Automated glossing of BCS texts
Danko Sipka
danko.sipka at ASU.EDU
Thu Oct 21 02:46:21 UTC 2004
Dear Seelangers,
You may be interested to take a look at http://cli.la.asu.edu/clitag2, a preliminary testing version of the script which enables the user to paste any Bosniac/Croatian/Serbian (BCS) text in cp-1250 (Windows Central European), e.g., from the newspapers like http://www.danas.co.yu, http://www.novilist.hr, etc., and have it automatically tagged with the English glosses and additional possibility to expand all inflected BCS words. At present, the script covers over 90% of a typical newspaper text. When finished the script is meant to facilitate early classroom inclusion of authentic materials and reconciliation of task-based instruction with the focus on form (focusing on form becomes a part of the task). The resulted tagged text can be downloaded and edited. More elaborate explanations can be found at http://cli.la.asu.edu/clitag2.
I plan to develop analogous resources for Russian and Polish pending financial support for the project.
I would appreciate any comments off-list at Danko.Sipka at asu.edu.
Best,
Danko
Danko Sipka
Research Associate Professor and Acting Director
Critical Languages Institute (http://cli.la.asu.edu)
Arizona State University
E-mail: Danko.Sipka at asu.edu
Web: http://www.public.asu.edu/~dsipka
-------------------------------------------------------------------------
Use your web browser to search the archives, control your subscription
options, and more. Visit and bookmark the SEELANGS Web Interface at:
http://seelangs.home.comcast.net/
-------------------------------------------------------------------------
More information about the SEELANG
mailing list