[Corpora-List] Data-Driven Learning materials

simon smith ssmith at mcu.edu.tw
Fri Apr 11 04:47:33 UTC 2008


Hi Alex and corpora-list members:

I've just been looking at the resources mentioned by Linda -- very
impressive! I'm afraid this contribution will seem rather modest by
comparison.

I'm involved in two projects in which users are presented with corpus data:
one on Chinese, the other on English. Both of them make use of Adam
Kilgarriff's Sketch Engine corpus query tool.

*In the Chinese project*, Alice Chen and I tried to assess, using pre- and
post-tests, the progress in acquisition of collocational patterns made by a
group of intermediate to advanced Chinese learners. These learners
were exposed for a period of time to a large corpus of Chinese, accessed
through the concordances and usage summaries offered by Sketch Engine. We
prepared a walkthrough guide <http://mcu.edu.tw/~ssmith/walkthrough/> to the
use of corpora for language learning in general (and the Sketch Engine in
particular), and described the work in a
paper<http://www.kilgarriff.co.uk/Publications/2007-SmithChenKilg-PALC.doc>given
at PALC, in Lodz, last year.

The results of that work were rather inconclusive, partly because
our learners were left to their own devices as to how they went about
exploring the corpus, and what they learned from it.

In July, I'll be building on that work with a much more task focused
Chinese-learning experience. This will be aimed at beginners, and will take
the form of a workshop at TALC 2008,
Lisbon<http://talc8.isla.pt/workshops.html#mandarin>.
Participants will learn about an important collocational category in the
language, that of Verb-Object Compounds, which can be readily illustrated
using corpus tools, and crops up often enough and early enough in every
Chinese learner's exposure to the language to merit special study. If that
sounds a bit dry, we'll also be practising some basic Mandarin, and even
dabbling a little in the writing system. Not to mention learning about
Sketch Engine along the way. If you're going to be at TALC, please consider
joining us!

*The English project* is on *corpus-generated cloze exercises.* Scott
Sommers and I are presenting a
paper<http://mcu.edu.tw/~ssmith/ccu2008-smith.pdf>on this at the 2008
Conference
of English Teaching and Learning in
R.O.C.<http://www.ccu.edu.tw/fllcccu/2008EIA/English/Eprogram.php>
A cloze exercise has three components: a cloze sentence ("The boy stood on
the burning deck"), a key ("burning") and distractors ("lukewarm",
"tepid","piping hot"..., for the sake of illustration). Our algorithm takes
the key as input from the user, finds an appropriate sentence in the corpus,
and supplies distractors (terms which have the same sort of distribution in
the corpus as the key, but never actually occur with a particular collocate,
such as "deck" in the example).

So far, we can generate cloze exercises such as:


*Reality manages the home delivery operations of a range of GUS
organisations , along with an enviable ______ of blue-chip clients .*

Ans: *investment   infrastructure    asset   portfolio*
and
*Albert E Sharp Fund Managers have launched AES European unit trust, which
seeks long-term capital growth from a diversified ______ of European
Securities.*
Ans:  *asset portfolio stock holding*
where "holding" seems to be a false distractor ;-(

Any feedback on either of these projects would of course be most welcome!

-- 
Simon Smith, PhD

Assistant Professor
English Language Center
Ming Chuan University
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20080411/3034e655/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list