[Corpora-List] Most common non-Romance, non-Germanic words in English
Darren Cook
darren at dcook.org
Wed Apr 9 23:17:43 UTC 2014
Trying again - I keep hitting the spam filter, so I'll try splitting my
response up!
> If not, I suppose I could produce one myself easily enough by taking a
> raw frequency list (such as Adam Kilgarriff's BNC lemma counts),
> querying each entry in a machine-readable dictionary which provides
> etymological information, and filtering appropriately. But that
> presupposes that such a dictionary exists. Does anyone know of a
> suitable freely available dictionary for this task?
One approach would be to gather a lists of the words of interest:
http://en.wikipedia.org/wiki/List_of_English_words_of_Arabic_origin
http://en.wikipedia.org/wiki/List_of_English_words_of_Japanese_origin
http://en.wikipedia.org/wiki/List_of_English_words_of_Chinese_origin
etc.
As most English words do come from the Romance or Germanic languages,
this is not an impossible task, though you may need to filter further
based on your exact criteria. E.g. tempura entered English from
Japanese, but entered Japanese from Portuguese. Admiral comes from a
French word which comes from an Arabic word; which does that count as.
Darren
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list