[Corpora-List] Most common non-Romance, non-Germanic words in English

Tristan Miller miller at ukp.informatik.tu-darmstadt.de
Wed Apr 9 16:38:55 UTC 2014


Dear Christian,

On 09/04/14 12:29 PM, Christian Meyer wrote:
>> I'm interested in finding the most frequent words in English which
>> do not have an origin in any Romance or Germanic language.  Does
>> anyone know if such a list is available anywhere?
> 
> The best data you can get is presumably from the OED. In his keynote
> speech at the recent eLex, John Simpson showed "the OED in two
> minutes", which is essentially a visualization of the time when and
> the region from which words entered the English language. A video of
> the talk is available from http://eki.ee/elex2013/videos/. AFAIK, the
> platform is not yet released(?), but if it is, you could try
> collecting the lemmas from the "other" category, which - I guess - is
> what you are looking for. Obviously, it will still be a lot of work
> to extract the lemmas - and they are not ordered by frequency. I am,
> however, not aware of an accessible, ready-to-use list.

Thanks for this.  I haven't seen the video yet, though there's something
similar to what you describe on the OED's website at
<http://www.oed.com/timelines>.  This tool allows you to graph and list
(in short portions) words by origin, though not by frequency.

Regards,
Tristan

-- 
Tristan Miller, Research Scientist
Ubiquitous Knowledge Processing Lab (UKP-TUDA)
Department of Computer Science, Technische Universität Darmstadt
Tel: +49 6151 16 6166 | Web: http://www.ukp.tu-darmstadt.de/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 901 bytes
Desc: OpenPGP digital signature
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20140409/98358293/attachment-0001.sig>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list