[Corpora-List] "Tajweed" in English dictionaries and corpora

Eric Atwell E.S.Atwell at leeds.ac.uk
Fri Mar 1 22:35:23 UTC 2013


I asked what I thought was a straightforward question:
  "Can anyone point me at research on vocabulary related to Islam,
   and how it figures in British dictionaries and corpora?"

  - this has started an interesting thread of discussion; but only 
Michael Rundell answered the question directly: Macmillan seems to 
have the only British English dictionary with vocabulary related to Islam. 
Other dicitonaries include specialist terms from many other domains 
but ignore vocabulary related to Islam used by British English speakers.

Michael, you said "Thanks for 'tajweed', which corpus data suggests we
should include" - what corpus data?  Presumably not the BNC.

I can understand that a general English dictionary cannot include all
specialist-domain terms, but only ones in "common use". I had thought
this might roughly equate to frequency of occurence in some standard
  corpus (eg BNC), but I see that the Oxford English Dictionary does not 
rely on frequency in a corpus, but more vaguely-defined "evidence" :

http://public.oed.com/about/frequently-asked-questions/#qualify

"How does a word qualify for inclusion in the OED?
The OED requires several independent examples of the word being used,
and also evidence that the word has been in use for a reasonable amount
of time. The exact time-span and number of examples may vary: for
instance, one word may be included on the evidence of only a few
examples, spread out over a long period of time, while another may
gather momentum very quickly, resulting in a wide range of evidence in a
shorter space of time. We also look for the word to reach a level of
general currency where it is unselfconsciously used with the expectation
of being understood: that is, we look for examples of uses of a word
that are not immediately followed by an explanation of its meaning for
the benefit of the reader..."

I guess Google's "About 1,490,000 results" for "Tajweed" is not enough
evidence for the OED.

or maybe Web-as-Corpus is just generally not good enough for the OED.


Eric Atwell, Language research group,
  School of Computing, Leeds University


On Thu, 28 Feb 2013, Michael Rundell wrote:

> Eric
>
> As far as dictionaries go, the Macmillan English Dictionary
> (http://www.macmillandictionary.com/) includes a thesaurus, and one of the
> categories is words relating to Islam - there are about 70 i think. You can
> see them all here:
> http://www.macmillandictionary.com/thesaurus-category/british/Islam_7
>
> A couple of caveats:
> -this is a general-purpose pedagogical dictionary so definitions are not
> detailed and coverage is not extensive
> -we're not omniscient, so there may be some inaccuracies, and there are sure
> to be other  good candidates currently missing
>
> But the good thing about online dictionaries is that (a) you can update them
> regularly, rather than waiting 5 years for a new printed edition, and (b)
> you can fix things that are inaccurate or misleading.
>
> Thanks for 'tajweed', which corpus data suggests we should include. I've
> added it to the list of items for our next-but-one update (the next update
> is already too far down the line). And if anyone has other suggestions (or
> proposed improvements to existing Islam-related entries) please send to me
> at michael.rundell at lexmasterclass.com
>
> Michael
>
> ----- Original Message -----
> From: "Eric Atwell" <E.S.Atwell at leeds.ac.uk>
> To: "CORPORA discussion forum" <corpora at uib.no>
> Cc: "Eric Atwell" <e.s.atwell at leeds.ac.uk>
> Sent: Thursday, February 28, 2013 10:03 AM
> Subject: [Corpora-List] "Tajweed" in English dictionaries and corpora
>
>
>> Can anyone point me at research on vocabulary related to Islam,
>> and how it figures in British dictionaries and corpora?
>> (other than "Terrorism" of course - well-researched by corpus linguists
>> :-)
>>
>> We have a UK-EPSRC project on "Natural Language Processing Working
>> Together With Arabic And Islamic Studies", focussing on Tajweed.
>> I've just discovered a Quite Interesting fact about Tajweed:
>>
>> It is worth noting that even though "Tajweed" is a term understood by
>> most British muslims (2.7 million or 5% of the UK population according
>> to UK Census 2011), the word is left out of most British English
>> dictionaries: it is not found in the Oxford English Dictionary, the
>> Collins English Dictionary, or the Longman Dicitionary of Contemporary
>> English. "Tajweed" is also not found in the 100-million-word British
>> National Corpus, although Google search for "tajweed" reports "About
>> 1,800,000 results".
>>
>> The only English-language "dictionary definition" I could find for
>> "Tajweed" was in Wikipedia:
>>
>> Tajw.d (Arabic: ...... ta.w.d: IPA: [tæ.wi.d]) is an Arabic word for
>> elocution and refers to the rules governing pronunciation during
>> recitation of the Qur'an.
>>
>> I would have thought that, although the word is Arabic by origin, it is
>> now a fully-British English loan word, used by many British English
>> speakers....
>>
>>
>> Eric Atwell, Associate Professor, Language research group,
>>  I-AIBS Institute for Artificial Intelligence and Biological Systems
>>  School of Computing, Faculty of Engineering, UNIVERSITY OF LEEDS
>>  Leeds LS2 9JT, England.        TEL: 0113-3435430  FAX: 0113-3435468
>>  WWW: http://www.comp.leeds.ac.uk/eric
>>       http://www.comp.leeds.ac.uk/nlp
>>       http://www.comp.leeds.ac.uk/arabic
>>
>
>
> --------------------------------------------------------------------------------
>
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>

-- 
Eric Atwell, Associate Professor, Language research group,
  I-AIBS Institute for Artificial Intelligence and Biological Systems
  School of Computing, Faculty of Engineering, UNIVERSITY OF LEEDS
  Leeds LS2 9JT, England.        TEL: 0113-3435430  FAX: 0113-3435468
  WWW: http://www.comp.leeds.ac.uk/eric
       http://www.comp.leeds.ac.uk/nlp
       http://www.comp.leeds.ac.uk/arabic
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list