[Corpora-List] Syntax search problem resolved

Sebastian Hoffmann sebhoff at es.unizh.ch
Fri Jun 16 16:03:48 UTC 2006


At 6:24 AM -0700 6/16/06, Linda Bawcom wrote:
>Dear friends, collegues, and list members,
>
>Thanks to Knut Hofland, Geoffrey Williams, Chris 
>Tribble and Mark Davis, who all very kindly took 
>me by the hand, I was able to find the strings I 
>needed by using, WITH the BNC:
>
><w NP0>* <w PRF>of , although I was unable to 
>get strings with <w NP0>* <w PRF>of  <w NP0>* 
>Well, It all seems quite obvious and logical now 
>of course!
>
>  And  since nouns follow of then it's just a 
>matter of deleting items such as United States 
>of America (no pun intended)  or Port of Spain. 
>I'm not quite sure whether to include items such 
>as Joan of Arc, Lawrence of Arabia or Prince of 
>Wales when basically I'm looking for frequency 
>of i.e. Clinton of Little Rock. I suppose I'll 
>check with John Sinclair-the 'of' expert!
>
>Kindest regards,
>Linda
>

Dear Linda,
I just ran a query for "NP0 of NP0" in BNCweb 
(CQP edition) and got 7850 hits. The frequency 
list feature gives you the following top 50 
combinations:

No.	Lexical item(s)	No. of occurrences	Percent
1	Isle of Man	346	4.41%
2	Isle of Wight	342	4.36%
3	States of America	168	2.14%
4	End of London	97	1.24%
5	Donaldson of Lymington	73	0.93%
6	Isle of Dogs	55	0.7%
7	Bridge of Harwich	50	0.64%
8	Riding of Yorkshire	46	0.59%
9	Jesus of Nazareth	44	0.56%
10	John of Gaunt	43	0.55%
11	Mitterrand of France	38	0.48%
12	Joan of Arc	35	0.45%
13	Goff of Chieveley	34	0.43%
14	Keith of Kinkel	32	0.41%
15	William of Malmesbury	29	0.37%
16	Francis of Assisi	29	0.37%
17	HUSSEIN of Jordan	27	0.34%
18	Lawrence of Arabia	27	0.34%
19	Richard of Gloucester	26	0.33%
20	States of Europe	26	0.33%
21	Highlands of Scotland	26	0.33%
22	Port of Spain	26	0.33%
23	Slynn of Hadley	24	0.31%
24	Kingdom of Great	23	0.29%
25	Isle of Skye	21	0.27%
26	Isle of Lewis	20	0.25%
27	John of Salisbury	19	0.24%
28	Joseph of Arimathea	18	0.23%
29	Edward of England	18	0.23%
30	Michael of Kent	18	0.23%
31	Hassan of Morocco	18	0.23%
32	Julian of Norwich	18	0.23%
33	HUGH OF LINCOLN	18	0.23%
34	Florence of Worcester	18	0.23%
35	Philip of Spain	15	0.19%
36	Isle of Sheppey	15	0.19%
37	Eleanor of Aquitaine	14	0.18%
38	Fahd of Saudi	14	0.18%
39	Mubarak of Egypt	13	0.17%
40	John of God	13	0.17%
41	Philip of France	12	0.15%
42	Teresa of Avila	12	0.15%
43	Hugh of Lyons	12	0.15%
44	Hook of Holland	12	0.15%
45	Fraser of Carmyllie	12	0.15%
46	William of Jumièges	11	0.14%
47	Henry of Lancaster	11	0.14%
48	Brandon of Oakbrook	11	0.14%
49	Morris of Borth-y-Gest	11	0.14%
50	Isle of Innisfree	11	0.14%

I can send you the complete list if you want. It 
may also be useful to add a few optional elements 
to your retrieval pattern. For example, you could 
allow sequences of items tagged as NP0 as well as 
instances of NN1 and NN2 that immediately follow 
the second NP0 to get instances like the 
following:

<w NP0>Superintendent <w NP0>Trobridge <w PRF>of 
<w NP0>Ealing <w NN2>Police <w NN1>Station

<w NP0>St <w NP0>Francis <w PRF>of <w NP0>Assisi

<w NP0>Archbishop <w NP0>MacNamara <w PRF>of <w NP0>Dublin

Best,
Sebastian

-- 

Dr. Sebastian Hoffmann
Englisches Seminar der Univ. Zürich
Plattenstrasse 47
CH-8032 Zürich
Tel: +41-44-634 3551
Fax: +41-44-634 4908
http://www-es.unizh.ch
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20060616/ca854bab/attachment.htm>


More information about the Corpora mailing list