[Corpora-List] Markup Examples Needed!

Eric Atwell eric at comp.leeds.ac.uk
Wed Sep 3 11:18:10 UTC 2003


Peet,

The AMALGAM project at Leeds University collected a "MULTI-TREEBANK",
A sample of sentences annotated with 24 rival parsing and PoS-tagging schemes,
see http://www.comp.leeds.ac.uk/amalgam/amalgam/multi-parsed.html
and http://www.comp.leeds.ac.uk/amalgam/amalgam/multi-tagged.html

Parse trees as raw output of 10 rival parsers:
Alice, DESPAR, ENGCG, Principar, Link, RANLP, Carroll/Briscoe Shallow Parser,
WordPerfect's Grammatik, Tosca, Sextant;

Parse trees representing 4 English corpus parsing schemes:
UPenn, ICE, POW Systemic-Functional Bracketed, POW S-F Numerical

PoS-tagged text representing 10 English corpus PoS-tagging schemes:
Brown, ICE, LLC, LOB, UNIX Parts, POW, SEC, UPenn, BNC-C5, and BNC-C6.

The sample sentences were from software manuals (tho the PoS-tagged samples
were extended to also include BBC radio and London teenager sentences), see
http://www.comp.leeds.ac.uk/amalgam/amalgam/corpus/tagged/raw/ipsm_raw.html

[note: IF YOU HAVE A PARSER/TAGGER, PLEASE VOLUNTEER TO PARSE/TAG THESE
SENTENCES AND DONATE THE OUTPUT TO THE MULTITREEBANK FOR ALL TO SHARE!]

Unsurprisingly, the sample does not include your example "Time flies like..."
- the nearest (in grammatical structure) I could find in the sample was:
"Select the text you want to protect."


Alice:
(SENT (SENT-MOD (UNK-CAT "Select") (NP (DET "the") (NOUN "text")))
(SENT (VP-ACT (NP "you") (V-TR "want")) (NP NULL-PHON))) (SENT-MOD
(UNK-CAT "to") (NP "protect"))

DESPAR:
VB   select 1  --> 8  -
DT      the 2  --> 3  [
NN     text 3  --> 1  + OBJ
PP      you 4  --> 5  " SUB
VBP    want 5  --> 3  ]
TO       to 6  --> 7  -
VB  protect 7  --> 5  -
.         . 8  --> 0  -

ENGCG:
"<Select>"
"select" <*> <SVO> <SV> <P/for> V IMP VFIN @+FMAINV
"<the>"
"the" <Def> DET CENTRAL ART SG/PL @DN>
"<text>"
"text" N NOM SG @OBJ
"<you>"
"you" <NonMod> PRON PERS NOM SG2/PL2 @SUBJ
"<want>"
"want" <SVOC/A> <SVO> <SV> <P/for> V PRES -SG3 VFIN @+FMAINV
"<to>"
"to" INFMARK> @INFMARK>
"<protect>"
"protect" <SVO> V INF @-FMAINV
"<$.>"

Principar:
(
 (Select	~ V_NP	*)
 (the	~ Det	< text	spec)
 (text	~ N	> Select	comp1)
 (you	~ N	< want	subj)
 (want	~ V_CP	> text	rel)
 (to	~ I	> want	comp1)
 (protect	~ V_NP	> to	pred)
 (.	)
)

Link:
parse not found

RANLP:
(VP/NP select
 (N2+/DET1a the
  (N2-
   (N1/INFMOD
    (N1/RELMOD1 (N1/N text)
     (S/THATLESSREL (S1a (N2+/PRO you) (VP/NP want (TRACE1 E)))))
    (VP/TO to (VP/NP protect (TRACE1 E)))))))

Carroll/Briscoe Shallow Parser:
parse not found

WordPerfect's Grammatik:
SENTENCE
   |- CLAUSE 1
   |    |- VERB ---------------- Select
   |    |- DIRECT-OBJECT ------- the text
   |- CLAUSE 2 - RELATIVE
        |- SUBJECT ------------- you
        |- VERB ---------------- want
        |- DIRECT-OBJECT ------- {the text}
        |- VERB-Infinitive ----- to protect
        |- --------------------- .

Tosca:
parse not found

Sextant:
VP  101 Select         select         INF       0 0
NP    2 the            the            DET       1 1  2 (text) DET
NP*   2 text           text           NOUN      2 1  0 (select) DOBJ
NP*   3 you            you            PRON      3 0
VP  102 want           want           INF       4 0
VP  102 to             to             TO        5 0
VP  102 protect        protect        INF       6 1  3 (you) SUBJ
--    0 .              .              .         7 0

UPenn:
( (S
    (NP-SBJ (-NONE- *) )
    (VP (VB select)
      (NP
        (NP (DT the) (NN text) )
        (SBAR
          (WHNP-1 (-NONE- 0) )
          (S
            (NP-SBJ-2 (PRP you) )
            (VP (VBP want)
              (S
                (NP-SBJ (-NONE- *-2) )
                (VP (TO to)
                  (VP (VB protect)
                    (NP (-NONE- *T*-1) )))))))))
    (. .) ))

ICE:
PU CL(main,montr,imp)
 VB VP(trans,imp)
  MVB V(trans,imp) {select}
 OD NP()
  DT DTP()
   DTCE ART(def) {the}
  NPHD N(com,sing) {text}
  NPPO CL(depend,montr,pres)
   SU NP()
    NPHD PRON(pers) {you}
   VB VP(montr,pres)
    MVB V(montr,pres) {want}
   OD CL(depend,montr,infin)
    TO PRTCL(to) {to}
    VB VP(montr,infin)
     MVB V(montr,infin) {protect}
 PUNC PUNC(per) {.}

POW Systemic-Functional Bracketed:
[Z
    [CL
        [M select]
        [C
            [NGP
                [DD the]
                [H text]
                [Q
                    [CL
                        [S
                            [NGP
                                [HP you]
                            ]
                        ]
                        [M want]
                        [C
                            [CL
                                [I to]
                                [M protect]
                            ]
                        ]
                    ]
                ]
            ]
        ]
        [E .]
    ]
]

POW S-F Numerical:
Z CL 1 M select 1 C NGP 2 DD the 2 H text 2 Q CL 3 S NGP HP you 3 M want
3 C CL 4 I to 4 M protect 1 E .

Brown:
select/VB
the/AT
text/NN
you/PPSS
want/VB
to/TO
protect/VB
./.

ICE:
select/V(montr,infin)
the/ART(def)
text/N(com,sing)
you/PRON(pers)
want/V(montr,pres)
to/PRTCL(to)
protect/V(montr,imp)
./PUNC(per)

LLC:
select/VA+0
the/TA
text/NC
you/RC
want/VA+0
to/PD
protect/VA+0
./.

LOB:
select/VB
the/ATI
text/NN
you/PP2
want/VB
to/TO
protect/VB
./.

UNIX Parts:
select/adj
the/art
text/noun
you/pron
want/verb
to/verb
protect/verb
./.

POW:
select/P
the/DD
text/H
you/HP
want/M
to/I
protect/M
./.

SEC:
select/VB
the/ATI
text/NN
you/PP2
want/VB
to/TO
protect/VB
./.

UPenn:
select/VB
the/DT
text/NN
you/PRP
want/VBP
to/TO
protect/VB
./.

BNC-C5:
Select/VVB
the/AT0
text/NN1
you/PNP
want/VVB
to/TO0
protect/VVI
./PUN

BNC-C6:
Select/VV0
the/AT
text/NN1
you/PPY
want/VV0
to/TO
protect/VVI
./YSTP




On Sat, 30 Aug 2003, peetm wrote:

> Hi,
>
>
>
> I'm really interested in seeing alternative mark-ups of the following
> sentence:
>
>
>
> "Time flies like an arrow whereas fruit flies like a banana"
>
>
>
> I know that 'accurate' is entirely subjective - and down to the tagger - but
> - I'd like to see samples of mark-ups produced by this sentence, 'accurate'
> or not (preferably with an explanation of the mark-up used:
> methododology/tag set - or with links to the same).
>
>
>
> I'm especially interested in any mark-up that produces some hierarchical
> XML-type output.
>
>
>
> So, if anyone feels like providing me with examples - PLEASE DO SO!
>
>
>
> Many thanks,
>
>
>
> peetm
>
>
>
> email: peet.morris at clg.ox.ac.uk
>
>
>
> addr: Computational Linguistics Group
>
>       University of Oxford
>
>       The Clarendon Institute
>
>       Walton Street
>
>       Oxford
>
>       OX1 2HG
>
>
>
> ================================================
>
>
>
> Important: This email is intended for the use of the individual addressee(s)
> named above and may contain information that is confidential, privileged or
> unsuitable for overly sensitive persons with low self-esteem, no sense of
> humour or irrational religious beliefs.
>
>
>
> If you are not the intended recipient, then social etiquette demands that
> you fully appropriate the message without trace of the former sender and
> triumphantly claim it as your own. Leaving a former sender's signature on a
> "forwarded" email is very bad form and, while being only a technical breach
> of the Olympic ideal, does in fact constitute an irritating social faux pas.
>
>
>
> Further, sending this email to a colleague does not appear to breach the
> provisions of the Copyright Amendment (Digital Agenda) Act 2000 of the
> Commonwealth, because chances are none of the thoughts contained in this
> email are in any sense original...
>
>
>
> Finally, if you have received this email in error, shred it immediately,
> then add it to some nutmeg, egg whites and caster sugar. Whisk until stiff
> peaks form, then place it in a warm oven for 40 minutes. Remove promptly and
> let it stand for 2 hours before adding the decorative kiwi fruit and cream.
> Then notify me immediately by return email and eat the original message.
>
>
>
>

--
Eric Atwell, CVL: Computer Vision and Language research group
Distributed Multimedia Systems MSc Tutor & SOCRATES/JYA Tutor
School of Computing, University of Leeds, LEEDS LS2 9JT
TEL: 0113-3435761  MOBILE: 0775-1039104 FAX: 0113-3435468
WWW: http://www.comp.leeds.ac.uk/eric  EMAIL: eric at comp.leeds.ac.uk
Visit http://www.computingLEEDS.ac.uk - our newsletter for industry



More information about the Corpora mailing list