French MOR

Christophe Parisse parisse at ext.jussieu.fr
Mon Jan 16 11:36:18 UTC 2006


Well I beg to differ !

First I used c-rules as there are many regulars words in French and it was
easier to use c-rules than not to used them.

For example, there are 25144 "words" in the v.cut file. Out of these
"words", 11147 are the root of 1st group verbs (the most frequent regular
verbs in French) and 240 are 2nd group verbs. All these roots allow to
analyse something like 45 different forms, thanks to c-rules (something like
510615 different forms in full forms). Now, there are something like 13757
words in v.cut which ARE full forms but these corresponds to (only!) 305
irregular verbs which have something like 45 different forms each and they
are much to irregular for c-rules to be of much use.

For nouns and adjectives, I did the same, which is generating automatically
the plurals with 's' and the feminine form of adjectives with "e".

Second, but most important, MOR for French does the SAME thing as MOR for
English.  Just two examples:

ENGLISH

@Begin
*CHI:   plays
%mor:	v|play-3S^n|play-PL
*CHI:   playing
%mor:	part|play-PROG
*CHI:   oxen
%mor:	n|ox&PL
*CHI:   geese
%mor:	n|goose&PL
*CHI:   problems
%mor:	n|problem-PL
@End

FRENCH:

@Begin
*CHI:	jouent
%mor:	v|jouer-SUBJV:PRES&_3PV^v|jouer&PRES&_3PV
*CHI:	remises
%mor:
v:pp|remettre&_FEM&_PL^n|remise&_FEM-_PL^v|remiser-SUBJV:PRES&_2SV^v|remiser
&PRES&_2SV
*CHI:	allumees
%mor:	v:pp|allumer&_FEM&_PL
*CHI:	jouant
%mor:	v:prog|jouer
*CHI:	chevaux
%mor:	n|cheval&_MASC&_PL
*CHI:	elephants
%mor:	n|elephant&_MASC-_PL
*CHI:	fille
%mor:	n|fille&_FEM
@End

However, I confess that I made an error when generating the list of
exceptions because I coded some words which are regular using the "&" sign
instead of "-". This especially is true for feminines forms which are all
coded with "&" whereas many are regular. But I can check this and change the
signs if necessary, either in the full form file or by coding a new rule.
Also, some verbs of the 3rd group could be considered as regular. Well these
could be changed too, but there could be disagreement about the list of
regular 3rd group verbs.
Finally, one could choose a different notation for infinitives and
participles. I coded them in the main category, instead of
using -INF, -PROG, etc. This could be easilly changed if necessary.

One final remark. There are around 32,000 roots in MOR for French, which
correspond to close to 600,000 full forms. It seems to me this far from
incomplete.

Christophe Parisse

> -----Message d'origine-----
> De : info-childes at mail.talkbank.org
> [mailto:info-childes at mail.talkbank.org]De la part de Brian MacWhinney
> Envoye : dimanche 15 janvier 2006 17:07
> A : info-childes at mail.talkbank.org
> Objet : French MOR
>
>
> Dear Colleagues,
>     Work on the application of MOR to the French corpora in CHILDES
> has lagged a bit, despite the availability of a fairly complete
> lexicon provided by Christophe Parisse.  In part, this is because the
> Parisse French MOR system was constructed to use full form entries,
> rather than the system of arules and crules used for other
> languages.  It would be possible to either continue constructing
> French MOR in this full-form format or to shift to using the analytic
> framework.  Before beginning on this work, I wanted to check to see
> if anyone in the CHILDES community had done any work extending the
> current French MOR grammar.  I want to make sure we are not about to
> reinvent the wheel.  Many thanks.
>
> --Brian MacWhinney, CMU
>
>



More information about the Info-childes mailing list