MEGRASP for french

Brian Macwhinney macw at cmu.edu
Wed Jun 29 19:52:08 UTC 2022


Dear Sophie,
    We don’t distribute the training corpora for each language along with the MOR grammars, so you are not seeing the crucial piece that would be needed for French.  You can read chapter 11 of the MOR manual to understand which grammatical relations would have to be tagged.  I would say that creating a reasonable training corpus for French would take about 10 days of solid work.  It could be a bit faster if one uses the trick of starting by using the English MOR.
    Later this summer I will be exploring the use of Universal Dependency taggers for this purpose, but I can’t promise anything about this now.

—Brian MacWhinney

> On Jun 29, 2022, at 2:44 PM, sophie.fagniart at gmail.com wrote:
> 
> Editing my message :
> 
> The answer was in my question message : by using english grammar, i've succeed running MEGRASP on my training corpus. Now the MEGRASP function works on all my transcript, I now need to check if it seems reliable.
> 
> If anyone have tried the same experience on french trancripts, i would be very interesting on any feedback.
> 
> Regards,
> 
> Fagniart S.
> 
> Le mercredi 29 juin 2022 à 20:29:46 UTC+2, sophie.fagniart at gmail.com a écrit :
> Hi everyone,
> 
> I would like to study different syntactic measures on my transcripts, bit I saw that the MEGRASP function doesn't exist in the actual version of CLAN for french. I've also read in the manual that it would be possible to create a training corpus (and thus a megrasp.mod file), so i have two questions : 
> - can you confirm that It will be possible to use the training corpus to use MEGRASP on my french transcripts ? 
> - I've just tried the procedure described in the manual and I'm not sure if i'm doing it right ... 
> Here is a part of the CLAN output (running MEGRASP on a short training corpus in training mode) :
>  
> Finished processing file with 7 and 1 errors.
> widthfactor = 1.000000
> preparing for estimation...
> done
> number of samples = 66
> number of features = 599
> calculating empirical expectation...
> done
> performing LMVM
>   0 of 301  logl(err) = -2.708050 (0.4091)
>   1 of 301  logl(err) = -1.856078 (0.4091)
>   2 of 301  logl(err) = -1.780707 (0.4091)
> ....
> 
> But i didn't find any megrasp.mod file in the core files. Do i have to use the english grammar for this procedure, with the megrasp.mod file created with a french training corpus ? 
> 
> Thanks a lot for your help,
> 
> Kind regards,
> 
> Fagniart S.
> 
> -- 
> You received this message because you are subscribed to the Google Groups "chibolts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com <mailto:chibolts+unsubscribe at googlegroups.com>.
> To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/175b49b9-ee19-4a55-8dd9-039a5d3fdb34n%40googlegroups.com <https://groups.google.com/d/msgid/chibolts/175b49b9-ee19-4a55-8dd9-039a5d3fdb34n%40googlegroups.com?utm_medium=email&utm_source=footer>.

-- 
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/1FE62244-D2BE-49CA-8036-CB3EC904A93D%40cmu.edu.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20220629/ca5b3e86/attachment-0001.htm>


More information about the Chibolts mailing list