[Corpora-List] automatic search for orthographic recurring patterns

MARC FRYD marc.fryd at univ-poitiers.fr
Wed Dec 8 08:38:53 UTC 2004


Hi,
Perhaps someone on the List will be able to help me with the following
datamining problem:

Given a corpus of isolated lexical units or collocations, I would like
to determine recurring orthographic patterns whether initial, i.e.
"CARPO" (carpogenic, carpogenous, carpolite), final i.e.  "IONALISM"
(sensationalism, functionalism, etc.) , or internal, i.e. "CHRON"
(synchony, synchronize, etc.).
The output should be arranged so as to show respective productivity for
each pattern.
Important constraint: the various patterns will *not* be fed in
initially but should be extracted as a result of the algorithm.
I'll post a summary if I get several replies.
Regards to all list members.
Marc Fryd

-------------- next part --------------
A non-text attachment was scrubbed...
Name: marc.fryd.vcf
Type: text/x-vcard
Size: 371 bytes
Desc: not available
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20041208/faa7a520/attachment.vcf>


More information about the Corpora mailing list