[Lexicog] word frequency

fieldworks_support at SIL.ORG fieldworks_support at SIL.ORG
Tue Sep 13 19:00:32 UTC 2005


I don't know of a way to automate a way of putting frequency counts from a
wordlist into a dictionary file.  But you could do some things in Shoebox
or Toolbox that would help with the task of marking the most frequent
words.

1) You could sort your word list by the numeric order of the frequency
counts. Do this by changing the sorting of the wordlist database to use the
\cz field (Count Sort). If you want a descending sort with the highest
numbers first, make a language encoding where the digits are included in
descending order, ie 9 8 7 6 5 4 3 2 1 0 and use that language encoding for
your \cz field.  The cz field will work better for sorting than the \c
field, because the numbers in the \c field don't have leading zeroes, and
Shoebox/Toolbox won't sort them correctly. It would sort '100' right next
to '10' and right next to '1' because they both begin with a '1'.

2) Then from this list, you could see which words in the dictionary need to
be marked as being among the most frequent.  To insert a symbol ahead of
the lemma, just add your symbol ahead of the lemma, and in the vernacular
language sort order, place your symbol in the "ignore characters" list.
This should keep it from changing how the lemmas are sorted.  Or else, if
you are using the \lc (lexical citation) field but sorting on the lx field,
you could put the symbol in the \lc field and not in the lx, and this
should also sort correctly.



Steve White, Jaars language software support
704-843-6337, 1-800-215-7813




------------------------ Yahoo! Groups Sponsor --------------------~--> 
Get fast access to your favorite Yahoo! Groups. Make Yahoo! your home page
http://us.click.yahoo.com/dpRU5A/wUILAA/yQLSAA/HKE4lB/TM
--------------------------------------------------------------------~-> 

 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/lexicographylist/

<*> To unsubscribe from this group, send an email to:
    lexicographylist-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/
 



More information about the Lexicography mailing list