LL-L "Resources" 2003.11.11 (10) [E/LS]

Lowlands-L lowlands-l at lowlands-l.net
Wed Nov 12 01:57:11 UTC 2003


======================================================================
L O W L A N D S - L * 11.NOV.2003 (10) * ISSN 189-5582 * LCSN 96-4226
http://www.lowlands-l.net * lowlands-l at lowlands-l.net
Rules & Guidelines: http://www.lowlands-l.net/index.php?page=rules
Posting Address: lowlands-l at listserv.linguistlist.org
Server Manual: http://www.lsoft.com/manuals/1.8c/userindex.html
Archives: http://listserv.linguistlist.org/archives/lowlands-l.html
Encoding: Unicode (UTF-8) [Please switch your view mode to it.]
=======================================================================
You have received this because you have been subscribed upon request.
To unsubscribe, please send the command "signoff lowlands-l" as message
text from the same account to listserv at listserv.linguistlist.org or
sign off at http://linguistlist.org/subscribing/sub-lowlands-l.html.
=======================================================================
A=Afrikaans Ap=Appalachian B=Brabantish D=Dutch E=English F=Frisian
L=Limburgish LS=Lowlands Saxon (Low German) N=Northumbrian
S=Scots Sh=Shetlandic V=(West)Flemish Z=Zeelandic (Zeêuws)
=======================================================================

From: Jan Strunk <strunkjan at hotmail.com>
Subject: Low Saxon search engine

Leiwe Lüe,

vör ein paor wiäken hiev ek maol wat üöwer platdüütsche söökmaschinen
fraogt.
Inne tüschentiid hiev ek sauwat 2000 nedderdüütsche dokumenten uut'n
internet
rünnerkriegen un sin jüst derbi verscheidene indices darüöwer te buun.
Vandage hiev ek twei fraogen för Ink:

1. Wiet villicht ein van Ink wu ek ein web-interface taum testen opbuun kan?
    Dat problem is dat ek dat giärn äs Java servlet programmeirn wull un
    vüel plats för de daten bruuk. Touminnst 100 MB.

2. Ek wöör de söökalgoritm dei ek jüst an'n programmeirn sin giärn an souwat
hunnert
    söökwöör uutprobeiern. Nu will ek mi nich einfak sölfs alle dei hunnert
wüör uutdinken.
    (Dao kunn jä well seggen, dat ek jüst dei nuomen hev dei am einfaksten
sin.)
     Un ek här auk giärn wüör uut verscheidene dialekten un schriivwisen.
    Kuort, ek wüör Ink heel dankbaor wenn It mi allemann ein paor
vörschliäge
    tauschicken kunnen. Dei wäör ik denn sammeln un taun utprobeirn bruken.
    Well lust het, schickt einfak'n paor wüör an miin email-adress:
    jstrunk at stanford.edu. (Schriivt man sau äs It dat süss auk daut un in
alle dialekten!)

In twei wiäken is miin projekt an de Uni tau Enne, aower villicht kömmt jä
wat
guedet derbi ruut dat me achterhiär noch tau wat bruken kann.

Dear folks,

some weeks ago, I asked something about Low Saxon search engines.
In the meantime, I have collected around 2000 Low Saxon documents from
the web by hand yielding quite a big corpus of Low Saxon text.
(Which might be useful in itself, later). I am currently programming
and testing some algorithms for "fuzzy matching" (which means e.g. also
finding
zuiken when you type in söken). I now have two more questions for you:

1. Does someone know where I could build a web interface for testing by Low
Saxon
    internet users? The problem is that I would like to use a Java servlet
and that I need
    a lot of space. Mininum 100 MB.

2. A more important and urgent question: I would like to test the different
search and
    matching algorithms on about 100 words or so. I don't like to come up
with these
    words all by myself, because I might introduce a bias in that way.
   Moreover, I would like to have search words in different dialects and
orthographies.
   To cut a long story short: I would be very thankful to all you Low Saxon
speakers
   out there (from the whole world, of whatever dialect) if you could just
send me
    some suggestions for search terms that I could use to evaluate the
different
    algorithms. Please send me as many as you like using the your own
dialect and
    the orthographic system you normally use. Please send any suggestions
to:
    jstrunk at stanford.edu

My project here at Stanford university will be over in two weeks from now.
And I won't be able
to implement and test as much as I wanted to. But maybe there are
interesting and useful results
that can be used for actually building a search engine in the future.

Thank you very much!

Jan Strunk
jstrunk at stanford.edu
strunk at linguistics.ruhr-uni-bochum.de

================================END===================================
* Please submit postings to lowlands-l at listserv.linguistlist.org.
* Postings will be displayed unedited in digest form.
* Please display only the relevant parts of quotes in your replies.
* Commands for automated functions (including "signoff lowlands-l") are
  to be sent to listserv at listserv.linguistlist.org or at
  http://linguistlist.org/subscribing/sub-lowlands-l.html.
======================================================================



More information about the LOWLANDS-L mailing list