[Lexicog] online publishing

Piotr Bański bansp at O2.PL
Sat Apr 23 00:43:58 UTC 2011


List members interested in distributing their dictionaries under a free
license ("free" as in "free speech", which in this case means primarily
GNU GPL) may want to check out the FreeDict project [1]. It's hosted by
SourceForge.net, which settles the issue of a free (as in "free beer")
distribution platform and additionally opens dictionaries to further
re-use by language-documentation and language-processing communities.

The source format is TEI P5 XML [2] (in many possible variants, from a
word-by-word glossary to a professional translating dictionary), and the
source XML can then be turned into formats such as DICT or StarDict, as
well as any other, practically depending only on the complexity of the
source. There is also a data exchange initiative being launched slowly
between FreeDict and the machine translation Apertium project [3].

[1]: http://sourceforge.net/projects/freedict/
[2]: http://www.tei-c.org/
[3]: http://sourceforge.net/projects/apertium/

FreeDict has over 70 dictionaries now, and is about to become a testing
ground for a standardization initiative linking the TEI
dictionary-encoding guidelines with the ISO TC 37 SC 4 Lexical Markup
Framework standard [4], as soon as one of its admins has coped with
editing a certain book that has lagged for much too long (a matter of
around two months, hopefully). But that standardization issue needn't
keep anyone from contributing to the project :-) I just mention it to
stress that while the project list has been silent for a month or so,
the project is very much alive under the surface. Some more -- slightly
outdated, as usual -- information can be found in the project wiki.[5]

[4]: http://www.tc37sc4.org/
[5]: http://sourceforge.net/apps/mediawiki/freedict/

Everyone's cordially invited to visit and possibly contribute.
Transcoding into XML isn't in most cases difficult -- for one thing, a
LIFT-to-FreeDictTEI translation is rather trivial; transcoding from Open
Office or plain text formats has been done as well.

Best regards,

  Piotr (co-admin of FreeDict)


On 22.04.2011 21:55, Bill Poser wrote:
>  
> 
> I think that Ron's idea of a web site for distributing dictionaries is a
> good one, and it shouldn't be too hard to set up. There are, however, a
> couple of issues. One is cost. Running a simple web site does not cost a
> lot, but it does cost something, unless you have an institution that
> will host it for free. A basic web site at a commercial ISP in North
> America is typically on the order of US$60 per year, plus the cost of
> the domain registration, so let's say US$75 per year total. If there are
> a lot of dictionaries, especially ones with images and/or audio, costs
> will go up because of the need for more storage and bandwidth (assuming
> that there are lots of downloads). It's hard to estimate, but since
> we're presumably talking about relatively obscure languages, it doesn't
> seem likely that such a site would run to more than a few hundred
> dollars a year. That isn't a huge amount, but somebody has to pay it.
> Rather than requiring posters to pay, which would deter some and be
> administratively a pain, it would probably be best just to raise that
> from a few donors.
> 
> Secondly, allowing anyone to post freely presents a couple of issues.
> One is that of spam. Some bot may come along and try to post a lot of
> junk. The usual solution to this is a captcha to force the user to prove
> that he or she is human. I don't like them, myself, as I seem to have
> considerable difficulty proving that I am human and often require
> several tries, but that may be just me.
> 
> The other issue is inappropriate posts. Sooner or later, somebody is
> going to come along and post a large junk file or pornography or
> something like that, or a scan of a copyrighted, published dictionary.
> My guess is that this kind of thing won't happen real often, but
> somebody is going to have to be responsible for keeping an eye on the
> site, removing obvious problems, and responding to take-down notices for
> copyright infringement.


------------------------------------

Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/lexicographylist/

<*> Your email settings:
    Individual Email | Traditional

<*> To change settings online go to:
    http://groups.yahoo.com/group/lexicographylist/join
    (Yahoo! ID required)

<*> To change settings via email:
    lexicographylist-digest at yahoogroups.com 
    lexicographylist-fullfeatured at yahoogroups.com

<*> To unsubscribe from this group, send an email to:
    lexicographylist-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/



More information about the Lexicography mailing list