[Lexicog] part-of-speech subcats

Ronald Moe ron_moe at SIL.ORG
Wed Mar 18 23:41:40 UTC 2009


Sebastian Drude wrote:

"I prefer very much the simple Standard Format (text) files toolbox produces
and which I can easily manipulate with a general text editor (or CC
tables)."

 

I just looked at the interlinear export options in FLEx. It appears that it
primarily exports XML. But it does export to HTML and Open Office Writer. It
also has quite a few XML options, including exporting in Word format. There
is a FLEx email discussion list similar to the Lexicography List that is
very active and helpful in answering users' questions.

I understand your liking for small files and for data that can be
manipulated by CC tables. I've been using CC for 25 years, but very few
people use it anymore. I still maintain my DDP list of semantic domains in
Toolbox (and my address list), but FLEx is far more efficient in developing
and editing a dictionary. I used to use CC tables to develop and manipulate
lexical data, but got very tired of my inability to interact with what the
CC tables were doing. They were also hard to write. The Bulk Edit tools in
FLEx enable me to do what I used to do with CC tables, only now I can
interact with them. They are also easier to use. (I was the one who
suggested that FLEx needed the Bulk Edit tools and I helped to design them.)
However there are some tasks that cannot be done with FLEx (or Toolbox) that
can be done with CC tables (or some other similar process such as Perl or
Python).

One of the best things that is happening these days in the realm of
lexicography software is the development of standards for marking up lexical
data. SIL has developed the Lexical Information Transfer (LIFT) standard
that is being used to facilitate moving data from one software package to
another. LIFT is XML. Already FLEx, WeSay, and Lexique Pro support LIFT. It
would be nice if Toolbox would be modified so that you can export your
dictionary in LIFT format. But I doubt that it will ever happen. (See
below.) There is also work going on right now on a wider (international)
scale to set up data standards so that dictionaries can be posted to the web
in a single standardized format. This will enable software designers to
write software that will work on any dictionary that is formatted using the
standard. It will also enable us to post lots of different dictionaries from
different languages on a website and search all the dictionaries for various
kinds of information. It would be nice if there were a data standard for
interlinearized text that would enable us to easily create it, work with it,
and transfer it from program to program.

As far as I know, there are no plans to further develop Toolbox. I believe
the reasons are (1) it is fairly stable as it is (it works well and has few
bugs), (2) it is old technology and therefore difficult to improve, (3) the
basic design of the program has some inherent flaws (e.g. it does not
constrain the structure of the data, with the result that every Toolbox file
gets messed up over time). It is for these reasons that various people have
developed add-ons such as MDF, Lexique Pro, and SOLID to make up for its
defects.

The bottom line is that every software package does some things well, others
things poorly, and other things not at all.

Ron Moe

 

  _____  

From: lexicographylist at yahoogroups.com
[mailto:lexicographylist at yahoogroups.com] On Behalf Of Sebastian Drude
Sent: Wednesday, March 18, 2009 12:36 PM
To: lexicographylist at yahoogroups.com
Subject: Re: [Lexicog] part-of-speech subcats

 

Thank you Ronald and Alan,


for your helpfully replying to my question.  It seems, if I want to use
Toolbox the way I pretend, probably continuing to use a non-standard
(self-made) \pss field will do best; but yes, I will have indeed to
pre-process the dictionary each time I want to print or export.

I am aware of FieldWork, thank you, Ronald, and will try its import-process
a.s.a.p.  Maybe that also helps to "clean" my lexical databases without
haveing to try SOLID out.

I have not used FieldWork yet because I am not comfortable with a
400MB-program (with .NET and SQL more than 880MB seems to be needed to
install it all) which saves its data solely in a relational database of
several hundred MBs.
I prefer very much the simple Standard Format (text) files toolbox produces
and which I can easily manipulate with a general text editor (or CC tables).

Furthermore, I use ELAN <http://www.lat- <http://www.lat-mpi.eu/tools/elan>
mpi.eu/tools/elan> for storing and publishing texts, which are mostly
annotated media files, and ELAN accepts (with difficulties) interlinearized
SF files as import and export options. So I rely to Toolbox for doing the
interlinearization and then transfer the annotation to EAF (ELAN Annotation
Format).  But FieldWork does not permit (yet?) transcription and annotation
of video files and, worse, does not provide export capacities of
interlinearized text to SF files.  So, how do I get my text data out of
Fieldwork (I am not going to program an XSL stylesheet to convert the
Fieldwork-generated XML-files to SF)?

Generally, I hope that Toolbox will be maintained and developed further as a
complementary tool to FieldWork.

Best,

Sebastian

-- 
| Sebastian Drude (Linguist)
| Sebastian.Drude@ <mailto:Sebastian.Drude at fu-berlin.de> fu-berlin.de &
Sebastian.Drude@ <mailto:Sebastian.Drude at googlemail.com> googlemail.com
| http://www.germanis
<http://www.germanistik.fu-berlin.de/il/pers/drude-en.html>
tik.fu-berlin.de/il/pers/drude-en.html



No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 8.0.238 / Virus Database: 270.11.18/2009 - Release Date: 03/18/09
07:17:00


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lexicography/attachments/20090318/d19a2df8/attachment.htm>


More information about the Lexicography mailing list