[Lexicog] Lexical Relations vs. Etymology - using "idiom" in the \ps field
David Frank
david_frank at SIL.ORG
Thu Mar 6 19:36:22 UTC 2008
Cheryl --
One other thing that I think that is needed is for us to see the raw standard format data (backslash codes) that went into producing this MDF-formatted output. Do you have a way of putting that into an e-mail? There clearly seems to be a problem, and someone who is more knowledgeable than I am about how Shoebox/Toolbox works might be able tell you what the problem is right off, based on how the output looks. But it would be easier to help if we could see what the data input going into MDF looks like. What I would want to see is which fields were used (for example \ps) and in what order they are. If we could just see the data for aŋka, it would help us troubleshoot that type of problem.
As for the ones like ammo that are marked "idm" as the part of speech, I would say that putting "idm" as a part of speech is at least part of the problem. What would happen if you left the part of speech field blank for those subentries that are marked "idm"? MDF is set up to make a line break whenever there is a new part of speech label, and you don't want that.
I would think that if you wanted to mark subentries as being idioms, that would be less like marking them as nouns and verbs, and more like marking them as rare or crude. In other words, it could be considered a usage question rather than a grammatical question.
If in addition to showing the standard format data behind a couple of these entries that MDF formatted in a way that you don't like, it would also help if you could manually adjust one of the improperly-formatted entries like ammo to make it look the way you think it ought to look, and then show us. But I suspect that you don't know how you would like it to look, and you only know that the way it does appear doesn't look right.
When years ago I produced a vernacular language dictionary, I didn't handle idioms this way and I didn't use Shoebox and MDF. I used a program that I designed myself, which was tailored to the demands of one specific language and to my own preferences. I have to admit that it was not worth the effort for me to make the program produce exactly what I wanted every time, with no tweaking of the program's formatted output in order to produce the published dictionary. I had to make some manual adjustments some places. Sometimes that is the most reasonable solution. But you have problems that look like they could be helped.
Not being a Shoebox/Toolbox expert, it looks to me like a problem is that you should not have given the part of speech again for sense #2. Someone else reading this e-mail could probably confirm that. Giving a part of speech automatically triggers a line break. I don't know what part of speech "p" is, but that is beside the point.
-- David
----- Original Message -----
From: Cheryl Reitz
To: lexicographylist at yahoogroups.com
Sent: Thursday, March 06, 2008 12:45 PM
Subject: RE: [Lexicog] Lexical Relations vs. Etymology - using "idiom" in the \ps field
Many thanks, David. Although I haven't a chance to prepare the raw data, I just made this 6-page excerpt (2MB) available for viewing/download at this link:
http://www.e-multiweb.com/secure/DikDik_20071219_excerpt.pdf
The "new-line" problem does not show up only for idioms, but for all parts of speech. Samples of the problem in this excerpt are:
ammo (2-idm, 3-idm, 4-idm), aŋka (2-p), baalo (3-idm, 4-idm), baala (2-auxv), baawo (2-idm).
We would prefer these members of the \ps Range Set to be "behaving" under the sense numbers like members of the \de or \ge Range Sets, using "wrap-around" behaviour. While it is probably not possible to do this for the various parts of speech, which really must be under \ps, the category idiom seems a little more like a separate definition, behaving like a \de member. You can see the "wrap-around" behaviour we are hoping to get in:
ammo (1-v), amo (1, 2), baalo (1, 2), baawo (1), baaŋja (1), babdo (1,2)
This is a display problem which I can hand-correct in an hour or so (maybe even create a macro), but of course would prefer to have it fixed once and for all instead of having to manually correct it each time we export the dictionary. Sorry, I'm only an amateur and may not know correct terminology for describing this problem.
Please note that this is very much a DRAFT, full of other minor problems as well, some of which we have corrected since the PDF and others we're still addressing. So we'd really appreciate it if you do not use this in any way other than to troubleshoot this issue.
Thanks so much.
Cheryl
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lexicography/attachments/20080306/8bc9b97b/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: untitled.jpg
Type: image/jpeg
Size: 17935 bytes
Desc: not available
URL: <http://listserv.linguistlist.org/pipermail/lexicography/attachments/20080306/8bc9b97b/attachment.jpg>
More information about the Lexicography
mailing list