<head>
<style type="text/css">
<!--
/* start of attachment style */
.ygrp-photo-title{
clear: both;
font-size: smaller;
height: 15px;
overflow: hidden;
text-align: center;
width: 75px;
}
div.ygrp-photo{
background-position: center;
background-repeat: no-repeat;
background-color: white;
border: 1px solid black;
height: 62px;
width: 62px;
}
div.photo-title
a,
div.photo-title a:active,
div.photo-title a:hover,
div.photo-title a:visited {
text-decoration: none;
}
div.attach-table div.attach-row {
clear: both;
}
div.attach-table div.attach-row div {
float: left;
/* margin: 2px;*/
}
p {
clear: both;
padding: 15px 0 3px 0;
overflow: hidden;
}
p span {
color: #628C2A;
}
div.ygrp-file {
width: 30px;
valign: middle;
}
div.attach-table div.attach-row div div a {
text-decoration: none;
}
div.attach-table div.attach-row div div span {
font-weight: normal;
}
div.ygrp-file-title {
font-weight: bold;
}
/* end of attachment style */
-->
</style>
</head>
<html><head>
<style type="text/css">
<!--
#ygrp-mkp{
border: 1px solid #d8d8d8;
font-family: Arial;
margin: 14px 0px;
padding: 0px 14px;
}
#ygrp-mkp hr{
border: 1px solid #d8d8d8;
}
#ygrp-mkp #hd{
color: #628c2a;
font-size: 85%;
font-weight: bold;
line-height: 122%;
margin: 10px 0px;
}
#ygrp-mkp #ads{
margin-bottom: 10px;
}
#ygrp-mkp .ad{
padding: 0 0;
}
#ygrp-mkp .ad a{
color: #0000ff;
text-decoration: none;
}
-->
</style>
</head>
<body>
Dear Toolboxers and Lexicographers,<br><br><br>I hope I am not abusive when I put this question forward to the toolbox and lexicography groups (sorry for possible cross-postings). As others posted in the Toolbox group only this question popped up in the process of transferring an older idiosyncratic lexical database to the MDF standard. I guess that some of my questions might be interesting for other users who have similar needs or wishes, and maybe something could be included in possible future versions of MDF.<br>
<br>Such a case is the representation of Part-of-Speech subcategories.<br><br>This post deals with Toolbox (Standard Format, MDF) databases in particular, but
I would like to know which solutions other or related technology offers
(in particular, LexiquePro). I am particularly interested in exporting from and to Toolbox.<br>
<br><br>I an older (idiosyncratic and inconsistent, but MDF-based) system, I used a combination of the standard <b>\ps</b> field (obligatory) and, in many cases, a <b>\pss</b> field (<i>"part of speech subcategory</i>", optional). In the first I (ideally) put only abbreviations for the major word categories -- in my case, <b>adv</b>(erb), <b>id</b>(eophone), <b>int</b>(er)<b>j</b>(ection), <b>n</b>, <b>num</b>(eral), <b>part</b>(icle), <b>p</b>(ost)<b>p</b>(osition), <b>pron</b>(oun), <b>v</b>(erb). <br>
In the second, I put labels for subcategories, such as <b>inal</b>(ienable) etc. (for nouns), <b>dem</b>(onstrative), <b>pers</b>(on) (for pronouns), <b>i</b>(n)<b>tr</b>(ansitive), <b>st</b>(ative), <b>tr</b>(ansitive) (for verbs). Some of these may hold for several major classes (cross-classifications), for instance <b>inter</b>(rogative) or <b>neg</b>(ation) (for adverbs, pronouns, particles,...).<br>
<br>All abbreviations in both fields are obligatorily linked (jump-path) with corresponding entries in another database, where I give the long name and explain what exactly this label stands for and which properties these words have.<br>
<br>Using only MDF fields, I have only one <b>\ps</b> field (<b>\pn</b> for national language, ok). As far as I can see, I have several options to organize the same information:<br><br>1) one complex abbreviation in the \ps field, such as <b>n.inal</b> or <b><a target="_blank">v.tr</a></b>.<br>
<br>2) several separate abbreviations in the <b>\ps</b> field.<br><br>3) main word classes in <b>\ps</b>, as before, subcategories in another field.<br><br><br>None of these fully satisfies me, for at least the following reasons:<br>
<br><i><b>A)</b></i> With (3) I have the problem to choose an appropriate field. <b>\pd</b>
seems to be an obvious option (many of the subcategories are indeed relevant
for the paradigm structure). However, the subcategory abbreviation will be put at
the end of the entry, and will be formatted with a label such as
"<i>Parad:</i>", which does not make sense or is at least counter-intuitive
for those subclasses which are of a rather semantic kind.<br>
<br><i><b>B)</b></i> With (2), I would have to add manually punctuation after the first (main) word class. This causes potential problems for consistency, for defining range sets etc.<br><br><i><b>C)</b></i> (1) and (2) are much clumsier for interlinearization, filtering and sorting. <br>
<i>C1)</i> Interlinearization: In the part-of-speech line the whole complex label (1) can be too much information (be it only for formatting reasons), and (2) does not interlinearize well at all or at least produces fields with internal spaces (if I define the data type as "single item") which are painful for exporting to other formats such as ELAN's <b>eaf</b>-files.<br>
<i>C2)</i> Filters that make reference to word classes will be much more difficult to formulate correctly.<br><i>C3)</i> Sorting: Sometimes, I just want to sort by major word classes, searching, say, for verbs ending in a certain letter. Depending on the number of combined subcategories, I will have many internally alphabetically ordered groups of verbs.<br clear="all">
<br><i><b>D)</b></i> With (1), I would create many complex labels which are
to administer and which are, in the printed dictionary, much less
esthetical and easy to read than separate abbreviations. True, many
subcategories only apply only to one major word class anyway; but this
does not hold for others such as <b>inter</b> or <b>neg</b> (see above).<br><br><br>I guess that a solution can be set up using appropriate cc-tables or some other mechanism doing replacements with regular expressions, or by splitting fields automatically for sort, jump, interlinearization and similar functions, or by joining fields (as the MDF <b>\ps</b> and my <b>\pss</b> field), for formatting and printing. <br>
But this still has the disadvantage of being difficult set up generally and in a sustainable way, and to have to keep track of different versions of the 'same' database for different purposes.<br><br><br>How do you all represent and organize this kind of information? <br>
What would you recommend?<br><br>With your solution, what happens if you export MDF databases to LexiquePro, LEXUS or other formats, and back to Toolbox?<br><br>Thank you in advance<br><br>Sebastian<br><br>-- <br>| Sebastian Drude (Linguist)<br>
| <a href="mailto:Sebastian.Drude@fu-berlin.de" target="_blank">Sebastian.Drude@fu-berlin.de</a> & <a href="mailto:Sebastian.Drude@googlemail.com" target="_blank">Sebastian.Drude@googlemail.com</a><br>| <a href="http://www.germanistik.fu-berlin.de/il/pers/drude-en.html" target="_blank">http://www.germanistik.fu-berlin.de/il/pers/drude-en.html</a><br>
<br>
<!-- |**|begin egp html banner|**| -->
<br>
<br>
<!-- |**|end egp html banner|**| -->
<div width="1" style="color: white; clear: both;"/>__._,_.___</div>
<!-- Start Recommendations -->
<!-- End Recommendations -->
<!-- |**|begin egp html banner|**| -->
<img src="http://geo.yahoo.com/serv?s=97476590/grpId=11682781/grpspId=1709195911/msgId=4921/stime=1236909152" width="1" height="1"> <br>
<!-- |**|end egp html banner|**| -->
<!-- |**|begin egp html banner|**| -->
<br>
<div style="font-family: verdana; font-size: 77%; border-top: 1px solid #666; padding: 5px 0;" >
Your email settings: Individual Email|Traditional <br>
<a href="http://groups.yahoo.com/group/lexicographylist/join;_ylc=X3oDMTJncG5uMGNmBF9TAzk3NDc2NTkwBGdycElkAzExNjgyNzgxBGdycHNwSWQDMTcwOTE5NTkxMQRzZWMDZnRyBHNsawNzdG5ncwRzdGltZQMxMjM2OTA5MTUy">Change settings via the Web</a> (Yahoo! ID required) <br>
Change settings via email: <a href="mailto:lexicographylist-digest@yahoogroups.com?subject=Email Delivery: Digest">Switch delivery to Daily Digest</a> | <a href = "mailto:lexicographylist-fullfeatured@yahoogroups.com?subject=Change Delivery Format: Fully Featured">Switch to Fully Featured</a> <br>
<a href="http://groups.yahoo.com/group/lexicographylist;_ylc=X3oDMTJlc2VudXNzBF9TAzk3NDc2NTkwBGdycElkAzExNjgyNzgxBGdycHNwSWQDMTcwOTE5NTkxMQRzZWMDZnRyBHNsawNocGYEc3RpbWUDMTIzNjkwOTE1Mg--">
Visit Your Group
</a> |
<a href="http://docs.yahoo.com/info/terms/">
Yahoo! Groups Terms of Use
</a> |
<a href="mailto:lexicographylist-unsubscribe@yahoogroups.com?subject=Unsubscribe">
Unsubscribe
</a>
<br>
</div>
<br>
<!-- |**|end egp html banner|**| -->
<div style="color: white; clear: both;"/>__,_._,___</div>
</body>
</html>