[Lingtyp] A list of 50 basic sentences

JOO, Ian [Student] ian.joo at connect.polyu.hk
Sun May 9 08:43:41 UTC 2021


Dear Matías,

that is a good point, but I would ask: isn't word order one of the most important part of syntax?
So for example, if we compare two languages who use the same words, but one in the order of SVO, and another in the order of OVS, the distance will be 2 (two transpositions).
If we compare two languages that both have the same SVO structure, but happen to use different words for V and O, the distance will be 2 (two substitutions).
In this situation, can we really say that the two SVO languages are much more similar to each other than the SVO and OVS languages are? I think that is debatable.

Regards,
Ian
On 9 May 2021, 4:36 PM +0800, Matías GN <mortem.dei at gmail.com>, wrote:
Dear Ian,

using edit distances makes little sense because edit distances are highly sensitive to ordering.
They are, after all, used for finding optimal alignments.
The distance between abcde and edcba is 4, which is the same as the distance between abcde and afglk, but clearly abcde and edcba are much more similar than abcde is to afglk for obvious reasons.
Whatever you end up measuring will be heavily biased towards word order.

Bes,

El dom, 9 de may. de 2021 a la(s) 10:23, JOO, Ian [Student] (ian.joo at connect.polyu.hk<mailto:ian.joo at connect.polyu.hk>) escribió:
Dear Sandra,

I think those are good points.
I agree that there could be a translation bias, and the idea is to elicit the most natural, preferred style of utterance from the speaker. As Alex François recommended, it may be better to use a dialogue rather than isolated sentences, to elicit the most natural way of speaking and avoid translation bias. And I agree that the number of 50 sentences is small and it should be expanded.
As for the standardization of glosses, I try to make it into the most uniform way possible, for example all copulas being glossed as COP rather than 'be', based on Leipzig glossing rules and other common practices.
As for example 1, I forgot to mention that there is another counting measure, substitution - so NOM is "substituted" into COP, because they are in the same position, so that counts as 1. I'm not 100% sure what counts as the same position, I just let R calculate that, so I should have a closer look into it.

Regards,
Ian
On 9 May 2021, 3:25 PM +0800, Sandra Auderset <sandrauderset at gmail.com<mailto:sandrauderset at gmail.com>>, wrote:
Hi Ian,

Following up on Hartmut, Yunfan and others, I have some questions:
 • What do you do with variation? I’m not familiar with the languages you work on, but ’Tense Future’ in German could be translated as “Ich werde morgen gehen” or “Ich gehe morgen”. The latter would be more frequent in spoken language, but you might get the former because of translation bias. Would you include both? You say that your method accounts for the choice of the speaker, but again I wonder if this isn’t just translation bias.
 • You say that this method has the advantage of including more frequently observed features. I wonder how you know whether that’s the case or not? Do you mean in spoken or written language? As Yunfan pointed out, with 50 sentences you might easily miss some common features.
 • How do you standardize the glosses? For example, how do you decide whether something should be glossed as ‘be’ or copula? That seems important to me, since glossing is very subjective and you might inadvertendly bias the whole calculation. Especially since you already wrote up the conclusion.
 • Lastly, I find it odd that Example 2) is calculated as having distance 1. To me, there are two differences: presence/absence of nominative and the presence/absence of a copula. How do you determine that the copula is in the same slot as the nominative for calculation?

Best,
Sandra


—Sandra Auderset<https://sauderset.github.io/>PhD Candidate | [she/her]Department of Linguistic and Cultural EvolutionMPI for Evolutionary Anthropology&Department of LinguisticsUniversity of California Santa Barbara
On Saturday, May 08, 2021 at 19:16, Hartmut Haberland <hartmut at ruc.dk<https://mailto:hartmut@ruc.dk>> wrote:
Dear Ian,I have a few comments.I was wondering about
Genitive
Alienable
Genitive
Inalienable
Is it a good idea to use ‘genitive’? Would ‘possessive’ not be better?Also I wonder about languages like Finnish which express contrast between definiteness and indefiniteness by word order:Auto on kadulla. ‘The car is in the street.’Kadulla on auto. ‘There is a car in the street.’ (-ulla is inessive case.)Also think of ItalianLa macchina è rotta.È rotta la machina.both ‘The car is broken’, but are answers to different questions (Where is your car?, Why are you late?, resp.); same (SV vs. VS) in Greek. How would you get these results?Best, Hartmut Fra: Lingtyp <lingtyp-bounces at listserv.linguistlist.org<mailto:lingtyp-bounces at listserv.linguistlist.org>> På vegne af JOO, Ian [Student]Sendt: 8. maj 2021 15:08Til: LINGTYP <lingtyp at listserv.linguistlist.org<mailto:lingtyp at listserv.linguistlist.org>>Emne: [Lingtyp] A list of 50 basic sentences
Dear all,

I am trying to make a list of 50 basic sentential meanings.
The goal is to make parallel corpora of different languages based on this list of sentences.
Each sentence on the list serves to check whether a language has a given grammatical feature, and if so, in what form the language expresses it.
When creating each sentence, I tried to limit its vocabulary to basic words that are found in most languages, avoiding culture-specific words.
I would appreciate it if you could have a look at the attached file and advise what I should add/remove/modify.

From Hong Kong,
Ian
[https://www.polyu.edu.hk/emaildisclaimer/PolyU_Email_Signature.jpg]
Disclaimer:This message (including any attachments) contains confidential information intended for a specific individual and purpose. If you are not the intended recipient, you should delete this message and notify the sender and The Hong Kong Polytechnic University (the University) immediately. Any disclosure, copying, or distribution of this message, or the taking of any action based on it, is strictly prohibited and may be unlawful.The University specifically denies any responsibility for the accuracy or quality of information obtained through University E-mail Facilities. Any views and opinions expressed are only those of the author(s) and do not necessarily represent those of the University and the University accepts no liability whatsoever for any losses or damages incurred or caused to any party as a result of the use of such information._______________________________________________
Lingtyp mailing list
Lingtyp at listserv.linguistlist.org<mailto:Lingtyp at listserv.linguistlist.org>
http://listserv.linguistlist.org/mailman/listinfo/lingtyp
[https://www.polyu.edu.hk/emaildisclaimer/PolyU_Email_Signature.jpg]

Disclaimer:

This message (including any attachments) contains confidential information intended for a specific individual and purpose. If you are not the intended recipient, you should delete this message and notify the sender and The Hong Kong Polytechnic University (the University) immediately. Any disclosure, copying, or distribution of this message, or the taking of any action based on it, is strictly prohibited and may be unlawful.

The University specifically denies any responsibility for the accuracy or quality of information obtained through University E-mail Facilities. Any views and opinions expressed are only those of the author(s) and do not necessarily represent those of the University and the University accepts no liability whatsoever for any losses or damages incurred or caused to any party as a result of the use of such information.

_______________________________________________
Lingtyp mailing list
Lingtyp at listserv.linguistlist.org<mailto:Lingtyp at listserv.linguistlist.org>
http://listserv.linguistlist.org/mailman/listinfo/lingtyp
[https://www.polyu.edu.hk/emaildisclaimer/PolyU_Email_Signature.jpg]

Disclaimer:

This message (including any attachments) contains confidential information intended for a specific individual and purpose. If you are not the intended recipient, you should delete this message and notify the sender and The Hong Kong Polytechnic University (the University) immediately. Any disclosure, copying, or distribution of this message, or the taking of any action based on it, is strictly prohibited and may be unlawful.

The University specifically denies any responsibility for the accuracy or quality of information obtained through University E-mail Facilities. Any views and opinions expressed are only those of the author(s) and do not necessarily represent those of the University and the University accepts no liability whatsoever for any losses or damages incurred or caused to any party as a result of the use of such information.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lingtyp/attachments/20210509/d0f537d7/attachment.htm>


More information about the Lingtyp mailing list