[Lingtyp] A list of 50 basic sentences

Matías GN mortem.dei at gmail.com
Sun May 9 08:35:56 UTC 2021


Dear Ian,

using edit distances makes little sense because edit distances are highly
sensitive to ordering.
They are, after all, used for finding optimal alignments.
The distance between *abcde* and *edcba* is 4, which is the same as the
distance between *abcde* and *afglk*, but clearly *abcde* and *edcba* are
*much* more similar than *abcde* is to *afglk* for obvious reasons.
Whatever you end up measuring will be heavily biased towards word order.

Bes,

El dom, 9 de may. de 2021 a la(s) 10:23, JOO, Ian [Student] (
ian.joo at connect.polyu.hk) escribió:

> Dear Sandra,
>
> I think those are good points.
> I agree that there could be a translation bias, and the idea is to elicit
> the most natural, preferred style of utterance from the speaker. As Alex
> François recommended, it may be better to use a dialogue rather than
> isolated sentences, to elicit the most natural way of speaking and avoid
> translation bias. And I agree that the number of 50 sentences is small and
> it should be expanded.
> As for the standardization of glosses, I try to make it into the most
> uniform way possible, for example all copulas being glossed as COP rather
> than 'be', based on Leipzig glossing rules and other common practices.
> As for example 1, I forgot to mention that there is another counting
> measure, substitution - so NOM is "substituted" into COP, because they are
> in the same position, so that counts as 1. I'm not 100% sure what counts as
> the same position, I just let R calculate that, so I should have a closer
> look into it.
>
> Regards,
> Ian
> On 9 May 2021, 3:25 PM +0800, Sandra Auderset <sandrauderset at gmail.com>,
> wrote:
>
> Hi Ian,
>
> Following up on Hartmut, Yunfan and others, I have some questions:
>  • What do you do with variation? I’m not familiar with the languages you
> work on, but ’Tense Future’ in German could be translated as “Ich werde
> morgen gehen” or “Ich gehe morgen”. The latter would be more frequent in
> spoken language, but you might get the former because of translation bias.
> Would you include both? You say that your method accounts for the choice of
> the speaker, but again I wonder if this isn’t just translation bias.
>  • You say that this method has the advantage of including more frequently
> observed features. I wonder how you know whether that’s the case or not? Do
> you mean in spoken or written language? As Yunfan pointed out, with 50
> sentences you might easily miss some common features.
>  • How do you standardize the glosses? For example, how do you decide
> whether something should be glossed as ‘be’ or copula? That seems important
> to me, since glossing is very subjective and you might inadvertendly bias
> the whole calculation. Especially since you already wrote up the conclusion.
>  • Lastly, I find it odd that Example 2) is calculated as having distance
> 1. To me, there are two differences: presence/absence of nominative and the
> presence/absence of a copula. How do you determine that the copula is in
> the same slot as the nominative for calculation?
>
> Best,
> Sandra
>
>
> —*Sandra Auderset* <https://sauderset.github.io/>PhD Candidate | [she/her]Department
> of Linguistic and Cultural EvolutionMPI for Evolutionary Anthropology&Department
> of LinguisticsUniversity of California Santa Barbara
>
> On Saturday, May 08, 2021 at 19:16, Hartmut Haberland <hartmut at ruc.dk
> <https://mailto:hartmut@ruc.dk>> wrote:
> Dear Ian,I have a few comments.I was wondering about
>
> Genitive
>
> Alienable
>
> Genitive
>
> Inalienable
> Is it a good idea to use ‘genitive’? Would ‘possessive’ not be better?Also
> I wonder about languages like Finnish which express contrast between
> definiteness and indefiniteness by word order:Auto on kadulla. ‘*The car* is
> in the street.’Kadulla on auto. ‘There is *a car* in the street.’ (-ulla
> is inessive case.)Also think of ItalianLa macchina è rotta.È rotta la
> machina.both ‘The car is broken’, but are answers to different questions
> (Where is your car?, Why are you late?, resp.); same (SV vs. VS) in Greek.
> How would you get these results?Best, Hartmut *Fra:* Lingtyp <
> lingtyp-bounces at listserv.linguistlist.org> *På vegne af* JOO, Ian
> [Student]*Sendt:* 8. maj 2021 15:08*Til:* LINGTYP <
> lingtyp at listserv.linguistlist.org>*Emne:* [Lingtyp] A list of 50 basic
> sentences
> Dear all,
>
> I am trying to make a list of 50 basic sentential meanings.
> The goal is to make parallel corpora of different languages based on this
> list of sentences.
> Each sentence on the list serves to check whether a language has a given
> grammatical feature, and if so, in what form the language expresses it.
> When creating each sentence, I tried to limit its vocabulary to basic
> words that are found in most languages, avoiding culture-specific words.
> I would appreciate it if you could have a look at the attached file and
> advise what I should add/remove/modify.
>
> From Hong Kong,
> Ian
>
> *Disclaimer:**This message (including any attachments) contains
> confidential information intended for a specific individual and purpose. If
> you are not the intended recipient, you should delete this message and
> notify the sender and The Hong Kong Polytechnic University (the University)
> immediately. Any disclosure, copying, or distribution of this message, or
> the taking of any action based on it, is strictly prohibited and may be
> unlawful.**The University specifically denies any responsibility for the
> accuracy or quality of information obtained through University E-mail
> Facilities. Any views and opinions expressed are only those of the
> author(s) and do not necessarily represent those of the University and the
> University accepts no liability whatsoever for any losses or damages
> incurred or caused to any party as a result of the use of such information.*
> _______________________________________________
> Lingtyp mailing list
> Lingtyp at listserv.linguistlist.org
> http://listserv.linguistlist.org/mailman/listinfo/lingtyp
>
>
> *Disclaimer:*
>
> *This message (including any attachments) contains confidential
> information intended for a specific individual and purpose. If you are not
> the intended recipient, you should delete this message and notify the
> sender and The Hong Kong Polytechnic University (the University)
> immediately. Any disclosure, copying, or distribution of this message, or
> the taking of any action based on it, is strictly prohibited and may be
> unlawful.*
>
> *The University specifically denies any responsibility for the accuracy or
> quality of information obtained through University E-mail Facilities. Any
> views and opinions expressed are only those of the author(s) and do not
> necessarily represent those of the University and the University accepts no
> liability whatsoever for any losses or damages incurred or caused to any
> party as a result of the use of such information.*
> _______________________________________________
> Lingtyp mailing list
> Lingtyp at listserv.linguistlist.org
> http://listserv.linguistlist.org/mailman/listinfo/lingtyp
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lingtyp/attachments/20210509/d061f948/attachment.htm>


More information about the Lingtyp mailing list