[Lingtyp] Making a cross-linguistic database of constructions

JOO, Ian [Student] ian.joo at connect.polyu.hk
Mon Jun 21 07:00:53 UTC 2021


Dear Adam, dear Juergen,

thank you for your thought-provoking comments. I will answer to the comments that I can answer to right now. (For the rest, I need some time to think.)
As to Adam’s question regarding the fixed order of each construction, the order of the elements within a construction is not necessarily fixed. The elements that are connected by a hyphen are fixed in their order, as they represent affixation, but the elements that are separated by a space are not necessarily fixed in order. I just write them down in a specific order (that is the most common in the language) for convenience.
As to Juergen’s question regarding how this can be useful for detecting areality, I plan to combine this data with phonological and lexico-semantic data as well, so with dozens of constructional data combined with dozens of phonological and lexico-semantic data, I will have close to 100 parameters, which (I believe) will be sizeable and balanced enough for detecting areality.

Regards,
Ian
On 21 Jun 2021, 2:01 AM +0800, Juergen Bohnemeyer <jb77 at buffalo.edu>, wrote:
Dear Ian — Two points, in no particular order.

First, I’m in a sense more interested in the comparisons that you aim to facilitate than in the structure of the database designed as a tool to facilitate these comparisons. It seems to me that the comparisons would heavily depend on the label in the first column. I assume the idea is that you slap a label like ‘causative’ or 'comparative’ onto the constructions and then compare ‘causatives’ and only those to one another and ‘comparatives’ to other ‘comparatives’, etc. In other words, the label in the first column determines the “scope” of the comparisons, and in that sense is the key categorization on which the comparisons would hinge. Am I getting this right?

There are several potential issues I see with this. I mention them here, not to discourage you from your project, but simply to inject them as topics of discussion:

(i) Not every construction fits neatly under just one such semantic category label. Just consider, for instance, the enormous range of meanings expressed by adpositional constructions, multi-verb constructions, and clausal connective constructions. Or consider inflectional paradigm cells expressing simultaneously multiple inflectional categories.

(ii) More generally, and more abstractly stated, constructions that we might want to compare to one another in some respect or other vary as much in their semantics and pragmatics as they do in their morphosyntactic properties. I’m getting the sense that the approach you have in mind would unilaterally take semantics to be the basis of comparison, thereby moving it into the “background” of the comparison itself, so to speak, and thus removing it from the assessment of the similarity/distance among constructions/languages. It seems to me that an approach that treats constructions as bundles of morphosyntactic and semantic/pragmatic properties and allows both to enter the assessment of distance would be superior.

(But this raises a (to me) slightly mind-bending problem: what does one use as the criterion of individuation for determining what counts as a construction, i.e., as an entry in your database? This closely resembles the lemma problem of lexicography: a lemma is a string of sounds that has associated with it a set of related meanings. In actual fact, the string of sounds is subject to various kinds of variation. The lexicographer makes decisions as to which variants belong together and which constitute distinct entries represented by distinct lemmata.)

(iii) How does one decide on the level of granularity at which to compare constructions/languages? E.g., should all copular constructions be compared to all other copular constructions? Will locative predications be treated as distinct from copular constructions? It seems existential predications will?

There is, of course, a lot of literature on the problem of the relation between descriptive and typological categorization that is highly relevant to all of these issues.

The second point I wanted to briefly raise: what is this 'distance' that you hope to measure? It seems that it would be a typological measure of overall morphosyntactic similarity across languages. One could use this to examine how much such overall morphosyntactic similarity reflects genealogical and areal dependencies. But I would be hesitant to simply take a language’s placement in a crosslinguistic morphosyntactic similarity space as direct evidence of areal or genealogical relations.

(Unless, that is, we at some point want to reopen the question what it means for two languages to be genealogically related, which seems to have been considered settled by a mainstream of linguists since the days of the Neogrammarians.)

And, of course, the outcome of such an analysis would depend on the coverage of constructions. If you go with a relatively small set of constructions, ranging in the dozens, then the outcome might vary a great deal depending on which constructions you include. So how would you decide which to include?

Well, those are my thoughts. I hope it’s obvious that I think this is a very interesting idea! — Best — Juergen

On Jun 20, 2021, at 4:51 AM, JOO, Ian [Student] <ian.joo at connect.polyu.hk> wrote:

Dear all,

I am thinking about making a cross-linguistic database consisting of the morphosyntactic structures of a set of common constructions across different languages.
Below is a set of constructions and their structures in Korean and Mandarin, as an example.
I have added numbers to multiple elements (N1, N2…) to label them consistently across different languages. (For example the N1 in Korean causative and the N1 in Mandarin causative both refer to the causer)
“A|B” means A or B. X refers to an element that is variable but required (for example V-X means that the verb needs some kind of suffix). The parentheses mean optional.
By compiling a database like this I aim to measure the distance between the morphosyntactic structures of different languages.
In case where a language has no corresponding construction, e. g. no passive construction, the slot would be left blank.
In case where a language has more than one structures for a construction, then I will insert more than one structures (hence the number column).
The database would include one example for each structure in interlinear gloss as well.
I would greatly appreciate your opinion on the feasibility of this plan, whether this can be something compiled in a meaningful sense.

<스크린샷 2021-06-20 오후 4.45.38.png>

From Hong Kong,
Ian




Disclaimer:


This message (including any attachments) contains confidential information intended for a specific individual and purpose. If you are not the intended recipient, you should delete this message and notify the sender and The Hong Kong Polytechnic University (the University) immediately. Any disclosure, copying, or distribution of this message, or the taking of any action based on it, is strictly prohibited and may be unlawful.

The University specifically denies any responsibility for the accuracy or quality of information obtained through University E-mail Facilities. Any views and opinions expressed are only those of the author(s) and do not necessarily represent those of the University and the University accepts no liability whatsoever for any losses or damages incurred or caused to any party as a result of the use of such information.

_______________________________________________
Lingtyp mailing list
Lingtyp at listserv.linguistlist.org
http://listserv.linguistlist.org/mailman/listinfo/lingtyp

[https://www.polyu.edu.hk/emaildisclaimer/PolyU_Email_Signature.jpg]

Disclaimer:

This message (including any attachments) contains confidential information intended for a specific individual and purpose. If you are not the intended recipient, you should delete this message and notify the sender and The Hong Kong Polytechnic University (the University) immediately. Any disclosure, copying, or distribution of this message, or the taking of any action based on it, is strictly prohibited and may be unlawful.

The University specifically denies any responsibility for the accuracy or quality of information obtained through University E-mail Facilities. Any views and opinions expressed are only those of the author(s) and do not necessarily represent those of the University and the University accepts no liability whatsoever for any losses or damages incurred or caused to any party as a result of the use of such information.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lingtyp/attachments/20210621/68ff87c2/attachment.htm>


More information about the Lingtyp mailing list