[Lingtyp] New release IMTVault with 123k examples and concept tagging
Sebastian Nordhoff
sebastian.nordhoff at glottotopia.de
Thu Jan 9 19:28:25 UTC 2025
Dear list members,
thanks to Robert Forkel and Open Text Collections, we can now report
that IMTVault contains more than 123,000 linguistic examples.
You can browse the examples
- by language (eg Akan https://imtvault.org/?languageName[0]=Akan),
- by category (eg Ergative https://imtvault.org/?cat[0]=ERG)
- by string (eg "evening" https://imtvault.org/?q=evening)
- NEW: by concept (https://imtvault.org/?entities[0]=Kitchenware)
You can also combine the queries, thus for sentences in Kagayanen with
an ergative marker and which are about food, the query would be
https://imtvault.org/?languageName%5B0%5D=Kagayanen&entities%5B0%5D=Food&cat%5B0%5D=ERG
My hope is that this will a nice tool for exploring linguistic diversity.
The concept extraction has its imperfections, which I would like to
minimize. If you come across examples which are tagged in a culturally
inappropriate way, please let me know. For instance, all examples
involving sorcery were also labelled as "pseudoscience". I have manually
removed this particular association, but this will not be the last
unfortunate link we discover.
If you have observations about the interface, about usability, or about
intuitive use, these would also be very welcome, as are all further
suggestions for improvement.
For each example, the original source is linked. If you find an
interesting example you want to use in your work, make sure to read the
original work and appreciate the context in which the example is found.
The original works are all open access.
Best wishes
Sebastian
More information about the Lingtyp
mailing list