[Lingtyp] Running an R code to analyze Phonotacticon

Ian Joo ian_joo at nucba.ac.jp
Mon May 22 15:38:07 UTC 2023


Dear all,

Since several have already kindly replied to offer me help, I thought it would be more pratical to send my data and R script to everyone here. Please find attached the R Markdown script, the data files (Phonotacticon and PanPhon), and also a draft of my thesis for your reference. Again, thank you for your kindness.

Regards,
Ian



> 2023. 5. 22. 오후 4:30, Ian Joo <ian_joo at nucba.ac.jp> 작성:
> Dear typologists,
> 
> for my doctoral project, I am compiling and analyzing a database called Phonotacticon, a cross-linguistic database of basic phonotactic information.
> I have collected more than 350 lects in my database as of now, the goal being around 450.
> For my thesis, I have written an R script to analyze the phonological distances between Eurasian lects based on Phonotacticon. Running the code worked fine until 200 languages or so, albeit with several hours of running time. But now, as the size of the lects has grown (and the distances between each pair of lects have also grown exponentially), my 2020 model Macbook Pro with 16gb RAM cannot run the code anymore without crashing in the middle.
> Perhaps it’s the hardware limit of my Macbook, or maybe I have written the code in an inefficient way. Anyway, I need to run the code somehow to finish revising my thesis. I tried using virtual machines like Google Pro+ with 32gb RAM, but the code crashed there too.
> In case where any of you are using a high-end computer better than mine and you are also experienced with R, I was wondering if I can send you my R script and data so that you can run it on your computer and send me the results, or better yet, see if anything is wrong with my R script so that I can fix it to run it on my own computer.
> I would much appreciate your help direly needed as this point.
> 
> From Netherlands,
> Ian


More information about the Lingtyp mailing list