The central part of the project is dedicated to the creation of a corpus of Torlak vernaculars. The corpus will be based on existing and newly collected recordings of field interviews.
The corpus will be used for quantitative linguistic studies, looking at how language production is shaped by the different types of boundaries, and for qualitative cultural studies.
An additional project outcome will be the development of new and/or improvement of existing tools for corpus linguistics and natural language processing that will also be applicable to cultural studies.