The primacy of research on written language serves the practical goals of cultivating, standardising and teaching forms of written language. However, the written language is regulated and restricted in most of its usages, and language users follow institutionally defined rules of standard language, and have editors and proofreaders to control their linguistic expression.
Spoken language, especially when spoken spontaneously, reflects human natural linguistic behaviour much better, exposes the great variety of language on all linguistic levels, and outlines additional levels of the language (e.g., phonological and prosodic) or language use which are not present in the written modality. Research of spoken language is, therefore, necessary for linguistics, to have a comprehensive knowledge and understanding of its subject of study.
To be able to perform basic research on spoken language or speech technologies with significant scientific impact, the scarciety of spoken language resources needs to be addressed first, especially for low-resourced languages. Freely available, open spoken language resources will stimulate research on spoken language and its use, as well as the development of speech technologies. However, the development of spoken language resources is not only a matter of applied data collection, but opens up a number of basic research questions. These research questions will be addressed in the MEZZANINE project (teMeljnE raZiskave Za rAzvoj govorNih vIrov in tehNologij za slovEnščino / Basic Research for the Development of Spoken Language Resources and Speech Technologies for the Slovenian Language), with a focus on the Slovenian language.