Work Packages
The MEZZANINE project is divided into 4 work packages, each containing 2 to 4 activities. Research in each activity follows the defined research questions. Experts from linguistics and technical sciences will cooperate in each work package.
WP1: Acquiring recordings of speech
Activities
Research questions
A1.1-I – Spoken language resources in Linguistics and Technical Sciences
RQ1.1.1
What are the needs of different linguistic disciplines and technical sciences regarding the spoken language resources?
RQ1.1.2
How well are the existing reference speech corpora balanced with regard to the covered spoken genres?
A1.2-I Advantages and disadvantages of different recording techniques
RQ1.2.1
What recording techniques are used to collect speech data, and what are the characteristics of data collected with particular techniques?
RQ1.2.2
What are the potentials of crowdsourcing speech data in small communities, and how can it satisfy the needs of a diverse set of disciplines?
RQ1.2.3
What are the legal considerations when recording speech data or using the existing speech data from different sources and how to address them?
A1.3-T Low-cost limited domain speech data for training a speech recogniser
RQ1.3.1
How should an unsupervised or semi-supervised training of a speech recogniser be constructed, if speech data are only available for a specific domain?
RQ1.3.2
What is the optimal approach for constructing new speech data from the perspective of available low-cost speech data?
A1.4-T The effectiveness of knowledge transfer for different speech/speaker recognition tasks
RV1.4
What are the speech recognition tasks with the lowest possibility of knowledge transfer from high-resourced languages to Slovenian?
WP2: Dialect variation
Activities
Research questions
A2.1-L Geolinguistic analysis of non-standard phonemes
RQ2.1
How reliable is the actual version of Slovenian dialect phonetic transcription?
A2.2-L A spatial model of basic dialect areas of non-standard phonemes
RQ2.2
How to determine a spatial distribution of non-standard phonemes?
A2.3-L Creation of diasystemic contrastive tables
RQ2.3
How to create a spatial model for designing diasystemic contrastive Tables of phonemes (dialect vs. standard)?
A2.4-I Definition of an optimal Slovenian phoneme set for ASR
RQ2.4
How to define an optimal Slovenian phoneme set, which is balanced between the standardised version and dialectic phoneme version?
WP3: Speech segmentation and annotation
Activities
Research questions
A3.1-I – The basic units of speech
RQ3.1.1
How well do manually annotated speech segments (i.e., utterances) in the Slovenian spoken language resources correlate with prosodic units?
RQ3.1.2
How well do manually annotated speech segments in the Slovenian spoken language resources correlate with syntactic units?
A3.2-I Annotating and modelling disfluencies
RQ3.2.1
What is an appropriate scheme for annotation of disfluencies in speech corpora?
RQ3.2.2
What is the optimal approach to automatic disfluency detection in speech corpora?
A3.3-I Morphosyntactic annotation, lemmatisation and dependency parsing
RQ3.3.1
How can disfluency annotations inform and improve linguistic annotation?
RQ3.3.2
How can training data from other domains and modalities be used efficiently for spoken language processing?
RQ3.3.3
What is the impact of linguistic input representation on the results of linguistic annotation?
A3.4-I Dialogue acts` annotation
RQ3.4.1
How unambiguous, adequate and informative is the GORDAN scheme compared to the ISO 24617-2 Standard?
RQ3.4.2
How to expand the ISO 24617-2 tagset in order to achieve better adequacy and informativeness of the tagset?
WP4: Spoken lexis
Activities
Research questions
A4.1-I Canonical forms of (non-standard) spoken lexis
RQ4.1.1
Which types (distinct words in a corpus) interpreting the same or similar phenomena were standardised differently in existing spoken language resources?
RQ4.1.2
What is the appropriate categorisation of the analysed heterogeneously interpreted corpus types, and how are canonical forms classified according to different categories (of types)?
RQ4.1.3
How are canonical forms and types included in the lexicon, or linked with lexicon data?
A4.2-I Lexicographic description of (non-standard) spoken language
RQ4.2.1
What are the characteristics of the spoken lexis, as opposed to written language, and how can these characteristics be analysed automatically (for lexicographic purposes)?
RQ4.2.2
How is semantic description spoken language lexis included in semantic (lexicographic) resources for Slovenian?