    Discourse markers in Slovenian and their applicability for developing speech-to-speech translation technologies

    Koncept konteksta v jezikoslovnih in diskurznih teorijah

    V prispevku pregledamo, kako se je koncept konteksta razvijal po različnih področjihjezikoslovja in analize diskurza, ter kritično pretresemo opredelitev različnih teorij do tegavprašanja. Predstavljene poglede združimo v dve smeri, družbeno in kognitivno. V družbenousmerjenih teorijah analize diskurza ne razvijejo širše, splošno sprejete teorije konteksta in secelo izogibajo sami definiciji pojma kontekst, češ da je prekompleksen. V kognitivno usmerjenihteorijah najdemo bolj obširno izdelane definicije in teorije konteksta, vendar jim očitamovrsto pomanjkljivosti, da bi bile splošno veljavne

    Frazeološkost krščanskega izrazja v vsakdanjem govoru

    This article studies the use of Christian vocabulary and coinages from them in phrasemes (with an emphasis on pragmatic phrasemes). As an addition to previous studies of phraseology that were based on lexicographic material, or on written texts or written corpuses, this study examines everyday spontaneous spoken language. The results show that these are phraseologically very active words that are predominantly used in a pragmatic role, especially for expressing the relationship of the speaker to the content of the conversation, and less often to the addressee or the circumstances.V prispevku raziskujemo rabo krščanskega izrazja v frazemih (s poudarkom na pragmatičnih frazemih). Kot dopolnitev predhodnim frazeološkim razpravam, ki temeljijo na slovarskem gradivu oz. pisnih besedilih ali pisnih korpusih, nas zanima raba v vsakdanjem spontano govorjenem jeziku. Rezultati kažejo, da gre za besede, ki so prevladujoče rabljene v nepropozicijski (metadiskurzni) vlogi, predvsem za izražanje odnosa govorca do vsebine pogovora, redkeje pa tudi do naslovnika ali okoliščin

    Key word analysis of discourses in Slovene speech : differences and similarities

    One of the aspects of speech that remains under-researched is the internal variety of speech, i.e. the differences and similarities between different types of speech. This paper aims to contribute to this research by making the comparison between different discourses of Slovene spontaneous speech, focusing on the use of vocabulary. The key word analysis (Scott, 1997), conducted on a million‑word corpus of spoken Slovene, was used to identify lexical items and groups of lexical items typical of a particular spoken discourse, or common to different types of spoken discourse. The results indicate that the presence or absence of a particular word class in the key word list can be a good indicator of a type of spoken discourse, or discourses.

    O avtomatski evalvaciji strojnega prevajanja

    Stalen del razvoja strojnega prevajanja je evalvacija prevodov, pri čemer se v glavnem uporabljajo avtomatski postopki. Ti vedno temeljijo na referenčnem prevodu. V tem prispevku pokažemo, kako zelo različni so lahko referenčni prevodi za področje podnaslavljanja ter kako lahko to vpliva na oceno – ista metrika lahko isti prevajalnik oceni kot neuporaben ali kot zelo uspešen samo na podlagi tega, da uporabimo referenčne prevode, ki so pridobljeni po različnih postopkih, vendar vedno jezikovno in pomensko povsem ustrezni

    The Slovene BNSI Broadcast News database and reference speech corpus GOS: Towards the uniform guidelines for future work

    Abstract The aim of the paper is to search for common guidelines for the future development of speech databases for less resourced languages in order to make them the most useful for both main fields of their use, linguistic research and speech technologies. We compare two standards for creating speech databases, one followed when developing the Slovene speech database for automatic speech recognition -BNSI Broadcast News, the other followed when developing the Slovene reference speech corpus GOS, and outline possible common guidelines for future work. We also present an add-on for the GOS corpus, which enables its usage for automatic speech recognition

    Can Turn-Taking Highlight the Nature of Non-Verbal Behavior: A Case Study

    The present research explores non-verbal behavior that accompanies the management of turns in naturally occurring conversations. To analyze turn management, we implemented the ISO 24617-2 multidimensional dialog act annotation scheme. The classification of the communicative intent of non-verbal behavior was performed with the annotation scheme for spontaneous authentic communication called the EVA annotation scheme. Both dialog acts and non-verbal communicative intent were observed according to their underlying nature and information exchange channel. Both concepts were divided into foreground and background expressions. We hypothesize that turn management dialog acts, being a background expression, co-occur with communication regulators, a class of non-verbal communicative intent, which are also of background nature. Our case analysis confirms this hypothesis. Furthermore, it reveals that another group of non-verbal communicative intent, the deictics, also often accompany turn management dialog acts. As deictics can be both foreground and background expressions, the premise that background non-verbal communicative intent is interlinked with background dialog acts is upheld. And when deictics were perceived as part of the foreground they co-occurred with foreground dialog acts. Therefore, dialog acts and non-verbal communicative intent share the same underlying nature, which implies a duality of the two concepts

    Učno E-okolje Slovenščina na dlani: izzivi in rešitve

    Prispevek izhaja iz treh izzivov, ki jih zaznavamo pri pouku slovenščine v višjih razredih osnovnih šol in v srednjih šolah: kako odpraviti napake knjižne norme, ki vztrajajo v pisnih izdelkih učencev; kako izboljšati frazeološko kompetenco; kako izboljšati sporazumevalno jezikovno zmožnost. Ti izzivi so osrednja točka razvoja sodobnega učnega e-okolja Slovenščina na dlani, ki temelji na jezikovnih in informacijsko-komunikacijskih tehnologijah ter prinaša podporo prožnim oblikam poučevanja, poučevanju na daljavo, lajša učiteljevo delo, omogoča pa tudi motiviranje učencev prek elementov igrifikacije. V prispevku predstavljamo zasnovo in izvedbo vsakega od štirih vsebinskih sklopov e-okolja: pravopis, slovnica, frazeologija in besedila

    Primary categories of dialogue acts

    Prispevek se ukvarja s problematiko označevanja dialoških dejanj v korpusih. Obstoječe generične sheme dialoških dejanj kažejo vrsto pomanjkljivosti, zato so v prispevku primarne kategorije dialoških dejanj definirane na novo ter evalvirane z označevanjem testnega gradiva, ki sta ga izvedla neodvisna označevalca. Rezultati potrdijo ustreznost definicij za empirično rabo, pokažejo pa tudi dvoumne in mejne rabe, ki jih je treba nasloviti v prihodnje.The article addresses dialogue act annotation in speech corpora. There are a number of drawbacks to using existing generic schemes for dialogue act annotationtherefore, the pri- mary categories of dialogue acts are redefined and re-evaluated based on authentic speech data annotation done by two independent annotators. The results confirm the appropriateness of the defined dialogue act categories for empirical use as well as point out ambiguous and borderline cases that need to be addressed in the future


