2,015 research outputs found

    Final FLaReNet deliverable: Language Resources for the Future - The Future of Language Resources

    Get PDF
    Language Technologies (LT), together with their backbone, Language Resources (LR), provide an essential support to the challenge of Multilingualism and ICT of the future. The main task of language technologies is to bridge language barriers and to help creating a new environment where information flows smoothly across frontiers and languages, no matter the country, and the language, of origin. To achieve this goal, all players involved need to act as a community able to join forces on a set of shared priorities. However, until now the field of Language Resources and Technology has long suffered from an excess of individuality and fragmentation, with a lack of coherence concerning the priorities for the field, the direction to move, not to mention a common timeframe. The context encountered by the FLaReNet project was thus represented by an active field needing a coherence that can only be given by sharing common priorities and endeavours. FLaReNet has contributed to the creation of this coherence by gathering a wide community of experts and making them participate in the definition of an exhaustive set of recommendations

    The European Language Resources and Technologies Forum: Shaping the Future of the Multilingual Digital Europe

    Get PDF
    Proceedings of the 1st FLaReNet Forum on the European Language Resources and Technologies, held in Vienna, at the Austrian Academy of Science, on 12-13 February 2009

    Temporal Parameters of Spontaneous Speech in Forensic Speaker Identification in Case of Language Mismatch: Serbian as L1 and English as L2

    Get PDF
    Celem badania jest analiza możliwości identyfikacji mówcy kryminalistycznego i sądowego podczas zadawania pytań w różnych językach, z wykorzystaniem parametrów temporalnych. (wskaźnik artykulcji, wskaźnik mowy, stopień niezdecydowania, odsetek pauz, średnia czas trwania pauzy). Korpus obejmuje 10 mówców kobiet z Serbii, które znają język angielksi na poziomie zaawwansowanym. Patrametry są badane z wykorzystaniem beayesowskiego wzoru wskaźnika prawdopodobieństwa w 40 parach tcyh samych mówców i w 230 parach różnych mówców, z uwzględnieniem szacunku wskaźnika błędu, równiego wskaźnika błędu i Całościowego Wskaźnika Prawdopodobieństwa. badanie ma charakter pionierski w zakresie językoznawstwa sądowego i kryminalistycznego por1) ónawczego w parze jezyka serbskiego i angielskiego, podobnie, jak analiza parametrów temporalnych mówców bilingwalnych. Dalsze badania inny skoncentrować się na porównaniu języków z rytmem akcentowym i z rytmem sylabicznym. The purpose of the research is to examine the possibility of forensic speaker identification if question and suspect sample are in different languages using temporal parameters (articulation rate, speaking rate, degree of hesitancy, percentage of pauses, average pause duration). The corpus includes 10 female native speakers of Serbian who are proficient in English. The parameters are tested using Bayesian likelihood ratio formula in 40 same-speaker and 360 different-speaker pairs, including estimation of error rates, equal error rates and Overall Likelihood Ratio. One-way ANOVA is performed to determine whether inter-speaker variability is higher than intra- speaker variability across languages. The most successful discriminant is degree of hesitancy with ER of 42.5%/28%, (EER: 33%), followed by average pause duration with ER 35%/45.56%, (EER: 40%). Although the research features a closed-set comparison, which is not very common in forensic reality, the results are still relevant for forensic phoneticians working on criminal cases or as expert witnesses. This study pioneers in forensically comparing Serbian and English as well as in forensically testing temporal parameters on bilingual speakers. Further research should focus on comparing two stress-timed or two syllable-timed languages to test whether they will be more comparable in terms of temporal aspects of speech.

    Identyfikacja parametrów czasowych mowy spontanicznej mówców kryminalistycznych w przypadku niedopasowania językowego: język serbski jako L1 i język angielski jako L2

    Get PDF
    The purpose of the research is to examine the possibility of forensic speaker identification if question and suspect sample are in different languages using temporal parameters (articulation rate, speaking rate, degree of hesitancy, percentage of pauses, average pause duration). The corpus includes 10 female native speakers of Serbian who are proficient in English. The parameters are tested using Bayesian likelihood ratio formula in 40 same-speaker and 360 different-speaker pairs, including estimation of error rates, equal error rates and Overall Likelihood Ratio. One-way ANOVA is performed to determine whether inter-speaker variability is higher than intra- speaker variability across languages. The most successful discriminant is degree of hesitancy with ER of 42.5%/28%, (EER: 33%), followed by average pause duration with ER 35%/45.56%, (EER: 40%). Although the research features a closed-set comparison, which is not very common in forensic reality, the results are still relevant for forensic phoneticians working on criminal cases or as expert witnesses. This study pioneers in forensically comparing Serbian and English as well as in forensically testing temporal parameters on bilingual speakers. Further research should focus on comparing two stress-timed or two syllable-timed languages to test whether they will be more comparable in terms of temporal aspects of speech. Celem badania jest analiza możliwości identyfikacji mówcy kryminalistycznego i sądowego podczas zadawania pytań w różnych językach, z wykorzystaniem parametrów temporalnych. (wskaźnik artykulcji, wskaźnik mowy, stopień niezdecydowania, odsetek pauz, średnia czas trwania pauzy). Korpus obejmuje 10 mówców kobiet z Serbii, które znają język angielksi na poziomie zaawwansowanym. Patrametry są badane z wykorzystaniem beayesowskiego wzoru wskaźnika prawdopodobieństwa w 40 parach tcyh samych mówców i w 230 parach różnych mówców, z uwzględnieniem szacunku wskaźnika błędu, równiego wskaźnika błędu i Całościowego Wskaźnika Prawdopodobieństwa. badanie ma charakter pionierski w zakresie językoznawstwa sądowego i kryminalistycznego por1) ónawczego w parze jezyka serbskiego i angielskiego, podobnie, jak analiza parametrów temporalnych mówców bilingwalnych. Dalsze badania inny skoncentrować się na porównaniu języków z rytmem akcentowym i z rytmem sylabicznym.

    Approaches towards a Lexical Web: the role of Interoperability

    Get PDF
    After highlighting some of the major dimensions that are relevant for Language Resources (LR) and contribute to their infrastructural role, I underline some priority areas of concern today with respect to implementing an open Language Infrastructure, and specifically what we could call a ?Lexical Web?. My objective is to show that it is imperative to define an underlying global strategy behind the set of initiatives which are/can be launched in Europe and world-wide, and that it is necessary an allembracing vision and a cooperation among different communities to achieve more coherent and useful results. I end up mentioning two new European initiatives that in this direction and promise to be influential in shaping the future of the LR area

    Language technologies for a multilingual Europe

    Get PDF
    This volume of the series “Translation and Multilingual Natural Language Processing” includes most of the papers presented at the Workshop “Language Technology for a Multilingual Europe”, held at the University of Hamburg on September 27, 2011 in the framework of the conference GSCL 2011 with the topic “Multilingual Resources and Multilingual Applications”, along with several additional contributions. In addition to an overview article on Machine Translation and two contributions on the European initiatives META-NET and Multilingual Web, the volume includes six full research articles. Our intention with this workshop was to bring together various groups concerned with the umbrella topics of multilingualism and language technology, especially multilingual technologies. This encompassed, on the one hand, representatives from research and development in the field of language technologies, and, on the other hand, users from diverse areas such as, among others, industry, administration and funding agencies. The Workshop “Language Technology for a Multilingual Europe” was co-organised by the two GSCL working groups “Text Technology” and “Machine Translation” (http://gscl.info) as well as by META-NET (http://www.meta-net.eu)

    Access to recorded interviews: A research agenda

    Get PDF
    Recorded interviews form a rich basis for scholarly inquiry. Examples include oral histories, community memory projects, and interviews conducted for broadcast media. Emerging technologies offer the potential to radically transform the way in which recorded interviews are made accessible, but this vision will demand substantial investments from a broad range of research communities. This article reviews the present state of practice for making recorded interviews available and the state-of-the-art for key component technologies. A large number of important research issues are identified, and from that set of issues, a coherent research agenda is proposed

    Multilingual language resources and interoperability

    Get PDF
    This article introduces the topic of ‘‘Multilingual language resources and interoperability’’. We start with a taxonomy and parameters for classifying language resources. Later we provide examples and issues of interoperatability, and resource architectures to solve such issues. Finally we discuss aspects of linguistic formalisms and interoperability
    corecore