Search CORE

7 research outputs found

Syväoppiminen puhutun kielen tunnistamisessa

Author: Lindgren Matias
Publication venue
Publication date: 19/05/2020
Field of study

This thesis applies deep learning based classification techniques to identify natural languages from speech. The primary motivation behind this thesis is to implement accurate techniques for segmenting multimedia materials by the languages spoken in them. Several existing state-of-the-art, deep learning based approaches are discussed and a subset of the discussed approaches are selected for quantitative experimentation. The selected model architectures are trained on several well-known spoken language identification datasets containing several different languages. Segmentation granularity varies between models, some supporting input audio lengths of 0.2 seconds, while others require 10 second long input to make a language decision. Results from the thesis experiments show that an unsupervised representation of acoustic units, produced by a deep sequence-to-sequence auto encoder, cannot reach the language identification performance of a supervised representation, produced by a multilingual phoneme recognizer. Contrary to most existing results, in this thesis, acoustic-phonetic language classifiers trained on labeled spectral representations outperform phonotactic classifiers trained on bottleneck features of a multilingual phoneme recognizer. More work is required, using transcribed datasets and automatic speech recognition techniques, to investigate why phoneme embeddings did not outperform simple, labeled spectral features. While an accurate online language segmentation tool for multimedia materials could not be constructed, the work completed in this thesis provides several insights for building feasible, modern spoken language identification systems. As a side-product of the experiments performed during this thesis, a free open source spoken language identification software library called "lidbox" was developed, allowing future experiments to begin where the experiments of this thesis end.Tämä diplomityö keskittyy soveltamaan syviä neuroverkkomalleja luonnollisten kielien automaattiseen tunnistamiseen puheesta. Tämän työn ensisijainen tavoite on toteuttaa tarkka menetelmä multimediamateriaalien ositteluun niissä esiintyvien puhuttujen kielien perusteella. Työssä tarkastellaan useampaa jo olemassa olevaa neuroverkkoihin perustuvaa lähestymistapaa, joista valitaan alijoukko tarkempaan tarkasteluun, kvantitatiivisten kokeiden suorittamiseksi. Valitut malliarkkitehtuurit koulutetaan käyttäen eri puhetietokantoja, sisältäen useampia eri kieliä. Kieliosittelun hienojakoisuus vaihtelee käytettyjen mallien mukaan, 0,2 sekunnista 10 sekuntiin, riippuen kuinka pitkän aikaikkunan perusteella malli pystyy tuottamaan kieliennusteen. Diplomityön aikana suoritetut kokeet osoittavat, että sekvenssiautoenkoodaajalla ohjaamattomasti löydetty puheen diskreetti akustinen esitysmuoto ei ole riittävä kielen tunnistamista varten, verrattuna foneemitunnistimen tuottamaan, ohjatusti opetettuun foneemiesitysmuotoon. Tässä työssä havaittiin, että akustisfoneettiset kielentunnistusmallit saavuttavat korkeamman kielentunnistustarkkuuden kuin foneemiesitysmuotoa käyttävät kielentunnistusmallit, mikä eroaa monista kirjallisuudessa esitetyistä tuloksista. Diplomityön tutkimuksia on jatkettava, esimerkiksi litteroituja puhetietokantoja ja puheentunnistusmenetelmiä käyttäen, jotta pystyttäisiin selittämään miksi foneemimallin tuottamalla esitysmuodolla ei saatu parempia tuloksia kuin yksinkertaisemmalla, taajuusspektrin esitysmuodolla. Tämän työn aikana puhutun kielen tunnistaminen osoittautui huomattavasti haasteellisemmaksi kuin mitä työn alussa oli arvioitu, eikä työn aikana onnistuttu toteuttamaan tarpeeksi tarkkaa multimediamateriaalien kielienosittelumenetelmää. Tästä huolimatta, työssä esitetyt lähestymistavat tarjoavat toimivia käytännön menetelmiä puhutun kielen tunnistamiseen tarkoitettujen, modernien järjestelmien rakentamiseksi. Tämän diplomityön sivutuotteena syntyi myös puhutun kielen tunnistamiseen tarkoitettu avoimen lähdekoodin kirjasto nimeltä "lidbox", jonka ansiosta tämän työn kvantitatiivisia kokeita voi jatkaa siitä, mihin ne tämän työn päätteeksi jäivät

Aaltodoc Publication Archive

Zebrafish Models for Development and Disease 2.0

Author
Publication venue: 'MDPI AG'
Publication date: 17/11/2022
Field of study

The special issue (Zebrafish Models for Development and Disease 2.0) is a collection of articles highlighting research using the zebrafish (Danio rerio) experimental organism. Research described in this special issue addresses various developmental biology, genetic, biomedical and neuroscience topics that should be of general interest to the biomedical research community

Directory of Open Access Books (DOAB)

Software for Exascale Computing - SPPEXA 2016-2019

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

This open access book summarizes the research done and results obtained in the second funding phase of the Priority Program 1648 "Software for Exascale Computing" (SPPEXA) of the German Research Foundation (DFG) presented at the SPPEXA Symposium in Dresden during October 21-23, 2019. In that respect, it both represents a continuation of Vol. 113 in Springer’s series Lecture Notes in Computational Science and Engineering, the corresponding report of SPPEXA’s first funding phase, and provides an overview of SPPEXA’s contributions towards exascale computing in today's sumpercomputer technology. The individual chapters address one or more of the research directions (1) computational algorithms, (2) system software, (3) application software, (4) data management and exploration, (5) programming, and (6) software tools. The book has an interdisciplinary appeal: scholars from computational sub-fields in computer science, mathematics, physics, or engineering will find it of particular interest

OAPEN Library

Adaptive spectral smoothening for development of robust keyword spotting system

Author: Biswaranjan Pattanayak
Can D.
Gayadhar Pradhan
Jayant Kumar Rout
Lee L.
Sangeetha J.
Serizel R.
Shahnawazuddin S.
Shahnawazuddin S.
Sinha R.
Sri Rama Murthy K.
Srinivas N.
Wu P.
Yadav I.C.
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date
Field of study

Crossref

Proceedings of the VIIth GSCP International Conference

Author
Publication venue: 'Firenze University Press'
Publication date: 31/05/2022
Field of study

The 7th International Conference of the Gruppo di Studi sulla Comunicazione Parlata, dedicated to the memory of Claire Blanche-Benveniste, chose as its main theme Speech and Corpora. The wide international origin of the 235 authors from 21 countries and 95 institutions led to papers on many different languages. The 89 papers of this volume reflect the themes of the conference: spoken corpora compilation and annotation, with the technological connected fields; the relation between prosody and pragmatics; speech pathologies; and different papers on phonetics, speech and linguistic analysis, pragmatics and sociolinguistics. Many papers are also dedicated to speech and second language studies. The online publication with FUP allows direct access to sound and video linked to papers (when downloaded)

Directory of Open Access Books (DOAB)

The drivers of Corporate Social Responsibility in the supply chain. A case study.

Author: CIASULLO MARIA VINCENZA
MONETTA Giulia
Publication venue: place:Montclair
Publication date: 01/01/2011
Field of study

Purpose: The paper studies the way in which a SME integrates CSR into its corporate strategy, the practices it puts in place and how its CSR strategies reflect on its suppliers and customers relations. Methodology/Research limitations: A qualitative case study methodology is used. The use of a single case study limits the generalizing capacity of these findings. Findings: The entrepreneur’s ethical beliefs and value system play a fundamental role in shaping sustainable corporate strategy. Furthermore, the type of competitive strategy selected based on innovation, quality and responsibility clearly emerges both in terms of well defined management procedures and supply chain relations as a whole aimed at involving partners in the process of sustainable innovation. Originality/value: The paper presents a SME that has devised an original innovative business model. The study pivots on the issues of innovation and eco-sustainability in a context of drivers for CRS and business ethics. These values are considered fundamental at International level; the United Nations has declared 2011 the “International Year of Forestry”

Archivio della Ricerca - Università di Salerno

Effects of Complementary use of Organic and Inorganic fertilizers on the growth and yield of Cucumber (Cucumu sativus. L.) on an ultisol

Author: Akata O. S.
Ikeh A. O.
Ndaeyo N. U.
Uduanng P. I.
Ukpe I. O.
Publication venue
Publication date: 01/01/2012
Field of study

A field study was conducted in 2008 and 2009 early cropping seasons to assess the response of cucumber (Cucumus sativus L.) to complementary use of organic and inorganic fertilizers in Uyo agro-ecology. The fertilizer treatments were: NPK (15:15:15) at 100 and 200 kgha-1, poultry manure (PM) at 5 and 10 tha-1 , and complementary application of 100 kgha-1 of NPK + 5 tha-1 of PM, 100 kgha-1 of NPK + 10 tha-1 of PM, 200 kgha1 of NPK +5 tha-1 of PM ,200 kgha-1 of NPK +10 tha-1 of PM and control (no fertilizer). Results showed significant differences (P<0.05) in all the growth and yield parameters considered in both cropping seasons. The combined application of 200 kgha-1 of NPK and 10 tha-1 of PM performed better than sole application of either organic or inorganic fertilizer, with fresh fruit yield of 14.63 and 14.92 tha-1 in 2008 and 2009, respectively and exceeded other treatments by 1 -76% and 1-73% in 2009 and 2010, respectively. This indicates strongly the synergistic benefits of using both organic and inorganic fertilizers even at lower rates

University of Uyo Institutional Repository