9 research outputs found
Cloud-Based Retrieval Information System Using Concept for Multi-Format Data
The need of effective and efficient method to retrieving non-Web-enabled and Web-enabled information entities is essential, due to the fact of inaccuracy of the existing search engines that still use traditional term-based indexing for text documents and annotation text for images, audio and video files. Previous works showed that incorporating the knowledge in the form of concepts into an information retrieval system may increase the effectiveness of the retrieving method. Unfortunately, most of the works that implemented the concept-based information retrieval system still focused on one information format. This paper proposes a multi-format (text, image, video and, audio) concept-based information retrieval method for Cloud environment. The proposed method is implemented in a laboratory-scale heterogeneous cloud environment using Eucalyptus middleware.ĂÂ 755 multi-format information is experimented and the performance of the proposed method is measured
KulcsszĂłkeresĂ©si kĂsĂ©rletek hangzĂł hĂranyagokon beszĂ©dhang alapĂș felismerĂ©si technikĂĄkkal
A beszĂ©dadatbĂĄzisok kereshetĆvĂ© tĂ©telĂ©hez szöveges cĂmkĂ©kkel kell
azokat ellĂĄtni. A kĂ©zenfekvĆ megoldĂĄs szĂłszintƱ ĂĄtirat kĂ©szĂttetĂ©se lenne
nagyszĂłtĂĄras beszĂ©dfelismerĆvel. A felismerĆk azonban zĂĄrt szĂłtĂĄrral dolgoznak,
Ăgy elĆfordulhat, hogy szĂĄmunkra fontos keresĂ©si kifejezĂ©seket (tulajdonneveket,
nĂ©velemeket) esĂ©lyĂŒnk sem lesz megtalĂĄlni, pusztĂĄn mert azok nem
szerepelnek a felismerĆ szĂłtĂĄrĂĄban. Jelen cikkben olyan megoldĂĄsokat hasonlĂtunk
össze, amelyek csupĂĄn beszĂ©dhang szinten vĂ©gzik el az elĆzetes indexĂĄlĂĄst,
Ăgy tetszĆleges keresĂ©si kifejezĂ©sre (hangsorozatra) kĂ©pesek rĂĄkeresni. A
vizsgĂĄlt mĂłdszerek talĂĄlati pontossĂĄga gyakorlati szempontbĂłl is hasznĂĄlhatĂłnak
ĂgĂ©rkezik, köszönhetĆen az eleve magas beszĂ©dhang-felismerĂ©si pontossĂĄgnak.
A futĂĄsi idĆt tekintve azonban mĂ©g a leggyorsabb mĂłdszer is sokkal
lassabbnak bizonyul, mint ami egy ilyen alkalmazåstól elvårt lenne. Ezért a kés
Ćbbiekben kifinomult indexĂĄlĂĄsi technikĂĄk bevetĂ©sĂ©re lesz szĂŒksĂ©g
Automatic Speech Indexing System of Bilingual Video Parliament Interventions
This paper presents the development and evaluation of an automatic audio indexing system designed for a special task: work in a bilingual environment in the Parliament of the Canton of Valais in Switzerland, with two official languages, German and French. As several speakers are bilingual, language changes may occur within speaker or even within utterance. Two audio indexing approaches are presented and compared: in the first, speech indexing is based on bilingual automatic speech recognition; in the second, language identification is used after speaker diarization in order to select the corresponding monolingual speech recognizer for decoding. The approaches are later combined. Speaker adaptive training is also addressed and evaluated. Accuracy of language identification and speech recognition for the monolingual and bilingual cases are presented and compared, in parallel with a brief description of the system and the user interface. Finally, the audio indexing system is also evaluated from an information retrieval point of view
KulcsszĂłkeresĂ©si kĂsĂ©rletek hangzĂł hĂranyagokon beszĂ©dhang alapĂș felismerĂ©si technikĂĄkkal
A beszĂ©dadatbĂĄzisok kereshetvĂ© tĂ©telĂ©hez szöveges cĂmkĂ©kkel kell azokat ellĂĄtni. A kĂ©zenfekv megoldĂĄs szĂłszint ĂĄtirat kĂ©szĂttetĂ©se lenne nagyszĂłtĂĄras beszĂ©dfelismervel. A felismerk azonban zĂĄrt szĂłtĂĄrral dolgoznak, Ăgy elfordulhat, hogy szĂĄmunkra fontos keresĂ©si kifejezĂ©seket (tulajdonneveket, nĂ©velemeket) esĂ©lyĂŒnk sem lesz megtalĂĄlni, pusztĂĄn mert azok nem szerepelnek a felismer szĂłtĂĄrĂĄban. Jelen cikkben olyan megoldĂĄsokat hasonlĂtunk össze, amelyek csupĂĄn beszĂ©dhang szinten vĂ©gzik el az elzetes indexĂĄlĂĄst, Ăgy tetszleges keresĂ©si kifejezĂ©sre (hangsorozatra) kĂ©pesek rĂĄkeresni. A vizsgĂĄlt mĂłdszerek talĂĄlati pontossĂĄga gyakorlati szempontbĂłl is hasznĂĄlhatĂłnak ĂgĂ©rkezik, köszönheten az eleve magas beszĂ©dhang-felismerĂ©si pontossĂĄgnak. A futĂĄsi idt tekintve azonban mĂ©g a leggyorsabb mĂłdszer is sokkal lassabbnak bizonyul, mint ami egy ilyen alkalmazĂĄstĂłl elvĂĄrt lenne. EzĂ©rt a kĂ©sbbiekben kifinomult indexĂĄlĂĄsi technikĂĄk bevetĂ©sĂ©re lesz szĂŒksĂ©g
Adaptive framing based similarity measurement between time warped speech signals using Kalman filter
Similarity measurement between speech signals aims at calculating the degree of similarity using acoustic features that has been receiving much interest due to the processing of large volume of multimedia information. However, dynamic properties of speech signals such as varying silence segments and time warping factor make it more challenging to measure the similarity between speech signals. This manuscript entails further extension of our research towards the adaptive framing based similarity measurement between speech signals using a Kalman filter. Silence removal is enhanced by integrating multiple features for voiced and unvoiced speech segments detection. The adaptive frame size measurement is improved by using the acceleration/deceleration phenomenon of object linear motion. A dominate feature set is used to represent the speech signals along with the pre-calculated model parameters that are set by the offline tuning of a Kalman filter. Performance is evaluated using additional datasets to evaluate the impact of the proposed model and silence removal approach on the time warped speech similarity measurement. Detailed statistical results are achieved indicating the overall accuracy improvement from 91 to 98% that proves the superiority of the extended approach on our previous research work towards the time warped continuous speech similarity measurement
Rapid yet accurate speech indexing using dynamic match lattice spotting
The support for typically out-of-vocabulary query terms such as names, acronyms, and foreign words is an important requirement of many speech indexing applications. However, to date many unrestricted vocabulary indexing systems have struggled to provide a balance between good detection rate and fast query speeds. This paper presents a fast and accurate unrestricted vocabulary speech indexing technique named Dynamic Match Lattice Spotting (DMLS). The proposed method augments the conventional lattice spotting technique with dynamic sequence matching, together with a number of other novel algorithmic enhancements, to obtain a system that is capable of searching hours of speech in seconds while maintaining excellent detection performanc
Spoken content retrieval: A survey of techniques and technologies
Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR
Recommended from our members
A Novel Approach for Continuous Speech Tracking and Dynamic Time Warping. Adaptive Framing Based Continuous Speech Similarity Measure and Dynamic Time Warping using Kalman Filter and Dynamic State Model
Dynamic speech properties such as time warping, silence removal and background noise interference are the most challenging issues in continuous speech signal matching. Among all of them, the time warped speech signal matching is of great interest and has been a tough challenge for the researchers. An adaptive framing based continuous speech tracking and similarity measurement approach is introduced in this work following a comprehensive research conducted in the diverse areas of speech processing. A dynamic state model is introduced based on system of linear motion equations which models the input (test) speech signal frame as a unidirectional moving object along the template speech signal. The most similar corresponding frame position in the template speech is estimated which is fused with a feature based similarity observation and the noise variances using a Kalman filter. The Kalman filter provides the final estimated frame position in the template speech at current time which is further used for prediction of a new frame size for the next step. In addition, a keyword spotting approach is proposed by introducing wavelet decomposition based dynamic noise filter and combination of beliefs. The Dempsterâs theory of belief combination is deployed for the first time in relation to keyword spotting task. Performances for both; speech tracking and keyword spotting approaches are evaluated using the statistical metrics and gold standards for the binary classification. Experimental results proved the superiority of the proposed approaches over the existing methods.The appendices files are not available online