1,948 research outputs found

    СИСТЕМА АУДИОВИЗУАЛЬНОГО СИНТЕЗА РУССКОЙ РЕЧИ

    Get PDF
    Описываются имитационная и компиляционная модели аудиовизуального синтеза русской речи и созданный на их основе аудиовизуальный синтезатор речи. Рассматриваются преимущества и недостатки каждой из моделей, а также особенности их построения для русской речи

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA) came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the neonate to the adult and elderly. Over the years the initial issues have grown and spread also in other aspects of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years always in Firenze, Italy

    Automatic Sign Language Recognition from Image Data

    Get PDF
    Tato práce se zabývá problematikou automatického rozpoznávání znakového jazyka z obrazových dat. Práce představuje pět hlavních přínosů v oblasti tvorby systému pro rozpoznávání, tvorby korpusů, extrakci příznaků z rukou a obličeje s využitím metod pro sledování pozice a pohybu rukou (tracking) a modelování znaků s využitím menších fonetických jednotek (sub-units). Metody využité v rozpoznávacím systému byly využity i k tvorbě vyhledávacího nástroje "search by example", který dokáže vyhledávat ve videozáznamech podle obrázku ruky. Navržený systém pro automatické rozpoznávání znakového jazyka je založen na statistickém přístupu s využitím skrytých Markovových modelů, obsahuje moduly pro analýzu video dat, modelování znaků a dekódování. Systém je schopen rozpoznávat jak izolované, tak spojité promluvy. Veškeré experimenty a vyhodnocení byly provedeny s vlastními korpusy UWB-06-SLR-A a UWB-07-SLR-P, první z nich obsahuje 25 znaků, druhý 378. Základní extrakce příznaků z video dat byla provedena na nízkoúrovňových popisech obrazu. Lepších výsledků bylo dosaženo s příznaky získaných z popisů vyšší úrovně porozumění obsahu v obraze, které využívají sledování pozice rukou a metodu pro segmentaci rukou v době překryvu s obličejem. Navíc, využitá metoda dokáže interpolovat obrazy s obličejem v době překryvu a umožňuje tak využít metody pro extrakci příznaků z obličeje, které by během překryvu nefungovaly, jako např. metoda active appearance models (AAM). Bylo porovnáno několik různých metod pro extrakci příznaků z rukou, jako např. local binary patterns (LBP), histogram of oriented gradients (HOG), vysokoúrovnové lingvistické příznaky a nové navržená metoda hand shape radial distance function (hRDF). Bylo také zkoumáno využití menších fonetických jednotek, než jsou celé znaky, tzv. sub-units. Pro první krok tvorby těchto jednotek byl navržen iterativní algoritmus, který tyto jednotky automaticky vytváří analýzou existujících dat. Bylo ukázáno, že tento koncept je vhodný pro modelování a rozpoznávání znaků. Kromě systému pro rozpoznávání je v práci navržen a představen systém "search by example", který funguje jako vyhledávací systém pro videa se záznamy znakového jazyka a může být využit například v online slovnících znakového jazyka, kde je v současné době složité či nemožné v takovýchto datech vyhledávat. Tento nástroj využívá metody, které byly použity v rozpoznávacím systému. Výstupem tohoto vyhledávacího nástroje je seřazený seznam videí, které obsahují stejný nebo podobný tvar ruky, které zadal uživatel, např. přes webkameru.Katedra kybernetikyObhájenoThis thesis addresses several issues of automatic sign language recognition, namely the creation of vision based sign language recognition framework, sign language corpora creation, feature extraction, making use of novel hand tracking with face occlusion handling, data-driven creation of sub-units and "search by example" tool for searching in sign language corpora using hand images as a search query. The proposed sign language recognition framework, based on statistical approach incorporating hidden Markov models (HMM), consists of video analysis, sign modeling and decoding modules. The framework is able to recognize both isolated signs and continuous utterances from video data. All experiments and evaluations were performed on two own corpora, UWB-06-SLR-A and UWB-07-SLR-P, the first containing 25 signs and second 378. As a baseline feature descriptors, low level image features are used. It is shown that better performance is gained by higher level features that employ hand tracking, which resolve occlusions of hands and face. As a side effect, the occlusion handling method interpolates face area in the frames during the occlusion and allows to use face feature descriptors that fail in such a case, for instance features extracted from active appearance models (AAM) tracker. Several state-of-the-art appearance-based feature descriptors were compared for tracked hands, such as local binary patterns (LBP), histogram of oriented gradients (HOG), high-level linguistic features or newly proposed hand shape radial distance function (denoted as hRDF) that enhances the feature description of hand-shape like concave regions. The concept of sub-units, that uses HMM models based on linguistic units smaller than whole sign and covers inner structures of the signs, was investigated in the proposed iterative method that is a first required step for data-driven construction of sub-units, and shows that such a concept is suitable for sign modeling and recognition tasks. Except of experiments in the sign language recognition, additional tool \textit{search by example} was created and evaluated. This tool is a search engine for sign language videos. Such a system can be incorporated into an online sign language dictionary where it is difficult to search in the sign language data. This proposed tool employs several methods which were examined in the sign language recognition task and allows to search in the video corpora based on an user-given query that consists of one or multiple images of hands. As a result, an ordered list of videos that contain the same or similar hand configurations is returned

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA) came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the newborn to the adult and elderly. Over the years the initial issues have grown and spread also in other fields of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years in Firenze, Italy. This edition celebrates twenty-two years of uninterrupted and successful research in the field of voice analysis

    Current issues of the Russian language teaching XIV

    Get PDF
    Collection of papers “Current issues of the Russian language teaching XIV” is devoted to issues of methodology of teaching Russian as a foreign language, to issues of linguistics and literary science and includes papers related to the use of online tools and resources in teaching Russian. This collection of papers is a result of the international scientific conference “Current issues of the Russian language teaching XIV”, which was scheduled for 8–10 May 2020, but due to the pandemic COVID-19 took place remotely

    The development of corpus-based computer assisted composition program and its application for instrumental music composition

    Get PDF
    In the last 20 years, we have seen the nourishing environment for the development of music software using a corpus of audio data expanding significantly, namely that synthesis techniques producing electronic sounds, and supportive tools for creative activities are the driving forces to the growth. Some software produces a sequence of sounds by means of synthesizing a chunk of source audio data retrieved from an audio database according to a rule. Since the matching of sources is processed according to their descriptive features extracted by FFT analysis, the quality of the result is significantly influenced by the outcomes of the Audio Analysis, Segmentation, and Decomposition. Also, the synthesis process often requires a considerable amount of sample data and this can become an obstacle to establish easy, inexpensive, and user-friendly applications on various kinds of devices. Therefore, it is crucial to consider how to treat the data and construct an efficient database for the synthesis. We aim to apply corpusbased synthesis techniques to develop a Computer Assisted Composition program, and to investigate the actual application of the program on ensemble pieces. The goal of this research is to apply the program to the instrumental music composition, refine its function, and search new avenues for innovative compositional method
    corecore