74 research outputs found

    Speech Recognition

    Get PDF
    Chapters in the first part of the book cover all the essential speech processing techniques for building robust, automatic speech recognition systems: the representation for speech signals and the methods for speech-features extraction, acoustic and language modeling, efficient algorithms for searching the hypothesis space, and multimodal approaches to speech recognition. The last part of the book is devoted to other speech processing applications that can use the information from automatic speech recognition for speaker identification and tracking, for prosody modeling in emotion-detection systems and in other speech processing applications that are able to operate in real-world environments, like mobile communication services and smart homes

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    Learning commonsense human-language descriptions from temporal and spatial sensor-network data

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2006.Includes bibliographical references (p. 105-109) and index.Embedded-sensor platforms are advancing toward such sophistication that they can differentiate between subtle actions. For example, when placed in a wristwatch, such platforms can tell whether a person is shaking hands or turning a doorknob. Sensors placed on objects in the environment now report many parameters, including object location, movement, sound, and temperature. A persistent problem, however, is the description of these sense data in meaningful human-language. This is an important problem that appears across domains ranging from organizational security surveillance to individual activity journaling. Previous models of activity recognition pigeon-hole descriptions into small, formal categories specified in advance; for example, location is often categorized as "at home" or "at the office." These models have not been able to adapt to the wider range of complex, dynamic, and idiosyncratic human activities. We hypothesize that the commonsense, semantically related, knowledge bases can be used to bootstrap learning algorithms for classifying and recognizing human activities from sensors.(cont.) Our system, LifeNet, is a first-person commonsense inference model, which consists of a graph with nodes drawn from a large repository of commonsense assertions expressed in human-language phrases. LifeNet is used to construct a mapping between streams of sensor data and partially ordered sequences of events, co-located in time and space. Further, by gathering sensor data in vivo, we are able to validate and extend the commonsense knowledge from which LifeNet is derived. LifeNet is evaluated in the context of its performance on a sensor-network platform distributed in an office environment. We hypothesize that mapping sensor data into LifeNet will act as a "semantic mirror" to meaningfully interpret sensory data into cohesive patterns in order to understand and predict human action.by Bo Morgan.S.M

    Connected Attribute Filtering Based on Contour Smoothness

    Get PDF

    Review : Deep learning in electron microscopy

    Get PDF
    Deep learning is transforming most areas of science and technology, including electron microscopy. This review paper offers a practical perspective aimed at developers with limited familiarity. For context, we review popular applications of deep learning in electron microscopy. Following, we discuss hardware and software needed to get started with deep learning and interface with electron microscopes. We then review neural network components, popular architectures, and their optimization. Finally, we discuss future directions of deep learning in electron microscopy

    Designing sound : procedural audio research based on the book by Andy Farnell

    Get PDF
    In procedural media, data normally acquired by measuring something, commonly described as sampling, is replaced by a set of computational rules (procedure) that defines the typical structure and/or behaviour of that thing. Here, a general approach to sound as a definable process, rather than a recording, is developed. By analysis of their physical and perceptual qualities, natural objects or processes that produce sound are modelled by digital Sounding Objects for use in arts and entertainments. This Thesis discusses different aspects of Procedural Audio introducing several new approaches and solutions to this emerging field of Sound Design.Em Media Procedimental, os dados os dados normalmente adquiridos através da medição de algo habitualmente designado como amostragem, são substituídos por um conjunto de regras computacionais (procedimento) que definem a estrutura típica, ou comportamento, desse elemento. Neste caso é desenvolvida uma abordagem ao som definível como um procedimento em vez de uma gravação. Através da análise das suas características físicas e perceptuais , objetos naturais ou processos que produzem som, são modelados como objetos sonoros digitais para utilização nas Artes e Entretenimento. Nesta Tese são discutidos diferentes aspectos de Áudio Procedimental, sendo introduzidas várias novas abordagens e soluções para o campo emergente do Design Sonoro
    corecore