9,343 research outputs found

    Multimodal music information processing and retrieval: survey and future challenges

    Full text link
    Towards improving the performance in various music information processing tasks, recent studies exploit different modalities able to capture diverse aspects of music. Such modalities include audio recordings, symbolic music scores, mid-level representations, motion, and gestural data, video recordings, editorial or cultural tags, lyrics and album cover arts. This paper critically reviews the various approaches adopted in Music Information Processing and Retrieval and highlights how multimodal algorithms can help Music Computing applications. First, we categorize the related literature based on the application they address. Subsequently, we analyze existing information fusion approaches, and we conclude with the set of challenges that Music Information Retrieval and Sound and Music Computing research communities should focus in the next years

    Virtual to Real Reinforcement Learning for Autonomous Driving

    Full text link
    Reinforcement learning is considered as a promising direction for driving policy learning. However, training autonomous driving vehicle with reinforcement learning in real environment involves non-affordable trial-and-error. It is more desirable to first train in a virtual environment and then transfer to the real environment. In this paper, we propose a novel realistic translation network to make model trained in virtual environment be workable in real world. The proposed network can convert non-realistic virtual image input into a realistic one with similar scene structure. Given realistic frames as input, driving policy trained by reinforcement learning can nicely adapt to real world driving. Experiments show that our proposed virtual to real (VR) reinforcement learning (RL) works pretty well. To our knowledge, this is the first successful case of driving policy trained by reinforcement learning that can adapt to real world driving data

    Automatic Affine and Elastic Registration Strategies for Multi-dimensional Medical Images

    Get PDF
    Medical images have been used increasingly for diagnosis, treatment planning, monitoring disease processes, and other medical applications. A large variety of medical imaging modalities exists including CT, X-ray, MRI, Ultrasound, etc. Frequently a group of images need to be compared to one another and/or combined for research or cumulative purposes. In many medical studies, multiple images are acquired from subjects at different times or with different imaging modalities. Misalignment inevitably occurs, causing anatomical and/or functional feature shifts within the images. Computerized image registration (alignment) approaches can offer automatic and accurate image alignments without extensive user involvement and provide tools for visualizing combined images. This dissertation focuses on providing automatic image registration strategies. After a through review of existing image registration techniques, we identified two registration strategies that enhance the current field: (1) an automated rigid body and affine registration using voxel similarity measurements based on a sequential hybrid genetic algorithm, and (2) an automated deformable registration approach based upon a linear elastic finite element formulation. Both methods streamlined the registration process. They are completely automatic and require no user intervention. The proposed registration strategies were evaluated with numerous 2D and 3D MR images with a variety of tissue structures, orientations and dimensions. Multiple registration pathways were provided with guidelines for their applications. The sequential genetic algorithm mimics the pathway of an expert manually doing registration. Experiments demonstrated that the sequential genetic algorithm registration provides high alignment accuracy and is reliable for brain tissues. It avoids local minima/maxima traps of conventional optimization techniques, and does not require any preprocessing such as threshold, smoothing, segmentation, or definition of base points or edges. The elastic model was shown to be highly effective to accurately align areas of interest that are automatically extracted from the images, such as brains. Using a finite element method to get the displacement of each element node by applying a boundary mapping, this method provides an accurate image registration with excellent boundary alignment of each pair of slices and consequently align the entire volume automatically. This dissertation presented numerous volume alignments. Surface geometries were created directly from the aligned segmented images using the Multiple Material Marching Cubes algorithm. Using the proposed registration strategies, multiple subjects were aligned to a standard MRI reference, which is aligned to a segmented reference atlas. Consequently, multiple subjects are aligned to the segmented atlas and a full fMRI analysis is possible

    Reference face graph for face recognition

    Get PDF
    Face recognition has been studied extensively; however, real-world face recognition still remains a challenging task. The demand for unconstrained practical face recognition is rising with the explosion of online multimedia such as social networks, and video surveillance footage where face analysis is of significant importance. In this paper, we approach face recognition in the context of graph theory. We recognize an unknown face using an external reference face graph (RFG). An RFG is generated and recognition of a given face is achieved by comparing it to the faces in the constructed RFG. Centrality measures are utilized to identify distinctive faces in the reference face graph. The proposed RFG-based face recognition algorithm is robust to the changes in pose and it is also alignment free. The RFG recognition is used in conjunction with DCT locality sensitive hashing for efficient retrieval to ensure scalability. Experiments are conducted on several publicly available databases and the results show that the proposed approach outperforms the state-of-the-art methods without any preprocessing necessities such as face alignment. Due to the richness in the reference set construction, the proposed method can also handle illumination and expression variation
    • …
    corecore