4,363 research outputs found

    Beyond Gr\"obner Bases: Basis Selection for Minimal Solvers

    Full text link
    Many computer vision applications require robust estimation of the underlying geometry, in terms of camera motion and 3D structure of the scene. These robust methods often rely on running minimal solvers in a RANSAC framework. In this paper we show how we can make polynomial solvers based on the action matrix method faster, by careful selection of the monomial bases. These monomial bases have traditionally been based on a Gr\"obner basis for the polynomial ideal. Here we describe how we can enumerate all such bases in an efficient way. We also show that going beyond Gr\"obner bases leads to more efficient solvers in many cases. We present a novel basis sampling scheme that we evaluate on a number of problems

    Ad Hoc Microphone Array Calibration: Euclidean Distance Matrix Completion Algorithm and Theoretical Guarantees

    Get PDF
    This paper addresses the problem of ad hoc microphone array calibration where only partial information about the distances between microphones is available. We construct a matrix consisting of the pairwise distances and propose to estimate the missing entries based on a novel Euclidean distance matrix completion algorithm by alternative low-rank matrix completion and projection onto the Euclidean distance space. This approach confines the recovered matrix to the EDM cone at each iteration of the matrix completion algorithm. The theoretical guarantees of the calibration performance are obtained considering the random and locally structured missing entries as well as the measurement noise on the known distances. This study elucidates the links between the calibration error and the number of microphones along with the noise level and the ratio of missing distances. Thorough experiments on real data recordings and simulated setups are conducted to demonstrate these theoretical insights. A significant improvement is achieved by the proposed Euclidean distance matrix completion algorithm over the state-of-the-art techniques for ad hoc microphone array calibration.Comment: In Press, available online, August 1, 2014. http://www.sciencedirect.com/science/article/pii/S0165168414003508, Signal Processing, 201

    A Deep Learning Technique to Clinch the Detection of Parkinson’s Disease using Speech and Voice Attributes

    Get PDF
    Among the neurodegenerative diseases Parkinson’s Disease ranks second only to Alzheimer’s disease. Though extensive research is carried out in this area there have been no biomarker suggested. At present the diagnosis and monitoring of the disease progression is possible only through clinical examination and function symptoms observation. Voice impairment has been identified as an early marker for Parkinson’s Disease and hence the research in this field is gaining popularity. Machine Learning algorithms have proved useful in analyzing the enormous data with high dimensionality. But this has not been successful in extricating features that will have a strong correlation in predicting the disease accurately. This calls for a more effective and powerful technique like Deep Learning that uses deep neural networks that can select the optimal features and can contribute in the identification of the disease. In this paper an initial step was made by designing an Artificial Neural Network model. This yielded a train and test accuracy more than ninety-nine percentage and seventy-five percentage respectively for classifying the disease but showed overfitting problem which resulted in a decrease in the performance. Hence, the Artificial Neural Network model was hyper-tuned to reduce this problem and there was a slight improvement in the performance. Two methods were employed for optimization – a regularization method early stop and another validation method called Stratified K -Fold Cross Validation. Among these the second approach showed better results by slightly reducing the overfitting issue and it yielded a train and test accuracy score of approximately ninety-nine percentage and ninety-seven percentage with K-fold as five and Stochastic Gradient Descent as the optimizer. Even though the results were promising it was unable to unravel the prime attributes that would eventually identify the disease

    Speaker Recognition Using Machine Learning Techniques

    Get PDF
    Speaker recognition is a technique of identifying the person talking to a machine using the voice features and acoustics. It has multiple applications ranging in the fields of Human Computer Interaction (HCI), biometrics, security, and Internet of Things (IoT). With the advancements in technology, hardware is getting powerful and software is becoming smarter. Subsequently, the utilization of devices to interact effectively with humans and performing complex calculations is also increasing. This is where speaker recognition is important as it facilitates a seamless communication between humans and computers. Additionally, the field of security has seen a rise in biometrics. At present, multiple biometric techniques co-exist with each other, for instance, iris, fingerprint, voice, facial, and more. Voice is one metric which apart from being natural to the users, provides comparable and sometimes even higher levels of security when compared to some traditional biometric approaches. Hence, it is a widely accepted form of biometric technique and is constantly being studied by scientists for further improvements. This study aims to evaluate different pre-processing, feature extraction, and machine learning techniques on audios recorded in unconstrained and natural environments to determine which combination of these works well for speaker recognition and classification. Thus, the report presents several methods of audio pre- processing like trimming, split and merge, noise reduction, and vocal enhancements to enhance the audios obtained from real-world situations. Additionally, a text-independent approach is used in this research which makes the model flexible to multiple languages. Mel Frequency Cepstral Coefficients (MFCC) are extracted for each audio, along with their differentials and accelerations to evaluate machine learning classification techniques such as kNN, Support Vector Machines, and Random Forest Classifiers. Lastly, the approaches are evaluated against existing research to study which techniques performs well on these sets of audio recordings
    • …
    corecore