33,273 research outputs found

    A Comparative Study of Machine Learning Models for Tabular Data Through Challenge of Monitoring Parkinson's Disease Progression Using Voice Recordings

    Full text link
    People with Parkinson's disease must be regularly monitored by their physician to observe how the disease is progressing and potentially adjust treatment plans to mitigate the symptoms. Monitoring the progression of the disease through a voice recording captured by the patient at their own home can make the process faster and less stressful. Using a dataset of voice recordings of 42 people with early-stage Parkinson's disease over a time span of 6 months, we applied multiple machine learning techniques to find a correlation between the voice recording and the patient's motor UPDRS score. We approached this problem using a multitude of both regression and classification techniques. Much of this paper is dedicated to mapping the voice data to motor UPDRS scores using regression techniques in order to obtain a more precise value for unknown instances. Through this comparative study of variant machine learning methods, we realized some old machine learning methods like trees outperform cutting edge deep learning models on numerous tabular datasets.Comment: Accepted at "HIMS'20 - The 6th Int'l Conf on Health Informatics and Medical Systems"; https://americancse.org/events/csce2020/conferences/hims2

    ProtNN: Fast and Accurate Nearest Neighbor Protein Function Prediction based on Graph Embedding in Structural and Topological Space

    Full text link
    Studying the function of proteins is important for understanding the molecular mechanisms of life. The number of publicly available protein structures has increasingly become extremely large. Still, the determination of the function of a protein structure remains a difficult, costly, and time consuming task. The difficulties are often due to the essential role of spatial and topological structures in the determination of protein functions in living cells. In this paper, we propose ProtNN, a novel approach for protein function prediction. Given an unannotated protein structure and a set of annotated proteins, ProtNN finds the nearest neighbor annotated structures based on protein-graph pairwise similarities. Given a query protein, ProtNN finds the nearest neighbor reference proteins based on a graph representation model and a pairwise similarity between vector embedding of both query and reference protein-graphs in structural and topological spaces. ProtNN assigns to the query protein the function with the highest number of votes across the set of k nearest neighbor reference proteins, where k is a user-defined parameter. Experimental evaluation demonstrates that ProtNN is able to accurately classify several datasets in an extremely fast runtime compared to state-of-the-art approaches. We further show that ProtNN is able to scale up to a whole PDB dataset in a single-process mode with no parallelization, with a gain of thousands order of magnitude of runtime compared to state-of-the-art approaches
    • …
    corecore