94 research outputs found

    Independent vector analysis based on overlapped cliques of variable width for frequency-domain blind signal separation

    Get PDF
    A novel method is proposed to improve the performance of independent vector analysis (IVA) for blind signal separation of acoustic mixtures. IVA is a frequency-domain approach that successfully resolves the well-known permutation problem by applying a spherical dependency model to all pairs of frequency bins. The dependency model of IVA is equivalent to a single clique in an undirected graph; a clique in graph theory is defined as a subset of vertices in which any pair of vertices is connected by an undirected edge. Therefore, IVA imposes the same amount of statistical dependency on every pair of frequency bins, which may not match the characteristics of real-world signals. The proposed method allows variable amounts of statistical dependencies according to the correlation coefficients observed in real acoustic signals and, hence, enables more accurate modeling of statistical dependencies. A number of cliques constitutes the new dependency graph so that neighboring frequency bins are assigned to the same clique, while distant bins are assigned to different cliques. The permutation ambiguity is resolved by overlapped frequency bins between neighboring cliques. For speech signals, we observed especially strong correlations across neighboring frequency bins and a decrease in these correlations with an increase in the distance between bins. The clique sizes are either fixed, or determined by the reciprocal of the mel-frequency scale to impose a wider dependency on low-frequency components. Experimental results showed improved performances over conventional IVA. The signal-to-interference ratio improved from 15.5 to 18.8 dB on average for seven different source locations. When we varied the clique sizes according to the observed correlations, the stability of the proposed method increased with a large number of cliques.open4

    Enhanced IVA for audio separation in highly reverberant environments

    Get PDF
    Blind Audio Source Separation (BASS), inspired by the "cocktail-party problem", has been a leading research application for blind source separation (BSS). This thesis concerns the enhancement of frequency domain convolutive blind source separation (FDCBSS) techniques for audio separation in highly reverberant room environments. Independent component analysis (ICA) is a higher order statistics (HOS) approach commonly used in the BSS framework. When applied to audio FDCBSS, ICA based methods suffer from the permutation problem across the frequency bins of each source. Independent vector analysis (IVA) is an FD-BSS algorithm that theoretically solves the permutation problem by using a multivariate source prior, where the sources are considered to be random vectors. The algorithm allows independence between multivariate source signals, and retains dependency between the source signals within each source vector. The source prior adopted to model the nonlinear dependency structure within the source vectors is crucial to the separation performance of the IVA algorithm. The focus of this thesis is on improving the separation performance of the IVA algorithm in the application of BASS. An alternative multivariate Student's t distribution is proposed as the source prior for the batch IVA algorithm. A Student's t probability density function can better model certain frequency domain speech signals due to its tail dependency property. Then, the nonlinear score function, for the IVA, is derived from the proposed source prior. A novel energy driven mixed super Gaussian and Student's t source prior is proposed for the IVA and FastIVA algorithms. The Student's t distribution, in the mixed source prior, can model the high amplitude data points whereas the super Gaussian distribution can model the lower amplitude information in the speech signals. The ratio of both distributions can be adjusted according to the energy of the observed mixtures to adapt for different types of speech signals. A particular multivariate generalized Gaussian distribution is adopted as the source prior for the online IVA algorithm. The nonlinear score function derived from this proposed source prior contains fourth order relationships between different frequency bins, which provides a more informative and stronger dependency structure and thereby improves the separation performance. An adaptive learning scheme is developed to improve the performance of the online IVA algorithm. The scheme adjusts the learning rate as a function of proximity to the target solutions. The scheme is also accompanied with a novel switched source prior technique taking the best performance properties of the super Gaussian source prior and the generalized Gaussian source prior as the algorithm converges. The methods and techniques, proposed in this thesis, are evaluated with real speech source signals in different simulated and real reverberant acoustic environments. A variety of measures are used within the evaluation criteria of the various algorithms. The experimental results demonstrate improved performance of the proposed methods and their robustness in a wide range of situations

    Enhanced independent vector analysis for speech separation in room environments

    Get PDF
    PhD ThesisThe human brain has the ability to focus on a desired sound source in the presence of several active sound sources. The machine based method lags behind in mimicking this particular skill of human beings. In the domain of digital signal processing this problem is termed as the cocktail party problem. This thesis thus aims to further the eld of acoustic source separation in the frequency domain based on exploiting source independence. The main challenge in such frequency domain algorithms is the permutation problem. Independent vector analysis (IVA) is a frequency domain blind source separation algorithm which can theoretically obviate the permutation problem by preserving the dependency structure within each source vector whilst eliminating the dependency between the frequency bins of di erent source vectors. This thesis in particular focuses on improving the separation performance of IVA algorithms which are used for frequency domain acoustic source separation in real room environments. The source prior is crucial to the separation performance of the IVA algorithm as it is used to model the nonlinear dependency structure within the source vectors. An alternative multivariate Student's t distribution source prior is proposed for the IVA algorithm as it is known to be well suited for modelling certain speech signals due to its heavy tail nature. Therefore the nonlinear score function that is derived from the proposed Student's t source prior can better model the dependency structure within the frequency bins and thereby enhance the separation performance and the convergence speed of the IVA and the Fast version of the IVA (FastIVA) algorithms. 4 5 A novel energy driven mixed Student's t and the original super Gaussian source prior is also proposed for the IVA algorithms. As speech signals can be composed of many high and low amplitude data points, therefore the Student's t distribution in the mixed source prior can account for the high amplitude data points whereas the original su- per Gaussian distribution can cater for the other information in the speech signals. Furthermore, the weight of both distributions in the mixed source prior can be ad- justed according to the energy of the observed mixtures. Therefore the mixed source prior adapts the measured signals and further enhances the performance of the IVA algorithm. A common approach within the IVA algorithm is to model di erent speech sources with an identical source prior, however this does not account for the unique characteristics of each speech signal. Therefore dependency modelling for di erent speech sources can be improved by modelling di erent speech sources with di erent source priors. Hence, the Student's t mixture model (SMM) is introduced as a source prior for the IVA algorithm. This new source prior can adapt according to the nature of di erent speech signals and the parameters for the proposed SMM source prior are estimated by deriving an e cient expectation maximization (EM) algorithm. As a result of this study, a novel EM framework for the IVA algorithm with the SMM as a source prior is proposed which is capable of separating the sources in an e cient manner. The proposed algorithms are tested in various realistic reverberant room environments with real speech signals. All the experiments and evaluation demonstrate the robustness and enhanced separation performance of the proposed algorithms

    Enhanced independent vector analysis for audio separation in a room environment

    Get PDF
    Independent vector analysis (IVA) is studied as a frequency domain blind source separation method, which can theoretically avoid the permutation problem by retaining the dependency between different frequency bins of the same source vector while removing the dependency between different source vectors. This thesis focuses upon improving the performance of independent vector analysis when it is used to solve the audio separation problem in a room environment. A specific stability problem of IVA, i.e. the block permutation problem, is identified and analyzed. Then a robust IVA method is proposed to solve this problem by exploiting the phase continuity of the unmixing matrix. Moreover, an auxiliary function based IVA algorithm with an overlapped chain type source prior is proposed as well to mitigate this problem. Then an informed IVA scheme is proposed which combines the geometric information of the sources from video to solve the problem by providing an intelligent initialization for optimal convergence. The proposed informed IVA algorithm can also achieve a faster convergence in terms of iteration numbers and better separation performance. A pitch based evaluation method is defined to judge the separation performance objectively when the information describing the mixing matrix and sources is missing. In order to improve the separation performance of IVA, an appropriate multivariate source prior is needed to better preserve the dependency structure within the source vectors. A particular multivariate generalized Gaussian distribution is adopted as the source prior. The nonlinear score function derived from this proposed source prior contains the fourth order relationships between different frequency bins, which provides a more informative and stronger dependency structure compared with the original IVA algorithm and thereby improves the separation performance. Copula theory is a central tool to model the nonlinear dependency structure. The t copula is proposed to describe the dependency structure within the frequency domain speech signals due to its tail dependency property, which means if one variable has an extreme value, other variables are expected to have extreme values. A multivariate student's t distribution constructed by using a t copula with the univariate student's t marginal distribution is proposed as the source prior. Then the IVA algorithm with the proposed source prior is derived. The proposed algorithms are tested with real speech signals in different reverberant room environments both using modelled room impulse response and real room recordings. State-of-the-art criteria are used to evaluate the separation performance, and the experimental results confirm the advantage of the proposed algorithms

    Efficient Noise Suppression for Robust Speech Recognition

    Get PDF
    Electrical EngineeringThis thesis addresses the issues of single microphone based noise estimation technique for speech recognition in noise environments. A lot of researches have been performed on the environmental noise estimation, however most of them require voice activity detector (VAD) for accurate estimation of noise characteristics. I propose two approaches for efficient noise estimation without VAD. The first approach aims at improving the conventional quantile-based noise estimation (QBNE). I fostered the QBNE by adjusting the quantile level (QL) according to the relative amount of added noise to the target speech. Basically, we assign two different QLs, i.e., binary levels, according to the measured statistical moment of log scale power spectrum at each frequency. The second approach is applying dual mixture parametric model in computing likelihoods of speech and non-speech classes. I used dual Gaussian mixture model (GMM) and Rayleigh mixture model (RMM) for the likelihoods. From the assumption that speech is generally uncorrelated to the environmental noises, the noise power spectrum can be estimated by using each mixture model parameter of speech absence class. I compared the proposed methods with the conventional QBNE and minimum statistics based method on a simple speech recognition task in various signal-to-noise ratio (SNR) levels. Based on the experimental results, the proposed methods are shown to be superior to the conventional methods.ope

    Filter-Based Probabilistic Markov Random Field Image Priors: Learning, Evaluation, and Image Analysis

    Get PDF
    Markov random fields (MRF) based on linear filter responses are one of the most popular forms for modeling image priors due to their rigorous probabilistic interpretations and versatility in various applications. In this dissertation, we propose an application-independent method to quantitatively evaluate MRF image priors using model samples. To this end, we developed an efficient auxiliary-variable Gibbs samplers for a general class of MRFs with flexible potentials. We found that the popular pairwise and high-order MRF priors capture image statistics quite roughly and exhibit poor generative properties. We further developed new learning strategies and obtained high-order MRFs that well capture the statistics of the inbuilt features, thus being real maximum-entropy models, and other important statistical properties of natural images, outlining the capabilities of MRFs. We suggest a multi-modal extension of MRF potentials which not only allows to train more expressive priors, but also helps to reveal more insights of MRF variants, based on which we are able to train compact, fully-convolutional restricted Boltzmann machines (RBM) that can model visual repetitive textures even better than more complex and deep models. The learned high-order MRFs allow us to develop new methods for various real-world image analysis problems. For denoising of natural images and deconvolution of microscopy images, the MRF priors are employed in a pure generative setting. We propose efficient sampling-based methods to infer Bayesian minimum mean squared error (MMSE) estimates, which substantially outperform maximum a-posteriori (MAP) estimates and can compete with state-of-the-art discriminative methods. For non-rigid registration of live cell nuclei in time-lapse microscopy images, we propose a global optical flow-based method. The statistics of noise in fluorescence microscopy images are studied to derive an adaptive weighting scheme for increasing model robustness. High-order MRFs are also employed to train image filters for extracting important features of cell nuclei and the deformation of nuclei are then estimated in the learned feature spaces. The developed method outperforms previous approaches in terms of both registration accuracy and computational efficiency

    Probabilistic characterization and synthesis of complex driven systems

    Get PDF
    Thesis (Ph.D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2000.Includes bibliographical references (leaves 194-204).Real-world systems that have characteristic input-output patterns but don't provide access to their internal states are as numerous as they are difficult to model. This dissertation introduces a modeling language for estimating and emulating the behavior of such systems given time series data. As a benchmark test, a digital violin is designed from observing the performance of an instrument. Cluster-weighted modeling (CWM), a mixture density estimator around local models, is presented as a framework for function approximation and for the prediction and characterization of nonlinear time series. The general model architecture and estimation algorithm are presented and extended to system characterization tools such as estimator uncertainty, predictor uncertainty and the correlation dimension of the data set. Furthermore a real-time implementation, a Hidden-Markov architecture, and function approximation under constraints are derived within the framework. CWM is then applied in the context of different problems and data sets, leading to architectures such as cluster-weighted classification, cluster-weighted estimation, and cluster-weighted sampling. Each application relies on a specific data representation, specific pre and post-processing algorithms, and a specific hybrid of CWM. The third part of this thesis introduces data-driven modeling of acoustic instruments, a novel technique for audio synthesis. CWM is applied along with new sensor technology and various audio representations to estimate models of violin-family instruments. The approach is demonstrated by synthesizing highly accurate violin sounds given off-line input data as well as cello sounds given real-time input data from a cello player.by Bernd Schoner.Ph.D

    Characterizing functional and structural brain alterations driven by chronic alcohol drinking: a resting-state fMRI connectivity and voxel-based morphometry analysis

    Full text link
    El balance del cerebro se altera a nivel estructural y funcional por el consumo de alcohol y puede causar trastornos por consumo de alcohol (TCA). El objetivo de esta Tesis Doctoral fue investigar los efectos del consumo crónico y excesivo de alcohol en el cerebro desde una perspectiva funcional y estructural, mediante análisis de imágenes multimodales de resonancia magnética (RM). Realizamos tres estudios con objetivos específicos: i) Para entender cómo las neuroadaptaciones desencadenadas por el consumo de alcohol se ven reflejadas en la conectividad cerebral funcional entre redes cerebrales, así como en la actividad cerebral, realizamos estudios en ratas msP en condiciones de control y tras un mes con acceso a alcohol. Para cada sujeto se obtuvieron las señales específicas de sus redes cerebrales tras aplicar análisis probabilístico de componentes independientes y regresión espacial a las imágenes funcionales de RM en estado de reposo (RMf-er). Después, estimamos la conectividad cerebral en estado de reposo mediante correlación parcial regularizada. Para una lectura de la actividad neuronal realizamos un experimento con imágenes de RM realzadas con manganeso. En la condición de alcohol encontramos hipoconectividades entre la red visual y las redes estriatal y sensorial; todas con incrementos en actividad. Por el contrario, hubo hiperconectividades entre tres pares de redes cerebrales: 1) red prefrontal cingulada media y red estriatal, 2) red sensorial y red parietal de asociación y 3) red motora-retroesplenial y red sensorial, siendo la red parietal de asociación la única red sin incremento de actividad. Estos resultados indican que las redes cerebrales ya se alteran desde una fase temprana de consumo continuo y prolongado de alcohol, disminuyendo el control ejecutivo y la flexibilidad comportamental. ii) Para comparar el volumen de materia gris (MG) cortical entre 34 controles sanos y 35 pacientes con dependencia al alcohol, desintoxicados y en abstinencia de 1 a 5 semanas, realizamos un análisis de morfometría basado en vóxel. Las principales estructuras cuyo volumen de MG disminuyó en los sujetos en abstinencia fueron el giro precentral (GPreC), el giro postcentral (GPostC), la corteza motora suplementaria (CMS), el giro frontal medio (GFM), el precúneo (PCUN) y el lóbulo parietal superior (LPS). Disminuciones de MG en el volumen de esas áreas pueden dar lugar a cambios en el control de los movimientos (GPreC y CMS), en el procesamiento de información táctil y propioceptiva (GPostC), personalidad, previsión (GFM), reconocimiento sensorial, entendimiento del lenguaje, orientación (PCUN) y reconocimiento de objetos a través de su forma (LPS). iii) Caracterizar estados cerebrales dinámicos en señales de RMf mediante una metodología basada en un modelo oculto de Markov (HMM en inglés)-Gaussiano en un paradigma con diseño de bloques, junto con distintas señales temporales de múltiples redes: componentes independientes y modos funcionales probabilísticos (PFMs en inglés) en 14 sujetos sanos. Cuatro condiciones experimentales formaron el paradigma de bloques: reposo, visual, motora y visual-motora. Mediante la aplicación de HMM-Gaussiano a los PFMs pudimos caracterizar cuatro estados cerebrales a partir de la actividad media de cada PFM. Los cuatro mapas espaciales obtenidos fueron llamados HMM-reposo, HMM-visual, HMM-motor y HMM-RND (red neuronal por defecto). HMM-RND apareció una vez el estado de tarea se había estabilizado. En un futuro cercano se espera obtener estados cerebrales en nuestros datos de RMf-er en ratas, para comparar dinámicamente el comportamiento de las redes cerebrales como un biomarcador de TCA. En conclusión, las técnicas de neuroimagen aplicadas en imagen de RM multimodal para estimar la conectividad cerebral en estado de reposo, la actividad cerebral y el volumen de materia gris han permitido avanzar en el entendimiento de los mecanismos homeostáticoLa ingesta d'alcohol altera el balanç del cervell a nivell estructural i funcional i pot causar trastorns per consum d' alcohol (TCA). L'objectiu d'aquesta Tesi Doctoral fou estudiar els efectes en el cervell del consum crònic i excessiu d'alcohol, des d'un punt de vista funcional i estructural i per mitjà d'anàlisi d'imatges de ressonància magnètica (RM). Vam realitzar tres anàlisis amb objectius específics: i) Per a entendre com les neuroadaptacions desencadenades pel consum d'alcohol es veuen reflectides en la connectivitat cerebral funcional entre xarxes cerebrals, així com en l'activitat cerebral, vam realitzar estudis en rates msP en les condicions de control i després d'un mes amb accés a alcohol. Per a cada subjecte vam obtindre els senyals de les xarxes cerebrals tras aplicar a les imatges funcionals de RM en estat de repòs una anàlisi probabilística de components independents i regressió espacial. Després, estimàrem la connectivitat cerebral en estat de repòs per mitjà de correlació parcial regularitzada. Per a una lectura de l'activitat cerebral vam adquirir imatges de RM realçades amb manganés. En la condició d'alcohol vam trobar hipoconnectivitats entre la xarxa visual i les xarxes estriatal i sensorial, totes amb increments en activitat. Al contrari, va haver-hi hiperconnectivitats entre tres parells de xarxes cerebrals: 1) xarxa prefrontal cingulada mitja i xarxa estriatal, 2) xarxa sensorial i xarxa parietal d'associació i 3) xarxa motora-retroesplenial i xarxa sensorial, sent la xarxa parietal d'associació l'única xarxa sense increment d'activitat. Aquests resultats indiquen que les xarxes cerebrals ja s'alteren des d'una fase primerenca caracteritzada per consum continu i prolongat d'alcohol, disminuint el control executiu i la flexibilitat comportamental. ii) Per a comparar el volum de MG cortical entre 34 controls sans i 35 pacients amb dependència a l'alcohol, desintoxicats i en abstinència de 1 a 5 setmanes vam emprar anàlisi de morfometria basada en vòxel. Les principals estructures on el volum de MG va disminuir en els subjectes en abstinència van ser el gir precentral (GPreC), el gir postcentral (GPostC), la corteça motora suplementària (CMS), el gir frontal mig (GFM), el precuni (PCUN) i el lòbul parietal superior (LPS). Les disminucions de MG en eixes àrees poden donar lloc a canvis en el control dels moviments (GPreC i CMS), en el processament d'informació tàctil i propioceptiva (GPostC), personalitat, previsió (GFM), reconeixement sensorial, enteniment del llenguatge, orientació (PCUN) i reconeixement d'objectes a través de la seua forma (LPS). iii) Caracterització de les dinàmiques temporals del cervell com a diferents estats cerebrals, en senyals de RMf mitjançant una metodologia basada en un model ocult de Markov (HMM en anglès)-Gaussià en imatges de RMf, junt amb dos tipus de senyals temporals de múltiples xarxes cerebrals: components independents i modes funcionals probabilístics (PFMs en anglès) en 14 subjectes sans. Quatre condicions experimentals van formar el paradigma de blocs: repòs, visual, motora i visual-motora. HMM-Gaussià aplicat als PFMs (senyals de RM funcional de xarxes cerebrals) va permetre la millor caracterització dels quatre estats cerebrals a partir de l'activitat mitjana de cada PFM. Els quatre mapes espacials obtinguts van ser anomenats HMM-repòs, HMM-visual, HMM-motor i HMM-XND (xarxa neuronal per defecte). HMM-XND va aparèixer una vegada una tasca estava estabilitzada. En un futur pròxim s'espera obtindre estats cerebrals en les nostres dades de RMf-er en rates, per a comparar dinàmicament el comportament de les xarxes cerebrals com a biomarcador de TCA. En conclusió, s'han aplicat tècniques de neuroimatge per a estimar la connectivitat cerebral en estat de repòs, l'activitat cerebral i el volum de MG, aplicades a imatges multimodals de RM i s'han obtés resultats que han permés avançar en l'enteniment dels mAlcohol intake alters brain balance, affecting its structure and function, and it may cause Alcohol Use Disorders (AUDs). We aimed to study the effects of chronic, excessive alcohol consumption on the brain from a functional and structural point of view, via analysis of multimodal magnetic resonance (MR) images. We conducted three studies with specific aims: i) To understand how the neuroadaptations triggered by alcohol intake are reflected in between-network resting-state functional connectivity (rs-FC) and brain activity in the onset of alcohol dependence, we performed studies in msP rats in control and alcohol conditions. Group probabilistic independent component analysis (group-PICA) and spatial regression were applied to resting-state functional magnetic resonance imaging (rs-fMRI) images to obtain subject-specific time courses of seven resting-state networks (RSNs). Then, we estimated rs-FC via L2-regularized partial correlation. We performed a manganese-enhanced (MEMRI) experiment as a readout of neuronal activity. In alcohol condition, we found hypoconnectivities between the visual network (VN), and striatal (StrN) and sensory-cortex (SCN) networks, all with increased brain activity. On the contrary, hyperconnectivities were found between three pairs of RSNs: 1) medial prefrontal-cingulate (mPRN) and StrN, 2) SCN and parietal association (PAN) and 3) motor-retrosplenial (MRN) and SCN networks, being PAN the only network without brain activity rise. Interestingly, the hypoconnectivities could be explained as control to alcohol transitions from direct to indirect connectivity, whereas the hyperconnectivities reflected an indirect to an even more indirect connection. These findings indicate that RSNs are early altered by prolonged and moderate alcohol exposure, diminishing the executive control and behavioral flexibility. ii) To compare cortical gray matter (GM) volume between 34 healthy controls and 35 alcohol-dependent patients who were detoxified and remained abstinent for 1-5 weeks before MRI acquisition, we performed a voxel-based morphometry analysis. The main structures whose GM volume decreased in abstinent subjects compared to controls were precentral gyrus (PreCG), postcentral gyrus (PostCG), supplementary motor cortex (SMC), middle frontal gyrus (MFG), precuneus (PCUN) and superior parietal lobule (SPL). Decreases in GM volume in these areas may lead to changes in control of movement (PreCG and SMC), in processing tactile and proprioceptive information (PostCG), personality, insight, prevision (MFG), sensory appreciation, language understanding, orientation (PCUN) and the recognition of objects by touch and shapes (SPL). iii) To characterize dynamic brain states in functional MRI (fMRI) signals by means of an approach based on the Hidden Markov model (HMM). Several parameter configurations of HMM-Gaussian in a block-design paradigm were considered, together with different time series: independent components (ICs) and probabilistic functional modes (PFMs) on 14 healthy subjects. The block-design fMRI paradigm consisted of four experimental conditions: rest, visual, motor and visual-motor. Characterizing brain states' dynamics in fMRI data was possible applying the HMM-Gaussian approach to PFMs, with mean activity driving the states. The four spatial maps obtained were named HMM-rest, HMM-visual, HMM-motor and HMM-DMN (default mode network). HMM-DMN appeared once a task state had stabilized. The ultimate goal will be to obtain brain states in our rs-fMRI rat data, to dynamically compare the behavior of brain RSNs as a biomarker of AUD. In conclusion, neuroimaging techniques to estimate rs-FC, brain activity and GM volume can be successfully applied to multimodal MRI in the advance of the understanding of brain homeostasis in AUDs. These functional and structural alterations are a biomarker of chronic alcoholism to explain impairments in executive control, reward evaluation and visuospatial processing.Pérez Ramírez, MÚ. (2018). Characterizing functional and structural brain alterations driven by chronic alcohol drinking: a resting-state fMRI connectivity and voxel-based morphometry analysis [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/11316

    On Computable Protein Functions

    Get PDF
    Proteins are biological machines that perform the majority of functions necessary for life. Nature has evolved many different proteins, each of which perform a subset of an organism’s functional repertoire. One aim of biology is to solve the sparse high dimensional problem of annotating all proteins with their true functions. Experimental characterisation remains the gold standard for assigning function, but is a major bottleneck due to resource scarcity. In this thesis, we develop a variety of computational methods to predict protein function, reduce the functional search space for proteins, and guide the design of experimental studies. Our methods take two distinct approaches: protein-centric methods that predict the functions of a given protein, and function-centric methods that predict which proteins perform a given function. We applied our methods to help solve a number of open problems in biology. First, we identified new proteins involved in the progression of Alzheimer’s disease using proteomics data of brains from a fly model of the disease. Second, we predicted novel plastic hydrolase enzymes in a large data set of 1.1 billion protein sequences from metagenomes. Finally, we optimised a neural network method that extracts a small number of informative features from protein networks, which we used to predict functions of fission yeast proteins
    corecore