935 research outputs found

    Analysis of class C G-protein coupled receptors using supervised classification methods

    Get PDF
    G protein-coupled receptors (GPCRs) are cell membrane proteins with a key role in regulating the function of cells. This is the result of their ability to transmit extracellular signals, which makes them relevant for pharmacology and has led, over the last decade, to active research in the field of proteomics. The current thesis specifically targets class C of GPCRs, which are relevant in therapies for various central nervous system disorders, such as Alzheimer’s disease, anxiety, Parkinson’s disease and schizophrenia. The investigation of protein functionality often relies on the knowledge of crystal three dimensional (3-D) structures, which determine the receptor’s ability for ligand binding responsible for the activation of certain functionalities in the protein. The structural information is therefore paramount, but it is not always known or easily unravelled, which is the case of eukaryotic cell membrane proteins such as GPCRs. In the face of the lack of information about the 3-D structure, research is often bound to the analysis of the primary amino acid sequences of the proteins, which are commonly known and available from curated databases. Much research on sequence analysis has focused on the quantitative analysis of their aligned versions, although, recently, alternative approaches using machine learning techniques for the analysis of alignment-free sequences have been proposed. In this thesis, we focus on the differentiation of class C GPCRs into functional and structural related subgroups based on the alignment-free analysis of their sequences using supervised classification models. In the first part of the thesis, the main topic is the construction of supervised classification models for unaligned protein sequences based on physicochemical transformations and n-gram representations of their amino acid sequences. These models are useful to assess the internal data quality of the externally labeled dataset and to manage the label noise problem from a data curation perspective. In its second part, the thesis focuses on the analysis of the sequences to discover subtype- and region-speci¿c sequence motifs. For that, we carry out a systematic analysis of the topological sequence segments with supervised classification models and evaluate the subtype discrimination capability of each region. In addition, we apply different types of feature selection techniques to the n-gram representation of the amino acid sequence segments to find subtype and region specific motifs. Finally, we compare the findings of this motif search with the partially known 3D crystallographic structures of class C GPCRs.Los receptores acoplados a proteínas G (GPCRs) son proteínas de la membrana celular con un papel clave para la regulación del funcionamiento de una célula. Esto es consecuencia de su capacidad de transmisión de señales extracelulares, lo que les hace relevante en la farmacología y que ha llevado a investigaciones activas en la última década en el área de la proteómica. Esta tesis se centra específicamente en la clase C de GPCRs, que son relevante para terapias de varios trastornos del sistema nervioso central, como la enfermedad de Alzheimer, ansiedad, enfermedad de Parkinson y esquizofrenia. La investigación de la funcionalidad de proteínas muchas veces se basa en el conocimiento de la estructura cristalina tridimensional (3-D), que determina la capacidad del receptor para la unión con ligandos, que son responsables para la activación de ciertas funcionalidades en la proteína. El análisis de secuencias de amino ácidos se ha centrado en muchas investigaciones en el análisis cuantitativo de las versiones alineados de las secuencias, aunque, recientemente, se han propuesto métodos alternativos usando métodos de aprendizaje automático aplicados a las versiones no-alineadas de las secuencias. En esta tesis, nos centramos en la diferenciación de los GPCRs de la clase C en subgrupos funcionales y estructurales basado en el análisis de las secuencias no-alineadas utilizando modelos de clasificación supervisados. Estos modelos son útiles para evaluar la calidad interna de los datos a partir del conjunto de datos etiquetados externamente y para gestionar el problema del 'ruido de datos' desde la perspectiva de la curación de datos. En su segunda parte, la tesis enfoca el análisis de las secuencias para descubrir motivos de secuencias específicos a nivel de subtipo o región. Para eso, llevamos a cabo un análisis sistemático de los segmentos topológicos de la secuencia con modelos supervisados de clasificación y evaluamos la capacidad de discriminar entre subtipos de cada región. Adicionalmente, aplicamos diferentes tipos de técnicas de selección de atributos a las representaciones mediante n-gramas de los segmentos de secuencias de amino ácidos para encontrar motivos específicos a nivel de subtipo y región. Finalmente, comparamos los descubrimientos de la búsqueda de motivos con las estructuras cristalinas parcialmente conocidas para la clase C de GPCRs

    Systematic analysis of primary sequence domain segments for the discrimination between class C GPCR subtypes

    Get PDF
    G-protein-coupled receptors (GPCRs) are a large and diverse super-family of eukaryotic cell membrane proteins that play an important physiological role as transmitters of extracellular signal. In this paper, we investigate Class C, a member of this super-family that has attracted much attention in pharmacology. The limited knowledge about the complete 3D crystal structure of Class C receptors makes necessary the use of their primary amino acid sequences for analytical purposes. Here, we provide a systematic analysis of distinct receptor sequence segments with regard to their ability to differentiate between seven class C GPCR subtypes according to their topological location in the extracellular, transmembrane, or intracellular domains. We build on the results from the previous research that provided preliminary evidence of the potential use of separated domains of complete class C GPCR sequences as the basis for subtype classification. The use of the extracellular N-terminus domain alone was shown to result in a minor decrease in subtype discrimination in comparison with the complete sequence, despite discarding much of the sequence information. In this paper, we describe the use of Support Vector Machine-based classification models to evaluate the subtype-discriminating capacity of the specific topological sequence segments.Peer ReviewedPostprint (author's final draft

    Sensitivity analysis of sensors in a hydraulic condition monitoring system using CNN models

    Get PDF
    Condition monitoring (CM) is a useful application in industry 4.0, where the machine’s health is controlled by computational intelligence methods. Data-driven models, especially from the field of deep learning, are efficient solutions for the analysis of time series sensor data due to their ability to recognize patterns in high dimensional data and to track the temporal evolution of the signal. Despite the excellent performance of deep learning models in many applications, additional requirements regarding the interpretability of machine learning models are getting relevant. In this work, we present a study on the sensitivity of sensors in a deep learning based CM system providing high-level information about the relevance of the sensors. Several convolutional neural networks (CNN) have been constructed from a multisensory dataset for the prediction of different degradation states in a hydraulic system. An attribution analysis of the input features provided insights about the contribution of each sensor in the prediction of the classifier. Relevant sensors were identified, and CNN models built on the selected sensors resulted equal in prediction quality to the original models. The information about the relevance of sensors is useful for the system’s design to decide timely on the required sensorsPeer ReviewedPostprint (published version

    Entwicklung und Evaluation einer 6-Tages-Intensivtherapie bei Kindern (5 - 10 Jahre) mit Lippen-Kiefer-Gaumen-Segel-Fehlbildung

    Get PDF
    Theoretischer Hintergrund: Kinder mit Lippen-Kiefer-Gaumen-Segel-Fehlbildungen (LKGSF) oder Hypernasalität zeigen in Folge der beeinträchtigten orofazialen Strukturen Aussprache- und Resonanzstörungen, welche eine erschwerte Partizipation und Teilhabe nach sich ziehen können. Kinder mit LKGSF oder Hypernasalität haben einen überdurchschnittlich hohen Bedarf an Sprachtherapie. Dennoch sind international geringe Forschungsaktivitäten zu verzeichnen, um validierte sprachtherapeutische Interventionen bei LKGSF zu entwickeln, die hierbei nicht nur auf die Therapie der Aussprache- und Resonanzstörungen abzielen, sondern die Gesamtcharakteristik des Störungsbildes berücksichtigen. Aus vorangegangenen Studien zu LKGSF und anderen Störungsbildern ist ersichtlich, dass in Intensiv- und Gruppentherapien Potential für neue sprachtherapeutische Interventionen bei LKGSF liegt. Ziel der Studie: Im Rahmen der vorliegenden Arbeit fand die Entwicklung, Anwendung und Evaluation einer 6-Tages-Intensivtherapie bei Kindern (5 - 10 Jahre) mit LKGSF oder Hypernasalität statt. Eingesetzt wurde hierfür eine Kombination aus Gruppen- und Einzeltherapie unter Anwendung phonetisch-phonologischer und sensomotorischer Therapieansätze. Methodik: An sechs Tagen erhielten 24 Kinder (5.01 – 9.11 Jahre) 16 Stunden Gruppen- und 5 Stunden Einzeltherapie. In der Therapie wurden die Phoneme /p/, /t/, /k/, /f/, /ʃ/ und /ʁ/ als Ziellaute definiert. Die Eltern wurden in 1.5 Stunden über die Zusammenhänge zwischen LKGSF und Aussprache informiert und in den Übungen ihrer Kinder angeleitet. In einem offenen und abhängigen Studiendesign wurden zu vier Zeitpunkten (Pre-, Posttreatment, Follow up 3 und 6 Monate später) die Veränderungen von Artikulationsfunktion (PCC, PICC, PVC, VPC-Sum), Stimmfunktion bzw. Resonanz (VPC-Hypernasalität, Nasalance Ratio), kommunikativer Partizipation (FBA), Verständlichkeit (ICS-G), sowie von Gefühl (ASAP-K) und Einstellungen (KiddyCAT) gegenüber dem eigenen Sprechen gemessen. Ergebnisse/Interpretation: Die Ergebnisse weisen die Kombination von Intensiv- und Gruppentherapie als eine effektive Methode zur Behandlung von Aussprachestörungen bei LKGSF oder Hypernasalität aus. Signifikante Verbesserungen liegen dabei nicht nur für die Funktion der Artikulation (PCC: Chi² = 25.548, p < .001; PICC: Chi² = 25.331, p < .001; PVC: Chi² = 22.552, p < .001, VPC-Sum: Chi² = 28.289, p < .001) und der Stimme vor (VPCHypernasalität: Chi² = 17.323, p = .001, Nasalance Ratio: z = -2.533, p = .011), sondern auch für die kommunikative Partizipation (FBA Gesamtwert: Chi2 = 6.083, p = .048). Für die sechs ausgewählten Ziellaute wurden mit Ausnahme des /ʁ/ Verbesserungen erreicht, wenn diese von aktiven Lautverlagerungen betroffen waren. Vorliegende passive Lautveränderungen zeigten eine lautunspezifische Reduktion. Keine signifikanten Veränderungen fanden sich in den Einstellungen und Gefühlen gegenüber dem eigenen Sprechen. Die Studie gibt Hinweise, dass zu Verbesserungen in der kommunikativen Partizipation ein geschütztes Umfeld und die Peergroup beizutragen scheinen. Weitere Anhaltspunkte deuten darauf hin, dass passive Lautveränderungen mit phonologisch-orientierten Therapieansätzen und aktive Lautveränderungen mit phonetisch-orientierten Therapieansätzen am Effektivsten behandelbar sind. Größere Therapiestudien sind zur Untersuchung dieser Beobachtung notwendig. Diese Ergebnisse sind vor dem Hintergrund teils erstmals verwendeter Diagnostikinstrumente zu sehen und sollten anhand einer größeren Stichprobengröße repliziert werden.Theoretical background: Children with cleft lip and palate anomalies (CLP) or hypernasality show, due to impaired orofacial structures, speech and resonance disorders, which can result in disablements in participation and communication. Children with CLP or hypernasality have an above-average need for speech therapy. Yet, there is little international research that develops validated speech therapy interventions for CLP and not only aims at the therapy of speech/resonance disorders but also adequately addresses the overall characteristics of CLP. As shown in previous studies of CLP or other disorders, intensive therapy and group therapy could be a key to new speech therapy interventions in CLP. Aim of the study: Within the scope of the present work, the development, application and evaluation of a 6-day intensive therapy in children (5 - 10 years) with CLP or hypernasality took place. Using a combination of group and individual therapy,phonetic-phonological and sensomotoric therapy approaches were applied. Methodology: 24 children (5.01 - 9.11 years) received a total of 16 hours of group therapy and 5 hours of individual therapy over six days. The phonemes /p/, /t/, /k/, /f/, /ʃ/ and /ʁ/ were focused. In a 1.5 hour sitting, the parents were informed about the correlations between speech impairments and CLP/Hypernasality and given instructions for further exercises for their children. In a descriptive and dependent study design with an open trial, improvements were examined at four points in time (pre-, post-treatment, follow-up 3 and 6 months later). The changes in the function of articulation (PCC, PICC, PVC, VPC-Sum) and voice (VPC-Hypernasalität, Nasalance Ratio), communicative participation (FBA), intelligibility (ICS-G), as well as in the feeling (ASAP-K) and the attitudes (KiddyCATG) towards one's own speaking were measured. Results / Interpretation: The results show that the combination of intensive and group therapy is an effective method for treating pronunciation disorders in CLP or hypernasality. Significant improvements are seen not only in the function of articulation (PCC: Chi² = 25.548, p < .001; PICC: Chi² = 25.331, p < .001; PVC: Chi² = 22.552, p < .001, VPC-Sum: Chi² = 28.289, p < .001) and voice (VPC-Hypernasalität: Chi² = 17.323, p = .001, Nasalance Ratio: z = -2.533, p = .011), but also in participation (FBA-score: Chi2 = 6.083, p = .048). Improvements were achieved for the six selected target sounds (/p/, /t/, /k/, /f/, /ʃ/) except the /ʁ/ if they were not affected by passive but active cleft type characteristics (CTCs). Passive CTC’s showed a sound-unspecific reduction. No significant changes were found in attitudes and feelings towards one's own speech. The study indicates that a protected environment and the peer group seem to contribute to improvements in communicative participation. Further evidence suggests that passive CTC’s can be treated most effectively with phonologically-oriented therapy approaches and active CTC’s with phonetically-oriented therapy approaches. Larger therapeutic studies are necessary to investigate this observation. These results are to be seen against the background of the often first-time use of diagnostic instruments, and should be replicated using a larger sample size

    Analysis of class C G-protein coupled receptors using supervised classification methods

    Get PDF
    G protein-coupled receptors (GPCRs) are cell membrane proteins with a key role in regulating the function of cells. This is the result of their ability to transmit extracellular signals, which makes them relevant for pharmacology and has led, over the last decade, to active research in the field of proteomics. The current thesis specifically targets class C of GPCRs, which are relevant in therapies for various central nervous system disorders, such as Alzheimer’s disease, anxiety, Parkinson’s disease and schizophrenia. The investigation of protein functionality often relies on the knowledge of crystal three dimensional (3-D) structures, which determine the receptor’s ability for ligand binding responsible for the activation of certain functionalities in the protein. The structural information is therefore paramount, but it is not always known or easily unravelled, which is the case of eukaryotic cell membrane proteins such as GPCRs. In the face of the lack of information about the 3-D structure, research is often bound to the analysis of the primary amino acid sequences of the proteins, which are commonly known and available from curated databases. Much research on sequence analysis has focused on the quantitative analysis of their aligned versions, although, recently, alternative approaches using machine learning techniques for the analysis of alignment-free sequences have been proposed. In this thesis, we focus on the differentiation of class C GPCRs into functional and structural related subgroups based on the alignment-free analysis of their sequences using supervised classification models. In the first part of the thesis, the main topic is the construction of supervised classification models for unaligned protein sequences based on physicochemical transformations and n-gram representations of their amino acid sequences. These models are useful to assess the internal data quality of the externally labeled dataset and to manage the label noise problem from a data curation perspective. In its second part, the thesis focuses on the analysis of the sequences to discover subtype- and region-speci¿c sequence motifs. For that, we carry out a systematic analysis of the topological sequence segments with supervised classification models and evaluate the subtype discrimination capability of each region. In addition, we apply different types of feature selection techniques to the n-gram representation of the amino acid sequence segments to find subtype and region specific motifs. Finally, we compare the findings of this motif search with the partially known 3D crystallographic structures of class C GPCRs.Los receptores acoplados a proteínas G (GPCRs) son proteínas de la membrana celular con un papel clave para la regulación del funcionamiento de una célula. Esto es consecuencia de su capacidad de transmisión de señales extracelulares, lo que les hace relevante en la farmacología y que ha llevado a investigaciones activas en la última década en el área de la proteómica. Esta tesis se centra específicamente en la clase C de GPCRs, que son relevante para terapias de varios trastornos del sistema nervioso central, como la enfermedad de Alzheimer, ansiedad, enfermedad de Parkinson y esquizofrenia. La investigación de la funcionalidad de proteínas muchas veces se basa en el conocimiento de la estructura cristalina tridimensional (3-D), que determina la capacidad del receptor para la unión con ligandos, que son responsables para la activación de ciertas funcionalidades en la proteína. El análisis de secuencias de amino ácidos se ha centrado en muchas investigaciones en el análisis cuantitativo de las versiones alineados de las secuencias, aunque, recientemente, se han propuesto métodos alternativos usando métodos de aprendizaje automático aplicados a las versiones no-alineadas de las secuencias. En esta tesis, nos centramos en la diferenciación de los GPCRs de la clase C en subgrupos funcionales y estructurales basado en el análisis de las secuencias no-alineadas utilizando modelos de clasificación supervisados. Estos modelos son útiles para evaluar la calidad interna de los datos a partir del conjunto de datos etiquetados externamente y para gestionar el problema del 'ruido de datos' desde la perspectiva de la curación de datos. En su segunda parte, la tesis enfoca el análisis de las secuencias para descubrir motivos de secuencias específicos a nivel de subtipo o región. Para eso, llevamos a cabo un análisis sistemático de los segmentos topológicos de la secuencia con modelos supervisados de clasificación y evaluamos la capacidad de discriminar entre subtipos de cada región. Adicionalmente, aplicamos diferentes tipos de técnicas de selección de atributos a las representaciones mediante n-gramas de los segmentos de secuencias de amino ácidos para encontrar motivos específicos a nivel de subtipo y región. Finalmente, comparamos los descubrimientos de la búsqueda de motivos con las estructuras cristalinas parcialmente conocidas para la clase C de GPCRs.Postprint (published version

    Trust in Artificial Intelligence: Comparing Trust Processes Between Human and Automated Trustees in Light of Unfair Bias

    Get PDF
    Automated systems based on artifcial intelligence (AI) increasingly support decisions with ethical implications where decision makers need to trust these systems. However, insights regarding trust in automated systems predominantly stem from contexts where the main driver of trust is that systems produce accurate outputs (e.g., alarm systems for monitoring tasks). It remains unclear whether what we know about trust in automated systems translates to application contexts where ethical considerations (e.g., fairness) are crucial in trust development. In personnel selection, as a sample context where ethical considerations are important, we investigate trust processes in light of a trust violation relating to unfair bias and a trust repair intervention. Specifcally, participants evaluated preselection outcomes (i.e., sets of preselected applicants) by either a human or an automated system across twelve selection tasks. We additionally varied information regarding imperfection of the human and automated system. In task rounds fve through eight, the preselected applicants were predominantly male, thus constituting a trust violation due to potential unfair bias. Before task round nine, participants received an excuse for the biased preselection (i.e., a trust repair intervention). The results of the online study showed that participants have initially less trust in automated systems. Furthermore, the trust violation and the trust repair intervention had weaker efects for the automated system. Those efects were partly stronger when highlighting system imperfection. We conclude that insights from classical areas of automation only partially translate to the many emerging application contexts of such systems where ethical considerations are central to trust processes

    Molecular dynamics forecasting of transmembrane regions in GPRCs by recurrent neural networks

    Get PDF
    G protein-coupled receptors are a large super-family of cell membrane proteins that play an important physiological role as transmitters of extra-cellular signals. Signal transmission through the cell membrane depends on the conformational changes of the transmembrane region of the receptor and the investigation of the dynamics in these regions is therefore key. Molecular Dynamics (MD) simulations can provide information of the receptor conformational states at the atom level and machine learning (ML) methods can be useful for the analysis of these data. In this paper, Recurrent Neural Networks (RNNs) are used to evaluate whether the MD can be modeled focusing on the different regions of the receptor (intra-cellular, extra-cellular and each transmembrane regions (TM)). The best results, as measured by root-mean-square deviation (RMSD), are 0.1228 Å for TM4 of the 2rh1 (inactive state) and 0.1325 Å for TM4 of the 3p0g (active state), which are comparable to the state-of-the-art in non-dynamic 3-D predictions, showing the potential of the proposed approach.This work is funded by Spanish PID2019-104551RB-I00 research project and by the PRE2020-092428 Ph.D. training program, through the Ministry Science and Innovation.Peer ReviewedPostprint (author's final draft

    Layer-wise relevance analysis for motif recognition in the activation pathway of the ß2-adrenergic GPCR receptor

    Get PDF
    G-protein-coupled receptors (GPCRs) are cell membrane proteins of relevance as therapeutic targets, and are associated to the development of treatments for illnesses such as diabetes, Alzheimer’s, or even cancer. Therefore, comprehending the underlying mechanisms of the receptor functional properties is of particular interest in pharmacoproteomics and in disease therapy at large. Their interaction with ligands elicits multiple molecular rearrangements all along their structure, inducing activation pathways that distinctly influence the cell response. In this work, we studied GPCR signaling pathways from molecular dynamics simulations as they provide rich information about the dynamic nature of the receptors. We focused on studying the molecular properties of the receptors using deep-learning-based methods. In particular, we designed and trained a one-dimensional convolution neural network and illustrated its use in a classification of conformational states: active, intermediate, or inactive, of the ß2 -adrenergic receptor when bound to the full agonist BI-167107. Through a novel explainability-oriented investigation of the prediction results, we were able to identify and assess the contribution of individual motifs (residues) influencing a particular activation pathway. Consequently, we contribute a methodology that assists in the elucidation of the underlying mechanisms of receptor activation–deactivation.This research was funded by Spanish PID2019-104551RB-I00 research project.Peer ReviewedPostprint (published version

    Misclassification of class C G-protein-coupled receptors as a label noise problem

    Get PDF
    G-Protein-Coupled Receptors (GPCRs) are cell membrane proteins of relevance to biology and pharmacology. Their supervised classification in subtypes is hampered by label noise, which stems from a combination of expert knowledge limitations and lack of clear correspondence between labels and different representations of the protein primary sequences. In this brief study, we describe a systematic approach to the analysis of GPCR misclassifications using Support Vector Machines and use it to assist the discovery of database labeling quality problems and investigate the extent to which GPCR sequence physicochemical transformations reflect GPCR subtype labeling. The proposed approach could enable a filtering approach to the label noise problem.Peer ReviewedPostprint (published version

    Reducing the n-gram feature space of class C GPCRs to subtype-discriminating patterns

    Get PDF
    G protein-coupled receptors (GPCRs) are a large and heterogeneous superfamily of receptors that are key cell players for their role as extracellular signal transmitters. Class C GPCRs, in particular, are of great interest in pharmacology. The lack of knowledge about their full 3-D structure prompts the use of their primary amino acid sequences for the construction of robust classifiers, capable of discriminating their different subtypes. In this paper, we investigate the use of feature selection techniques to build Support Vector Machine (SVM)-based classification models from selected receptor subsequences described as n-grams. We show that this approach to classification is useful for finding class C GPCR subtype-specific motifs.Peer ReviewedPostprint (published version
    corecore