13,583 research outputs found

    Regression applied to protein binding site prediction and comparison with classification

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The structural genomics centers provide hundreds of protein structures of unknown function. Therefore, developing methods enabling the determination of a protein function automatically is imperative. The determination of a protein function can be achieved by studying the network of its physical interactions. In this context, identifying a potential binding site between proteins is of primary interest. In the literature, methods for predicting a potential binding site location generally are based on classification tools. The aim of this paper is to show that regression tools are more efficient than classification tools for patches based binding site predictors. For this purpose, we developed a patches based binding site localization method usable with either regression or classification tools.</p> <p>Results</p> <p>We compared predictive performances of regression tools with performances of machine learning classifiers. Using leave-one-out cross-validation, we showed that regression tools provide better predictions than classification ones. Among regression tools, Multilayer Perceptron ranked highest in the quality of predictions. We compared also the predictive performance of our patches based method using Multilayer Perceptron with the performance of three other methods usable through a web server. Our method performed similarly to the other methods.</p> <p>Conclusion</p> <p>Regression is more efficient than classification when applied to our binding site localization method. When it is possible, using regression instead of classification for other existing binding site predictors will probably improve results. Furthermore, the method presented in this work is flexible because the size of the predicted binding site is adjustable. This adaptability is useful when either false positive or negative rates have to be limited.</p

    Modeling regionalized volumetric differences in protein-ligand binding cavities

    Get PDF
    Identifying elements of protein structures that create differences in protein-ligand binding specificity is an essential method for explaining the molecular mechanisms underlying preferential binding. In some cases, influential mechanisms can be visually identified by experts in structural biology, but subtler mechanisms, whose significance may only be apparent from the analysis of many structures, are harder to find. To assist this process, we present a geometric algorithm and two statistical models for identifying significant structural differences in protein-ligand binding cavities. We demonstrate these methods in an analysis of sequentially nonredundant structural representatives of the canonical serine proteases and the enolase superfamily. Here, we observed that statistically significant structural variations identified experimentally established determinants of specificity. We also observed that an analysis of individual regions inside cavities can reveal areas where small differences in shape can correspond to differences in specificity

    Geometric algorithms for cavity detection on protein surfaces

    Get PDF
    Macromolecular structures such as proteins heavily empower cellular processes or functions. These biological functions result from interactions between proteins and peptides, catalytic substrates, nucleotides or even human-made chemicals. Thus, several interactions can be distinguished: protein-ligand, protein-protein, protein-DNA, and so on. Furthermore, those interactions only happen under chemical- and shapecomplementarity conditions, and usually take place in regions known as binding sites. Typically, a protein consists of four structural levels. The primary structure of a protein is made up of its amino acid sequences (or chains). Its secondary structure essentially comprises -helices and -sheets, which are sub-sequences (or sub-domains) of amino acids of the primary structure. Its tertiary structure results from the composition of sub-domains into domains, which represent the geometric shape of the protein. Finally, the quaternary structure of a protein results from the aggregate of two or more tertiary structures, usually known as a protein complex. This thesis fits in the scope of structure-based drug design and protein docking. Specifically, one addresses the fundamental problem of detecting and identifying protein cavities, which are often seen as tentative binding sites for ligands in protein-ligand interactions. In general, cavity prediction algorithms split into three main categories: energy-based, geometry-based, and evolution-based. Evolutionary methods build upon evolutionary sequence conservation estimates; that is, these methods allow us to detect functional sites through the computation of the evolutionary conservation of the positions of amino acids in proteins. Energy-based methods build upon the computation of interaction energies between protein and ligand atoms. In turn, geometry-based algorithms build upon the analysis of the geometric shape of the protein (i.e., its tertiary structure) to identify cavities. This thesis focuses on geometric methods. We introduce here three new geometric-based algorithms for protein cavity detection. The main contribution of this thesis lies in the use of computer graphics techniques in the analysis and recognition of cavities in proteins, much in the spirit of molecular graphics and modeling. As seen further ahead, these techniques include field-of-view (FoV), voxel ray casting, back-face culling, shape diameter functions, Morse theory, and critical points. The leading idea is to come up with protein shape segmentation, much like we commonly do in mesh segmentation in computer graphics. In practice, protein cavity algorithms are nothing more than segmentation algorithms designed for proteins.Estruturas macromoleculares tais como as proteínas potencializam processos ou funções celulares. Estas funções resultam das interações entre proteínas e peptídeos, substratos catalíticos, nucleótideos, ou até mesmo substâncias químicas produzidas pelo homem. Assim, há vários tipos de interacções: proteína-ligante, proteína-proteína, proteína-DNA e assim por diante. Além disso, estas interações geralmente ocorrem em regiões conhecidas como locais de ligação (binding sites, do inglês) e só acontecem sob condições de complementaridade química e de forma. É também importante referir que uma proteína pode ser estruturada em quatro níveis. A estrutura primária que consiste em sequências de aminoácidos (ou cadeias), a estrutura secundária que compreende essencialmente por hélices e folhas , que são subsequências (ou subdomínios) dos aminoácidos da estrutura primária, a estrutura terciária que resulta da composição de subdomínios em domínios, que por sua vez representa a forma geométrica da proteína, e por fim a estrutura quaternária que é o resultado da agregação de duas ou mais estruturas terciárias. Este último nível estrutural é frequentemente conhecido por um complexo proteico. Esta tese enquadra-se no âmbito da conceção de fármacos baseados em estrutura e no acoplamento de proteínas. Mais especificamente, aborda-se o problema fundamental da deteção e identificação de cavidades que são frequentemente vistos como possíveis locais de ligação (putative binding sites, do inglês) para os seus ligantes (ligands, do inglês). De forma geral, os algoritmos de identificação de cavidades dividem-se em três categorias principais: baseados em energia, geometria ou evolução. Os métodos evolutivos baseiam-se em estimativas de conservação das sequências evolucionárias. Isto é, estes métodos permitem detectar locais funcionais através do cálculo da conservação evolutiva das posições dos aminoácidos das proteínas. Em relação aos métodos baseados em energia estes baseiam-se no cálculo das energias de interação entre átomos da proteína e do ligante. Por fim, os algoritmos geométricos baseiam-se na análise da forma geométrica da proteína para identificar cavidades. Esta tese foca-se nos métodos geométricos. Apresentamos nesta tese três novos algoritmos geométricos para detecção de cavidades em proteínas. A principal contribuição desta tese está no uso de técnicas de computação gráfica na análise e reconhecimento de cavidades em proteínas, muito no espírito da modelação e visualização molecular. Como pode ser visto mais à frente, estas técnicas incluem o field-of-view (FoV), voxel ray casting, back-face culling, funções de diâmetro de forma, a teoria de Morse, e os pontos críticos. A ideia principal é segmentar a proteína, à semelhança do que acontece na segmentação de malhas em computação gráfica. Na prática, os algoritmos de detecção de cavidades não são nada mais que algoritmos de segmentação de proteínas

    Shortest Geometric Paths Analysis in Structural Biology

    Get PDF
    The surface of a macromolecule, such as a protein, represents the contact point of any interaction that molecule has with solvent, ions, small molecules or other macromolecules. Analyzing the surface of macromolecules has a rich history but analyzing the distances from this surface to other surfaces or volumes has not been extensively explored. Many important questions can be answered quantitatively through these analyses. These include: what is the depth of a pocket or groove on the surface? what is the overall depth of the protein? how deeply are atoms buried from the surface? where are the tunnels in a protein? where are the pockets and what are their shapes? A single algorithm to solve one graph problem, namely Dijkstra’s shortest paths algorithm, forms the basis for algorithms to answer these many questions. Many distances can be measured, for instance the distance from the convex hull to the molecular surface while avoiding the interior of the surface is defined as Travel Depth. Alternatively, the distance from the surface to every atom can be measured, giving a measure of the Burial Depth of given residues. Measuring the minimum distance to the protein surface for all points in solvent, combined with topological guidance, allows tunnels to be located. Analyzing the surface from the deepest Travel Depth upwards allows pockets to be catalogued over the entire protein surface for additional shape analysis. Ligand binding sites in proteins are significantly deep, though this does not affect the binding affinity. Hyperthermostable proteins have a less deep surface but bury atoms more deeply, forming more spherical shapes than their mesophilic counterparts. Tunnels through proteins can be identified, for the first time tunnels that are winding or bifurcated can be analyzed. Pockets can be found all over the protein surface and these pockets can be tracked through time series, mutational series, or over protein families. All of these results are new and for the first time provide quantitative and statistical verification of some previous hypotheses about protein shape

    Multi-Target Prediction: A Unifying View on Problems and Methods

    Full text link
    Multi-target prediction (MTP) is concerned with the simultaneous prediction of multiple target variables of diverse type. Due to its enormous application potential, it has developed into an active and rapidly expanding research field that combines several subfields of machine learning, including multivariate regression, multi-label classification, multi-task learning, dyadic prediction, zero-shot learning, network inference, and matrix completion. In this paper, we present a unifying view on MTP problems and methods. First, we formally discuss commonalities and differences between existing MTP problems. To this end, we introduce a general framework that covers the above subfields as special cases. As a second contribution, we provide a structured overview of MTP methods. This is accomplished by identifying a number of key properties, which distinguish such methods and determine their suitability for different types of problems. Finally, we also discuss a few challenges for future research

    Proceedings: Aeronautics and Space Science

    Get PDF
    VARIABILITY IN AGN ABSORPTION LINES BASED ON HUBBLE SPACE TELESCOPE/COS DATA BALQSO KINETIC LUMINOSITY DETERMINATION WITH C III* MEASUREMENTS BREWSTER ANGLE MICROSCOPY AND CHARACTERIZATIONS OF LANGMUIR FILMS SORTING LIGHT’S TOTAL ANGULAR MOMENTUM FOR COMMUNICATION SYSTEMS THE PHOSPHORYLATION PATTERN OF RPA2, IN RESPONSE TO DOUBLE-STRAND BREAKS, DIFFERS DEPENDING ON THE LOCATION IN THE CELL AND THE PHASE OF THE CELL CYCLE THE DIOPHANTINE EQUATION Ax^4+By^4=Cz^4 IN QUADRATIC FIELDS THE SBML STANDARD TO SHARE COMPUTATIONAL MODELS OF BIOLOGICAL SYSTEMS HIGH SPEED ELECTRO-DISCHARGE DRILLING AND WIRE ELECTRODE-DISCHARGE MACHINING OF TITANIUM ALLOYS FOR AEROSPACE APPLICATIONS ROUTING OVER THE INTERPLANETARY INTERNET WIRELESS INTEGRATED RELAY SYSTEM (WIRS) HUMAN REACTIONS TO FLUCTUATING NOISE CONDITIONS AS PRODUCED BY LOW-BOOM SUPERSONIC AIRCRAFT NONINVASIVE, AMBULATORY, LONG-TERM, DEEP GASTROINTESTINAL BIOSENSOR AND IMPLANTER RECONFIGURATION PLANNING OF MODULAR ROBOT UNDER UNCERTAINTY DYNAMIC GAIT ADAPTION IN FIXED CONFIGURATION FOR MODULAR SELF-RECONFIGURABLE ROBOTS USING FUZZY LOGIC CONTROL EARLY STAGE DEVELOPMENT OF A MEDICAL DEVICE FOR NON-INVASIVE MEASUREMENT OF INTRACRANIAL PRESSURE COMPLIANT LAPAROSCOPIC SURGICAL GRASPER MODULAR JOYSTICK FOR VIRTUAL REALITY SURGICAL SIMULATION NOVEL ASSISTIVE LOCOMOTOR TOOL FOR GAIT REHABILITATION IN THE ELDERLY GAIT VARIABILITY HAS NO RELATION TO COGNITIVE PERFORMANCE ON THE PHONETIC FLUENCY TEST EFFECT OF TACTILE STIMULI ON LOCOMOTOR RHYTHM UNDERGRADUATE RESEARCH PIPELINE IN MATHEMATICS COLLEGE OF SAINT MARY ELEMENTARY SCIENCE OUTREACH PROGRAM FOSTERING STUDENT AWARENESS ON GLOBAL CLIMATE CHANGE AND ENVIRONMENTAL STEWARDSHIP THROUGH CURRICULAR AND CO-CURRICULAR ACTIVITIES AUTONOMOUS RC CAR HIGH-ALTITUDE BALLOON SOLAR PANEL VOLTAGE VARIATION MICROBENTHIC ALGAE DENSITIES IN THE DUPLIN WATERSHED ESTIMATING UNCERTAINTY OF REFLECTANCE AND ERROR PROPAGATION IN VEGETATION INDICES ESTIMATING SURFACE VISIBILITY ON THE U.S. EAST COAST: INCORPORATING THE AEROSOL VERTICAL PROFILE FROM GEOS-5 EFFECTS OF VOLCANIC EMISSIONS ON THE EARTH-ATMOSPHERE SYSTEM OBSERVING THE TRANSPORTATION OF DUST ON EARTH USING MISR ARGOS AND MICROGRAVITY FREE FLYER EVALUATION UNL LUNABOTICS TEAM: DESIGNING A ROBOT FOR THE NASA LUNABOTICS ROBOT COMPETITION DESIGN, BUILD, FLY UNIVERSITY STUDENT LAUNCH INITIATIVE EHD THIN FILM BOILING IN MICROGRAVITY ENVIRONMENTS COMBINING SATELLITE OBSERVATIONS OF FIRE ACTIVITY AND NUMERICAL WEATHER PREDICTION TO IMPROVE THE PREDICTION OF SMOKE EMISSIONS SEARCH FOR ASYMMETRIC INTERACTIONS BETWEEN CHIRAL MOLECULES AND SPIN-POLARIZED ELECTRONS AUTOIGNITION IN AN UNSTRAINED METHANOL/AIR MIXING LAYER ANALYSIS OF THE HST/COS SPECTRUM OF THE MASS OUTFLOW IN SEYFERT 1 GALAXY MRK 279 CHARACTERIZATION OF A 5.8KV SIC PIN DIODE FOR ELECTRIC SPACE PROPULSION APPLICATIONS WIRELESS POWER TRANSFER: DESIGN AND APPLICATION FORCE SENSING OF GRASPING EVENTS FOR MINIATURE SURGICAL ROBOTS UNDERSTANDING WALKING AND BREATHING COUPLING WHEN ABNORMAL BREATHING PATTERNS ARE PRESENT EXAMINING THE QUALITY OF MODIS REFLECTANCE PRODUCTS USING A FOUR-BAND SPECTRORADIOMETER INVESTIGATING LAND AND ATMOSPHERE CHARACTERISTICS DURING THE 2012 CENTRAL PLAINS DROUGHT USING MODIS AND TRMM PRODUCTS A MARXIST APPROACH TO US HISTORICAL ARCHAEOLOGY: A REVIEW AND SUMMARY OF THE HISTORY AND APPLICATION OF MARXISM ON THE FIELD OF HISTORICAL ARCHAEOLOGY IN THE US JOHN COLLIER, ANTHROPOLOGY, AND THE INDIAN NEW DEA

    Proceedings: Aeronautics and Space Science

    Get PDF
    VARIABILITY IN AGN ABSORPTION LINES BASED ON HUBBLE SPACE TELESCOPE/COS DATA BALQSO KINETIC LUMINOSITY DETERMINATION WITH C III* MEASUREMENTS BREWSTER ANGLE MICROSCOPY AND CHARACTERIZATIONS OF LANGMUIR FILMS SORTING LIGHT’S TOTAL ANGULAR MOMENTUM FOR COMMUNICATION SYSTEMS THE PHOSPHORYLATION PATTERN OF RPA2, IN RESPONSE TO DOUBLE-STRAND BREAKS, DIFFERS DEPENDING ON THE LOCATION IN THE CELL AND THE PHASE OF THE CELL CYCLE THE DIOPHANTINE EQUATION Ax^4+By^4=Cz^4 IN QUADRATIC FIELDS THE SBML STANDARD TO SHARE COMPUTATIONAL MODELS OF BIOLOGICAL SYSTEMS HIGH SPEED ELECTRO-DISCHARGE DRILLING AND WIRE ELECTRODE-DISCHARGE MACHINING OF TITANIUM ALLOYS FOR AEROSPACE APPLICATIONS ROUTING OVER THE INTERPLANETARY INTERNET WIRELESS INTEGRATED RELAY SYSTEM (WIRS) HUMAN REACTIONS TO FLUCTUATING NOISE CONDITIONS AS PRODUCED BY LOW-BOOM SUPERSONIC AIRCRAFT NONINVASIVE, AMBULATORY, LONG-TERM, DEEP GASTROINTESTINAL BIOSENSOR AND IMPLANTER RECONFIGURATION PLANNING OF MODULAR ROBOT UNDER UNCERTAINTY DYNAMIC GAIT ADAPTION IN FIXED CONFIGURATION FOR MODULAR SELF-RECONFIGURABLE ROBOTS USING FUZZY LOGIC CONTROL EARLY STAGE DEVELOPMENT OF A MEDICAL DEVICE FOR NON-INVASIVE MEASUREMENT OF INTRACRANIAL PRESSURE COMPLIANT LAPAROSCOPIC SURGICAL GRASPER MODULAR JOYSTICK FOR VIRTUAL REALITY SURGICAL SIMULATION NOVEL ASSISTIVE LOCOMOTOR TOOL FOR GAIT REHABILITATION IN THE ELDERLY GAIT VARIABILITY HAS NO RELATION TO COGNITIVE PERFORMANCE ON THE PHONETIC FLUENCY TEST EFFECT OF TACTILE STIMULI ON LOCOMOTOR RHYTHM UNDERGRADUATE RESEARCH PIPELINE IN MATHEMATICS COLLEGE OF SAINT MARY ELEMENTARY SCIENCE OUTREACH PROGRAM FOSTERING STUDENT AWARENESS ON GLOBAL CLIMATE CHANGE AND ENVIRONMENTAL STEWARDSHIP THROUGH CURRICULAR AND CO-CURRICULAR ACTIVITIES AUTONOMOUS RC CAR HIGH-ALTITUDE BALLOON SOLAR PANEL VOLTAGE VARIATION MICROBENTHIC ALGAE DENSITIES IN THE DUPLIN WATERSHED ESTIMATING UNCERTAINTY OF REFLECTANCE AND ERROR PROPAGATION IN VEGETATION INDICES ESTIMATING SURFACE VISIBILITY ON THE U.S. EAST COAST: INCORPORATING THE AEROSOL VERTICAL PROFILE FROM GEOS-5 EFFECTS OF VOLCANIC EMISSIONS ON THE EARTH-ATMOSPHERE SYSTEM OBSERVING THE TRANSPORTATION OF DUST ON EARTH USING MISR ARGOS AND MICROGRAVITY FREE FLYER EVALUATION UNL LUNABOTICS TEAM: DESIGNING A ROBOT FOR THE NASA LUNABOTICS ROBOT COMPETITION DESIGN, BUILD, FLY UNIVERSITY STUDENT LAUNCH INITIATIVE EHD THIN FILM BOILING IN MICROGRAVITY ENVIRONMENTS COMBINING SATELLITE OBSERVATIONS OF FIRE ACTIVITY AND NUMERICAL WEATHER PREDICTION TO IMPROVE THE PREDICTION OF SMOKE EMISSIONS SEARCH FOR ASYMMETRIC INTERACTIONS BETWEEN CHIRAL MOLECULES AND SPIN-POLARIZED ELECTRONS AUTOIGNITION IN AN UNSTRAINED METHANOL/AIR MIXING LAYER ANALYSIS OF THE HST/COS SPECTRUM OF THE MASS OUTFLOW IN SEYFERT 1 GALAXY MRK 279 CHARACTERIZATION OF A 5.8KV SIC PIN DIODE FOR ELECTRIC SPACE PROPULSION APPLICATIONS WIRELESS POWER TRANSFER: DESIGN AND APPLICATION FORCE SENSING OF GRASPING EVENTS FOR MINIATURE SURGICAL ROBOTS UNDERSTANDING WALKING AND BREATHING COUPLING WHEN ABNORMAL BREATHING PATTERNS ARE PRESENT EXAMINING THE QUALITY OF MODIS REFLECTANCE PRODUCTS USING A FOUR-BAND SPECTRORADIOMETER INVESTIGATING LAND AND ATMOSPHERE CHARACTERISTICS DURING THE 2012 CENTRAL PLAINS DROUGHT USING MODIS AND TRMM PRODUCTS A MARXIST APPROACH TO US HISTORICAL ARCHAEOLOGY: A REVIEW AND SUMMARY OF THE HISTORY AND APPLICATION OF MARXISM ON THE FIELD OF HISTORICAL ARCHAEOLOGY IN THE US JOHN COLLIER, ANTHROPOLOGY, AND THE INDIAN NEW DEA
    corecore