49 research outputs found

    Tools and data services registry: a community effort to document bioinformatics resources

    Get PDF
    Life sciences are yielding huge data sets that underpin scientific discoveries fundamental to improvement in human health, agriculture and the environment. In support of these discoveries, a plethora of databases and tools are deployed, in technically complex and diverse implementations, across a spectrum of scientific disciplines. The corpus of documentation of these resources is fragmented across the Web, with much redundancy, and has lacked a common standard of information. The outcome is that scientists must often struggle to find, understand, compare and use the best resources for the task at hand. Here we present a community-driven curation effort, supported by ELIXIR—the European infrastructure for biological information—that aspires to a comprehensive and consistent registry of information about bioinformatics resources. The sustainable upkeep of this Tools and Data Services Registry is assured by a curation effort driven by and tailored to local needs, and shared amongst a network of engaged partners. As of November 2015, the registry includes 1785 resources, with depositions from 126 individual registrations including 52 institutional providers and 74 individuals. With community support, the registry can become a standard for dissemination of information about bioinformatics resources: we welcome everyone to join us in this common endeavour. The registry is freely available at https://bio.tools

    Community-Driven Data Analysis Training for Biology

    Get PDF
    The primary problem with the explosion of biomedical datasets is not the data, not computational resources, and not the required storage space, but the general lack of trained and skilled researchers to manipulate and analyze these data. Eliminating this problem requires development of comprehensive educational resources. Here we present a community-driven framework that enables modern, interactive teaching of data analytics in life sciences and facilitates the development of training materials. The key feature of our system is that it is not a static but a continuously improved collection of tutorials. By coupling tutorials with a web-based analysis framework, biomedical researchers can learn by performing computation themselves through a web browser without the need to install software or search for example datasets. Our ultimate goal is to expand the breadth of training materials to include fundamental statistical and data science topics and to precipitate a complete re-engineering of undergraduate and graduate curricula in life sciences. This project is accessible at https://training.galaxyproject.org. We developed an infrastructure that facilitates data analysis training in life sciences. It is an interactive learning platform tuned for current types of data and research problems. Importantly, it provides a means for community-wide content creation and maintenance and, finally, enables trainers and trainees to use the tutorials in a variety of situations, such as those where reliable Internet access is unavailable

    ARIAweb: a server for automated NMR structure calculation

    No full text
    International audienceNuclear magnetic resonance (NMR) spectroscopy is a method of choice to study the dynamics and de- termine the atomic structure of macromolecules in solution. The standalone program ARIA (Ambigu- ous Restraints for Iterative Assignment) for auto- mated assignment of nuclear Overhauser enhance- ment (NOE) data and structure calculation is well es- tablished in the NMR community. To ultimately pro- vide a perfectly transparent and easy to use service, we designed an online user interface to ARIA with additional functionalities. Data conversion, structure calculation setup and execution, followed by inter- active visualization of the generated 3D structures are all integrated in ARIAweb and freely accessible at https://ariaweb.pasteur.fr

    Improved reliability, accuracy and quality in automated NMR structure calculation with ARIA.

    Get PDF
    International audienceIn biological NMR, assignment of NOE cross-peaks and calculation of atomic conformations are critical steps in the determination of reliable high-resolution structures. ARIA is an automated approach that performs NOE assignment and structure calculation in a concomitant manner in an iterative procedure. The log-harmonic shape for distance restraint potential and the Bayesian weighting of distance restraints, recently introduced in ARIA, were shown to significantly improve the quality and the accuracy of determined structures. In this paper, we propose two modifications of the ARIA protocol: (1) the softening of the force field together with adapted hydrogen radii, which is meaningful in the context of the log-harmonic potential with Bayesian weighting, (2) a procedure that automatically adjusts the violation tolerance used in the selection of active restraints, based on the fitting of the structure to the input data sets. The new ARIA protocols were fine-tuned on a set of eight protein targets from the CASD–NMR initiative. As a result, the convergence problems previously observed for some targets was resolved and the obtained structures exhibited better quality. In addition, the new ARIA protocols were applied for the structure calculation of ten new CASD–NMR targets in a blind fashion, i.e. without knowing the actual solution. Even though optimisation of parameters and pre-filtering of unrefined NOE peak lists were necessary for half of the targets, ARIA consistently and reliably determined very precise and highly accurate structures for all cases. In the context of integrative structural biology, an increasing number of experimental methods are used that produce distance data for the determination of 3D structures of macromolecules, stressing the importance of methods that successfully make use of ambiguous and noisy distance data. Keywords Nuclear magnetic resonance Á Automated NOE assignment Á Structure determination Á ARIA Á CASD–NM

    A Comprehensive Dataset of protein-protein interactions and Ligand Binding Pockets for Advancing Drug Discovery

    No full text
    <p>This dataset presents a comprehensive collection of structural data related to protein-protein interactions (PPIs) and ligand binding pockets. The dataset includes high-quality structural information that can aid researchers in the fields of bioinformatics, structural biology, and drug discovery. It encompasses a diverse set of PPI complexes and associated ligands, enabling detailed investigations into molecular interactions at the atomic level. This article introduces an indispensable resource designed to unlock the full potential of PPIs while pioneering a novel metric for pocket similarity for repurposing protein partners.</p&gt

    A comprehensive dataset of protein-protein interactions and ligand binding pockets for advancing drug discovery

    No full text
    International audienceThis dataset represents a collection of pocket-centric structural data related to protein-protein interactions (PPIs) and PPI-related ligand binding sites. The dataset includes high-quality structural information on more than 23,000 pockets, 3,700 proteins on more than 500 organisms, and nearly 3500 ligands that can aid researchers in the fields of bioinformatics, structural biology, and drug discovery. It encompasses a diverse set of PPI complexes with more than 1,700 unique protein families including some with associated ligands, enabling detailed investigations into molecular interactions at the atomic level. This article introduces an indispensable resource designed to unlock the full potential of PPIs while pioneering a novel metric for pocket similarity for hypothesizing protein partners repurposing

    A simple genetic algorithm for the optimization of multidomain protein homology models driven by NMR residual dipolar coupling and small angle X-ray scattering data.

    No full text
    International audienceMost proteins comprise several domains and/or participate in functional complexes. Owing to ongoing structural genomic projects, it is likely that it will soon be possible to predict, with reasonable accuracy, the conserved regions of most structural domains. Under these circumstances, it will be important to have methods, based on simple-to-acquire experimental data, that allow to build and refine structures of multi-domain proteins or of protein complexes from homology models of the individual domains/proteins. It has been recently shown that small angle X-ray scattering (SAXS) and NMR residual dipolar coupling (RDC) data can be combined to determine the architecture of such objects when the X-ray structures of the domains are known and can be considered as rigid objects. We developed a simple genetic algorithm to achieve the same goal, but by using homology models of the domains considered as deformable objects. We applied it to two model systems, an S1KH bi-domain of the NusA protein and the gammaS-crystallin protein. Despite its simplicity our algorithm is able to generate good solutions when driven by SAXS and RDC data

    Protein Interaction Explorer (PIE): A Web Platform for Exploring Protein-Protein Interactions

    No full text
    International audienceProtein-protein interactions (PPIs) are fundamental to numerous biological processes and represent promising targets for drug discovery. Understanding PPIs’ three-dimensional (3D) structure is critical for identifying potential drug-binding sites (ligandability). However, assessing ligandability and targeting PPIs pose significant challenges due to the variability in binding sites and the lack of available ligands. The methodology behind this work involves rigorous protein selection criteria, structure quality filters, and preparation steps to ensure the reliability of the data and a web platform, the Protein Interaction Explorer (PIE), allowing the exploration of PPIs by integrating advanced visualization tools and leveraging extensive structural data. PIE offers unique features such as: ‱ Pocket detection, filtration, and characterization based on VolSite, 89 descriptors provide detailed structural information. ‱ Pocket Similarity based on Euclidean distances and Gaussian kernel. ‱ Pocketome Visualization: TMAP tool creates a simplified visual representation of pocket similarity. ‱ Hot Spot and Binding Site Prediction: FoldX [7] predicts hot spots (crucial amino acid), whileInDeep [8] forecasts functional binding sites. By combining various visualization techniques and predictive tools (InDeep, VolSite, FoldX), PIE empowers researchers to navigate the intricate landscape of PPIs and serves as a comprehensive platform for investigating PPIs, offering valuable insights into molecular interactions and facilitating drug discovery endeavors.PIE is user-friendly and readily accessible at https://ippidb.pasteur.fr/targetcentric/. It relies on the NGL visualizer

    Exploring the richness of the French Galaxy Ecosystem

    No full text
    International audienceThe French Bioinformatics Community has embraced Galaxy since its inception, with a pivotal moment being the Galaxy Tour de France led by Nate Coroar, Anton Nekrutenko, and James Taylor in 2012. This adoption has led to the establishment of over 10 Galaxy servers across France, catering to diverse local needs and specialized thematic areas such as ecology, biodiversity, NGS, proteomics, and more.Among these servers, UseGalaxy.fr stands out as the flagship national instance, launched in 2021 and hosted by the French Institute for Bioinformatics (IFB - ELIXIR-FR). With robust infrastructure boasting 8300 CPU cores, 52 TB of RAM, and GPU cards, UseGalaxy.fr offers a comprehensive suite of over 3,000 tools, including interactive options like Jupyter Notebook, AlphaFold, and Helixer. Notably, it has garnered over 6,000 users who have collectively executed over 3.6 million jobs. Moreover, UseGalaxy.fr hosts specialized subdomains catering to various community needs, such as ecology, metabarcoding, and COVID-19 research, with ongoing integration of new subdomains.The community's commitment to collaboration and consolidation is evident as several local servers have migrated to UseGalaxy.fr in recent years, with others expressing interest in doing the same. The French Galaxy community is deeply engaged in a multitude of projects at national, European, and global levels, including EOSC FAIR EASE, EuroScienceGateway, ATLASea and ABRomics.To foster cohesion and synergy within the community, a Galaxy Working Group led by the French Bioinformatics Institute facilitates regular interactions. This group serves to connect Galaxy users across France, share knowledge, support UseGalaxy.fr, and combat misconceptions about Galaxy within the French scientific community.In this poster presentation, we provide an overview of the dynamic French Galaxy ecosystem, highlighting its diverse servers, engaged researchers, ongoing projects, and collaborative efforts. Through this exploration, we aim to showcase the vibrancy and impact of Galaxy within the French bioinformatics landscape
    corecore