1,286 research outputs found

    The MIntAct project—IntAct as a common curation platform for 11 molecular interaction databases

    Get PDF
    IntAct (freely available at http://www.ebi.ac.uk/intact) is an open-source, open data molecular interaction database populated by data either curated from the literature or from direct data depositions. IntAct has developed a sophisticated web-based curation tool, capable of supporting both IMEx- and MIMIx-level curation. This tool is now utilized by multiple additional curation teams, all of whom annotate data directly into the IntAct database. Members of the IntAct team supply appropriate levels of training, perform quality control on entries and take responsibility for long-term data maintenance. Recently, the MINT and IntAct databases decided to merge their separate efforts to make optimal use of limited developer resources and maximize the curation output. All data manually curated by the MINT curators have been moved into the IntAct database at EMBL-EBI and are merged with the existing IntAct dataset. Both IntAct and MINT are active contributors to the IMEx consortium (http://www.imexconsortium.org

    DOMINO: a database of domain–peptide interactions

    Get PDF
    Many protein interactions are mediated by small protein modules binding to short linear peptides. DOMINO () is an open-access database comprising more than 3900 annotated experiments describing interactions mediated by protein-interaction domains. DOMINO can be searched with a versatile search tool and the interaction networks can be visualized with a convenient graphic display applet that explicitly identifies the domains/sites involved in the interactions

    MINT, the molecular interaction database: 2009 update

    Get PDF
    MINT (http://mint.bio.uniroma2.it/mint) is a public repository for molecular interactions reported in peer-reviewed journals. Since its last report, MINT has grown considerably in size and evolved in scope to meet the requirements of its users. The main changes include a more precise definition of the curation policy and the development of an enhanced and user-friendly interface to facilitate the analysis of the ever-growing interaction dataset. MINT has adopted the PSI-MI standards for the annotation and for the representation of molecular interactions and is a member of the IMEx consortium

    Integration and mining of malaria molecular, functional and pharmacological data: how far are we from a chemogenomic knowledge space?

    Get PDF
    The organization and mining of malaria genomic and post-genomic data is highly motivated by the necessity to predict and characterize new biological targets and new drugs. Biological targets are sought in a biological space designed from the genomic data from Plasmodium falciparum, but using also the millions of genomic data from other species. Drug candidates are sought in a chemical space containing the millions of small molecules stored in public and private chemolibraries. Data management should therefore be as reliable and versatile as possible. In this context, we examined five aspects of the organization and mining of malaria genomic and post-genomic data: 1) the comparison of protein sequences including compositionally atypical malaria sequences, 2) the high throughput reconstruction of molecular phylogenies, 3) the representation of biological processes particularly metabolic pathways, 4) the versatile methods to integrate genomic data, biological representations and functional profiling obtained from X-omic experiments after drug treatments and 5) the determination and prediction of protein structures and their molecular docking with drug candidate structures. Progresses toward a grid-enabled chemogenomic knowledge space are discussed.Comment: 43 pages, 4 figures, to appear in Malaria Journa

    Researching BWPWAP: how can we save research from itself?

    Get PDF

    DeepHTLV: a Deep Learning Framework for Detecting Human T-Lymphotrophic Virus 1 Integration Sites

    Get PDF
    In the 1980s, researchers found the first human oncogenic retrovirus called human T-lymphotrophic virus type 1 (HTLV-1). Since then, HTLV-1 has been identified as the causative agent behind several diseases such as adult T-cell leukemia/lymphoma (ATL) and a HTLV-1 associated myelopathy or tropical spastic paraparesis (HAM/TSP). As part of its normal replication cycle, the genome is converted into DNA and integrated into the genome. With several hundreds to thousands of unique viral integration sites (VISs) distributed with indeterminate preference throughout the genome, detection of HTLV-1 VISs is a challenging task. Experimental studies typically use molecular biology techniques such as fluorescent in-situ hybridization (FISH) or using rt-qPCR (reverse transcriptase quantitative PCR) to detect VISs. While these methods are accurate, they cannot be applied in a high throughput manner. Next generation sequencing (NGS) has generated vast amounts of data, resulting in the development of several computational methods for VIS detection such as VERSE, VirusFinder, or DeepVISP for the task of rapid detection VIS across an entire genome. However, no such model exists for predicting HTLV-1 VISs. In this study, we have developed DeepHTLV: the first deep neural network for accurate detection of HTLV-1 insertion sites. We focused on 1) accurately predicting HTLV-1 VISs by extracting and generating superior feature representations and 2) uncovering the cis-regulatory features surrounding the insertion sites. DeepHTLV was implemented as a deep convolutional neural network (CNN) with self-attention architecture after comparing with several other deep neural network structures. To improve model accuracy, we trained the model using a bootstrap balanced sampling method with 10-fold CV. Furthermore, we demonstrated that this model has higher accuracy than several traditional machine learning models, with a modest improvement in area under the curve (AUC) values by 3-10%. To study the cis-regulatory features around HTLV-1 insertion sites, we extracted informative motifs from convolutional layer. Clustering of these motifs yielded eight unique consensus sequence motifs that represented potential integration sites in humans. The informative motif sequences were matched with a known transcription factor (TF) binding profile database, JASPAR2020, with the sequence matching tool TOMTOM. 79 TFs associations were enriched in regions surrounding HTLV-1 VISs. Furthermore, literature screening of HTLV-1, ATL, and HAM/TSP validated nearly half (34) of the predicted TFs interactions. This work demonstrates that DeepHTLV can accurately identify HTLV-1 VISs, elucidate surrounding features regulating these insertion sites, and make biologically meaningful predictions about cis-regulatory elements surrounding the insertion sites

    Steps to an Ecology of Networked Knowledge and Innovation: Enabling new forms of collaboration among sciences, engineering, arts, and design

    Get PDF
    SEAD network White Papers ReportThe final White Papers (posted at http://seadnetwork.wordpress.com/white-paper- abstracts/final-white-papers/) represent a spectrum of interests in advocating for transdisciplinarity among arts, sciences, and technologies. All authors submitted plans of action and identified stakeholders they perceived as instrumental in carrying out such plans. The individual efforts led to an international scope. One of the important characteristics of this collection is that the papers do not represent a collective aim toward an explicit initiative. Rather, they offer a broad array of views on barriers faced and prospective solutions. In summary, the collected White Papers and associated Meta- analyses began as an effort to take the pulse of the SEAD community as broadly as possible. The ideas they generated provide a fruitful basis for gauging trends and challenges in facilitating the growth of the network and implementing future SEAD initiatives.National Science Foundation Grant No.1142510. Additional funding was provided by the ATEC program at the University of Texas at Dallas and the Institute for Applied Creativity at Texas A&M University

    BioDR : semantic indexing networks for biomedical document retrieval

    Get PDF
    In Biomedical research, retrieving documents that match an interesting query is a task performed quite frequently. Typically, the set of obtained results is extensive containing many non-interesting documents and consists in a flat list, i.e., not organized or indexed in any way. This work proposes BioDR, a novel approach that allows the semantic indexing of the results of a query, by identifying relevant terms in the documents. These terms emerge from a process of Named Entity Recognition that annotates occurrences of biological terms (e.g. genes or proteins) in abstracts or full-texts. The system is based on a learning process that builds an Enhanced Instance Retrieval Network (EIRN) from a set of manually classified documents, regarding their relevance to a given problem. The resulting EIRN implements the semantic indexing of documents and terms, allowing for enhanced navigation and visualization tools, as well as the assessment of relevance for new documents.Fundação para a Ciência e a Tecnologia (FCT)Maria Barbeito” contract XuntaHUELLA financed by the Consellería de Sanidade (Xunta de Galicia de Galicia
    • …
    corecore