2,334 research outputs found

    Explorative search of distributed bio-data to answer complex biomedical questions

    Get PDF
    Background The huge amount of biomedical-molecular data increasingly produced is providing scientists with potentially valuable information. Yet, such data quantity makes difficult to find and extract those data that are most reliable and most related to the biomedical questions to be answered, which are increasingly complex and often involve many different biomedical-molecular aspects. Such questions can be addressed only by comprehensively searching and exploring different types of data, which frequently are ordered and provided by different data sources. Search Computing has been proposed for the management and integration of ranked results from heterogeneous search services. Here, we present its novel application to the explorative search of distributed biomedical-molecular data and the integration of the search results to answer complex biomedical questions. Results A set of available bioinformatics search services has been modelled and registered in the Search Computing framework, and a Bioinformatics Search Computing application (Bio-SeCo) using such services has been created and made publicly available at http://www.bioinformatics.deib.polimi.it/bio-seco/seco/. It offers an integrated environment which eases search, exploration and ranking-aware combination of heterogeneous data provided by the available registered services, and supplies global results that can support answering complex multi-topic biomedical questions. Conclusions By using Bio-SeCo, scientists can explore the very large and very heterogeneous biomedical-molecular data available. They can easily make different explorative search attempts, inspect obtained results, select the most appropriate, expand or refine them and move forward and backward in the construction of a global complex biomedical query on multiple distributed sources that could eventually find the most relevant results. Thus, it provides an extremely useful automated support for exploratory integrated bio search, which is fundamental for Life Science data driven knowledge discovery

    NETTAB 2012 on “Integrated Bio-Search”

    Get PDF
    The NETTAB 2012 workshop, held in Como on November 14-16, 2012, was devoted to "Integrated Bio-Search", that is to technologies, methods, architectures, systems and applications for searching, retrieving, integrating and analyzing data, information, and knowledge with the aim of answering complex bio-medical-molecular questions, i.e. some of the most challenging issues in bioinformatics today. It brought together about 80 researchers working in the field of Bioinformatics, Computational Biology, Biology, Computer Science and Engineering. More than 50 scientific contributions, including keynote and tutorial talks, oral communications, posters and software demonstrations, were presented at the workshop. This preface provides a brief overview of the workshop and shortly introduces the peer-reviewed manuscripts that were accepted for publication in this Supplement

    Detection of gene annotations and protein-protein interaction associated disorders through transitive relationships between integrated annotations

    Get PDF
    Background Gene function annotations, which are associations between a gene and a term of a controlled vocabulary describing gene functional features, are of paramount importance in modern biology. Datasets of these annotations, such as the ones provided by the Gene Ontology Consortium, are used to design novel biological experiments and interpret their results. Despite their importance, these sources of information have some known issues. They are incomplete, since biological knowledge is far from being definitive and it rapidly evolves, and some erroneous annotations may be present. Since the curation process of novel annotations is a costly procedure, both in economical and time terms, computational tools that can reliably predict likely annotations, and thus quicken the discovery of new gene annotations, are very useful. Methods We used a set of computational algorithms and weighting schemes to infer novel gene annotations from a set of known ones. We used the latent semantic analysis approach, implementing two popular algorithms (Latent Semantic Indexing and Probabilistic Latent Semantic Analysis) and propose a novel method, the Semantic IMproved Latent Semantic Analysis, which adds a clustering step on the set of considered genes. Furthermore, we propose the improvement of these algorithms by weighting the annotations in the input set. Results We tested our methods and their weighted variants on the Gene Ontology annotation sets of three model organism genes (Bos taurus, Danio rerio and Drosophila melanogaster ). The methods showed their ability in predicting novel gene annotations and the weighting procedures demonstrated to lead to a valuable improvement, although the obtained results vary according to the dimension of the input annotation set and the considered algorithm. Conclusions Out of the three considered methods, the Semantic IMproved Latent Semantic Analysis is the one that provides better results. In particular, when coupled with a proper weighting policy, it is able to predict a significant number of novel annotations, demonstrating to actually be a helpful tool in supporting scientists in the curation process of gene functional annotations

    A New Way of Identifying Biomarkers in Biomedical Basic-Research Studies

    Get PDF
    A simple, nonparametric and distribution free method was developed for quick identification of the most meaningful biomarkers among a number of candidates in complex biological phenomena, especially in relatively small samples. This method is independent of rigid model forms or other link functions. It may be applied both to metric and non-metric data as well as to independent or matched parallel samples. With this method identification of the most relevant biomarkers is not based on inferential methods; therefore, its application does not require corrections of the level of significance, even in cases of thousands of variables. Hence, the introduced method is appropriate to analyze and evaluate data of complex investigations in clinical and pre-clinical basic research, such as gene or protein expressions, phenotype-genotype associations in case-control studies on the basis of thousands of genes and SNPs (single nucleotide polymorphism), search of prevalence in sleep EEG-Data, functional magnetic resonance imaging (fMRI) or others

    The impact of depression forums on illness narratives: a comprehensive NLP analysis of socialization in e-mental health communities

    Get PDF
    While depression is globally on the rise, the mental health sector struggles with handling the increased number of cases, especially since the pandemic. These circumstances have resulted in an increased interest in the e-mental health sector. The dataset is constituted of 67 857 posts from the most popular English-language online health forums between 15 February 2016 and 15 February 2019. The posts were first automatically labelled (biomedical vs. psy framing) via deep learning; second, the time series of framing types of recurring forum users were analysed; third, the clusters of biomedical and psy patterns were analysed; fourth, the discursive characteristics of each cluster were analysed with the help of topic modelling. Five ideal-typical patterns of forum socialization are described: the first and the second clusters express the developing of a ‘recovery helper’ role, either by opposing expert discourses or by identifying with the psy discourses; the third cluster expresses the acquiring of a substantively diffuse, uncertain role; the fourth and fifth clusters refer to a trajectory leading to the incorporating of a biomedically framed patient role, or a therapeutic psy subjectivity. Elements of data collection that potentially undermine representativeness: online forum users, open and public forums, keyword search. The trajectories identified in our study represent various phases of a general forum socialization process: newcomers (cluster 3); settled patient role (cluster 4) or psy subjectivity (cluster 5); recovery helpers (cluster 1 and 2)

    An Introduction to Programming for Bioscientists: A Python-based Primer

    Full text link
    Computing has revolutionized the biological sciences over the past several decades, such that virtually all contemporary research in the biosciences utilizes computer programs. The computational advances have come on many fronts, spurred by fundamental developments in hardware, software, and algorithms. These advances have influenced, and even engendered, a phenomenal array of bioscience fields, including molecular evolution and bioinformatics; genome-, proteome-, transcriptome- and metabolome-wide experimental studies; structural genomics; and atomistic simulations of cellular-scale molecular assemblies as large as ribosomes and intact viruses. In short, much of post-genomic biology is increasingly becoming a form of computational biology. The ability to design and write computer programs is among the most indispensable skills that a modern researcher can cultivate. Python has become a popular programming language in the biosciences, largely because (i) its straightforward semantics and clean syntax make it a readily accessible first language; (ii) it is expressive and well-suited to object-oriented programming, as well as other modern paradigms; and (iii) the many available libraries and third-party toolkits extend the functionality of the core language into virtually every biological domain (sequence and structure analyses, phylogenomics, workflow management systems, etc.). This primer offers a basic introduction to coding, via Python, and it includes concrete examples and exercises to illustrate the language's usage and capabilities; the main text culminates with a final project in structural bioinformatics. A suite of Supplemental Chapters is also provided. Starting with basic concepts, such as that of a 'variable', the Chapters methodically advance the reader to the point of writing a graphical user interface to compute the Hamming distance between two DNA sequences.Comment: 65 pages total, including 45 pages text, 3 figures, 4 tables, numerous exercises, and 19 pages of Supporting Information; currently in press at PLOS Computational Biolog

    Transcriptome and Proteome Research in Veterinary Science: What Is Possible and What Questions Can Be Asked?

    Get PDF
    In recent years several technologies for the complete analysis of the transcriptome and proteome have reached a technological level which allows their routine application as scientific tools. The principle of these methods is the identification and quantification of up to ten thousands of RNA and proteins species in a tissue, in contrast to the sequential analysis of conventional methods such as PCR and Western blotting. Due to their technical progress transcriptome and proteome analyses are becoming increasingly relevant in all fields of biological research. They are mainly used for the explorative identification of disease associated complex gene expression patterns and thereby set the stage for hypothesis-driven studies. This review gives an overview on the methods currently available for transcriptome analysis, that is, microarrays, Ref-Seq, quantitative PCR arrays and discusses their potentials and limitations. Second, the most powerful current approaches to proteome analysis are introduced, that is, 2D-gel electrophoresis, shotgun proteomics, MudPIT and the diverse technological concepts are reviewed. Finally, experimental strategies for biomarker discovery, experimental settings for the identification of prognostic gene sets and explorative versus hypothesis driven approaches for the elucidation of diseases associated genes and molecular pathways are described and their potential for studies in veterinary research is highlighted
    • 

    corecore