11,513 research outputs found

    A quick guide for building a successful bioinformatics community

    Get PDF
    “Scientific community” refers to a group of people collaborating together on scientific-research-related activities who also share common goals, interests, and values. Such communities play a key role in many bioinformatics activities. Communities may be linked to a specific location or institute, or involve people working at many different institutions and locations. Education and training is typically an important component of these communities, providing a valuable context in which to develop skills and expertise, while also strengthening links and relationships within the community. Scientific communities facilitate: (i) the exchange and development of ideas and expertise; (ii) career development; (iii) coordinated funding activities; (iv) interactions and engagement with professionals from other fields; and (v) other activities beneficial to individual participants, communities, and the scientific field as a whole. It is thus beneficial at many different levels to understand the general features of successful, high-impact bioinformatics communities; how individual participants can contribute to the success of these communities; and the role of education and training within these communities. We present here a quick guide to building and maintaining a successful, high-impact bioinformatics community, along with an overview of the general benefits of participating in such communities. This article grew out of contributions made by organizers, presenters, panelists, and other participants of the ISMB/ECCB 2013 workshop “The ‘How To Guide’ for Establishing a Successful Bioinformatics Network” at the 21st Annual International Conference on Intelligent Systems for Molecular Biology (ISMB) and the 12th European Conference on Computational Biology (ECCB)

    A quick guide for student-driven community genome annotation

    Full text link
    High quality gene models are necessary to expand the molecular and genetic tools available for a target organism, but these are available for only a handful of model organisms that have undergone extensive curation and experimental validation over the course of many years. The majority of gene models present in biological databases today have been identified in draft genome assemblies using automated annotation pipelines that are frequently based on orthologs from distantly related model organisms. Manual curation is time consuming and often requires substantial expertise, but is instrumental in improving gene model structure and identification. Manual annotation may seem to be a daunting and cost-prohibitive task for small research communities but involving undergraduates in community genome annotation consortiums can be mutually beneficial for both education and improved genomic resources. We outline a workflow for efficient manual annotation driven by a team of primarily undergraduate annotators. This model can be scaled to large teams and includes quality control processes through incremental evaluation. Moreover, it gives students an opportunity to increase their understanding of genome biology and to participate in scientific research in collaboration with peers and senior researchers at multiple institutions

    Process-oriented Iterative Multiple Alignment for Medical Process Mining

    Full text link
    Adapted from biological sequence alignment, trace alignment is a process mining technique used to visualize and analyze workflow data. Any analysis done with this method, however, is affected by the alignment quality. The best existing trace alignment techniques use progressive guide-trees to heuristically approximate the optimal alignment in O(N2L2) time. These algorithms are heavily dependent on the selected guide-tree metric, often return sum-of-pairs-score-reducing errors that interfere with interpretation, and are computationally intensive for large datasets. To alleviate these issues, we propose process-oriented iterative multiple alignment (PIMA), which contains specialized optimizations to better handle workflow data. We demonstrate that PIMA is a flexible framework capable of achieving better sum-of-pairs score than existing trace alignment algorithms in only O(NL2) time. We applied PIMA to analyzing medical workflow data, showing how iterative alignment can better represent the data and facilitate the extraction of insights from data visualization.Comment: accepted at ICDMW 201

    An Introduction to Programming for Bioscientists: A Python-based Primer

    Full text link
    Computing has revolutionized the biological sciences over the past several decades, such that virtually all contemporary research in the biosciences utilizes computer programs. The computational advances have come on many fronts, spurred by fundamental developments in hardware, software, and algorithms. These advances have influenced, and even engendered, a phenomenal array of bioscience fields, including molecular evolution and bioinformatics; genome-, proteome-, transcriptome- and metabolome-wide experimental studies; structural genomics; and atomistic simulations of cellular-scale molecular assemblies as large as ribosomes and intact viruses. In short, much of post-genomic biology is increasingly becoming a form of computational biology. The ability to design and write computer programs is among the most indispensable skills that a modern researcher can cultivate. Python has become a popular programming language in the biosciences, largely because (i) its straightforward semantics and clean syntax make it a readily accessible first language; (ii) it is expressive and well-suited to object-oriented programming, as well as other modern paradigms; and (iii) the many available libraries and third-party toolkits extend the functionality of the core language into virtually every biological domain (sequence and structure analyses, phylogenomics, workflow management systems, etc.). This primer offers a basic introduction to coding, via Python, and it includes concrete examples and exercises to illustrate the language's usage and capabilities; the main text culminates with a final project in structural bioinformatics. A suite of Supplemental Chapters is also provided. Starting with basic concepts, such as that of a 'variable', the Chapters methodically advance the reader to the point of writing a graphical user interface to compute the Hamming distance between two DNA sequences.Comment: 65 pages total, including 45 pages text, 3 figures, 4 tables, numerous exercises, and 19 pages of Supporting Information; currently in press at PLOS Computational Biolog

    The active microbial community more accurately reflects the anaerobic digestion process: 16S rRNA (gene) sequencing as a predictive tool

    Get PDF
    Background: Amplicon sequencing methods targeting the 16S rRNA gene have been used extensively to investigate microbial community composition and dynamics in anaerobic digestion. These methods successfully characterize amplicons but do not distinguish micro-organisms that are actually responsible for the process. In this research, the archaeal and bacterial community of 48 full-scale anaerobic digestion plants were evaluated on DNA (total community) and RNA (active community) level via 16S rRNA (gene) amplicon sequencing. Results: A significantly higher diversity on DNA compared with the RNA level was observed for archaea, but not for bacteria. Beta diversity analysis showed a significant difference in community composition between the DNA and RNA of both bacteria and archaea. This related with 25.5 and 42.3% of total OTUs for bacteria and archaea, respectively, that showed a significant difference in their DNA and RNA profiles. Similar operational parameters affected the bacterial and archaeal community, yet the differentiating effect between DNA and RNA was much stronger for archaea. Co-occurrence networks and functional prediction profiling confirmed the clear differentiation between DNA and RNA profiles. Conclusions: In conclusion, a clear difference in active (RNA) and total (DNA) community profiles was observed, implying the need for a combined approach to estimate community stability in anaerobic digestion

    Highlights of the 2nd Bioinformatics Student Symposium by ISCB RSG-UK [version 1]

    Get PDF
    Following the success of the 1 (st) Student Symposium by ISCB RSG-UK, a 2 (nd) Student Symposium took place on 7 (th) October 2015 at The Genome Analysis Centre, Norwich, UK. This short report summarizes the main highlights from the 2 (nd) Bioinformatics Student Symposium

    Higher education decision making and decision support systems

    Get PDF
    The authors illustrate several issues in decision support and decision support systems (DSS), state of the art research in these fields, and also their own studies in designing a higher education DSS. The final section contains our contribution in outlining the modules of the DSS, involving the present systems and databases of FSEGA and UBB, results and activities belonging to FSEGA students, teaching and research staff, to assist decisions for all the actors implicated in the processes, in various specific situations.decision support, decision support systems (DSS), higher education institutions, Information and Communication Technologies (ICT)
    corecore