11,513 research outputs found
A quick guide for building a successful bioinformatics community
“Scientific community” refers to a group of people collaborating together on scientific-research-related activities who also share common goals, interests, and values. Such communities play a key role in many bioinformatics activities. Communities may be linked to a specific location or institute, or involve people working at many different institutions and locations. Education and training is typically an important component of these communities, providing a valuable context in which to develop skills and expertise, while also strengthening links and relationships within the community. Scientific communities facilitate: (i) the exchange and development of ideas and expertise; (ii) career development; (iii) coordinated funding activities; (iv) interactions and engagement with professionals from other fields; and (v) other activities beneficial to individual participants, communities, and the scientific field as a whole. It is thus beneficial at many different levels to understand the general features of successful, high-impact bioinformatics communities; how individual participants can contribute to the success of these communities; and the role of education and training within these communities. We present here a quick guide to building and maintaining a successful, high-impact bioinformatics community, along with an overview of the general benefits of participating in such communities. This article grew out of contributions made by organizers, presenters, panelists, and other participants of the ISMB/ECCB 2013 workshop “The ‘How To Guide’ for Establishing a Successful Bioinformatics Network” at the 21st Annual International Conference on Intelligent Systems for Molecular Biology (ISMB) and the 12th European Conference on Computational Biology (ECCB)
A quick guide for student-driven community genome annotation
High quality gene models are necessary to expand the molecular and genetic
tools available for a target organism, but these are available for only a
handful of model organisms that have undergone extensive curation and
experimental validation over the course of many years. The majority of gene
models present in biological databases today have been identified in draft
genome assemblies using automated annotation pipelines that are frequently
based on orthologs from distantly related model organisms. Manual curation is
time consuming and often requires substantial expertise, but is instrumental in
improving gene model structure and identification. Manual annotation may seem
to be a daunting and cost-prohibitive task for small research communities but
involving undergraduates in community genome annotation consortiums can be
mutually beneficial for both education and improved genomic resources. We
outline a workflow for efficient manual annotation driven by a team of
primarily undergraduate annotators. This model can be scaled to large teams and
includes quality control processes through incremental evaluation. Moreover, it
gives students an opportunity to increase their understanding of genome biology
and to participate in scientific research in collaboration with peers and
senior researchers at multiple institutions
Process-oriented Iterative Multiple Alignment for Medical Process Mining
Adapted from biological sequence alignment, trace alignment is a process
mining technique used to visualize and analyze workflow data. Any analysis done
with this method, however, is affected by the alignment quality. The best
existing trace alignment techniques use progressive guide-trees to
heuristically approximate the optimal alignment in O(N2L2) time. These
algorithms are heavily dependent on the selected guide-tree metric, often
return sum-of-pairs-score-reducing errors that interfere with interpretation,
and are computationally intensive for large datasets. To alleviate these
issues, we propose process-oriented iterative multiple alignment (PIMA), which
contains specialized optimizations to better handle workflow data. We
demonstrate that PIMA is a flexible framework capable of achieving better
sum-of-pairs score than existing trace alignment algorithms in only O(NL2)
time. We applied PIMA to analyzing medical workflow data, showing how iterative
alignment can better represent the data and facilitate the extraction of
insights from data visualization.Comment: accepted at ICDMW 201
Recommended from our members
Narrative Visualization: Sharing Insights into Complex Data
This paper is a reflection on the emerging genre of narrative visualization, a creative response to the need to share complex data engagingly with the public. In it, we explain how narrative visualization offers authors the opportunity to communicate more effectively with their audience by reproducing and sharing an experience of insight similar to their own. To do so, we propose a two part model, derived from previous literature, in which insight is understood as both an experience and also the product of that experience. We then discuss how the design of narrative visualization should be informed by attempts elsewhere to track the provenance of insights and share them in a collaborative setting. Finally, we present a future direction for research that includes using EEG technology to record neurological patterns during episodes of insight experience as the basis for evaluation
An Introduction to Programming for Bioscientists: A Python-based Primer
Computing has revolutionized the biological sciences over the past several
decades, such that virtually all contemporary research in the biosciences
utilizes computer programs. The computational advances have come on many
fronts, spurred by fundamental developments in hardware, software, and
algorithms. These advances have influenced, and even engendered, a phenomenal
array of bioscience fields, including molecular evolution and bioinformatics;
genome-, proteome-, transcriptome- and metabolome-wide experimental studies;
structural genomics; and atomistic simulations of cellular-scale molecular
assemblies as large as ribosomes and intact viruses. In short, much of
post-genomic biology is increasingly becoming a form of computational biology.
The ability to design and write computer programs is among the most
indispensable skills that a modern researcher can cultivate. Python has become
a popular programming language in the biosciences, largely because (i) its
straightforward semantics and clean syntax make it a readily accessible first
language; (ii) it is expressive and well-suited to object-oriented programming,
as well as other modern paradigms; and (iii) the many available libraries and
third-party toolkits extend the functionality of the core language into
virtually every biological domain (sequence and structure analyses,
phylogenomics, workflow management systems, etc.). This primer offers a basic
introduction to coding, via Python, and it includes concrete examples and
exercises to illustrate the language's usage and capabilities; the main text
culminates with a final project in structural bioinformatics. A suite of
Supplemental Chapters is also provided. Starting with basic concepts, such as
that of a 'variable', the Chapters methodically advance the reader to the point
of writing a graphical user interface to compute the Hamming distance between
two DNA sequences.Comment: 65 pages total, including 45 pages text, 3 figures, 4 tables,
numerous exercises, and 19 pages of Supporting Information; currently in
press at PLOS Computational Biolog
The active microbial community more accurately reflects the anaerobic digestion process: 16S rRNA (gene) sequencing as a predictive tool
Background:
Amplicon sequencing methods targeting the 16S rRNA gene have been used extensively to investigate microbial community composition and dynamics in anaerobic digestion. These methods successfully characterize amplicons but do not distinguish micro-organisms that are actually responsible for the process. In this research, the archaeal and bacterial community of 48 full-scale anaerobic digestion plants were evaluated on DNA (total community) and RNA (active community) level via 16S rRNA (gene) amplicon sequencing.
Results:
A significantly higher diversity on DNA compared with the RNA level was observed for archaea, but not for bacteria. Beta diversity analysis showed a significant difference in community composition between the DNA and RNA of both bacteria and archaea. This related with 25.5 and 42.3% of total OTUs for bacteria and archaea, respectively, that showed a significant difference in their DNA and RNA profiles. Similar operational parameters affected the bacterial and archaeal community, yet the differentiating effect between DNA and RNA was much stronger for archaea. Co-occurrence networks and functional prediction profiling confirmed the clear differentiation between DNA and RNA profiles.
Conclusions:
In conclusion, a clear difference in active (RNA) and total (DNA) community profiles was observed, implying the need for a combined approach to estimate community stability in anaerobic digestion
Highlights of the 2nd Bioinformatics Student Symposium by ISCB RSG-UK [version 1]
Following the success of the 1 (st) Student Symposium by ISCB RSG-UK, a 2 (nd) Student Symposium took place on 7 (th) October 2015 at The Genome Analysis Centre, Norwich, UK. This short report summarizes the main highlights from the 2 (nd) Bioinformatics Student Symposium
A Quick Guide for Developing Effective Bioinformatics Programming Skills
Bioinformatics programming skills are becoming a necessity across many facets of biology and medicine, owed in part to the continuing explosion of biological dat
Higher education decision making and decision support systems
The authors illustrate several issues in decision support and decision support systems (DSS), state of the art research in these fields, and also their own studies in designing a higher education DSS. The final section contains our contribution in outlining the modules of the DSS, involving the present systems and databases of FSEGA and UBB, results and activities belonging to FSEGA students, teaching and research staff, to assist decisions for all the actors implicated in the processes, in various specific situations.decision support, decision support systems (DSS), higher education institutions, Information and Communication Technologies (ICT)
- …