253 research outputs found
An introduction to scripting in Ruby for biologists
<p>Abstract</p> <p>The Ruby programming language has a lot to offer to any scientist with electronic data to process. Not only is the initial learning curve very shallow, but its reflection and meta-programming capabilities allow for the rapid creation of relatively complex applications while still keeping the code short and readable. This paper provides a gentle introduction to this scripting language for researchers without formal informatics training such as many wet-lab scientists. We hope this will provide such researchers an idea of how powerful a tool Ruby can be for their data management tasks and encourage them to learn more about it.</p
Comparative genomics of tadpole shrimps (Crustacea, Branchiopoda, Notostraca): Dynamic genome evolution against the backdrop of morphological stasis
This analysis presents five genome assemblies of four Notostraca taxa. Notostraca origin dates to the Permian/Upper Devonian and the extant forms show a striking morphological similarity to fossil taxa. The comparison of sequenced genomes with other Branchiopoda genomes shows that, despite the morphological stasis, Notostraca share a dynamic genome evolution with high turnover for gene families' expansion/contraction and a transposable elements content comparable to other branchiopods. While Notostraca substitutions rate appears similar or lower in comparison to other branchiopods, a subset of genes shows a faster evolutionary pace, highlighting the difficulty of generalizing about genomic stasis versus dynamism. Moreover, we found that the variation of Triops cancriformis transposable elements content appeared linked to reproductive strategies, in line with theoretical expectations. Overall, besides providing new genomic resources for the study of these organisms, which appear relevant for their ecology and evolution, we also confirmed the decoupling of morphological and molecular evolution
Statistical expression deconvolution from mixed tissue samples
Motivation: Global expression patterns within cells are used for purposes ranging from the identification of disease biomarkers to basic understanding of cellular processes. Unfortunately, tissue samples used in cancer studies are usually composed of multiple cell types and the non-cancerous portions can significantly affect expression profiles. This severely limits the conclusions that can be made about the specificity of gene expression in the cell-type of interest. However, statistical analysis can be used to identify differentially expressed genes that are related to the biological question being studied
Centroacinar cells are progenitors that contribute to endocrine pancreas regeneration
Diabetes is associated with a paucity of insulin-producing β-cells. With the goal of finding therapeutic routes to treat diabetes, we aim to find molecular and cellular mechanisms involved in β-cell neogenesis and regeneration. To facilitate discovery of such mechanisms, we use a vertebrate organism where pancreatic cells readily regenerate. The larval zebrafish pancreas contains Notch-responsive progenitors that during development give rise to adult ductal, endocrine, and centroacinar cells (CACs). Adult CACs are also Notch responsive and are morphologically similar to their larval predecessors. To test our hypothesis that adult CACs are also progenitors, we took two complementary approaches: 1) We established the transcriptome for adult CACs. Using gene ontology, transgenic lines, and in situ hybridization, we found that the CAC transcriptome is enriched for progenitor markers. 2) Using lineage tracing, we demonstrated that CACs do form new endocrine cells after β-cell ablation or partial pancreatectomy. We concluded that CACs and their larval predecessors are the same cell type and represent an opportune model to study both β-cell neogenesis and β-cell regeneration. Furthermore, we show that in cftr loss-of-function mutants, there is a deficiency of larval CACs, providing a possible explanation for pancreatic complications associated with cystic fibrosis
Team level identification predicts perceived and actual team performance: longitudinal multilevel analyses with sports teams
Social identification and team performance literatures typically focus on the relationship between individual differences in identification and individual-level performance. By using a longitudinal multilevel approach, involving 369 members of 45 sports teams across England and Italy, we compared how team-level and individual-level variance in social identification together predicted team and individual performance outcomes. As hypothesised, team-level variance in identification significantly predicted subsequent levels of both perceived and actual team performance in cross-lagged analyses. Conversely, individual-level variance in identification did not significantly predict subsequent levels of perceived individual performance. These findings support recent calls for social identity to be considered a multilevel construct and highlight the influence of group-level social identification on group-level processes and outcomes, over and above its individual-level effects
A quasi classical approach to electron impact ionization
A quasi classical approximation to quantum mechanical scattering in the
Moeller formalism is developed. While keeping the numerical advantage of a
standard Classical--Trajectory--Monte--Carlo calculation, our approach is no
longer restricted to use stationary initial distributions. This allows one to
improve the results by using better suited initial phase space distributions
than the microcanonical one and to gain insight into the collision mechanism by
studying the influence of different initial distributions on the cross section.
A comprehensive account of results for single, double and triple differential
cross sections for atomic hydrogen will be given, in comparison with experiment
and other theories.Comment: 21 pages, 10 figures, submitted to J Phys
Improving the performance of DomainDiscovery of protein domain boundary assignment using inter-domain linker index
BACKGROUND: Knowledge of protein domain boundaries is critical for the characterisation and understanding of protein function. The ability to identify domains without the knowledge of the structure – by using sequence information only – is an essential step in many types of protein analyses. In this present study, we demonstrate that the performance of DomainDiscovery is improved significantly by including the inter-domain linker index value for domain identification from sequence-based information. Improved DomainDiscovery uses a Support Vector Machine (SVM) approach and a unique training dataset built on the principle of consensus among experts in defining domains in protein structure. The SVM was trained using a PSSM (Position Specific Scoring Matrix), secondary structure, solvent accessibility information and inter-domain linker index to detect possible domain boundaries for a target sequence. RESULTS: Improved DomainDiscovery is compared with other methods by benchmarking against a structurally non-redundant dataset and also CASP5 targets. Improved DomainDiscovery achieves 70% accuracy for domain boundary identification in multi-domains proteins. CONCLUSION: Improved DomainDiscovery compares favourably to the performance of other methods and excels in the identification of domain boundaries for multi-domain proteins as a result of introducing support vector machine with benchmark_2 dataset
A Systematic Survey of Mini-Proteins in Bacteria and Archaea
BACKGROUND: Mini-proteins, defined as polypeptides containing no more than 100 amino acids, are ubiquitous in prokaryotes and eukaryotes. They play significant roles in various biological processes, and their regulatory functions gradually attract the attentions of scientists. However, the functions of the majority of mini-proteins are still largely unknown due to the constraints of experimental methods and bioinformatic analysis. METHODOLOGY/PRINCIPAL FINDINGS: In this article, we extracted a total of 180,879 mini-proteins from the annotations of 532 sequenced genomes, including 491 strains of Bacteria and 41 strains of Archaea. The average proportion of mini-proteins among all genomic proteins is approximately 10.99%, but different strains exhibit remarkable fluctuations. These mini-proteins display two notable characteristics. First, the majority are species-specific proteins with an average proportion of 58.79% among six representative phyla. Second, an even larger proportion (70.03% among all strains) is hypothetical proteins. However, a fraction of highly conserved hypothetical proteins potentially play crucial roles in organisms. Among mini-proteins with known functions, it seems that regulatory and metabolic proteins are more abundant than essential structural proteins. Furthermore, domains in mini-proteins seem to have greater distributions in Bacteria than Eukarya. Analysis of the evolutionary progression of these domains reveals that they have diverged to new patterns from a single ancestor. CONCLUSIONS/SIGNIFICANCE: Mini-proteins are ubiquitous in bacterial and archaeal species and play significant roles in various functions. The number of mini-proteins in each genome displays remarkable fluctuation, likely resulting from the differential selective pressures that reflect the respective life-styles of the organisms. The answers to many questions surrounding mini-proteins remain elusive and need to be resolved experimentally
- …