42 research outputs found
Textpresso for Neuroscience: Searching the Full Text of Thousands of Neuroscience Research Papers
Textpresso is a text-mining system for scientific literature. Its two major features are access to the full text of research papers and the development and use of categories of biological concepts as well as categories that describe or relate objects. A search engine enables the user to search for one or a combination of these categories and/or keywords within an entire literature. Here we describe Textpresso for
Neuroscience, part of the core Neuroscience Information Framework
(NIF). The Textpresso site currently consists of 67,500 full text
papers and 131,300 abstracts. We show that using categories in
literature can make a pure keyword query more refined and meaningful.
We also show how semantic queries can be formulated with categories
only. We explain the build and content of the database and describe the
main features of the web pages and the advanced search options. We also
give detailed illustrations of the web service developed to provide
programmatic access to Textpresso. This web service is used by the NIF
interface to access Textpresso. The standalone website of Textpresso
for Neuroscience can be accessed at
http://www.textpresso.org/neuroscience
Textpresso - an Information Retrieval and Extraction System for Biological Literature
We developed an information retrieval and extraction system that processes the full
text of biological papers. The system, called Textpresso, separates text into
sentences, labels words and phrases according to an ontology (an organized lexicon),
and allows queries to be performed on a database of labeled sentences. The current
ontology comprises approximately one hundred categories of terms, such as "gene",
"regulation", "human disease", "brain area" etc., and also contains main Gene
Ontology (GO) categories. Extraction of particular biological facts, such as gene-Ă‚Âgene
interactions, or the curation of GO cellular components, can be accelerated
significantly by ontologies, with Textpresso automatically performing nearly as well as
expert curators to identify sentences. Search engine for four literatures, C. elegans,
Drosophila, Arabidopsis and Neuroscience have been established by us, and thirteen
systems for other literatures have been developed by other groups around the world.
Currently, our four systems contain 112,000 papers with 40 million sentences, all
systems worldwide contain 190,000 papers with approximately 65 million sentences
Redox-Active Antibiotics Control Gene Expression and Community Behavior in Divergent Bacteria
It is thought that bacteria excrete redox-active pigments as antibiotics to inhibit competitors. In Pseudomonas aeruginosa, the endogenous antibiotic pyocyanin activates SoxR, a transcription factor conserved in Proteo- and Actinobacteria. In Escherichia coli, SoxR regulates the superoxide stress response. Bioinformatic analysis coupled with gene expression studies in P. aeruginosa and Streptomyces coelicolor revealed that the majority of SoxR regulons in bacteria lack the genes required for stress responses, despite the fact that many of these organisms still produce redox-active small molecules, which indicates that redox-active pigments play a role independent of oxidative stress. These compounds had profound effects on the structural organization of colony biofilms in both P. aeruginosa and S. coelicolor, which shows that "secondary metabolites" play important conserved roles in gene expression and development
Data Carpentry: Workshops to Increase Data Literacy for Researchers
In many domains the rapid generation of large amounts of data is fundamentally changing how research is done. The deluge of data presents great opportunities, but also many challenges in managing, analyzing and sharing data. However, good training resources for researchers looking to develop skills that will enable them to be more effective and productive researchers are scarce and there is little space in the existing curriculum for courses or additional lectures. To address this need we have developed an introductory two-day intensive workshop, Data Carpentry, designed to teach basic concepts, skills, and tools for working more effectively and reproducibly with data. These workshops are based on Software Carpentry: two-day, hands-on, bootcamp style workshops teaching best practices in software development, that have demonstrated the success of short workshops to teach foundational research skills. Data Carpentry focuses on data literacy in particular, with the objective of teaching skills to researchers to enable them to retrieve, view, manipulate, analyze and store their and other’s data in an open and reproducible way in order to extract knowledge from data
Perennial grasslands enhance biodiversity and multiple ecosystem services in bioenergy landscapes
Agriculture is being challenged to provide food, and increasingly fuel, for an expanding global population. Producing bioenergy crops on marginal lands—farmland suboptimal for food crops—could help meet energy goals while minimizing competition with food production. However, the ecological costs and benefits of growing bioenergy feedstocks—primarily annual grain crops—on marginal lands have been questioned. Here we show that perennial bioenergy crops provide an alternative to annual grains that increases biodiversity of multiple taxa and sustain a variety of ecosystem functions, promoting the creation of multifunctional agricultural landscapes. We found that switchgrass and prairie plantings harbored significantly greater plant, methanotrophic bacteria, arthropod, and bird diversity than maize. Although biomass production was greater in maize, all other ecosystem services, including methane consumption, pest suppression, pollination, and conservation of grassland birds, were higher in perennial grasslands. Moreover, we found that the linkage between biodiversity and ecosystem services is dependent not only on the choice of bioenergy crop but also on its location relative to other habitats, with local landscape context as important as crop choice in determining provision of some services. Our study suggests that bioenergy policy that supports coordinated land use can diversify agricultural landscapes and sustain multiple critical ecosystem services
Recommended from our members
Journal of Open Source Software (JOSS) : design and first-year review
This article describes the motivation, design, and progress of the Journal of Open Source Software (JOSS). JOSS is a free and open-access journal that publishes articles describing research software. It has the dual goals of improving the quality of the software submitted and providing a mechanism for research software developers to receive credit. While designed to work within the current merit system of science, JOSS addresses the dearth of rewards for key contributions to science made in the form of software. JOSS publishes articles that encapsulate scholarship contained in the software itself, and its rigorous peer review targets the software components: functionality, documentation, tests, continuous integration, and the license. A JOSS article contains an abstract describing the purpose and functionality of the software, references, and a link to the software archive. The article is the entry point of a JOSS submission, which encompasses the full set of software artifacts. Submission and review proceed in the open, on GitHub. Editors, reviewers, and authors work collaboratively and openly. Unlike other journals, JOSS does not reject articles requiring major revision; while not yet accepted, articles remain visible and under review until the authors make adequate changes (or withdraw, if unable to meet requirements). Once an article is accepted, JOSS gives it a digital object identifier (DOI), deposits its metadata in Crossref, and the article can begin collecting citations on indexers like Google Scholar and other services. Authors retain copyright of their JOSS article, releasing it under a Creative Commons Attribution 4.0 International License. In its first year, starting in May 2016, JOSS published 111 articles, with more than 40 additional articles under review. JOSS is a sponsored project of the nonprofit organization NumFOCUS and is an affiliate of the Open Source Initiative (OSI)
Computing Workflows for Biologists: A Roadmap.
Extremely large datasets have become routine in biology. However, performing a computational analysis of a large dataset can be overwhelming, especially for novices. Here, we present a step-by-step guide to computing workflows with the biologist end-user in mind. Starting from a foundation of sound data management practices, we make specific recommendations on how to approach and perform computational analyses of large datasets, with a view to enabling sound, reproducible biological research