42 research outputs found

    Textpresso for Neuroscience: Searching the Full Text of Thousands of Neuroscience Research Papers

    Get PDF
    Textpresso is a text-mining system for scientific literature. Its two major features are access to the full text of research papers and the development and use of categories of biological concepts as well as categories that describe or relate objects. A search engine enables the user to search for one or a combination of these categories and/or keywords within an entire literature. Here we describe Textpresso for Neuroscience, part of the core Neuroscience Information Framework (NIF). The Textpresso site currently consists of 67,500 full text papers and 131,300 abstracts. We show that using categories in literature can make a pure keyword query more refined and meaningful. We also show how semantic queries can be formulated with categories only. We explain the build and content of the database and describe the main features of the web pages and the advanced search options. We also give detailed illustrations of the web service developed to provide programmatic access to Textpresso. This web service is used by the NIF interface to access Textpresso. The standalone website of Textpresso for Neuroscience can be accessed at http://www.textpresso.org/neuroscience

    Textpresso - an Information Retrieval and Extraction System for Biological Literature

    Get PDF
    We developed an information retrieval and extraction system that processes the full text of biological papers. The system, called Textpresso, separates text into sentences, labels words and phrases according to an ontology (an organized lexicon), and allows queries to be performed on a database of labeled sentences. The current ontology comprises approximately one hundred categories of terms, such as "gene", "regulation", "human disease", "brain area" etc., and also contains main Gene Ontology (GO) categories. Extraction of particular biological facts, such as gene-­gene interactions, or the curation of GO cellular components, can be accelerated significantly by ontologies, with Textpresso automatically performing nearly as well as expert curators to identify sentences. Search engine for four literatures, C. elegans, Drosophila, Arabidopsis and Neuroscience have been established by us, and thirteen systems for other literatures have been developed by other groups around the world. Currently, our four systems contain 112,000 papers with 40 million sentences, all systems worldwide contain 190,000 papers with approximately 65 million sentences

    Redox-Active Antibiotics Control Gene Expression and Community Behavior in Divergent Bacteria

    Get PDF
    It is thought that bacteria excrete redox-active pigments as antibiotics to inhibit competitors. In Pseudomonas aeruginosa, the endogenous antibiotic pyocyanin activates SoxR, a transcription factor conserved in Proteo- and Actinobacteria. In Escherichia coli, SoxR regulates the superoxide stress response. Bioinformatic analysis coupled with gene expression studies in P. aeruginosa and Streptomyces coelicolor revealed that the majority of SoxR regulons in bacteria lack the genes required for stress responses, despite the fact that many of these organisms still produce redox-active small molecules, which indicates that redox-active pigments play a role independent of oxidative stress. These compounds had profound effects on the structural organization of colony biofilms in both P. aeruginosa and S. coelicolor, which shows that "secondary metabolites" play important conserved roles in gene expression and development

    Data Carpentry: Workshops to Increase Data Literacy for Researchers

    Get PDF
    In many domains the rapid generation of large amounts of data is fundamentally changing how research is done. The deluge of data presents great opportunities, but also many challenges in managing, analyzing and sharing data. However, good training resources for researchers looking to develop skills that will enable them to be more effective and productive researchers are scarce and there is little space in the existing curriculum for courses or additional lectures. To address this need we have developed an introductory two-day intensive workshop, Data Carpentry, designed to teach basic concepts, skills, and tools for working more effectively and reproducibly with data. These workshops are based on Software Carpentry: two-day, hands-on, bootcamp style workshops teaching best practices in software development, that have demonstrated the success of short workshops to teach foundational research skills. Data Carpentry focuses on data literacy in particular, with the objective of teaching skills to researchers to enable them to retrieve, view, manipulate, analyze and store their and other’s data in an open and reproducible way in order to extract knowledge from data

    Perennial grasslands enhance biodiversity and multiple ecosystem services in bioenergy landscapes

    Get PDF
    Agriculture is being challenged to provide food, and increasingly fuel, for an expanding global population. Producing bioenergy crops on marginal lands—farmland suboptimal for food crops—could help meet energy goals while minimizing competition with food production. However, the ecological costs and benefits of growing bioenergy feedstocks—primarily annual grain crops—on marginal lands have been questioned. Here we show that perennial bioenergy crops provide an alternative to annual grains that increases biodiversity of multiple taxa and sustain a variety of ecosystem functions, promoting the creation of multifunctional agricultural landscapes. We found that switchgrass and prairie plantings harbored significantly greater plant, methanotrophic bacteria, arthropod, and bird diversity than maize. Although biomass production was greater in maize, all other ecosystem services, including methane consumption, pest suppression, pollination, and conservation of grassland birds, were higher in perennial grasslands. Moreover, we found that the linkage between biodiversity and ecosystem services is dependent not only on the choice of bioenergy crop but also on its location relative to other habitats, with local landscape context as important as crop choice in determining provision of some services. Our study suggests that bioenergy policy that supports coordinated land use can diversify agricultural landscapes and sustain multiple critical ecosystem services

    Computing Workflows for Biologists: A Roadmap.

    No full text
    Extremely large datasets have become routine in biology. However, performing a computational analysis of a large dataset can be overwhelming, especially for novices. Here, we present a step-by-step guide to computing workflows with the biologist end-user in mind. Starting from a foundation of sound data management practices, we make specific recommendations on how to approach and perform computational analyses of large datasets, with a view to enabling sound, reproducible biological research

    Effects of Compression on Language Evolution

    No full text
    corecore