65 research outputs found

    Web-based Named Entity Recognition and Data Integration to Accelerate Molecular Biology Research

    Get PDF
    Finding information about a biological entity is a step tightly bound to molecular biology research. Despite ongoing efforts, this task is both tedious and time consuming, and tends to become Sisyphean as the number of entities increases. Our aim is to assist researchers by providing them with summary information about biological entities while they are browsing the web, as well as with simplified programmatic access to biological data. To materialise this aim we employ emerging web technologies offering novel web-browsing experiences and new ways of software communication Reflect is a tool that couples biological named entity recognition with informative summaries, and can be applied to any web page, during web browsing. Invoked either via its browser extensions or via its web page, Reflect highlights gene, protein and chemical molecule names in a web page, and, dynamically, attaches to them summary information. The latter provides an overview of what is known about the entity, such as a description, the domain composition, the 3D structure and links to more detailed resources. The annotation process occurs via easy-to-use interfaces. The fast performance allows for Reflect to be an interactive companion for scientific readers/researchers, while they are surfing the internet. OnTheFly is a web-based application that not only extends Reflect functionality to Microsoft Word, Microsoft Excel, PDF and plain text format files, but also supports the extraction of networks of known and predicted interactions about the entities recognised in a document. A combination of Reflect and OnTheFly offers a data annotation solution for documents used by life science researchers throughout their work. EasySRS is a set of remote methods that expose the functionality of the Sequence Retrieval System (SRS), a data integration platform used in providing access to life science information including genetic, protein, expression and pathway data. EasySRS supports simultaneous queries to all of the integrated resources. Accessed from a single point, via the web, and based on a simple, common query format, EasySRS facilitates the task of biological data collection and annotation. EasySRS has been employed to enrich the entries of a Plant Defence Mechanism database. UniprotProfiler is a prototype application that employs EasySRS to generate graphs of knowledge based on database record cross-references. These graphs are converted into 3D diagrams of interconnected data. The 3D diagram generation occurs via Systems Biology visualisation tools that employ intuitive graphs to replace long result lists and facilitate hypothesis generation and knowledge discovery

    Arena3D: visualization of biological networks in 3D

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Complexity is a key problem when visualizing biological networks; as the number of entities increases, most graphical views become incomprehensible. Our goal is to enable many thousands of entities to be visualized meaningfully and with high performance.</p> <p>Results</p> <p>We present a new visualization tool, Arena3D, which introduces a new concept of staggered layers in 3D space. Related data – such as proteins, chemicals, or pathways – can be grouped onto separate layers and arranged via layout algorithms, such as Fruchterman-Reingold, distance geometry, and a novel hierarchical layout. Data on a layer can be clustered via k-means, affinity propagation, Markov clustering, neighbor joining, tree clustering, or UPGMA ('unweighted pair-group method with arithmetic mean'). A simple input format defines the name and URL for each node, and defines connections or similarity scores between pairs of nodes. The use of Arena3D is illustrated with datasets related to Huntington's disease.</p> <p>Conclusion</p> <p>Arena3D is a user friendly visualization tool that is able to visualize biological or any other network in 3D space. It is free for academic use and runs on any platform. It can be downloaded or lunched directly from <url>http://arena3d.org</url>. Java3D library and Java 1.5 need to be pre-installed for the software to run.</p

    Environmental variability and heavy metal concentrations from five lagoons in the Ionian Sea (Amvrakikos Gulf, W Greece)

    Get PDF
    Background: Coastal lagoons are ecosystems of major importance as they host a number of species tolerant to disturbances and they are highly productive. Therefore, these ecosystems should be protected to ensure stability and resilience. The lagoons of Amvrakikos Gulf form one of the most important lagoonal complexes in Greece. The optimal ecological status of these lagoons is crucial for the well-being of the biodiversity and the economic prosperity of the local communities. Thus, monitoring of the area is necessary to detect possible sources of disturbance and restore stability. New information: The environmental variables and heavy metals concentrations, from five lagoons of Amvrakikos Gulf were measured from seasonal samplings and compared to the findings of previous studies in the area, in order to check for possible sources of disturbance. The analysis, showed that i) the values of the abiotic parameters vary with time (season), space (lagoon) and with space over time; ii) the variability of the environmental factors and enrichment in certain elements is naturally induced and no source of contamination is detected in the lagoons

    Polytraits : a database on biological traits of marine polychaetes

    Get PDF
    The study of ecosystem functioning – the role which organisms play in an ecosystem – is becoming increasingly important in marine ecological research. The functional structure of a community can be represented by a set of functional traits assigned to behavioural, reproductive and morphological characteristics. The collection of these traits from the literature is however a laborious and time-consuming process, and gaps of knowledge and restricted availability of literature are a common problem. Trait data are not yet readily being shared by research communities, and even if they are, a lack of trait data repositories and standards for data formats leads to the publication of trait information in forms which cannot be processed by computers. This paper describes Polytraits (http://polytraits.lifewatchgreece.eu), a database on biological traits of marine polychaetes (bristle worms, Polychaeta: Annelida). At present, the database contains almost 20,000 records on morphological, behavioural and reproductive characteristics of more than 1,000 marine polychaete species, all referenced by literature sources. All data can be freely accessed through the project website in different ways and formats, both human-readable and machine-readable, and have been submitted to the Encyclopedia of Life for archival and integration with trait information from other sources

    Optimized R functions for analysis of ecological community data using the R virtual laboratory (RvLab)

    Get PDF
    Background: Parallel data manipulation using R has previously been addressed by members of the R community, however most of these studies produce ad hoc solutions that are not readily available to the average R user. Our targeted users, ranging from the expert ecologist/microbiologists to computational biologists, often experience difficulties in finding optimal ways to exploit the full capacity of their computational resources. In addition, improving performance of commonly used R scripts becomes increasingly difficult especially with large datasets. Furthermore, the implementations described here can be of significant interest to expert bioinformaticians or R developers. Therefore, our goals can be summarized as: (i) description of a complete methodology for the analysis of large datasets by combining capabilities of diverse R packages, (ii) presentation of their application through a virtual R laboratory (RvLab) that makes execution of complex functions and visualization of results easy and readily available to the end-user. New information: In this paper, the novelty stems from implementations of parallel methodologies which rely on the processing of data on different levels of abstraction and the availability of these processes through an integrated portal. Parallel implementation R packages, such as the pbdMPI (Programming with Big Data – Interface to MPI) package, are used to implement Single Program Multiple Data (SPMD) parallelization on primitive mathematical operations, allowing for interplay with functions of the vegan package. The dplyr and RPostgreSQL R packages are further integrated offering connections to dataframe like objects (databases) as secondary storage solutions whenever memory demands exceed available RAM resources. The RvLab is running on a PC cluster, using version 3.1.2 (2014-10-31) on a x86_64-pc-linux-gnu (64-bit) platform, and offers an intuitive virtual environmet interface enabling users to perform analysis of ecological and microbial communities based on optimized vegan functions. A beta version of the RvLab is available after registration at: https://portal.lifewatchgreece.eu

    Establishment of computational biology in Greece and Cyprus: Past, present, and future.

    Get PDF
    We review the establishment of computational biology in Greece and Cyprus from its inception to date and issue recommendations for future development. We compare output to other countries of similar geography, economy, and size—based on publication counts recorded in the literature—and predict future growth based on those counts as well as national priority areas. Our analysis may be pertinent to wider national or regional communities with challenges and opportunities emerging from the rapid expansion of the field and related industries. Our recommendations suggest a 2-fold growth margin for the 2 countries, as a realistic expectation for further expansion of the field and the development of a credible roadmap of national priorities, both in terms of research and infrastructure funding

    The environment ontology in 2016: bridging domains with increased scope, semantic density, and interoperation

    Get PDF
    Background The Environment Ontology (ENVO; http://www.environmentontology.org/), first described in 2013, is a resource and research target for the semantically controlled description of environmental entities. The ontology's initial aim was the representation of the biomes, environmental features, and environmental materials pertinent to genomic and microbiome-related investigations. However, the need for environmental semantics is common to a multitude of fields, and ENVO's use has steadily grown since its initial description. We have thus expanded, enhanced, and generalised the ontology to support its increasingly diverse applications. Methods We have updated our development suite to promote expressivity, consistency, and speed: we now develop ENVO in the Web Ontology Language (OWL) and employ templating methods to accelerate class creation. We have also taken steps to better align ENVO with the Open Biological and Biomedical Ontologies (OBO) Foundry principles and interoperate with existing OBO ontologies. Further, we applied text-mining approaches to extract habitat information from the Encyclopedia of Life and automatically create experimental habitat classes within ENVO. Results Relative to its state in 2013, ENVO's content, scope, and implementation have been enhanced and much of its existing content revised for improved semantic representation. ENVO now offers representations of habitats, environmental processes, anthropogenic environments, and entities relevant to environmental health initiatives and the global Sustainable Development Agenda for 2030. Several branches of ENVO have been used to incubate and seed new ontologies in previously unrepresented domains such as food and agronomy. The current release version of the ontology, in OWL format, is available at http://purl.obolibrary.org/obo/envo.owl. Conclusions ENVO has been shaped into an ontology which bridges multiple domains including biomedicine, natural and anthropogenic ecology, ‘omics, and socioeconomic development. Through continued interactions with our users and partners, particularly those performing data archiving and sythesis, we anticipate that ENVO’s growth will accelerate in 2017. As always, we invite further contributions and collaboration to advance the semantic representation of the environment, ranging from geographic features and environmental materials, across habitats and ecosystems, to everyday objects in household settings

    ENVIRONMENTS and EOL : identification of Environment Ontology terms in text and the annotation of the Encyclopedia of Life

    Get PDF
    © The Author(s), 2015. This article is distributed under the terms of the Creative Commons Attribution License. The definitive version was published in Bioinformatics 31 (2015): 1872-1874, doi:10.1093/bioinformatics/btv045.The association of organisms to their environments is a key issue in exploring biodiversity patterns. This knowledge has traditionally been scattered, but textual descriptions of taxa and their habitats are now being consolidated in centralized resources. However, structured annotations are needed to facilitate large-scale analyses. Therefore, we developed ENVIRONMENTS, a fast dictionary-based tagger capable of identifying Environment Ontology (ENVO) terms in text. We evaluate the accuracy of the tagger on a new manually curated corpus of 600 Encyclopedia of Life (EOL) species pages. We use the tagger to associate taxa with environments by tagging EOL text content monthly, and integrate the results into the EOL to disseminate them to a broad audience of users.The Encyclopedia Of Life Rubenstein Fellows Program [CRDF EOL-33066-13/E33066], the LifeWatchGreece Research Infrastructure [384676-94/GSRT/ NSRF(C&E)] and the Novo Nordisk Foundation Center for Protein Research [NNF14CC0001]

    Seqenv : linking sequences to environments through text mining

    Get PDF
    Understanding the distribution of taxa and associated traits across different environments is one of the central questions in microbial ecology. High-throughput sequencing (HTS) studies are presently generating huge volumes of data to address this biogeographical topic. However, these studies are often focused on specific environment types or processes leading to the production of individual, unconnected datasets. The large amounts of legacy sequence data with associated metadata that exist can be harnessed to better place the genetic information found in these surveys into a wider environmental context. Here we introduce a software program, seqenv, to carry out precisely such a task. It automatically performs similarity searches of short sequences against the ‘‘nt’’ nucleotide database provided by NCBI and, out of every hit, extracts–if it is available–the textual metadata field. After collecting all the isolation sources from all the search results, we run a text mining algorithm to identify and parse words that are associated with the Environmental Ontology (EnvO) controlled vocabulary. This, in turn, enables us to determine both in which environments individual sequences or taxa have previously been observed and, by weighted summation of those results, to summarize complete samples. We present two demonstrative applications of seqenv to a survey of ammonia oxidizing archaea as well as to a plankton paleome dataset from the Black Sea. These demonstrate the ability of the tool to reveal novel patterns in HTS How to cite this article Sinclair et al. (2016), Seqenv: linking sequences to environments through text mining. PeerJ 4:e2690; DOI 10.7717/peerj.2690 and its utility in the fields of environmental source tracking, paleontology, and studies of microbial biogeography

    New Mediterranean biodiversity records (October, 2014)

    Get PDF
    The Collective Article 'New Mediterranean Biodiversity Records' of the Mediterranean Marine Science journal offers the means to publish biodiversity records in the Mediterranean Sea. The current article is divided in two parts, for records of alien and native species respectively. The new records of alien species include: the red alga Asparagopsis taxiformis (Crete and Lakonikos Gulf, Greece); the red alga Grateloupia turuturu (along the Israeli Mediterranean shore); the mantis shrimp Clorida albolitura (Gulf of Antalya, Turkey); the mud crab Dyspanopeus sayi (Mar Piccolo of Taranto, Ionian Sea); the blue crab Callinectes sapidus (Chios Island, Greece); the isopod Paracerceis sculpta (northern Aegean Sea, Greece); the sea urchin Diadema setosum (Gökova Bay, Turkey); the molluscs Smaragdia souverbiana, Murex forskoehlii, Fusinus verrucosus, Circenita callipyga, and Aplysia dactylomela (Syria); the cephalaspidean mollusc Haminoea cyanomarginata (Baia di Puolo, Massa Lubrense, Campania, southern Italy); the topmouth gudgeon Pseudorasbora parva (Civitavecchia, Tyrrhenian Sea); the fangtooth moray Enchelycore anatina (Plemmirio marine reserve, Sicily); the silver-cheeked toadfish Lagocephalus sceleratus (Saros Bay, Turkey; and Ibiza channel, Spain); the Indo-Pacific ascidian Herdmania momus in Kastelorizo Island (Greece); and the foraminiferal Clavulina multicamerata (Saronikos Gulf, Greece). The record of L. sceleratus in Spain consists the deepest (350-400m depth) record of the species in the Mediterranean Sea. The new records of native species include: first record of the ctenophore Cestum veneris in Turkish marine waters; the presence of Holothuria tubulosa and Holothuria polii in the Bay of Igoumenitsa (Greece); the first recorded sighting of the bull ray Pteromylaeus bovinus in Maltese waters; and a new record of the fish Lobotes surinamensis from Maliakos Gulf.peer-reviewe
    corecore