1,560 research outputs found
A Tree Index to Support Clustering Based Exploratory Data Analysis
Martin C, Nattkemper TW. A Tree Index to Support Clustering Based Exploratory Data Analysis. In: Bioinformatics Research and Development : Second International Conference, BIRD 2008 Vienna, Austria, July 7-9, 2008 Proceedings. Communications in Computer and Information Science. Vol 13. Berlin, Heidelberg: Springer; 2008: 1-15
Getting More out of Biomedical Documents with GATE's Full Lifecycle Open Source Text Analytics.
This software article describes the GATE family of open source text analysis tools and processes. GATE is one of the most
widely used systems of its type with yearly download rates of tens of thousands and many active users in both academic
and industrial contexts. In this paper we report three examples of GATE-based systems operating in the life sciences and in
medicine. First, in genome-wide association studies which have contributed to discovery of a head and neck cancer
mutation association. Second, medical records analysis which has significantly increased the statistical power of treatment/
outcome models in the UK’s largest psychiatric patient cohort. Third, richer constructs in drug-related searching. We also
explore the ways in which the GATE family supports the various stages of the lifecycle present in our examples. We conclude
that the deployment of text mining for document abstraction or rich search and navigation is best thought of as a process,
and that with the right computational tools and data collection strategies this process can be made defined and repeatable.
The GATE research programme is now 20 years old and has grown from its roots as a specialist development tool for text
processing to become a rather comprehensive ecosystem, bringing together software developers, language engineers and
research staff from diverse fields. GATE now has a strong claim to cover a uniquely wide range of the lifecycle of text analysis
systems. It forms a focal point for the integration and reuse of advances that have been made by many people (the majority
outside of the authors’ own group) who work in text processing for biomedicine and other areas. GATE is available online
,1. under GNU open source licences and runs on all major operating systems. Support is available from an active user and
developer community and also on a commercial basis
Serious Games Are Not Serious Enough for Myoelectric Prosthetics
Serious games show a lot of potential for use in movement rehabilitation (eg, after a stroke, injury to the spinal cord, or limb loss). However, the nature of this research leads to diversity both in the background of the researchers and in the approaches of their investigation. Our close examination and categorization of virtual training software for upper limb prosthetic rehabilitation found that researchers typically followed one of two broad approaches: (1) focusing on the game design aspects to increase engagement and muscle training and (2) concentrating on an accurate representation of prosthetic training tasks, to induce task-specific skill transfer. Previous studies indicate muscle training alone does not lead to improved prosthetic control without a transfer-enabling task structure. However, the literature shows a recent surge in the number of game-based prosthetic training tools, which focus on engagement without heeding the importance of skill transfer. This influx appears to have been strongly influenced by the availability of both software and hardware, specifically the launch of a commercially available acquisition device and freely available high-profile game development engines. In this Viewpoint, we share our perspective on the current trends and progress of serious games for prosthetic training
Simple identification tools in FishBase
Simple identification tools for fish species were included in the FishBase information system from its inception. Early tools made use of the relational model and characters like fin ray meristics. Soon pictures and drawings were added as a further help, similar to a field guide. Later came the computerization of existing dichotomous keys, again in combination with pictures and other information, and the ability to restrict possible species by country, area, or taxonomic group. Today, www.FishBase.org offers four different ways to identify species. This paper describes these tools with their advantages and disadvantages, and suggests various options for further
development. It explores the possibility of a holistic and integrated computeraided strategy
Characterization of microbial communities in carbonate sediments
Microbial communities in carbonate sediments from the alkaline Lake Neusiedl and
the Aldabra Atoll were characterized. The aim was to determine the microbial
community composition and function in the context of their contribution to
biogeochemical cycles and carbonate precipitation. Total DNA and RNA were
extracted from sediment and water samples. 16S ribosomal RNA genes and transcripts
were amplified and sequenced to determine the bacterial community composition.
Metagenomes were assembled from selected sampling sites to determine the
functional potential encoded within the microbial community. Detailed insights into
bacterial genomes and metabolism were gained through isolation and characterisation
of two novel bacterial species derived from Aldabra.
The first sampling campaign represents the proof-of-concept study at Lake
Neusiedl (Chapter C.1 & C.2). In this study the sampling procedure for the push-cores
and water column was established. Bacterial 16S rRNA genes were amplified from the
total DNA, sequenced, and analysed. The results showed that freshwater
picoplanktonic Alphaproteobacteria and Actinobacteriota were abundant in the water
column (Chapter C.1). Together with Synechococcales sheaths they may provide
nucleation sites for carbonate precipitation in the water column. The sediment
followed the standard biogeochemical succession and showed signs of diatom
dissolution (Chapter C.2). This was linked to high abundance of heterotrophic
Gammaproteobacteria and fermenting Chloroflexota, which likely contributed to
maintaining the neutral pH and supported the dissolution process.
The main sampling campaign to the Aldabra Atoll took place at the end of the
dry season in November 2017. Sediment cores and water samples were taken at three
sampling sites in the lagoon and one pool at the island rim (Chapter C.3). The bacterial
community composition was identified using both 16S rRNA genes and transcripts,
covering both present and past members of the community. The sampling sites Cinq
Cases and Westpool D were selected for direct metagenome sequencing and analysis,
as these were landlocked pools with a history of stromatolites (Chapter C.5). The sand
sediment was oxic with low bacterial diversities and dominant Pseudomonas. The
surface was covered by a slightly lithified crust, potentially linked to tidally induced
carbonate oversaturation and precipitation driven by the activity of Gloeocapsopsis (Chapter C.3). In the mud and silt sediments bioturbation and tidal mixing led to a
mixed surface and sulphate reduction zone. These were followed by atypical low
bacterial phylogenetic diversity zones with high proportions of Gammaproteobacteria.
Their onset was linked to changes in redox conditions, sediment age and available
organic material (Chapter C.3). This was supported by results from the analysis of
abundant metagenome-assembled genomes (MAGs) of the low-diversity zones at Cinq
Cases. The MAGs harboured key genes for aerobic metabolism and denitrification
(Chapter C.5). MAGs and 16S rRNA genes from Westpool D suggested that a biofilm
comprising Gloeocapsa, Salinivibrio and Francisella is responsible for biologically
induced carbonate precipitation of the local stromatolites. The unlithified microbial
mat at the bottom of the pond harboured Cyanobium and Arthrospira, indicating that
only specific Cyanobacteria support carbonate precipitation (Chapter C.5).
To identify novel bacteria and provide information on the vast majority of
uncultured taxa, we enriched halophilic members of the bacterial community. Two
isolates were selected and characterized both physiologically and genomically
(Chapter C.4). Pontibacillus sp. ALD_SL1 was isolated form the mudflat of the South
Lagoon and exhibited a high relative abundance (30%) in the active bacterial
community of the water column at Cinq Cases. Psychroflexus sp. ALD_RP9 was
isolated from the bacterial bloom at Westpool D. Its ability to form extensive EPS to
protect itself from salt and solar radiation may result in binding Ca2+-ions. Upon EPS
degradation, local increase of Ca2+ and rearrangement of the EPS residues support the
nucleation of carbonates.
This study encompasses the first characterization of microbial communities from
the Aldabra Atoll using amplicon, metagenome, and genome analyses. The study
highlights the different modes of carbonate precipitation, which can occur in the
lacustrine and lagoonal environments. It also provides a basis for in-depth analysis of
individual members of the community and their involvement in sediment
biogeochemical cycling.2021-12-0
Optimal neighborhood indexing for protein similarity search
Background: Similarity inference, one of the main bioinformatics tasks, has to face an exponential growth of the biological data. A classical approach used to cope with this data flow involves heuristics with large seed indexes. In order to speed up this technique, the index can be enhanced by storing additional information to limit the number of random memory accesses. However, this improvement leads to a larger index that may become a bottleneck. In the case of protein similarity search, we propose to decrease the index size by reducing the amino acid alphabet.\ud
\ud
Results: The paper presents two main contributions. First, we show that an optimal neighborhood indexing combining an alphabet reduction and a longer neighborhood leads to a reduction of 35% of memory involved into the process, without sacrificing the quality of results nor the computational time. Second, our approach led us to develop a new kind of substitution score matrices and their associated e-value parameters. In contrast to usual matrices, these matrices are rectangular since they compare amino acid groups from different alphabets. We describe the method used for computing those matrices and we provide some typical examples that can be used in such comparisons. Supplementary data can be found on the website http://bioinfo.lifl.fr/reblosum.\ud
\ud
Conclusions: We propose a practical index size reduction of the neighborhood data, that does not negatively affect the performance of large-scale search in protein sequences. Such an index can be used in any study involving large protein data. Moreover, rectangular substitution score matrices and their associated statistical parameters can have applications in any study involving an alphabet reduction
Rapid population decline in migratory shorebirds relying on Yellow Sea tidal mudflats as stopover sites
Migratory animals are threatened by human-induced global change. However, little is known about how stopover habitat, essential for refuelling during migration, affects the population dynamics of migratory species. Using 20 years of continent-wide citizen science data, we assess population trends of ten shorebird taxa that refuel on Yellow Sea tidal mudflats, a threatened ecosystem that has shrunk by >65% in recent decades. Seven of the taxa declined at rates of up to 8% per year. Taxa with the greatest reliance on the Yellow Sea as a stopover site showed the greatest declines, whereas those that stop primarily in other regions had slowly declining or stable populations. Decline rate was unaffected by shared evolutionary history among taxa and was not predicted by migration distance, breeding range size, non-breeding location, generation time or body size. These results suggest that changes in stopover habitat can severely limit migratory populations
- …