297 research outputs found

    Robot life: simulation and participation in the study of evolution and social behavior.

    Get PDF
    This paper explores the case of using robots to simulate evolution, in particular the case of Hamilton's Law. The uses of robots raises several questions that this paper seeks to address. The first concerns the role of the robots in biological research: do they simulate something (life, evolution, sociality) or do they participate in something? The second question concerns the physicality of the robots: what difference does embodiment make to the role of the robot in these experiments. Thirdly, how do life, embodiment and social behavior relate in contemporary biology and why is it possible for robots to illuminate this relation? These questions are provoked by a strange similarity that has not been noted before: between the problem of simulation in philosophy of science, and Deleuze's reading of Plato on the relationship of ideas, copies and simulacra

    Gene identification and protein classification in microbial metagenomic sequence data via incremental clustering

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The identification and study of proteins from metagenomic datasets can shed light on the roles and interactions of the source organisms in their communities. However, metagenomic datasets are characterized by the presence of organisms with varying GC composition, codon usage biases etc., and consequently gene identification is challenging. The vast amount of sequence data also requires faster protein family classification tools.</p> <p>Results</p> <p>We present a computational improvement to a sequence clustering approach that we developed previously to identify and classify protein coding genes in large microbial metagenomic datasets. The clustering approach can be used to identify protein coding genes in prokaryotes, viruses, and intron-less eukaryotes. The computational improvement is based on an incremental clustering method that does not require the expensive all-against-all compute that was required by the original approach, while still preserving the remote homology detection capabilities. We present evaluations of the clustering approach in protein-coding gene identification and classification, and also present the results of updating the protein clusters from our previous work with recent genomic and metagenomic sequences. The clustering results are available via CAMERA, (http://camera.calit2.net).</p> <p>Conclusion</p> <p>The clustering paradigm is shown to be a very useful tool in the analysis of microbial metagenomic data. The incremental clustering method is shown to be much faster than the original approach in identifying genes, grouping sequences into existing protein families, and also identifying novel families that have multiple members in a metagenomic dataset. These clusters provide a basis for further studies of protein families.</p

    Analysis and comparison of very large metagenomes with fast clustering and functional annotation

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The remarkable advance of metagenomics presents significant new challenges in data analysis. Metagenomic datasets (metagenomes) are large collections of sequencing reads from anonymous species within particular environments. Computational analyses for very large metagenomes are extremely time-consuming, and there are often many novel sequences in these metagenomes that are not fully utilized. The number of available metagenomes is rapidly increasing, so fast and efficient metagenome comparison methods are in great demand.</p> <p>Results</p> <p>The new metagenomic data analysis method Rapid Analysis of Multiple Metagenomes with a Clustering and Annotation Pipeline (<b>RAMMCAP</b>) was developed using an ultra-fast sequence clustering algorithm, fast protein family annotation tools, and a novel statistical metagenome comparison method that employs a unique graphic interface. RAMMCAP processes extremely large datasets with only moderate computational effort. It identifies raw read clusters and protein clusters that may include novel gene families, and compares metagenomes using clusters or functional annotations calculated by RAMMCAP. In this study, RAMMCAP was applied to the two largest available metagenomic collections, the "Global Ocean Sampling" and the "Metagenomic Profiling of Nine Biomes".</p> <p>Conclusion</p> <p>RAMMCAP is a very fast method that can cluster and annotate one million metagenomic reads in only hundreds of CPU hours. It is available from <url>http://tools.camera.calit2.net/camera/rammcap/</url>.</p

    Autism as a disorder of neural information processing: directions for research and targets for therapy

    Get PDF
    The broad variation in phenotypes and severities within autism spectrum disorders suggests the involvement of multiple predisposing factors, interacting in complex ways with normal developmental courses and gradients. Identification of these factors, and the common developmental path into which theyfeed, is hampered bythe large degrees of convergence from causal factors to altered brain development, and divergence from abnormal brain development into altered cognition and behaviour. Genetic, neurochemical, neuroimaging and behavioural findings on autism, as well as studies of normal development and of genetic syndromes that share symptoms with autism, offer hypotheses as to the nature of causal factors and their possible effects on the structure and dynamics of neural systems. Such alterations in neural properties may in turn perturb activity-dependent development, giving rise to a complex behavioural syndrome many steps removed from the root causes. Animal models based on genetic, neurochemical, neurophysiological, and behavioural manipulations offer the possibility of exploring these developmental processes in detail, as do human studies addressing endophenotypes beyond the diagnosis itself

    Comparison of distance measures in spatial analytical modeling for health service planning

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Several methodological approaches have been used to estimate distance in health service research. In this study, focusing on cardiac catheterization services, Euclidean, Manhattan, and the less widely known Minkowski distance metrics are used to estimate distances from patient residence to hospital. Distance metrics typically produce less accurate estimates than actual measurements, but each metric provides a single model of travel over a given network. Therefore, distance metrics, unlike actual measurements, can be directly used in spatial analytical modeling. Euclidean distance is most often used, but unlikely the most appropriate metric. Minkowski distance is a more promising method. Distances estimated with each metric are contrasted with road distance and travel time measurements, and an optimized Minkowski distance is implemented in spatial analytical modeling.</p> <p>Methods</p> <p>Road distance and travel time are calculated from the postal code of residence of each patient undergoing cardiac catheterization to the pertinent hospital. The Minkowski metric is optimized, to approximate travel time and road distance, respectively. Distance estimates and distance measurements are then compared using descriptive statistics and visual mapping methods. The optimized Minkowski metric is implemented, via the spatial weight matrix, in a spatial regression model identifying socio-economic factors significantly associated with cardiac catheterization.</p> <p>Results</p> <p>The Minkowski coefficient that best approximates road distance is 1.54; 1.31 best approximates travel time. The latter is also a good predictor of road distance, thus providing the best single model of travel from patient's residence to hospital. The Euclidean metric and the optimal Minkowski metric are alternatively implemented in the regression model, and the results compared. The Minkowski method produces more reliable results than the traditional Euclidean metric.</p> <p>Conclusion</p> <p>Road distance and travel time measurements are the most accurate estimates, but cannot be directly implemented in spatial analytical modeling. Euclidean distance tends to underestimate road distance and travel time; Manhattan distance tends to overestimate both. The optimized Minkowski distance partially overcomes their shortcomings; it provides a single model of travel over the network. The method is flexible, suitable for analytical modeling, and more accurate than the traditional metrics; its use ultimately increases the reliability of spatial analytical models.</p

    Is a history of work-related low back injury associated with prevalent low back pain and depression in the general population?

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Little is known about the role of prior occupational low back injury in future episodes of low back pain and disability in the general population. We conducted a study to determine if a lifetime history of work-related low back injury is associated with prevalent severity-graded low back pain, depressive symptoms, or both, in the general population.</p> <p>Methods</p> <p>We used data from the Saskatchewan Health and Back Pain Survey – a population-based cross-sectional survey mailed to a random, stratified sample of 2,184 Saskatchewan adults 20 to 69 years of age in 1995. Information on the main independent variable was gathered by asking respondents whether they had ever injured their low back at work. Our outcomes, the 6-month period prevalence of severity-graded low back pain and depressive symptoms during the past week, were measured with valid and reliable questionnaires. The associations between prior work-related low back injury and our outcomes were estimated through multinomial and binary multivariable logistic regression with adjustment for age, gender, and other important covariates.</p> <p>Results</p> <p>Fifty-five percent of the eligible population participated. Of the 1,086 participants who responded to the question about the main independent variable, 38.0% reported a history of work-related low back injury. A history of work-related low back injury was positively associated with low intensity/low disability low back pain (OR, 3.66; 95%CI, 2.48–5.42), with high intensity/low disability low back pain (OR, 4.03; 95%CI, 2.41–6.76), and with high disability low back pain (OR, 6.76; 95%CI, 3.80–12.01). No association was found between a history of work-related low back injury and depression (OR, 0.85; 95%CI, 0.55–1.30).</p> <p>Conclusion</p> <p>Our analysis shows an association between past occupational low back injury and increasing severity of prevalent low back pain, but not depression. These results suggest that past work-related low back injury may be an important risk factor for future episodes of low back pain and disability in the general population.</p

    Probing Metagenomics by Rapid Cluster Analysis of Very Large Datasets

    Get PDF
    BACKGROUND: The scale and diversity of metagenomic sequencing projects challenge both our technical and conceptual approaches in gene and genome annotations. The recent Sorcerer II Global Ocean Sampling (GOS) expedition yielded millions of predicted protein sequences, which significantly altered the landscape of known protein space by more than doubling its size and adding thousands of new families (Yooseph et al., 2007 PLoS Biol 5, e16). Such datasets, not only by their sheer size, but also by many other features, defy conventional analysis and annotation methods. METHODOLOGY/PRINCIPAL FINDINGS: In this study, we describe an approach for rapid analysis of the sequence diversity and the internal structure of such very large datasets by advanced clustering strategies using the newly modified CD-HIT algorithm. We performed a hierarchical clustering analysis on the 17.4 million Open Reading Frames (ORFs) identified from the GOS study and found over 33 thousand large predicted protein clusters comprising nearly 6 million sequences. Twenty percent of these clusters did not match known protein families by sequence similarity search and might represent novel protein families. Distributions of the large clusters were illustrated on organism composition, functional class, and sample locations. CONCLUSION/SIGNIFICANCE: Our clustering took about two orders of magnitude less computational effort than the similar protein family analysis of original GOS study. This approach will help to analyze other large metagenomic datasets in the future. A Web server with our clustering results and annotations of predicted protein clusters is available online at http://tools.camera.calit2.net/gos under the CAMERA project

    CADM1 is a strong neuroblastoma candidate gene that maps within a 3.72 Mb critical region of loss on 11q23

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Recurrent loss of part of the long arm of chromosome 11 is a well established hallmark of a subtype of aggressive neuroblastomas. Despite intensive mapping efforts to localize the culprit 11q tumour suppressor gene, this search has been unsuccessful thus far as no sufficiently small critical region could be delineated for selection of candidate genes.</p> <p>Methods</p> <p>To refine the critical region of 11q loss, the chromosome 11 status of 100 primary neuroblastoma tumours and 29 cell lines was analyzed using a BAC array containing a chromosome 11 tiling path. For the genes mapping within our refined region of loss, meta-analysis on published neuroblastoma mRNA gene expression datasets was performed for candidate gene selection. The DNA methylation status of the resulting candidate gene was determined using re-expression experiments by treatment of neuroblastoma cells with the demethylating agent 5-aza-2'-deoxycytidine and bisulphite sequencing.</p> <p>Results</p> <p>Two small critical regions of loss within 11q23 at chromosomal band 11q23.1-q23.2 (1.79 Mb) and 11q23.2-q23.3 (3.72 Mb) were identified. In a first step towards further selection of candidate neuroblastoma tumour suppressor genes, we performed a meta-analysis on published expression profiles of 692 neuroblastoma tumours. Integration of the resulting candidate gene list with expression data of neuroblastoma progenitor cells pinpointed <it>CADM1 </it>as a compelling candidate gene. Meta-analysis indicated that <it>CADM1 </it>expression has prognostic significance and differential expression for the gene was noted in unfavourable neuroblastoma versus normal neuroblasts. Methylation analysis provided no evidence for a two-hit mechanism in 11q deleted cell lines.</p> <p>Conclusion</p> <p>Our study puts <it>CADM1 </it>forward as a strong candidate neuroblastoma suppressor gene. Further functional studies are warranted to elucidate the role of <it>CADM1 </it>in neuroblastoma development and to investigate the possibility of <it>CADM1 </it>haploinsufficiency in neuroblastoma.</p

    Coordinating Environmental Genomics and Geochemistry Reveals Metabolic Transitions in a Hot Spring Ecosystem

    Get PDF
    We have constructed a conceptual model of biogeochemical cycles and metabolic and microbial community shifts within a hot spring ecosystem via coordinated analysis of the “Bison Pool” (BP) Environmental Genome and a complementary contextual geochemical dataset of ∼75 geochemical parameters. 2,321 16S rRNA clones and 470 megabases of environmental sequence data were produced from biofilms at five sites along the outflow of BP, an alkaline hot spring in Sentinel Meadow (Lower Geyser Basin) of Yellowstone National Park. This channel acts as a >22 m gradient of decreasing temperature, increasing dissolved oxygen, and changing availability of biologically important chemical species, such as those containing nitrogen and sulfur. Microbial life at BP transitions from a 92°C chemotrophic streamer biofilm community in the BP source pool to a 56°C phototrophic mat community. We improved automated annotation of the BP environmental genomes using BLAST-based Markov clustering. We have also assigned environmental genome sequences to individual microbial community members by complementing traditional homology-based assignment with nucleotide word-usage algorithms, allowing more than 70% of all reads to be assigned to source organisms. This assignment yields high genome coverage in dominant community members, facilitating reconstruction of nearly complete metabolic profiles and in-depth analysis of the relation between geochemical and metabolic changes along the outflow. We show that changes in environmental conditions and energy availability are associated with dramatic shifts in microbial communities and metabolic function. We have also identified an organism constituting a novel phylum in a metabolic “transition” community, located physically between the chemotroph- and phototroph-dominated sites. The complementary analysis of biogeochemical and environmental genomic data from BP has allowed us to build ecosystem-based conceptual models for this hot spring, reconstructing whole metabolic networks in order to illuminate community roles in shaping and responding to geochemical variability
    corecore