496 research outputs found

    Hidden breakpoints in genome alignments

    Full text link
    During the course of evolution, an organism's genome can undergo changes that affect the large-scale structure of the genome. These changes include gene gain, loss, duplication, chromosome fusion, fission, and rearrangement. When gene gain and loss occurs in addition to other types of rearrangement, breakpoints of rearrangement can exist that are only detectable by comparison of three or more genomes. An arbitrarily large number of these "hidden" breakpoints can exist among genomes that exhibit no rearrangements in pairwise comparisons. We present an extension of the multichromosomal breakpoint median problem to genomes that have undergone gene gain and loss. We then demonstrate that the median distance among three genomes can be used to calculate a lower bound on the number of hidden breakpoints present. We provide an implementation of this calculation including the median distance, along with some practical improvements on the time complexity of the underlying algorithm. We apply our approach to measure the abundance of hidden breakpoints in simulated data sets under a wide range of evolutionary scenarios. We demonstrate that in simulations the hidden breakpoint counts depend strongly on relative rates of inversion and gene gain/loss. Finally we apply current multiple genome aligners to the simulated genomes, and show that all aligners introduce a high degree of error in hidden breakpoint counts, and that this error grows with evolutionary distance in the simulation. Our results suggest that hidden breakpoint error may be pervasive in genome alignments.Comment: 13 pages, 4 figure

    Wastewater-based Estimation of Substances Discharged at the Rest Areas along the State Highways in Kentucky

    Get PDF
    The availability of licit and illicit stimulants and its adverse consequences on public health has emerged as a major drug threat to communities in the United States. Despite several drug-involved traffic incidents along the interstate highways, this report represents the first comprehensive and quantitative report of drugs discharged at the rest areas along the interstate highways. In this National Institute of Justice-funded study, the amount of several discharged drugs focusing on stimulants but also including opioids and prescription antipsychotics are being measured in raw wastewater collected from five rest areas and a truck servicing facility using a state-of-the-art mass spectrometry technique. Three stimulants (cocaine, methamphetamine, and amphetamine), two opioids (hydrocodone and tramadol), THC metabolite, and four antidepressants (venlafaxine, citalopram, fluoxetine, and sertraline) were detected in all of the collected wastewater samples in the early phases of the project. Methamphetamine was the most prevalent stimulant (40.0-1240 mg/d) followed by the cocaine metabolite (9.18-385 mg/d) and amphetamine (14.9-97.9 mg/d). The rest area users normalized methamphetamine discharge in Christian County rest area (I-24E) was 1.8 folds higher than in Whitehaven rest area (I-24W) and 7.8 folds higher than in the Laurel County truck service facility (I-75). The significantly higher ratio of cocaine and its metabolite (\u3e1.0) found in the Whitehaven rest area suggested the possibility of a direct discharge of cocaine in two select days in October. Overall, we established a unique collaboration among the Appalachian High Intensity Drug Trafficking Area (HIDTA), the Kentucky Transportation Cabinet, Cabinet for Health and Family Services, Murray State University and the University of Kentucky

    High-throughput sequence alignment using Graphics Processing Units

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The recent availability of new, less expensive high-throughput DNA sequencing technologies has yielded a dramatic increase in the volume of sequence data that must be analyzed. These data are being generated for several purposes, including genotyping, genome resequencing, metagenomics, and <it>de novo </it>genome assembly projects. Sequence alignment programs such as MUMmer have proven essential for analysis of these data, but researchers will need ever faster, high-throughput alignment tools running on inexpensive hardware to keep up with new sequence technologies.</p> <p>Results</p> <p>This paper describes MUMmerGPU, an open-source high-throughput parallel pairwise local sequence alignment program that runs on commodity Graphics Processing Units (GPUs) in common workstations. MUMmerGPU uses the new Compute Unified Device Architecture (CUDA) from nVidia to align multiple query sequences against a single reference sequence stored as a suffix tree. By processing the queries in parallel on the highly parallel graphics card, MUMmerGPU achieves more than a 10-fold speedup over a serial CPU version of the sequence alignment kernel, and outperforms the exact alignment component of MUMmer on a high end CPU by 3.5-fold in total application time when aligning reads from recent sequencing projects using Solexa/Illumina, 454, and Sanger sequencing technologies.</p> <p>Conclusion</p> <p>MUMmerGPU is a low cost, ultra-fast sequence alignment program designed to handle the increasing volume of data produced by new, high-throughput sequencing technologies. MUMmerGPU demonstrates that even memory-intensive applications can run significantly faster on the relatively low-cost GPU than on the CPU.</p

    Frequency and Types of Healthcare Encounters in the Week Preceding a Sepsis Hospitalization: A Systematic Review

    Get PDF
    OBJECTIVES: Early recognition and treatment are critical to improving sepsis outcomes. We sought to identify the frequency and types of encounters that patients have with the healthcare system in the week prior to a sepsis hospitalization. DATA SOURCES: PubMed, Cumulative Index to Nursing and Allied Health Literature, Scopus, and the Cochrane Library. STUDY SELECTION: Observational cohort studies of patients hospitalized with sepsis or septic shock that were assessed for an outpatient or emergency department encounter with the healthcare system in the week prior to hospital admission. DATA EXTRACTION: The primary outcome was the proportion of patients with a healthcare encounter in the time period assessed (up to 1 week) prior to a hospitalization with sepsis. DATA SYNTHESIS: Six retrospective observational studies encompassing 6,785,728 sepsis admissions were included for evaluation, ranging from a 263-patient single-center cohort to a large database evaluating 6,731,827 sepsis admissions. The average (unweighted) proportion of patients having an encounter with the healthcare system in the week prior to a sepsis hospitalization was 32.7% and ranged from 10.3% to 52.9%. These encounters commonly involved presentation or potential symptoms of infectious diseases, antibiotic prescriptions, and appeared to increase in frequency closer to a sepsis hospitalization admission. No consistent factors were identified that distinguished a healthcare encounter as more or less likely to precede a sepsis hospitalization in the subsequent week. CONCLUSIONS: Patients that present to the hospital with sepsis are frequently evaluated in the healthcare system in the week prior to admission. Further research is necessary to understand if these encounters offer earlier opportunities for intervention to prevent the transition from infection to sepsis, whether they merely reflect the comorbidities of sepsis patients with a high baseline rate of healthcare encounters, or the declining trajectory of a patient\u27s overall health in response to infection

    Small-scale spatial variability of soil CO2 flux: Implication for monitoring strategy

    Get PDF
    In recent decades, soil CO2 flux measurements have been often used in both volcanic and seismically active areas to investigate the interconnections between temporal and spatial anomalies in degassing and telluric activities. In this study, we focus on a narrow degassing area of the Piton de la Fournaise volcano, that has been chosen for its proximity and link with the frequently active volcanic area. Our aim is to constrain the degassing in this narrow area and identify the potential processes involved in both spatial and temporal soil CO2 variations in order to provide an enhanced monitoring strategy for soil CO2 flux. We performed a geophysical survey (self-potential measurements: SP; electrical resistivity tomography: ERT) to provide a high-resolution description of the subsurface. We identified one main SP negative anomaly dividing the area in two zones. Based on these results, we set ten control points, from the site of the main SP negative anomaly up to 230 m away, where soil CO2 fluxes were weekly measured during one year of intense eruptive activity at Piton de la Fournaise. Our findings show that lateral and vertical soil heterogeneities and structures exert a strong control on the degassing pattern. We find that temporal soil CO2 flux series at control points close to the main SP negative anomaly better record variations linked to the volcanic activity. We also show that the synchronicity between the increase of soil CO2 flux and deep seismicity can be best explained by a pulsed process pushing out the CO2 already stored and fractionated in the system. Importantly, our findings show that low soil CO2 fluxes and low carbon isotopic signature are able to track variations of volcanic activity in the same way as high fluxes and high carbon isotopic signature do. This result gives important insights in terms of monitoring strategy of volcanic and seismotectonic areas in geodynamics contexts characterized by difficult environmental operational conditions as commonly met in tropical areaPublished13-264A. Oceanografia e climaJCR Journa

    Cardiovascular risk estimation and eligibility for statins in primary prevention comparing different strategies.

    Get PDF
    Recommendations for statin use for primary prevention of coronary heart disease (CHD) are based on estimation of the 10-year CHD risk. It is unclear which risk algorithm and guidelines should be used in European populations. Using data from a population-based study in Switzerland, we first assessed 10-year CHD risk and eligibility for statins in 5,683 women and men 35 to 75 years of age without cardiovascular disease by comparing recommendations by the European Society of Cardiology without and with extrapolation of risk to age 60 years, the International Atherosclerosis Society, and the US Adult Treatment Panel III. The proportions of participants classified as high-risk for CHD were 12.5% (15.4% with extrapolation), 3.0%, and 5.8%, respectively. Proportions of participants eligible for statins were 9.2% (11.6% with extrapolation), 13.7%, and 16.7%, respectively. Assuming full compliance to each guideline, expected relative decreases in CHD deaths in Switzerland over a 10-year period would be 16.4% (17.5% with extrapolation), 18.7%, and 19.3%, respectively; the corresponding numbers needed to treat to prevent 1 CHD death would be 285 (340 with extrapolation), 380, and 440, respectively. In conclusion, the proportion of subjects classified as high risk for CHD varied over a fivefold range across recommendations. Following the International Atherosclerosis Society and the Adult Treatment Panel III recommendations might prevent more CHD deaths at the cost of higher numbers needed to treat compared with European Society of Cardiology guidelines

    Hydrogeology of Stromboli volcano, Aeolian Islands (Italy) from the interpretation of resistivity tomograms, self-potential, soil temperature and soil CO2 concentration measurements

    Get PDF
    International audienceTo gain a better insight of the hydrogeology and the location of the main tectonic faults of Stromboli volcano in Italy, we collected electrical resistivity measurements, soil CO2 concentrations, temperature and self-potential measurements along two profiles. These two profiles started at the village of Ginostra in the southwest part of the island. The first profile (4.8 km in length) ended up at the village of Scari in the north east part of the volcano and the second one (3.5 km in length) at Forgia Vecchia beach, in the eastern part of the island. These data were used to provide insights regarding the position of shallow aquifers and the extension of the hydrothermal system. This large-scale study is complemented by two high-resolution studies, one at the Pizzo area (near the active vents) and one at Rina Grande where flank collapse areas can be observed. The Pizzo corresponds to one of the main degassing structure of the hydrothermal system. The main degassing area is localized along a higher permeability area corresponding to the head of the gliding plane of the Rina Grande sector collapse. We found that the self-potential data reveal the position of an aquifer above the villages of Scari and San Vincenzo. We provide an estimate of the depth of this aquifer from these data. The lateral extension of the hydrothermal system (resistivity ∼15-60 ohm m) is broader than anticipated extending in the direction of the villages of Scari and San Vincenzo (in agreement with temperature data recorded in shallow wells). The lateral extension of the hydrothermal system reaches the lower third of the Rina Grande sector collapse area in the eastern part of the island. The hydrothermal body in this area is blocked by an old collapse boundary. This position of the hydrothermal body is consistent with low values of the magnetization (<2.5 A m−1) from previously published work. The presence of the hydrothermal body below Rina Grande raises questions about the mechanical stability of this flank of the edifice

    Endogenous Avian Leukosis Virus subgroup E elements of the chicken reference genome

    Get PDF
    The chicken reference genome contains two endogenous Avian Leukosis Virus subgroup E (ALVE) insertions, but gaps and unresolved repetitive sequences in previous assemblies has hindered their precise characterisation. Detailed analysis of the most recent reference genome (GRCg6a) now shows both ALVEs within contiguous chromosome assemblies for the first time. ALVE6 (ALVE-JFevA) and ALVE-JFevB are both located on chromosome 1, with ALVE6 close to the p arm telomere. ALVE-JFevB is a structurally intact element containing the ALVE gag, pol and env genes, and is capable of forming replication competent viruses. In contrast, ALVE6 (ALVE-JFevA) contains a 3352 bp 5’ truncation and lacks the entire 5’ LTR and gag gene. Despite this, ALVE6 remains able to produce intact envelope protein, likely due to a mutation in the recognition site for a known inhibitory miRNA (miR-155). Whole genome resequencing datasets from layers, broilers and three independent sources of wild-caught red junglefowl were surveyed for the presence of each of these reference genome ALVEs. ALVE-JFevB was found in no other chicken or red junglefowl genomes, whereas ALVE6 was identified in some layers, broilers and native breeds, but not within any other red junglefowl genome. Improved assembly contiguity has facilitated better characterisation of the two ALVEs of the chicken reference genome. However, both the limited ALVE content and unique presence of ALVE-JFevB suggests that the reference individual is unrepresentative of ancestral Gallus gallus ALVE diversity

    CGAT: a comparative genome analysis tool for visualizing alignments in the analysis of complex evolutionary changes between closely related genomes

    Get PDF
    BACKGROUND: The recent accumulation of closely related genomic sequences provides a valuable resource for the elucidation of the evolutionary histories of various organisms. However, although numerous alignment calculation and visualization tools have been developed to date, the analysis of complex genomic changes, such as large insertions, deletions, inversions, translocations and duplications, still presents certain difficulties. RESULTS: We have developed a comparative genome analysis tool, named CGAT, which allows detailed comparisons of closely related bacteria-sized genomes mainly through visualizing middle-to-large-scale changes to infer underlying mechanisms. CGAT displays precomputed pairwise genome alignments on both dotplot and alignment viewers with scrolling and zooming functions, and allows users to move along the pre-identified orthologous alignments. Users can place several types of information on this alignment, such as the presence of tandem repeats or interspersed repetitive sequences and changes in G+C contents or codon usage bias, thereby facilitating the interpretation of the observed genomic changes. In addition to displaying precomputed alignments, the viewer can dynamically calculate the alignments between specified regions; this feature is especially useful for examining the alignment boundaries, as these boundaries are often obscure and can vary between programs. Besides the alignment browser functionalities, CGAT also contains an alignment data construction module, which contains various procedures that are commonly used for pre- and post-processing for large-scale alignment calculation, such as the split-and-merge protocol for calculating long alignments, chaining adjacent alignments, and ortholog identification. Indeed, CGAT provides a general framework for the calculation of genome-scale alignments using various existing programs as alignment engines, which allows users to compare the outputs of different alignment programs. Earlier versions of this program have been used successfully in our research to infer the evolutionary history of apparently complex genome changes between closely related eubacteria and archaea. CONCLUSION: CGAT is a practical tool for analyzing complex genomic changes between closely related genomes using existing alignment programs and other sequence analysis tools combined with extensive manual inspection

    Longest Increasing Subsequence under Persistent Comparison Errors

    Full text link
    We study the problem of computing a longest increasing subsequence in a sequence SS of nn distinct elements in the presence of persistent comparison errors. In this model, every comparison between two elements can return the wrong result with some fixed (small) probability p p , and comparisons cannot be repeated. Computing the longest increasing subsequence exactly is impossible in this model, therefore, the objective is to identify a subsequence that (i) is indeed increasing and (ii) has a length that approximates the length of the longest increasing subsequence. We present asymptotically tight upper and lower bounds on both the approximation factor and the running time. In particular, we present an algorithm that computes an O(logn)O(\log n)-approximation in time O(nlogn)O(n\log n), with high probability. This approximation relies on the fact that that we can approximately sort nn elements in O(nlogn)O(n\log n) time such that the maximum dislocation of an element is at most O(logn)O(\log n). For the lower bounds, we prove that (i) there is a set of sequences, such that on a sequence picked randomly from this set every algorithm must return an Ω(logn)\Omega(\log n)-approximation with high probability, and (ii) any O(logn)O(\log n)-approximation algorithm for longest increasing subsequence requires Ω(nlogn)\Omega(n \log n) comparisons, even in the absence of errors
    corecore