997 research outputs found
An Optimal Linear Time Algorithm for Quasi-Monotonic Segmentation
Monotonicity is a simple yet significant qualitative characteristic. We
consider the problem of segmenting a sequence in up to K segments. We want
segments to be as monotonic as possible and to alternate signs. We propose a
quality metric for this problem using the l_inf norm, and we present an optimal
linear time algorithm based on novel formalism. Moreover, given a
precomputation in time O(n log n) consisting of a labeling of all extrema, we
compute any optimal segmentation in constant time. We compare experimentally
its performance to two piecewise linear segmentation heuristics (top-down and
bottom-up). We show that our algorithm is faster and more accurate.
Applications include pattern recognition and qualitative modeling.Comment: This is the extended version of our ICDM'05 paper (arXiv:cs/0702142
Practical Evaluation of Lempel-Ziv-78 and Lempel-Ziv-Welch Tries
We present the first thorough practical study of the Lempel-Ziv-78 and the
Lempel-Ziv-Welch computation based on trie data structures. With a careful
selection of trie representations we can beat well-tuned popular trie data
structures like Judy, m-Bonsai or Cedar
A Call to Arms: Revisiting Database Design
Good database design is crucial to obtain a sound, consistent database, and -
in turn - good database design methodologies are the best way to achieve the
right design. These methodologies are taught to most Computer Science
undergraduates, as part of any Introduction to Database class. They can be
considered part of the "canon", and indeed, the overall approach to database
design has been unchanged for years. Moreover, none of the major database
research assessments identify database design as a strategic research
direction.
Should we conclude that database design is a solved problem?
Our thesis is that database design remains a critical unsolved problem.
Hence, it should be the subject of more research. Our starting point is the
observation that traditional database design is not used in practice - and if
it were used it would result in designs that are not well adapted to current
environments. In short, database design has failed to keep up with the times.
In this paper, we put forth arguments to support our viewpoint, analyze the
root causes of this situation and suggest some avenues of research.Comment: Removed spurious column break. Nothing else was change
SPEC Kit 361: Outreach and Engagement
ARL SPEC KitLibrary outreach is experiencing a renaissance. Librarians have been reaching out to their communities and developing programming for decades, but libraries are increasingly being asked to demonstrate their value to the communities that they serve. In response, outreach positions are becoming more commonplace and communities of practice are emerging around measuring the impact of library outreach activities. This SPEC Kit was born out of the authors’ struggles and successes in providing academic library outreach services at their local institutions. The survey questions were designed to gather information from ARL institutions to create a picture of library outreach that spans across institutions; a professional baseline. Questions of organizational priorities, vision, goals, resource allocation, staffing models, and assessment come together to paint the picture of how libraries are approaching outreach programs. The survey was sent to the 125 ARL member institutions in July 2018,
with 57 (46%) responding by the August 6 deadline. The data gathered suggests that systematic outreach programs are still very much in their infancy and highly dependent on local organizational culture. This SPEC Kit highlights the areas where libraries share approaches to outreach programs while also shining a spotlight on issues that warrant continued research and attention by outreach librarians and library administrators
Fourier analysis of 2-point Hermite interpolatory subdivision schemes
Two subdivision schemes with Hermite data on Z are studied. These schemes use 2 or 7 parameters respectively depending on whether Hermite data involve only first derivatives or include second derivatives. For a large region in the parameters space, the schemes are C1 or C2 convergent or at least are convergent on the space of Schwartz distributions. The Fourier transform of any interpolating function can be computed through products of matrices of order 2 or 3. The Fourier transform is related to a specific system of functional equations whose analytic solution is unique except for a multiplicative constant. The main arguments for these results come from Paley-Wiener-Schwartz theorem on the characterization of the Fourier transforms of distributions with compact support and a theorem of Artzrouni about convergent products of matrices
Exome-wide association study of pancreatic cancer risk
We conducted a case-control exome-wide association study to discover germline variants in coding regions that affect risk for pancreatic cancer, combining data from 5 studies. We analyzed exome and genome sequencing data from 437 patients with pancreatic cancer (cases) and 1922 individuals not known to have cancer (controls). In the primary analysis, BRCA2 had the strongest enrichment for rare inactivating variants (17/437 cases vs 3/1922 controls) (P=3.27x10(-6); exome-wide statistical significance threshold P<2.5x10(-6)). Cases had more rare inactivating variants in DNA repair genes than controls, even after excluding 13 genes known to predispose to pancreatic cancer (adjusted odds ratio, 1.35, P=.045). At the suggestive threshold (P<.001), 6 genes were enriched for rare damaging variants (UHMK1, AP1G2, DNTA, CHST6, FGFR3, and EPHA1) and 7 genes had associations with pancreatic cancer risk, based on the sequence-kernel association test. We confirmed variants in BRCA2 as the most common high-penetrant genetic factor associated with pancreatic cancer and we also identified candidate pancreatic cancer genes. Large collaborations and novel approaches are needed to overcome the genetic heterogeneity of pancreatic cancer predisposition
Histone deacetylase activity is necessary for left-right patterning during vertebrate development
<p>Abstract</p> <p>Background</p> <p>Consistent asymmetry of the left-right (LR) axis is a crucial aspect of vertebrate embryogenesis. Asymmetric gene expression of the TGFβ superfamily member <it>Nodal related 1 </it>(<it>Nr1) </it>in the left lateral mesoderm plate is a highly conserved step regulating the <it>situs </it>of the heart and viscera. In <it>Xenopus</it>, movement of maternal serotonin (5HT) through gap-junctional paths at cleavage stages dictates asymmetry upstream of <it>Nr1</it>. However, the mechanisms linking earlier biophysical asymmetries with this transcriptional control point are not known.</p> <p>Results</p> <p>To understand how an early physiological gradient is transduced into a late, stable pattern of <it>Nr1 </it>expression we investigated epigenetic regulation during LR patterning. Embryos injected with mRNA encoding a dominant-negative of Histone Deacetylase (HDAC) lacked <it>Nr1 </it>expression and exhibited randomized sidedness of the heart and viscera (heterotaxia) at stage 45. Timing analysis using pharmacological blockade of HDACs implicated cleavage stages as the active period. Inhibition during these early stages was correlated with an absence of <it>Nr1 </it>expression at stage 21, high levels of heterotaxia at stage 45, and the deposition of the epigenetic marker H3K4me2 on the <it>Nr1 </it>gene. To link the epigenetic machinery to the 5HT signaling pathway, we performed a high-throughput proteomic screen for novel cytoplasmic 5HT partners associated with the epigenetic machinery. The data identified the known HDAC partner protein Mad3 as a 5HT-binding regulator. While Mad3 overexpression led to an absence of <it>Nr1 </it>transcription and randomized the LR axis, a mutant form of Mad3 lacking 5HT binding sites was not able to induce heterotaxia, showing that Mad3's biological activity is dependent on 5HT binding.</p> <p>Conclusion</p> <p>HDAC activity is a new LR determinant controlling the epigenetic state of <it>Nr1 </it>from early developmental stages. The HDAC binding partner Mad3 may be a new serotonin-dependent regulator of asymmetry linking early physiological asymmetries to stable changes in gene expression during organogenesis.</p
Successful identification of rare variants using oligogenic segregation analysis as a prioritizing tool for whole-exome sequencing studies
We aim to identify rare variants that have large effects on trait variance using a cost-efficient strategy. We use an oligogenic segregation analysis as a prioritizing tool for whole-exome sequencing studies to identify families more likely to harbor rare variants, by estimating the mean number of quantitative trait loci (QTLs) in each family. We hypothesize that families with additional QTLs, relative to the other families, are more likely to segregate functional rare variants. We test the association of rare variants with the traits only in regions where at least modest evidence of linkage with the trait is observed, thereby reducing the number of tests performed. We found that family 7 harbored an estimated two, one, and zero additional QTLs for traits Q1, Q2, and Q4, respectively. Two rare variants (C4S4935 and C6S2981) segregating in family 7 were associated with Q1 and explained a substantial proportion of the observed linkage signal. These rare variants have 31 and 22 carriers, respectively, in the 128-member family and entered through a single but different founder. For Q2, we found one rare variant unique to family 7 that showed small effect and weak evidence of association; this was a false positive. These results are a proof of principle that prioritizing the sequencing of carefully selected extended families is a simple and cost-efficient design strategy for sequencing studies aiming at identifying functional rare variants
Biomarkers of Methylmercury Exposure Immunotoxicity among Fish Consumers in Amazonian Brazil
Background: Mercury (Hg) is a ubiquitous environmental contaminant with neurodevelopmental and immune system effects. An informative biomarker of Hg-induced immunotoxicity could aid studies on the potential contribution to immune-related health effects
- …