367 research outputs found

    String Indexing for Patterns with Wildcards

    Get PDF
    We consider the problem of indexing a string tt of length nn to report the occurrences of a query pattern pp containing mm characters and jj wildcards. Let occocc be the number of occurrences of pp in tt, and σ\sigma the size of the alphabet. We obtain the following results. - A linear space index with query time O(m+σjlog⁥log⁥n+occ)O(m+\sigma^j \log \log n + occ). This significantly improves the previously best known linear space index by Lam et al. [ISAAC 2007], which requires query time Θ(jn)\Theta(jn) in the worst case. - An index with query time O(m+j+occ)O(m+j+occ) using space O(σk2nlog⁥klog⁥n)O(\sigma^{k^2} n \log^k \log n), where kk is the maximum number of wildcards allowed in the pattern. This is the first non-trivial bound with this query time. - A time-space trade-off, generalizing the index by Cole et al. [STOC 2004]. We also show that these indexes can be generalized to allow variable length gaps in the pattern. Our results are obtained using a novel combination of well-known and new techniques, which could be of independent interest

    An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics

    Get PDF
    For a decade, The Cancer Genome Atlas (TCGA) program collected clinicopathologic annotation data along with multi-platform molecular profiles of more than 11,000 human tumors across 33 different cancer types. TCGA clinical data contain key features representing the democratized nature of the data collection process. To ensure proper use of this large clinical dataset associated with genomic features, we developed a standardized dataset named the TCGA Pan-Cancer Clinical Data Resource (TCGA-CDR), which includes four major clinical outcome endpoints. In addition to detailing major challenges and statistical limitations encountered during the effort of integrating the acquired clinical data, we present a summary that includes endpoint usage recommendations for each cancer type. These TCGA-CDR findings appear to be consistent with cancer genomics studies independent of the TCGA effort and provide opportunities for investigating cancer biology using clinical correlates at an unprecedented scale. Analysis of clinicopathologic annotations for over 11,000 cancer patients in the TCGA program leads to the generation of TCGA Clinical Data Resource, which provides recommendations of clinical outcome endpoint usage for 33 cancer types

    Real Roots of Random Polynomials and Zero Crossing Properties of Diffusion Equation

    Full text link
    We study various statistical properties of real roots of three different classes of random polynomials which recently attracted a vivid interest in the context of probability theory and quantum chaos. We first focus on gap probabilities on the real axis, i.e. the probability that these polynomials have no real root in a given interval. For generalized Kac polynomials, indexed by an integer d, of large degree n, one finds that the probability of no real root in the interval [0,1] decays as a power law n^{-\theta(d)} where \theta(d) > 0 is the persistence exponent of the diffusion equation with random initial conditions in spatial dimension d. For n \gg 1 even, the probability that they have no real root on the full real axis decays like n^{-2(\theta(2)+\theta(d))}. For Weyl polynomials and Binomial polynomials, this probability decays respectively like \exp{(-2\theta_{\infty}} \sqrt{n}) and \exp{(-\pi \theta_{\infty} \sqrt{n})} where \theta_{\infty} is such that \theta(d) = 2^{-3/2} \theta_{\infty} \sqrt{d} in large dimension d. We also show that the probability that such polynomials have exactly k roots on a given interval [a,b] has a scaling form given by \exp{(-N_{ab} \tilde \phi(k/N_{ab}))} where N_{ab} is the mean number of real roots in [a,b] and \tilde \phi(x) a universal scaling function. We develop a simple Mean Field (MF) theory reproducing qualitatively these scaling behaviors, and improve systematically this MF approach using the method of persistence with partial survival, which in some cases yields exact results. Finally, we show that the probability density function of the largest absolute value of the real roots has a universal algebraic tail with exponent {-2}. These analytical results are confirmed by detailed numerical computations.Comment: 32 pages, 16 figure

    Driver Fusions and Their Implications in the Development and Treatment of Human Cancers.

    Get PDF
    Gene fusions represent an important class of somatic alterations in cancer. We systematically investigated fusions in 9,624 tumors across 33 cancer types using multiple fusion calling tools. We identified a total of 25,664 fusions, with a 63% validation rate. Integration of gene expression, copy number, and fusion annotation data revealed that fusions involving oncogenes tend to exhibit increased expression, whereas fusions involving tumor suppressors have the opposite effect. For fusions involving kinases, we found 1,275 with an intact kinase domain, the proportion of which varied significantly across cancer types. Our study suggests that fusions drive the development of 16.5% of cancer cases and function as the sole driver in more than 1% of them. Finally, we identified druggable fusions involving genes such as TMPRSS2, RET, FGFR3, ALK, and ESR1 in 6.0% of cases, and we predicted immunogenic peptides, suggesting that fusions may provide leads for targeted drug and immune therapy

    Operation and performance of the ATLAS semiconductor tracker

    Get PDF
    The semiconductor tracker is a silicon microstrip detector forming part of the inner tracking system of the ATLAS experiment at the LHC. The operation and performance of the semiconductor tracker during the first years of LHC running are described. More than 99% of the detector modules were operational during this period, with an average intrinsic hit efficiency of (99.74±0.04)%. The evolution of the noise occupancy is discussed, and measurements of the Lorentz angle, Ύ-ray production and energy loss presented. The alignment of the detector is found to be stable at the few-micron level over long periods of time. Radiation damage measurements, which include the evolution of detector leakage currents, are found to be consistent with predictions and are used in the verification of radiation background simulations

    Measurement of the cross section of high transverse momentum Z→bb̄ production in proton–proton collisions at √s = 8 TeV with the ATLAS detector

    Get PDF
    This Letter reports the observation of a high transverse momentum Z→bb̄ signal in proton–proton collisions at √s=8 TeV and the measurement of its production cross section. The data analysed were collected in 2012 with the ATLAS detector at the LHC and correspond to an integrated luminosity of 19.5 fb−Âč. The Z→bb̄ decay is reconstructed from a pair of b -tagged jets, clustered with the anti-ktkt jet algorithm with R=0.4R=0.4, that have low angular separation and form a dijet with pT>200 GeVpT>200 GeV. The signal yield is extracted from a fit to the dijet invariant mass distribution, with the dominant, multi-jet background mass shape estimated by employing a fully data-driven technique that reduces the dependence of the analysis on simulation. The fiducial cross section is determined to be σZ→bbÂŻfid=2.02±0.20 (stat.) ±0.25 (syst.)±0.06 (lumi.) pb=2.02±0.33 pb, in good agreement with next-to-leading-order theoretical predictions

    Measurement of the correlation between flow harmonics of different order in lead-lead collisions at √sNN = 2.76 TeV with the ATLAS detector

    Get PDF
    Correlations between the elliptic or triangular flow coefficients vm (m=2 or 3) and other flow harmonics vn (n=2 to 5) are measured using √sNN=2.76 TeV Pb+Pb collision data collected in 2010 by the ATLAS experiment at the LHC, corresponding to an integrated luminosity of 7 ÎŒb−1. The vm−vn correlations are measured in midrapidity as a function of centrality, and, for events within the same centrality interval, as a function of event ellipticity or triangularity defined in a forward rapidity region. For events within the same centrality interval, v3 is found to be anticorrelated with v2 and this anticorrelation is consistent with similar anticorrelations between the corresponding eccentricities, Δ2 and Δ3. However, it is observed that v4 increases strongly with v2, and v5 increases strongly with both v2 and v3. The trend and strength of the vm−vn correlations for n=4 and 5 are found to disagree with Δm−Δn correlations predicted by initial-geometry models. Instead, these correlations are found to be consistent with the combined effects of a linear contribution to vn and a nonlinear term that is a function of v22 or of v2v3, as predicted by hydrodynamic models. A simple two-component fit is used to separate these two contributions. The extracted linear and nonlinear contributions to v4 and v5 are found to be consistent with previously measured event-plane correlations

    Search for H→γγ produced in association with top quarks and constraints on the Yukawa coupling between the top quark and the Higgs boson using data taken at 7 TeV and 8 TeV with the ATLAS detector

    Get PDF
    A search is performed for Higgs bosons produced in association with top quarks using the diphoton decay mode of the Higgs boson. Selection requirements are optimized separately for leptonic and fully hadronic final states from the top quark decays. The dataset used corresponds to an integrated luminosity of 4.5 fb−14.5 fb−1 of proton–proton collisions at a center-of-mass energy of 7 TeV and 20.3 fb−1 at 8 TeV recorded by the ATLAS detector at the CERN Large Hadron Collider. No significant excess over the background prediction is observed and upper limits are set on the tt¯H production cross section. The observed exclusion upper limit at 95% confidence level is 6.7 times the predicted Standard Model cross section value. In addition, limits are set on the strength of the Yukawa coupling between the top quark and the Higgs boson, taking into account the dependence of the tt¯H and tH cross sections as well as the H→γγ branching fraction on the Yukawa coupling. Lower and upper limits at 95% confidence level are set at −1.3 and +8.0 times the Yukawa coupling strength in the Standard Model

    Search for vectorlike B quarks in events with one isolated lepton, missing transverse momentum, and jets at √s = 8 TeV with the ATLAS detector

    Get PDF
    A search has been performed for pair production of heavy vectorlike down-type (B) quarks. The analysis explores the lepton-plus-jets final state, characterized by events with one isolated charged lepton (electron or muon), significant missing transverse momentum, and multiple jets. One or more jets are required to be tagged as arising from b quarks, and at least one pair of jets must be tagged as arising from the hadronic decay of an electroweak boson. The analysis uses the full data sample of pp collisions recorded in 2012 by the ATLAS detector at the LHC, operating at a center-of-mass energy of 8 TeV, corresponding to an integrated luminosity of 20.3 fb −1 . No significant excess of events is observed above the expected background. Limits are set on vectorlike B production, as a function of the B branching ratios, assuming the allowable decay modes are B → Wt/Zb/Hb. In the chiral limit with a branching ratio of 100% for the decay B → Wt, the observed (expected) 95% C.L. lower limit on the vectorlike B mass is 810 GeV (760 GeV). In the case where the vectorlike B quark has branching ratio values corresponding to those of an SU(2) singlet state, the observed (expected) 95% C.L. lower limit on the vectorlike B mass is 640 GeV (505 GeV). The same analysis, when used to investigate pair production of a colored, charge 5/3 exotic fermion T 5/3 , with subsequent decay T 5/3 → Wt, sets an observed (expected) 95% C.L. lower limit on the T 5/3 mass of 840 GeV (780 GeV)

    Fiducial and differential cross sections of Higgs boson production measured in the four-lepton decay channel in pp collisions at √s = 8 TeV with the ATLAS detector

    Get PDF
    Measurements of fiducial and differential cross sections of Higgs boson production in the H→ZZ∗ → 4ℓ decay channel are presented. The cross sections are determined within a fiducial phase space and corrected for detection efficiency and resolution effects. They are based on 20.3 fb−Âč of pp collision data, produced at √s = 8 TeV centre-of-mass energy at the LHC and recorded by the ATLAS detector. The differential measurements are performed in bins of transverse momentum and rapidity of the four-lepton system, the invariant mass of the subleading lepton pair and the decay angle of the leading lepton pair with respect to the beam line in the four-lepton rest frame, as well as the number of jets and the transverse momentum of the leading jet. The measured cross sections are compared to selected theoretical calculations of the Standard Model expectations. No significant deviation from any of the tested predictions is found
    • 

    corecore