3,888 research outputs found
Bayesian Grammar Induction for Language Modeling
We describe a corpus-based induction algorithm for probabilistic context-free
grammars. The algorithm employs a greedy heuristic search within a Bayesian
framework, and a post-pass using the Inside-Outside algorithm. We compare the
performance of our algorithm to n-gram models and the Inside-Outside algorithm
in three language modeling tasks. In two of the tasks, the training data is
generated by a probabilistic context-free grammar and in both tasks our
algorithm outperforms the other techniques. The third task involves
naturally-occurring data, and in this task our algorithm does not perform as
well as n-gram models but vastly outperforms the Inside-Outside algorithm.Comment: 8 pages, LaTeX, uses aclap.st
Recommended from our members
Bayesian Grammar Induction for Language Modeling
We describe a corpus-based induction algorithm for probabilistic context-free grammars. The algorithm employs a greedy heuristic search within a Bayesian framework, and a post-pass using the Inside-Outside algorithm. We compare the performance of our algorithm to n-gram models and the Inside-Outside algorithm in three language modeling tasks. In two of these domains, our algorithm outperforms these other techniques, marking the first time a grammar-based language model has surpassed n-gram modeling in a task of at least moderate size.Engineering and Applied Science
Scale Invariance and Nonlinear Patterns of Human Activity
We investigate if known extrinsic and intrinsic factors fully account for the
complex features observed in recordings of human activity as measured from
forearm motion in subjects undergoing their regular daily routine. We
demonstrate that the apparently random forearm motion possesses previously
unrecognized dynamic patterns characterized by fractal and nonlinear dynamics.
These patterns are unaffected by changes in the average activity level, and
persist when the same subjects undergo time-isolation laboratory experiments
designed to account for the circadian phase and to control the known extrinsic
factors. We attribute these patterns to a novel intrinsic multi-scale dynamic
regulation of human activity.Comment: 4 pages, three figure
Quantifying stock return distributions in financial markets
This is the final version. Available from Public Library of Science via the DOI in this record. Data Availability: Relevant data were obtained by the authors from the third party Wharton Research Data Services. Raw data sets from the Trades and Quotes database are available from the following URL: https://wrds-web.wharton.upenn.edu/wrds/.Being able to quantify the probability of large price changes in stock markets is of crucial importance in understanding financial crises that affect the lives of people worldwide. Large changes in stock market prices can arise abruptly, within a matter of minutes, or develop across much longer time scales. Here, we analyze a dataset comprising the stocks forming the Dow Jones Industrial Average at a second by second resolution in the period from January 2008 to July 2010 in order to quantify the distribution of changes in market prices at a range of time scales. We find that the tails of the distributions of logarithmic price changes, or returns, exhibit power law decays for time scales ranging from 300 seconds to 3600 seconds. For larger time scales, we find that the distributions tails exhibit exponential decay. Our findings may inform the development of models of market behavior across varying time scales.EPSRCIARPANS
Thermodynamics, Structure, and Dynamics of Water Confined between Hydrophobic Plates
We perform molecular dynamics simulations of 512 water-like molecules that
interact via the TIP5P potential and are confined between two smooth
hydrophobic plates that are separated by 1.10 nm. We find that the anomalous
thermodynamic properties of water are shifted to lower temperatures relative to
the bulk by K. The dynamics and structure of the confined water
resemble bulk water at higher temperatures, consistent with the shift of
thermodynamic anomalies to lower temperature. Due to this shift, our
confined water simulations (down to K) do not reach sufficiently low
temperature to observe a liquid-liquid phase transition found for bulk water at
K using the TIP5P potential. We find that the different
crystalline structures that can form for two different separations of the
plates, 0.7 nm and 1.10 nm, have no counterparts in the bulk system, and
discuss the relevance to experiments on confined water.Comment: 31 pages, 14 figure
Ribosomal Proteins RPS11 and RPS20, Two Stress-Response Markers of Glioblastoma Stem Cells, Are Novel Predictors of Poor Prognosis in Glioblastoma Patients.
Glioblastoma stem cells (GSC) co-exhibiting a tumor-initiating capacity and a radio-chemoresistant phenotype, are a compelling cell model for explaining tumor recurrence. We have previously characterized patient-derived, treatment-resistant GSC clones (TRGC) that survived radiochemotherapy. Compared to glucose-dependent, treatment-sensitive GSC clones (TSGC), TRGC exhibited reduced glucose dependence that favor the fatty acid oxidation pathway as their energy source. Using comparative genome-wide transcriptome analysis, a series of defense signatures associated with TRGC survival were identified and verified by siRNA-based gene knockdown experiments that led to loss of cell integrity. In this study, we investigate the prognostic value of defense signatures in glioblastoma (GBM) patients using gene expression analysis with Probeset Analyzer (131 GBM) and The Cancer Genome Atlas (TCGA) data, and protein expression with a tissue microarray (50 GBM), yielding the first TRGC-derived prognostic biomarkers for GBM patients. Ribosomal protein S11 (RPS11), RPS20, individually and together, consistently predicted poor survival of newly diagnosed primary GBM tumors when overexpressed at the RNA or protein level [RPS11: Hazard Ratio (HR) = 11.5, p<0.001; RPS20: HR = 4.5, p = 0.03; RPS11+RPS20: HR = 17.99, p = 0.001]. The prognostic significance of RPS11 and RPS20 was further supported by whole tissue section RPS11 immunostaining (27 GBM; HR = 4.05, p = 0.01) and TCGA gene expression data (578 primary GBM; RPS11: HR = 1.19, p = 0.06; RPS20: HR = 1.25, p = 0.02; RPS11+RPS20: HR = 1.43, p = 0.01). Moreover, tumors that exhibited unmethylated O-6-methylguanine-DNA methyltransferase (MGMT) or wild-type isocitrate dehydrogenase 1 (IDH1) were associated with higher RPS11 expression levels [corr (IDH1, RPS11) = 0.64, p = 0.03); [corr (MGMT, RPS11) = 0.52, p = 0.04]. These data indicate that increased expression of RPS11 and RPS20 predicts shorter patient survival. The study also suggests that TRGC are clinically relevant cells that represent resistant tumorigenic clones from patient tumors and that their properties, at least in part, are reflected in poor-prognosis GBM. The screening of TRGC signatures may represent a novel alternative strategy for identifying new prognostic biomarkers
Relation Between the Widom line and the Strong-Fragile Dynamic Crossover in Systems with a Liquid-Liquid Phase Transition
We investigate, for two water models displaying a liquid-liquid critical
point, the relation between changes in dynamic and thermodynamic anomalies
arising from the presence of the liquid-liquid critical point. We find a
correlation between the dynamic fragility transition and the locus of specific
heat maxima (``Widom line'') emanating from the critical point.
Our findings are consistent with a possible relation between the previously
hypothesized liquid-liquid phase transition and the transition in the dynamics
recently observed in neutron scattering experiments on confined water. More
generally, we argue that this connection between and dynamic
crossover is not limited to the case of water, a hydrogen bond network forming
liquid, but is a more general feature of crossing the Widom line. Specifically,
we also study the Jagla potential, a spherically-symmetric two-scale potential
known to possess a liquid-liquid critical point, in which the competition
between two liquid structures is generated by repulsive and attractive ramp
interactions.Comment: 6 pages and 5 figure
An exactly solvable three-particle problem with three-body interaction
The energy spectrum of the three-particle Hamiltonian obtained by replacing
the two-body trigonometric potential of the Sutherland problem by a three-body
one of a similar form is derived. When expressed in appropriate variables, the
corresponding wave functions are shown to be expressible in terms of Jack
polynomials. The exact solvability of the problem with three-body interaction
is explained by a hidden sl(3,\R) symmetry.Comment: LaTeX, 15 pages, no figures, slightly shortened version to appear in
Phys. Rev. A, one error correcte
Cartilage-selective genes identified in genome-scale analysis of non-cartilage and cartilage gene expression
<p>Abstract</p> <p>Background</p> <p>Cartilage plays a fundamental role in the development of the human skeleton. Early in embryogenesis, mesenchymal cells condense and differentiate into chondrocytes to shape the early skeleton. Subsequently, the cartilage anlagen differentiate to form the growth plates, which are responsible for linear bone growth, and the articular chondrocytes, which facilitate joint function. However, despite the multiplicity of roles of cartilage during human fetal life, surprisingly little is known about its transcriptome. To address this, a whole genome microarray expression profile was generated using RNA isolated from 18–22 week human distal femur fetal cartilage and compared with a database of control normal human tissues aggregated at UCLA, termed Celsius.</p> <p>Results</p> <p>161 cartilage-selective genes were identified, defined as genes significantly expressed in cartilage with low expression and little variation across a panel of 34 non-cartilage tissues. Among these 161 genes were cartilage-specific genes such as cartilage collagen genes and 25 genes which have been associated with skeletal phenotypes in humans and/or mice. Many of the other cartilage-selective genes do not have established roles in cartilage or are novel, unannotated genes. Quantitative RT-PCR confirmed the unique pattern of gene expression observed by microarray analysis.</p> <p>Conclusion</p> <p>Defining the gene expression pattern for cartilage has identified new genes that may contribute to human skeletogenesis as well as provided further candidate genes for skeletal dysplasias. The data suggest that fetal cartilage is a complex and transcriptionally active tissue and demonstrate that the set of genes selectively expressed in the tissue has been greatly underestimated.</p
The dynamic crossover in water does not require bulk water
Many of the anomalous properties of water may be explained by invoking a second critical point that terminates the coexistence line between the low- and high-density amorphous states in the liquid. Direct experimental evidence of this point, and the associated polyamorphic liquid–liquid transition, is elusive as it is necessary for liquid water to be cooled below its homogeneous-nucleation temperature. To avoid crystallization, water in the eutectic LiCl solution has been studied but then it is generally considered that “bulk” water cannot be present. However, recent computational and experimental studies observe cooperative hydration in which case it is possible that sufficient hydrogen-bonded water is present for the essential characteristics of water to be preserved. For femtosecond optical Kerr-effect and nuclear magnetic resonance measurements, we observe in each case a fractional Stokes–Einstein relation with evidence of the dynamic crossover appearing near 220 K and 250 K respectively. Spectra obtained in the glass state also confirm the complex nature of the hydrogen-bonding modes reported for neat room-temperature water and support predictions of anomalous diffusion due to “worm-hole” structure
- …