8,205 research outputs found

    Discriminative Segmental Cascades for Feature-Rich Phone Recognition

    Full text link
    Discriminative segmental models, such as segmental conditional random fields (SCRFs) and segmental structured support vector machines (SSVMs), have had success in speech recognition via both lattice rescoring and first-pass decoding. However, such models suffer from slow decoding, hampering the use of computationally expensive features, such as segment neural networks or other high-order features. A typical solution is to use approximate decoding, either by beam pruning in a single pass or by beam pruning to generate a lattice followed by a second pass. In this work, we study discriminative segmental models trained with a hinge loss (i.e., segmental structured SVMs). We show that beam search is not suitable for learning rescoring models in this approach, though it gives good approximate decoding performance when the model is already well-trained. Instead, we consider an approach inspired by structured prediction cascades, which use max-marginal pruning to generate lattices. We obtain a high-accuracy phonetic recognition system with several expensive feature types: a segment neural network, a second-order language model, and second-order phone boundary features

    Stochastic Optimization for Deep CCA via Nonlinear Orthogonal Iterations

    Full text link
    Deep CCA is a recently proposed deep neural network extension to the traditional canonical correlation analysis (CCA), and has been successful for multi-view representation learning in several domains. However, stochastic optimization of the deep CCA objective is not straightforward, because it does not decouple over training examples. Previous optimizers for deep CCA are either batch-based algorithms or stochastic optimization using large minibatches, which can have high memory consumption. In this paper, we tackle the problem of stochastic optimization for deep CCA with small minibatches, based on an iterative solution to the CCA objective, and show that we can achieve as good performance as previous optimizers and thus alleviate the memory requirement.Comment: in 2015 Annual Allerton Conference on Communication, Control and Computin

    Revisiting the morphology and phylogeny of Lactifluus with three new lineages from southern China

    Get PDF
    As a recent group mainly defined by molecular data the genus Lactifluus is in need of further study to provide insight into the morphological and molecular variation within the genus, species limits and relationships. Phylogenetic analyses of nuc rDNA ITS1-5.8S-ITS2 (ITS), D1 and D2 domains of nuc 28S rDNA (28S), and part of the second largest subunit of the RNA polymerase II (rpb2) (6-7 region) sequences of 28 samples from southern China revealed three new lineages of Lactifluus. Two of them are nested in a major clade that includes the type of Lactifluus and here is treated as two new sections: L. sect. Ambicystidiati and L. sect. Tenuicystidiati. Lactifluus ambicystidiatus, described here as a new species (= sect. Ambicystidiati), has both lamprocystidia and macrocystidia in the hymenium, a unique combination of features within Russulaceae. Furthermore, only remnants of lactiferous hyphae are present in L. ambicystidiatus and our results suggest that the ability to form a lactiferous system has been lost in this lineage. Lactifluus sect. Tenuicystidiati forms a strongly supported monophyletic group as a sister lineage to L. sect. Lactifluus. We recognize it based on the thin-walled macrocystidia and smaller ellipsoid spores with an incomplete reticulum compared with L. sect. Lactifluus. The former placement of L. tenuicystidiatus in the African L. sect. Pseudogymnocarpi is not supported. Using genealogical concordance we recognize five phylogenetic species within L. sect. Tenuicystidiati and describe two of these as new, L. subpruinosus and L. tropicosinicus. The third lineage, represented by L. leoninus, forms a sister group to L. subg. Lactariopsis sensu stricto. The three lineages provide further evidence for morphological features in Lactifluus being homoplasious. Some sections and species complexes are likely to be composed of more species and merit further investigations. Subtropical-tropical Asia is likely a key region for additional sampling

    Association Signals Unveiled by a Comprehensive Gene Set Enrichment Analysis of Dental Caries Genome-Wide Association Studies

    Get PDF
    Gene set-based analysis of genome-wide association study (GWAS) data has recently emerged as a useful approach to examine the joint effects of multiple risk loci in complex human diseases or phenotypes. Dental caries is a common, chronic, and complex disease leading to a decrease in quality of life worldwide. In this study, we applied the approaches of gene set enrichment analysis to a major dental caries GWAS dataset, which consists of 537 cases and 605 controls. Using four complementary gene set analysis methods, we analyzed 1331 Gene Ontology (GO) terms collected from the Molecular Signatures Database (MSigDB). Setting false discovery rate (FDR) threshold as 0.05, we identified 13 significantly associated GO terms. Additionally, 17 terms were further included as marginally associated because they were top ranked by each method, although their FDR is higher than 0.05. In total, we identified 30 promising GO terms, including 'Sphingoid metabolic process,' 'Ubiquitin protein ligase activity,' 'Regulation of cytokine secretion,' and 'Ceramide metabolic process.' These GO terms encompass broad functions that potentially interact and contribute to the oral immune response related to caries development, which have not been reported in the standard single marker based analysis. Collectively, our gene set enrichment analysis provided complementary insights into the molecular mechanisms and polygenic interactions in dental caries, revealing promising association signals that could not be detected through single marker analysis of GWAS data. © 2013 Wang et al

    Costly Blackouts? Measuring Productivity and Environmental Effects of Electricity Shortages

    Get PDF
    In many countries, unreliable inputs, particularly those lacking storage, can significantly limit a firm's productivity. In the case of an increasing frequency of blackouts, a firm may change factor shares in a number of ways. It may decide to self generate electricity, to purchase intermediate goods that it used to produce directly, or to improve its technical efficiency. We examine how industrial firms responded to China's severe power shortages in the early 2000s. Fast-growing demand coupled with regulated electricity prices led to blackouts that varied in degree over location and time. Our data consist of annual observations from 1999 to 2004 for approximately 32,000 energy-intensive, enterprises from all industries. We estimate the losses in productivity due to factor-neutral and factor-biased effects of electricity scarcity. Our results suggest that enterprises re-optimize among factors in response to electricity scarcity by shifting from energy (both electric and non-electric sources) into materials---a shift from "make" to "buy." These effects are strongest for firms in textiles, timber, chemicals, and metals. Contrary to the literature, we do not find evidence of an increase in self generation. Finally, we find that these productivity changes, while costly to firms, led to small reductions in carbon emissions.

    Combining isotonic regression and EM algorithm to predict genetic risk under monotonicity constraint

    Get PDF
    In certain genetic studies, clinicians and genetic counselors are interested in estimating the cumulative risk of a disease for individuals with and without a rare deleterious mutation. Estimating the cumulative risk is difficult, however, when the estimates are based on family history data. Often, the genetic mutation status in many family members is unknown; instead, only estimated probabilities of a patient having a certain mutation status are available. Also, ages of disease-onset are subject to right censoring. Existing methods to estimate the cumulative risk using such family-based data only provide estimation at individual time points, and are not guaranteed to be monotonic or nonnegative. In this paper, we develop a novel method that combines Expectation-Maximization and isotonic regression to estimate the cumulative risk across the entire support. Our estimator is monotonic, satisfies self-consistent estimating equations and has high power in detecting differences between the cumulative risks of different populations. Application of our estimator to a Parkinson's disease (PD) study provides the age-at-onset distribution of PD in PARK2 mutation carriers and noncarriers, and reveals a significant difference between the distribution in compound heterozygous carriers compared to noncarriers, but not between heterozygous carriers and noncarriers.Comment: Published in at http://dx.doi.org/10.1214/14-AOAS730 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org
    • …
    corecore