1,535 research outputs found

    An Unsupervised Autoregressive Model for Speech Representation Learning

    Full text link
    This paper proposes a novel unsupervised autoregressive neural model for learning generic speech representations. In contrast to other speech representation learning methods that aim to remove noise or speaker variabilities, ours is designed to preserve information for a wide range of downstream tasks. In addition, the proposed model does not require any phonetic or word boundary labels, allowing the model to benefit from large quantities of unlabeled data. Speech representations learned by our model significantly improve performance on both phone classification and speaker verification over the surface features and other supervised and unsupervised approaches. Further analysis shows that different levels of speech information are captured by our model at different layers. In particular, the lower layers tend to be more discriminative for speakers, while the upper layers provide more phonetic content.Comment: Accepted to Interspeech 2019. Code available at: https://github.com/iamyuanchung/Autoregressive-Predictive-Codin

    Extracting transcription factor binding sites from unaligned gene sequences with statistical models

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Transcription factor binding sites (TFBSs) are crucial in the regulation of gene transcription. Recently, chromatin immunoprecipitation followed by cDNA microarray hybridization (ChIP-chip array) has been used to identify potential regulatory sequences, but the procedure can only map the probable protein-DNA interaction loci within 1–2 kb resolution. To find out the exact binding motifs, it is necessary to build a computational method to examine the ChIP-chip array binding sequences and search for possible motifs representing the transcription factor binding sites.</p> <p>Results</p> <p>We developed a program to find out accurate motif sites from a set of unaligned DNA sequences in the yeast genome. Compared with MDscan, the prediction results suggest that, overall, our algorithm outperforms MDscan since the predicted motifs are more consistent with previously known specificities reported in the literature and have better prediction ranks. Our program also outperforms the constraint-less Cosmo program, especially in the elimination of false positives.</p> <p>Conclusion</p> <p>In this study, an improved sampling algorithm is proposed to incorporate the binomial probability model to build significant initial candidate motif sets. By investigating the statistical dependence between base positions in TFBSs, the method of dependency graphs and their expanded Bayesian networks is combined. The results show that our program satisfactorily extract transcription factor binding sites from unaligned gene sequences.</p

    DEXON: A Highly Scalable, Decentralized DAG-Based Consensus Algorithm

    Get PDF
    A blockchain system is a replicated state machine that must be fault tolerant. When designing a blockchain system, there is usually a trade-off between decentralization, scalability, and security. In this paper, we propose a novel blockchain system, DEXON, which achieves high scalability while remaining decentralized and robust in the real-world environment. We have two main contributions. First, we present a highly scalable sharding framework for blockchain. This framework takes an arbitrary number of single chains and transforms them into the \textit{blocklattice} data structure, enabling \textit{high scalability} and \textit{low transaction confirmation latency} with asymptotically optimal communication overhead. Second, we propose a single-chain protocol based on our novel verifiable random function and a new Byzantine agreement that achieves high decentralization and low latency

    Orthogonal Constant-Amplitude Sequence Families for System Parameter Identification in Spectrally Compact OFDM

    Full text link
    In rectangularly-pulsed orthogonal frequency division multiplexing (OFDM) systems, constant-amplitude (CA) sequences are desirable to construct preamble/pilot waveforms to facilitate system parameter identification (SPI). Orthogonal CA sequences are generally preferred in various SPI applications like random-access channel identification. However, the number of conventional orthogonal CA sequences (e.g., Zadoff-Chu sequences) that can be adopted in cellular communication without causing sequence identification ambiguity is insufficient. Such insufficiency causes heavy performance degradation for SPI requiring a large number of identification sequences. Moreover, rectangularly-pulsed OFDM preamble/pilot waveforms carrying conventional CA sequences suffer from large power spectral sidelobes and thus exhibit low spectral compactness. This paper is thus motivated to develop several order-I CA sequence families which contain more orthogonal CA sequences while endowing the corresponding OFDM preamble/pilot waveforms with fast-decaying spectral sidelobes. Since more orthogonal sequences are provided, the developed order-I CA sequence families can enhance the performance characteristics in SPI requiring a large number of identification sequences over multipath channels exhibiting short-delay channel profiles, while composing spectrally compact OFDM preamble/pilot waveforms.Comment: 15 pages, 4 figure

    An O(1)-Approximation Algorithm for Dynamic Weighted Vertex Cover with Soft Capacity

    Get PDF
    This study considers the soft capacitated vertex cover problem in a dynamic setting. This problem generalizes the dynamic model of the vertex cover problem, which has been intensively studied in recent years. Given a dynamically changing vertex-weighted graph G=(V,E), which allows edge insertions and edge deletions, the goal is to design a data structure that maintains an approximate minimum vertex cover while satisfying the capacity constraint of each vertex. That is, when picking a copy of a vertex v in the cover, the number of v\u27s incident edges covered by the copy is up to a given capacity of v. We extend Bhattacharya et al.\u27s work [SODA\u2715 and ICALP\u2715] to obtain a deterministic primal-dual algorithm for maintaining a constant-factor approximate minimum capacitated vertex cover with O(log n / epsilon) amortized update time, where n is the number of vertices in the graph. The algorithm can be extended to (1) a more general model in which each edge is associated with a non-uniform and unsplittable demand, and (2) the more general capacitated set cover problem
    • …
    corecore