1,535 research outputs found
An Unsupervised Autoregressive Model for Speech Representation Learning
This paper proposes a novel unsupervised autoregressive neural model for
learning generic speech representations. In contrast to other speech
representation learning methods that aim to remove noise or speaker
variabilities, ours is designed to preserve information for a wide range of
downstream tasks. In addition, the proposed model does not require any phonetic
or word boundary labels, allowing the model to benefit from large quantities of
unlabeled data. Speech representations learned by our model significantly
improve performance on both phone classification and speaker verification over
the surface features and other supervised and unsupervised approaches. Further
analysis shows that different levels of speech information are captured by our
model at different layers. In particular, the lower layers tend to be more
discriminative for speakers, while the upper layers provide more phonetic
content.Comment: Accepted to Interspeech 2019. Code available at:
https://github.com/iamyuanchung/Autoregressive-Predictive-Codin
Extracting transcription factor binding sites from unaligned gene sequences with statistical models
<p>Abstract</p> <p>Background</p> <p>Transcription factor binding sites (TFBSs) are crucial in the regulation of gene transcription. Recently, chromatin immunoprecipitation followed by cDNA microarray hybridization (ChIP-chip array) has been used to identify potential regulatory sequences, but the procedure can only map the probable protein-DNA interaction loci within 1–2 kb resolution. To find out the exact binding motifs, it is necessary to build a computational method to examine the ChIP-chip array binding sequences and search for possible motifs representing the transcription factor binding sites.</p> <p>Results</p> <p>We developed a program to find out accurate motif sites from a set of unaligned DNA sequences in the yeast genome. Compared with MDscan, the prediction results suggest that, overall, our algorithm outperforms MDscan since the predicted motifs are more consistent with previously known specificities reported in the literature and have better prediction ranks. Our program also outperforms the constraint-less Cosmo program, especially in the elimination of false positives.</p> <p>Conclusion</p> <p>In this study, an improved sampling algorithm is proposed to incorporate the binomial probability model to build significant initial candidate motif sets. By investigating the statistical dependence between base positions in TFBSs, the method of dependency graphs and their expanded Bayesian networks is combined. The results show that our program satisfactorily extract transcription factor binding sites from unaligned gene sequences.</p
DEXON: A Highly Scalable, Decentralized DAG-Based Consensus Algorithm
A blockchain system is a replicated state machine that must be fault
tolerant. When designing a blockchain system, there is usually a trade-off
between decentralization, scalability, and security. In this paper, we propose
a novel blockchain system, DEXON, which achieves high scalability while
remaining decentralized and robust in the real-world environment. We have two
main contributions. First, we present a highly scalable sharding framework for
blockchain. This framework takes an arbitrary number of single chains and
transforms them into the \textit{blocklattice} data structure, enabling
\textit{high scalability} and \textit{low transaction confirmation latency}
with asymptotically optimal communication overhead. Second, we propose a
single-chain protocol based on our novel verifiable random function and a new
Byzantine agreement that achieves high decentralization and low latency
Orthogonal Constant-Amplitude Sequence Families for System Parameter Identification in Spectrally Compact OFDM
In rectangularly-pulsed orthogonal frequency division multiplexing (OFDM)
systems, constant-amplitude (CA) sequences are desirable to construct
preamble/pilot waveforms to facilitate system parameter identification (SPI).
Orthogonal CA sequences are generally preferred in various SPI applications
like random-access channel identification. However, the number of conventional
orthogonal CA sequences (e.g., Zadoff-Chu sequences) that can be adopted in
cellular communication without causing sequence identification ambiguity is
insufficient. Such insufficiency causes heavy performance degradation for SPI
requiring a large number of identification sequences. Moreover,
rectangularly-pulsed OFDM preamble/pilot waveforms carrying conventional CA
sequences suffer from large power spectral sidelobes and thus exhibit low
spectral compactness. This paper is thus motivated to develop several order-I
CA sequence families which contain more orthogonal CA sequences while endowing
the corresponding OFDM preamble/pilot waveforms with fast-decaying spectral
sidelobes. Since more orthogonal sequences are provided, the developed order-I
CA sequence families can enhance the performance characteristics in SPI
requiring a large number of identification sequences over multipath channels
exhibiting short-delay channel profiles, while composing spectrally compact
OFDM preamble/pilot waveforms.Comment: 15 pages, 4 figure
An O(1)-Approximation Algorithm for Dynamic Weighted Vertex Cover with Soft Capacity
This study considers the soft capacitated vertex cover problem in a dynamic setting. This problem generalizes the dynamic model of the vertex cover problem, which has been intensively studied in recent years. Given a dynamically changing vertex-weighted graph G=(V,E), which allows edge insertions and edge deletions, the goal is to design a data structure that maintains an approximate minimum vertex cover while satisfying the capacity constraint of each vertex. That is, when picking a copy of a vertex v in the cover, the number of v\u27s incident edges covered by the copy is up to a given capacity of v. We extend Bhattacharya et al.\u27s work [SODA\u2715 and ICALP\u2715] to obtain a deterministic primal-dual algorithm for maintaining a constant-factor approximate minimum capacitated vertex cover with O(log n / epsilon) amortized update time, where n is the number of vertices in the graph. The algorithm can be extended to (1) a more general model in which each edge is associated with a non-uniform and unsplittable demand, and (2) the more general capacitated set cover problem
Angular-Resolved Optical Characteristics and Threshold Gain Analysis of GaN-Based 2-D Photonics Crystal Surface Emitting Lasers
- …