3,265 research outputs found
An Unsupervised Autoregressive Model for Speech Representation Learning
This paper proposes a novel unsupervised autoregressive neural model for
learning generic speech representations. In contrast to other speech
representation learning methods that aim to remove noise or speaker
variabilities, ours is designed to preserve information for a wide range of
downstream tasks. In addition, the proposed model does not require any phonetic
or word boundary labels, allowing the model to benefit from large quantities of
unlabeled data. Speech representations learned by our model significantly
improve performance on both phone classification and speaker verification over
the surface features and other supervised and unsupervised approaches. Further
analysis shows that different levels of speech information are captured by our
model at different layers. In particular, the lower layers tend to be more
discriminative for speakers, while the upper layers provide more phonetic
content.Comment: Accepted to Interspeech 2019. Code available at:
https://github.com/iamyuanchung/Autoregressive-Predictive-Codin
Constructing an Index for Brand Equity: A Hospital Example
If two hospitals are providing identical services in all respects, except for the brand name, why are customers willing to pay more for one hospital than the other? That is, the brand name is not just a name, but a name that contains value (brand equity). Brand equity is the value that the brand name endows to the product, such that consumers are willing to pay a premium price for products with the particular brand name. Accordingly, a company needs to manage its brand carefully so that its brand equity does not depreciate. Although measuring brand equity is important, managers have no brand equity index that is psychometrically robust and parsimonious enough for practice. Indeed, index construction is quite different from conventional scale development. Moreover, researchers might still be unaware of the potential appropriateness of formative indicators for operationalizing particular constructs. Towards this end, drawing on the brand equity literature and following the index construction procedure, this study creates a brand equity index for a hospital. The results reveal a parsimonious five-indicator brand equity index that can adequately capture the full domain of brand equity. This study also illustrates the differences between index construction and scale development
CONVERSER: Few-Shot Conversational Dense Retrieval with Synthetic Data Generation
Conversational search provides a natural interface for information retrieval
(IR). Recent approaches have demonstrated promising results in applying dense
retrieval to conversational IR. However, training dense retrievers requires
large amounts of in-domain paired data. This hinders the development of
conversational dense retrievers, as abundant in-domain conversations are
expensive to collect. In this paper, we propose CONVERSER, a framework for
training conversational dense retrievers with at most 6 examples of in-domain
dialogues. Specifically, we utilize the in-context learning capability of large
language models to generate conversational queries given a passage in the
retrieval corpus. Experimental results on conversational retrieval benchmarks
OR-QuAC and TREC CAsT 19 show that the proposed CONVERSER achieves comparable
performance to fully-supervised models, demonstrating the effectiveness of our
proposed framework in few-shot conversational dense retrieval. All source code
and generated datasets are available at https://github.com/MiuLab/CONVERSERComment: Accepted to SIGDIAL 202
Testing Monotonicity of Mean Potential Outcomes in a Continuous Treatment
While most treatment evaluations focus on binary interventions, a growing
literature also considers continuously distributed treatments, e.g. hours spent
in a training program to assess its effect on labor market outcomes. In this
paper, we propose a Cram\'er-von Mises-type test for testing whether the mean
potential outcome given a specific treatment has a weakly monotonic
relationship with the treatment dose under a weak unconfoundedness assumption.
This appears interesting for testing shape restrictions, e.g. whether
increasing the treatment dose always has a non-negative effect, no matter what
the baseline level of treatment is. We formally show that the proposed test
controls asymptotic size and is consistent against any fixed alternative. These
theoretical findings are supported by the method's finite sample behavior in
our Monte-Carlo simulations. As an empirical illustration, we apply our test to
the Job Corps study and reject a weakly monotonic relationship between the
treatment (hours in academic and vocational training) and labor market outcomes
like earnings or employment
Hey! I Have Something for You: Paging Cycle Based Random Access for LTE-A
The surge of M2M devices imposes new challenges for the current cellular network architecture, especially in radio access networks. One of the key issues is that the M2M traffic, characterized by small data and massive connection requests, makes significant collisions and congestion during network access via the random access (RA) procedure. To resolve this problem, in this paper, we propose a paging cycle-based protocol to facilitate the random access procedure in LTE-A. The high-level idea of our design is to leverage a UE's paging cycle as a hint to preassign RA preambles so that UEs can avoid preamble collisions at the first place. Our rpHint has two modes: (1) collision-free paging, which completely prevents cross-collision between paged user equipment (UEs) and random access UEs, and (2) collision-avoidance paging, which alleviates cross-collision. Moreover, we formulate a mathematical model to derive the optimal paging ratio that maximizes the expected number of successful UEs. This analysis also allows us to adapt dynamically to the better one between the two modes. We show via extensive simulations that our design increases the number of successful UEs in an RA procedure by more than 3× as compared to the legacy RA scheme of the LTE
On Optimizing Signaling Efficiency of Retransmissions for Voice LTE
The emergence of voice over LTE enables voice traffic transmissions over 4G packet-switched networks. Since voice traffic is characterized by its small payload and frequent transmissions, the corresponding control channel overhead would be high. Semi-persistent scheduling (SPS) is hence proposed in LTE-A to reduce such overhead. However, as wireless channels typically fluctuate, tremendous retransmissions due to poor channel conditions, which are still scheduled dynamically, would lead to a large overhead. To reduce the control message overhead caused by SPS retransmissions, we propose a new SPS retransmission protocol. Different from traditional SPS, which removes the downlink control indicators (DCI) directly, we compress some key fields of all retransmissions' DCIs in the same subframe as a fixed-length hint. Thus, the base station does not need to send this information to different users individually but just announces the hint as a broadcast message. In this way, we reduce the signaling overhead and at the same time, preserve the flexibility of dynamic scheduling. Our simulation results show that, by enabling DCI compression, our design improves signaling efficiency by 2.16\times, and the spectral utilization can be increased by up to 60%
Model Extraction Attack against Self-supervised Speech Models
Self-supervised learning (SSL) speech models generate meaningful
representations of given clips and achieve incredible performance across
various downstream tasks. Model extraction attack (MEA) often refers to an
adversary stealing the functionality of the victim model with only query
access. In this work, we study the MEA problem against SSL speech model with a
small number of queries. We propose a two-stage framework to extract the model.
In the first stage, SSL is conducted on the large-scale unlabeled corpus to
pre-train a small speech model. Secondly, we actively sample a small portion of
clips from the unlabeled corpus and query the target model with these clips to
acquire their representations as labels for the small model's second-stage
training. Experiment results show that our sampling methods can effectively
extract the target model without knowing any information about its model
architecture
- …