Search CORE

6,012 research outputs found

Unsupervised Spoken Term Detection with Spoken Queries by Multi-level Acoustic Patterns with Varying Model Granularity

Author: Chan Chun-an
Chung Cheng-Tao
Lee Lin-shan
Publication venue
Publication date: 07/09/2015
Field of study

This paper presents a new approach for unsupervised Spoken Term Detection with spoken queries using multiple sets of acoustic patterns automatically discovered from the target corpus. The different pattern HMM configurations(number of states per model, number of distinct models, number of Gaussians per state)form a three-dimensional model granularity space. Different sets of acoustic patterns automatically discovered on different points properly distributed over this three-dimensional space are complementary to one another, thus can jointly capture the characteristics of the spoken terms. By representing the spoken content and spoken query as sequences of acoustic patterns, a series of approaches for matching the pattern index sequences while considering the signal variations are developed. In this way, not only the on-line computation load can be reduced, but the signal distributions caused by different speakers and acoustic conditions can be reasonably taken care of. The results indicate that this approach significantly outperformed the unsupervised feature-based DTW baseline by 16.16\% in mean average precision on the TIMIT corpus.Comment: Accepted by ICASSP 201

arXiv.org e-Print Archive

Crossref

MPTQ-ViT: Mixed-Precision Post-Training Quantization for Vision Transformer

Author: An-Yeu
Tai Yu-Shan
Wu
Publication venue
Publication date: 31/01/2024
Field of study

While vision transformers (ViTs) have shown great potential in computer vision tasks, their intense computation and memory requirements pose challenges for practical applications. Existing post-training quantization methods leverage value redistribution or specialized quantizers to address the non-normal distribution in ViTs. However, without considering the asymmetry in activations and relying on hand-crafted settings, these methods often struggle to maintain performance under low-bit quantization. To overcome these challenges, we introduce SmoothQuant with bias term (SQ-b) to alleviate the asymmetry issue and reduce the clamping loss. We also introduce optimal scaling factor ratio search (OPT-m) to determine quantization parameters by a data-dependent mechanism automatically. To further enhance the compressibility, we incorporate the above-mentioned techniques and propose a mixed-precision post-training quantization framework for vision transformers (MPTQ-ViT). We develop greedy mixed-precision quantization (Greedy MP) to allocate layer-wise bit-width considering both model performance and compressibility. Our experiments on ViT, DeiT, and Swin demonstrate significant accuracy improvements compared with SOTA on the ImageNet dataset. Specifically, our proposed methods achieve accuracy improvements ranging from 0.90% to 23.35% on 4-bit ViTs with single-precision and from 3.82% to 78.14% on 5-bit fully quantized ViTs with mixed-precision

arXiv.org e-Print Archive

Self-organization and the Process of Dynamic Learner Language Development

Author: Shan An
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2023
Field of study

Adopting Complex Dynamic Systems Theory (CDST) in Second Language Acquisition (SLA) is a testament to the revolutionary and evolutionary advancement in theory and empirical practice in the field. CDST is revolutionary for the fact that it warrants systems thinking of SLA phenomena that breaks the chain of dichotomous conceptualization on vital issues such as the mechanism of language acquisition and learning and the effectiveness of positive and negative evidence. The emergence of CDST, on the other hand, is an evolutionary product nurtured by the painstaking collaborations of SLA scholars for over two decades of scientific inquiry (see, e.g., Han, 2019; Hiver & Al-Hoorie, 2019; Larsen-Freeman & Cameron, 2008; Ortega & Han, 2017). To capitalize on CDST as a valid approach to scholarly work, it is necessary to grapple with its fundamental constructs. This forum piece accentuates a critical notion of CDST: self-organization. By first referring to the theoretical aspects of self-organization, this forum piece seeks to demonstrate the relevance of this notion in SLA. This piece will then review three sample studies homing in on learner language development with a CDST lens and a specific focus on self-organization

Directory of Open Access Journals

Recommended from our members

Crosslinguistic Influence and Second Language Learning

Author: An Shan
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2022
Field of study

The multilingual turn in second language acquisition (SLA) research signals an epistemic reorientation of the field (Ortega, 2014). It manifests the move away from the monolingual bias that measures learner language with the idealized competence of native speakers as the yardstick. In so doing, the focus has shifted to disentangling the cognitive, linguistic, and psycholinguistic mechanisms involved in multilinguals’ language acquisition processes. Crosslinguistic influence (CLI) has been a prominent object of research since the 1980s, and new perspectives have been requested to reflect this multilingual turn (Odlin & Yu, 2016). McManus’s (2022) book, Crosslinguistic influence and second language learning, aims to advance new avenues of theorization and empirical research in CLI to answer the multilingual turn’s call

Columbia University Academic Commons