709 research outputs found
Adaptive Storey's null proportion estimator
False discovery rate (FDR) is a commonly used criterion in multiple testing
and the Benjamini-Hochberg (BH) procedure is arguably the most popular approach
with FDR guarantee. To improve power, the adaptive BH procedure has been
proposed by incorporating various null proportion estimators, among which
Storey's estimator has gained substantial popularity. The performance of
Storey's estimator hinges on a critical hyper-parameter, where a pre-fixed
configuration lacks power and existing data-driven hyper-parameters compromise
the FDR control. In this work, we propose a novel class of adaptive
hyper-parameters and establish the FDR control of the associated BH procedure
using a martingale argument. Within this class of data-driven hyper-parameters,
we present a specific configuration designed to maximize the number of
rejections and characterize the convergence of this proposal to the optimal
hyper-parameter under a commonly-used mixture model. We evaluate our adaptive
Storey's null proportion estimator and the associated BH procedure on extensive
simulated data and a motivating protein dataset. Our proposal exhibits
significant power gains when dealing with a considerable proportion of weak
non-nulls or a conservative null distribution.Comment: 17 pages, 4 figures, 1 tabl
Data Augmentation for Time-Series Classification: An Extensive Empirical Study and Comprehensive Survey
Data Augmentation (DA) has emerged as an indispensable strategy in Time
Series Classification (TSC), primarily due to its capacity to amplify training
samples, thereby bolstering model robustness, diversifying datasets, and
curtailing overfitting. However, the current landscape of DA in TSC is plagued
with fragmented literature reviews, nebulous methodological taxonomies,
inadequate evaluative measures, and a dearth of accessible, user-oriented
tools. In light of these challenges, this study embarks on an exhaustive
dissection of DA methodologies within the TSC realm. Our initial approach
involved an extensive literature review spanning a decade, revealing that
contemporary surveys scarcely capture the breadth of advancements in DA for
TSC, prompting us to meticulously analyze over 100 scholarly articles to
distill more than 60 unique DA techniques. This rigorous analysis precipitated
the formulation of a novel taxonomy, purpose-built for the intricacies of DA in
TSC, categorizing techniques into five principal echelons:
Transformation-Based, Pattern-Based, Generative, Decomposition-Based, and
Automated Data Augmentation. Our taxonomy promises to serve as a robust
navigational aid for scholars, offering clarity and direction in method
selection. Addressing the conspicuous absence of holistic evaluations for
prevalent DA techniques, we executed an all-encompassing empirical assessment,
wherein upwards of 15 DA strategies were subjected to scrutiny across 8 UCR
time-series datasets, employing ResNet and a multi-faceted evaluation paradigm
encompassing Accuracy, Method Ranking, and Residual Analysis, yielding a
benchmark accuracy of 88.94 +- 11.83%. Our investigation underscored the
inconsistent efficacies of DA techniques, with..
Identifying Strongly Lensed Gravitational Waves with the Third-generation Detectors
The joint detection of GW signals by a network of instruments will increase
the detecting ability of faint and far GW signals with higher signal-to-noise
ratios (SNRs), which could improve the ability of detecting the lensed GWs as
well, especially for the 3rd generation detectors, e.g. Einstein Telescope (ET)
and Cosmic Explorer (CE). However, identifying Strongly Lensed Gravitational
Waves (SLGWs) is still challenging. We focus on the identification ability of
3G detectors in this article. We predict and analyze the SNR distribution of
SLGW signals and prove only 50.6\% of SLGW pairs detected by ET alone can be
identified by Lens Bayes factor (LBF), which is a popular method at present to
identify SLGWs. For SLGW pairs detected by CE\&ET network, owing to the
superior spatial resolution, this number rises to 87.3\%. Moreover, we get an
approximate analytical relation between SNR and LBF. We give clear SNR limits
to identify SLGWs and estimate the expected yearly detection rates of
galaxy-scale lensed GWs that can get identified with 3G detector network.Comment: 9 pages, 7 figure
Long-tail Cross Modal Hashing
Existing Cross Modal Hashing (CMH) methods are mainly designed for balanced
data, while imbalanced data with long-tail distribution is more general in
real-world. Several long-tail hashing methods have been proposed but they can
not adapt for multi-modal data, due to the complex interplay between labels and
individuality and commonality information of multi-modal data. Furthermore, CMH
methods mostly mine the commonality of multi-modal data to learn hash codes,
which may override tail labels encoded by the individuality of respective
modalities. In this paper, we propose LtCMH (Long-tail CMH) to handle
imbalanced multi-modal data. LtCMH firstly adopts auto-encoders to mine the
individuality and commonality of different modalities by minimizing the
dependency between the individuality of respective modalities and by enhancing
the commonality of these modalities. Then it dynamically combines the
individuality and commonality with direct features extracted from respective
modalities to create meta features that enrich the representation of tail
labels, and binaries meta features to generate hash codes. LtCMH significantly
outperforms state-of-the-art baselines on long-tail datasets and holds a better
(or comparable) performance on datasets with balanced labels.Comment: Accepted by the Thirty-Seventh AAAI Conference on Artificial
Intelligence(AAAI2023
Facile synthesis of mono-disperse sub-20 nm NaY(WO4)2:Er3+,Yb3+ upconversion nanoparticles: A new choice for nanothermometry
- …