4,694 research outputs found
Feature Augmentation via Nonparametrics and Selection (FANS) in High Dimensional Classification
We propose a high dimensional classification method that involves
nonparametric feature augmentation. Knowing that marginal density ratios are
the most powerful univariate classifiers, we use the ratio estimates to
transform the original feature measurements. Subsequently, penalized logistic
regression is invoked, taking as input the newly transformed or augmented
features. This procedure trains models equipped with local complexity and
global simplicity, thereby avoiding the curse of dimensionality while creating
a flexible nonlinear decision boundary. The resulting method is called Feature
Augmentation via Nonparametrics and Selection (FANS). We motivate FANS by
generalizing the Naive Bayes model, writing the log ratio of joint densities as
a linear combination of those of marginal densities. It is related to
generalized additive models, but has better interpretability and computability.
Risk bounds are developed for FANS. In numerical analysis, FANS is compared
with competing methods, so as to provide a guideline on its best application
domain. Real data analysis demonstrates that FANS performs very competitively
on benchmark email spam and gene expression data sets. Moreover, FANS is
implemented by an extremely fast algorithm through parallel computing.Comment: 30 pages, 2 figure
STABILITY ANALYSIS AND HOPF BIFURCATION OF DENSITY-DEPENDENT PREDATOR-PREY SYSTEMS WITH BEDDINGTON-DEANGELIS FUNCTIONAL RESPONSE
In this article, we study a density-dependent predator-prey system with the Beddington-DeAngelis functional response for stability and Hopf bifurcation under certain parametric conditions. We start with the condition of the existence of the unique positive equilibrium, and provide two sufficient conditions for its local stability by the Lyapunov function method and the Routh-Hurwitz criterion, respectively. Then, we establish sufficient conditions for the global stability of the positive equilibrium by proving the non-existence of closed orbits in the first quadrant R²+. Afterwards, we analyze the Hopf bifurcation geometrically by exploring the monotonic property of the trace of the Jacobean matrix with respect to r and analytically verifying that there is a unique r* such that the trace is equal to 0. We also introduce an auxiliary map by restricting all the five parameters to a special one-dimensional geometrical structure and analyze the Hopf bifurcation with respect to all these five parameters. Finally, some numerical simulations are illustrated which are in agreement with our analytical results
How Re-sampling Helps for Long-Tail Learning?
Long-tail learning has received significant attention in recent years due to
the challenge it poses with extremely imbalanced datasets. In these datasets,
only a few classes (known as the head classes) have an adequate number of
training samples, while the rest of the classes (known as the tail classes) are
infrequent in the training data. Re-sampling is a classical and widely used
approach for addressing class imbalance issues. Unfortunately, recent studies
claim that re-sampling brings negligible performance improvements in modern
long-tail learning tasks. This paper aims to investigate this phenomenon
systematically. Our research shows that re-sampling can considerably improve
generalization when the training images do not contain semantically irrelevant
contexts. In other scenarios, however, it can learn unexpected spurious
correlations between irrelevant contexts and target labels. We design
experiments on two homogeneous datasets, one containing irrelevant context and
the other not, to confirm our findings. To prevent the learning of spurious
correlations, we propose a new context shift augmentation module that generates
diverse training images for the tail class by maintaining a context bank
extracted from the head-class images. Experiments demonstrate that our proposed
module can boost the generalization and outperform other approaches, including
class-balanced re-sampling, decoupled classifier re-training, and data
augmentation methods. The source code is available at
https://www.lamda.nju.edu.cn/code_CSA.ashx.Comment: Accepted by NeurIPS 202
SPA: On-Line Availability Upgrades for Parity-based RAIDs through Supplementary Parity Augmentations
In this paper, we propose a simple but powerful on-line availability upgrade mechanism, Supplementary Parity Augmentations (SPA), to address the availability issue for parity-based RAID systems. The basic idea of SPA is to store and update the supplementary parity units on one or a few newly augmented spare disks for on-line RAID systems in the operational mode, thus achieving the goals of improving the reconstruction performance while tole-rating multiple disk failures and latent sector errors simultaneously. By applying the exclusive OR operations appropriately among supplementary parity, full parity and data units, SPA can reconstruct the data on the failed disks with a fraction of the original overhead that is proportional to the supplementary parity coverage, thus significantly reducing the overhead of data regeneration and decreasing recovery time in parity-based RAID systems. In particular, SPA has two supplementary-parity coverage orientations, SPA Vertical and SPA Diagonal, which cater to user’s different availability needs. The former, which calculates the supplementary parity of a fixed subset of the disks, can tolerate more disk failures and sector errors; whereas, the latter shifts the coverage of supplementary parity by one disk for each stripe to balance the workload and thus maximize the performance of reconstruction during recovery. The SPA with a single supplementary-parity disk can be viewed as a variant of but significantly different from the RAID5+0 architecture in that the former can easily and dynamically upgrade a RAID5 system to a RAID5+0-like system without any change to the data layout of the RAID5 system. Our extensive trace-driven simulation study shows that both SPA orientations can significantly improve the reconstruction performance of the RAID5 system while SPA Diagonal significantly improves the reconstruction performance of RAID5+0, at an acceptable performance overhead imposed in the operational mode. Moreover, our reliability analytical modeling and Sequential Monte-Carlo simulation demonstrate that both SPA orientations consistently more than double the MTTDL of the RAID5 system and improve the reliability of the RAID5+0 system noticeably
Expressions of COX-2 and VEGF-C in gastric cancer: correlations with lymphangiogenesis and prognostic implications
<p>Abstract</p> <p>Background</p> <p>Cyclooxygenase-2 (COX-2) has recently been considered to promote lymphangiogenesis by up-regulating vascular endothelial growth factor-C (VEGF-C) in breast and lung cancer. However, the impact of COX-2 on lymphangiogenesis of gastric cancer remains unclear. This study aims to test the expression of COX-2 and VEGF-C in human gastric cancer, and to analyze the correlation with lymphatic vessel density (LVD), clinicopathologic features and survival prognosis.</p> <p>Methods</p> <p>Using immunohistochemistry, COX-2, VEGF-C and level of LVD were analyzed in 56 R0-resected primary gastric adenocarcinomas, while paracancerous normal mucosal tissues were also collected as control from 25 concurrent patients. The relationships among COX-2 and VEGF-C expression, LVD, and clinicopathologic parameters were analyzed. The correlations of COX-2, VEGF-C and level of LVD with patient prognosis were also evaluated by univariate tests and multivariate Cox regression.</p> <p>Results</p> <p>The expression rates of COX-2 and VEGF-C were 69.64% and 55.36%, respectively, in gastric carcinoma. Peritumoral LVD was significantly higher than that in both normal and intratumoral tissue (<it>P </it>< 0.05). It was significantly correlated with lymph node metastasis and invasion depth (<it>P </it>= 0.003, <it>P </it>= 0.05). VEGF-C was significantly associated with peritumoral LVD (<it>r </it>= 0.308, <it>P </it>= 0.021). However, COX-2 was not correlated with VEGF-C (<it>r </it>= 0.110, <it>P </it>= 0.419) or LVD (<it>r </it>= 0.042, <it>P </it>= 0.758). Univariate analysis showed that survival time was impaired by higher COX-2 expression and higher peritumoral LVD. Multivariate survival analysis showed that age, COX-2 expression and peritumoral LVD were independent prognostic factors.</p> <p>Conclusions</p> <p>Although COX-2 expression was associated with survival time, it was not correlated with VEGF-C and peritumoral LVD. Our data did not show that overexpression of COX-2 promotes tumor lymphangiogenesis through an up-regulation of VEGF-C expression in gastric carcinoma. Age, COX-2 and peritumoral LVD were independent prognostic factors for human gastric carcinoma.</p
A Survey on Extreme Multi-label Learning
Multi-label learning has attracted significant attention from both academic
and industry field in recent decades. Although existing multi-label learning
algorithms achieved good performance in various tasks, they implicitly assume
the size of target label space is not huge, which can be restrictive for
real-world scenarios. Moreover, it is infeasible to directly adapt them to
extremely large label space because of the compute and memory overhead.
Therefore, eXtreme Multi-label Learning (XML) is becoming an important task and
many effective approaches are proposed. To fully understand XML, we conduct a
survey study in this paper. We first clarify a formal definition for XML from
the perspective of supervised learning. Then, based on different model
architectures and challenges of the problem, we provide a thorough discussion
of the advantages and disadvantages of each category of methods. For the
benefit of conducting empirical studies, we collect abundant resources
regarding XML, including code implementations, and useful tools. Lastly, we
propose possible research directions in XML, such as new evaluation metrics,
the tail label problem, and weakly supervised XML.Comment: A preliminary versio
Investigation of ultra-thin Al₂O₃ film as Cu diffusion barrier on low-k (k=2.5) dielectrics
Ultrathin Al(2)O(3) films were deposited by PEALD as Cu diffusion barrier on low-k (k=2.5) material. The thermal stability and electrical properties of the Cu/low k system with Al(2)O(3) layers with different thickness were studied after annealing. The AES, TEM and EDX results revealed that the ultrathin Al(2)O(3) films are thermally stable and have excellent Cu diffusion barrier performance. The electrical measurements of dielectric breakdown and TDDB tests further confirmed that the ultrathin Al(2)O(3) film is a potential Cu diffusion barrier in the Cu/low-k interconnects system
- …