4,694 research outputs found

    Feature Augmentation via Nonparametrics and Selection (FANS) in High Dimensional Classification

    Full text link
    We propose a high dimensional classification method that involves nonparametric feature augmentation. Knowing that marginal density ratios are the most powerful univariate classifiers, we use the ratio estimates to transform the original feature measurements. Subsequently, penalized logistic regression is invoked, taking as input the newly transformed or augmented features. This procedure trains models equipped with local complexity and global simplicity, thereby avoiding the curse of dimensionality while creating a flexible nonlinear decision boundary. The resulting method is called Feature Augmentation via Nonparametrics and Selection (FANS). We motivate FANS by generalizing the Naive Bayes model, writing the log ratio of joint densities as a linear combination of those of marginal densities. It is related to generalized additive models, but has better interpretability and computability. Risk bounds are developed for FANS. In numerical analysis, FANS is compared with competing methods, so as to provide a guideline on its best application domain. Real data analysis demonstrates that FANS performs very competitively on benchmark email spam and gene expression data sets. Moreover, FANS is implemented by an extremely fast algorithm through parallel computing.Comment: 30 pages, 2 figure

    STABILITY ANALYSIS AND HOPF BIFURCATION OF DENSITY-DEPENDENT PREDATOR-PREY SYSTEMS WITH BEDDINGTON-DEANGELIS FUNCTIONAL RESPONSE

    Get PDF
    In this article, we study a density-dependent predator-prey system with the Beddington-DeAngelis functional response for stability and Hopf bifurcation under certain parametric conditions. We start with the condition of the existence of the unique positive equilibrium, and provide two sufficient conditions for its local stability by the Lyapunov function method and the Routh-Hurwitz criterion, respectively. Then, we establish sufficient conditions for the global stability of the positive equilibrium by proving the non-existence of closed orbits in the first quadrant R²+. Afterwards, we analyze the Hopf bifurcation geometrically by exploring the monotonic property of the trace of the Jacobean matrix with respect to r and analytically verifying that there is a unique r* such that the trace is equal to 0. We also introduce an auxiliary map by restricting all the five parameters to a special one-dimensional geometrical structure and analyze the Hopf bifurcation with respect to all these five parameters. Finally, some numerical simulations are illustrated which are in agreement with our analytical results

    How Re-sampling Helps for Long-Tail Learning?

    Full text link
    Long-tail learning has received significant attention in recent years due to the challenge it poses with extremely imbalanced datasets. In these datasets, only a few classes (known as the head classes) have an adequate number of training samples, while the rest of the classes (known as the tail classes) are infrequent in the training data. Re-sampling is a classical and widely used approach for addressing class imbalance issues. Unfortunately, recent studies claim that re-sampling brings negligible performance improvements in modern long-tail learning tasks. This paper aims to investigate this phenomenon systematically. Our research shows that re-sampling can considerably improve generalization when the training images do not contain semantically irrelevant contexts. In other scenarios, however, it can learn unexpected spurious correlations between irrelevant contexts and target labels. We design experiments on two homogeneous datasets, one containing irrelevant context and the other not, to confirm our findings. To prevent the learning of spurious correlations, we propose a new context shift augmentation module that generates diverse training images for the tail class by maintaining a context bank extracted from the head-class images. Experiments demonstrate that our proposed module can boost the generalization and outperform other approaches, including class-balanced re-sampling, decoupled classifier re-training, and data augmentation methods. The source code is available at https://www.lamda.nju.edu.cn/code_CSA.ashx.Comment: Accepted by NeurIPS 202

    SPA: On-Line Availability Upgrades for Parity-based RAIDs through Supplementary Parity Augmentations

    Get PDF
    In this paper, we propose a simple but powerful on-line availability upgrade mechanism, Supplementary Parity Augmentations (SPA), to address the availability issue for parity-based RAID systems. The basic idea of SPA is to store and update the supplementary parity units on one or a few newly augmented spare disks for on-line RAID systems in the operational mode, thus achieving the goals of improving the reconstruction performance while tole-rating multiple disk failures and latent sector errors simultaneously. By applying the exclusive OR operations appropriately among supplementary parity, full parity and data units, SPA can reconstruct the data on the failed disks with a fraction of the original overhead that is proportional to the supplementary parity coverage, thus significantly reducing the overhead of data regeneration and decreasing recovery time in parity-based RAID systems. In particular, SPA has two supplementary-parity coverage orientations, SPA Vertical and SPA Diagonal, which cater to user’s different availability needs. The former, which calculates the supplementary parity of a fixed subset of the disks, can tolerate more disk failures and sector errors; whereas, the latter shifts the coverage of supplementary parity by one disk for each stripe to balance the workload and thus maximize the performance of reconstruction during recovery. The SPA with a single supplementary-parity disk can be viewed as a variant of but significantly different from the RAID5+0 architecture in that the former can easily and dynamically upgrade a RAID5 system to a RAID5+0-like system without any change to the data layout of the RAID5 system. Our extensive trace-driven simulation study shows that both SPA orientations can significantly improve the reconstruction performance of the RAID5 system while SPA Diagonal significantly improves the reconstruction performance of RAID5+0, at an acceptable performance overhead imposed in the operational mode. Moreover, our reliability analytical modeling and Sequential Monte-Carlo simulation demonstrate that both SPA orientations consistently more than double the MTTDL of the RAID5 system and improve the reliability of the RAID5+0 system noticeably

    Expressions of COX-2 and VEGF-C in gastric cancer: correlations with lymphangiogenesis and prognostic implications

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Cyclooxygenase-2 (COX-2) has recently been considered to promote lymphangiogenesis by up-regulating vascular endothelial growth factor-C (VEGF-C) in breast and lung cancer. However, the impact of COX-2 on lymphangiogenesis of gastric cancer remains unclear. This study aims to test the expression of COX-2 and VEGF-C in human gastric cancer, and to analyze the correlation with lymphatic vessel density (LVD), clinicopathologic features and survival prognosis.</p> <p>Methods</p> <p>Using immunohistochemistry, COX-2, VEGF-C and level of LVD were analyzed in 56 R0-resected primary gastric adenocarcinomas, while paracancerous normal mucosal tissues were also collected as control from 25 concurrent patients. The relationships among COX-2 and VEGF-C expression, LVD, and clinicopathologic parameters were analyzed. The correlations of COX-2, VEGF-C and level of LVD with patient prognosis were also evaluated by univariate tests and multivariate Cox regression.</p> <p>Results</p> <p>The expression rates of COX-2 and VEGF-C were 69.64% and 55.36%, respectively, in gastric carcinoma. Peritumoral LVD was significantly higher than that in both normal and intratumoral tissue (<it>P </it>< 0.05). It was significantly correlated with lymph node metastasis and invasion depth (<it>P </it>= 0.003, <it>P </it>= 0.05). VEGF-C was significantly associated with peritumoral LVD (<it>r </it>= 0.308, <it>P </it>= 0.021). However, COX-2 was not correlated with VEGF-C (<it>r </it>= 0.110, <it>P </it>= 0.419) or LVD (<it>r </it>= 0.042, <it>P </it>= 0.758). Univariate analysis showed that survival time was impaired by higher COX-2 expression and higher peritumoral LVD. Multivariate survival analysis showed that age, COX-2 expression and peritumoral LVD were independent prognostic factors.</p> <p>Conclusions</p> <p>Although COX-2 expression was associated with survival time, it was not correlated with VEGF-C and peritumoral LVD. Our data did not show that overexpression of COX-2 promotes tumor lymphangiogenesis through an up-regulation of VEGF-C expression in gastric carcinoma. Age, COX-2 and peritumoral LVD were independent prognostic factors for human gastric carcinoma.</p

    A Survey on Extreme Multi-label Learning

    Full text link
    Multi-label learning has attracted significant attention from both academic and industry field in recent decades. Although existing multi-label learning algorithms achieved good performance in various tasks, they implicitly assume the size of target label space is not huge, which can be restrictive for real-world scenarios. Moreover, it is infeasible to directly adapt them to extremely large label space because of the compute and memory overhead. Therefore, eXtreme Multi-label Learning (XML) is becoming an important task and many effective approaches are proposed. To fully understand XML, we conduct a survey study in this paper. We first clarify a formal definition for XML from the perspective of supervised learning. Then, based on different model architectures and challenges of the problem, we provide a thorough discussion of the advantages and disadvantages of each category of methods. For the benefit of conducting empirical studies, we collect abundant resources regarding XML, including code implementations, and useful tools. Lastly, we propose possible research directions in XML, such as new evaluation metrics, the tail label problem, and weakly supervised XML.Comment: A preliminary versio

    Investigation of ultra-thin Al₂O₃ film as Cu diffusion barrier on low-k (k=2.5) dielectrics

    Get PDF
    Ultrathin Al(2)O(3) films were deposited by PEALD as Cu diffusion barrier on low-k (k=2.5) material. The thermal stability and electrical properties of the Cu/low k system with Al(2)O(3) layers with different thickness were studied after annealing. The AES, TEM and EDX results revealed that the ultrathin Al(2)O(3) films are thermally stable and have excellent Cu diffusion barrier performance. The electrical measurements of dielectric breakdown and TDDB tests further confirmed that the ultrathin Al(2)O(3) film is a potential Cu diffusion barrier in the Cu/low-k interconnects system
    corecore