Search CORE

5,485 research outputs found

Box Drawings for Learning with Imbalanced Data

Author: Abe N.
Chawla N. V.
Qi Y.
Sniadecki J.
Wu G.
Publication venue
Publication date: 07/06/2014
Field of study

The vast majority of real world classification problems are imbalanced, meaning there are far fewer data from the class of interest (the positive class) than from other classes. We propose two machine learning algorithms to handle highly imbalanced classification problems. The classifiers constructed by both methods are created as unions of parallel axis rectangles around the positive examples, and thus have the benefit of being interpretable. The first algorithm uses mixed integer programming to optimize a weighted balance between positive and negative class accuracies. Regularization is introduced to improve generalization performance. The second method uses an approximation in order to assist with scalability. Specifically, it follows a \textit{characterize then discriminate} approach, where the positive class is characterized first by boxes, and then each box boundary becomes a separate discriminative classifier. This method has the computational advantages that it can be easily parallelized, and considers only the relevant regions of feature space

arXiv.org e-Print Archive

CiteSeerX

DSpace@MIT

Crossref

Deep Over-sampling Framework for Classifying Imbalanced Data

Author: B Krawczyk
C Dong
G Hinton
GE Hinton
H He
KQ Weinberger
MD Zeiler
NV Chawla
NV Chawla
P Jeatrakul
RA Dunne
S Ando
S Köknar-Tezel
Y Bengio
Y Lecun
ZH Zhou
Publication venue
Publication date: 12/07/2017
Field of study

Class imbalance is a challenging issue in practical classification problems for deep learning models as well as traditional models. Traditionally successful countermeasures such as synthetic over-sampling have had limited success with complex, structured data handled by deep learning models. In this paper, we propose Deep Over-sampling (DOS), a framework for extending the synthetic over-sampling method to exploit the deep feature space acquired by a convolutional neural network (CNN). Its key feature is an explicit, supervised representation learning, for which the training data presents each raw input sample with a synthetic embedding target in the deep feature space, which is sampled from the linear subspace of in-class neighbors. We implement an iterative process of training the CNN and updating the targets, which induces smaller in-class variance among the embeddings, to increase the discriminative power of the deep representation. We present an empirical study using public benchmarks, which shows that the DOS framework not only counteracts class imbalance better than the existing method, but also improves the performance of the CNN in the standard, balanced settings

arXiv.org e-Print Archive

Crossref

Disruption of nNOS-NOS1AP protein-protein interactions suppresses neuropathic pain in mice

Author: Chawla Aarti
Courtney Michael J.
Hohmann Andrea G.
Hudmon Andy
Lai Yvonne Y.
Lee Wan-Hung
Li Li-Li
Publication venue: 'Ovid Technologies (Wolters Kluwer Health)'
Publication date: 01/05/2018
Field of study

Elevated N-methyl-D-aspartate receptor (NMDAR) activity is linked to central sensitization and chronic pain. However, NMDAR antagonists display limited therapeutic potential because of their adverse side effects. Novel approaches targeting the NR2B-PSD95-nNOS complex to disrupt signaling pathways downstream of NMDARs show efficacy in preclinical pain models. Here, we evaluated the involvement of interactions between neuronal nitric oxide synthase (nNOS) and the nitric oxide synthase 1 adaptor protein (NOS1AP) in pronociceptive signaling and neuropathic pain. TAT-GESV, a peptide inhibitor of the nNOS-NOS1AP complex, disrupted the in vitro binding between nNOS and its downstream protein partner NOS1AP but not its upstream protein partner postsynaptic density 95 kDa (PSD95). Putative inactive peptides (TAT-cp4GESV and TAT-GESVΔ1) failed to do so. Only the active peptide protected primary cortical neurons from glutamate/glycine-induced excitotoxicity. TAT-GESV, administered intrathecally (i.t.), suppressed mechanical and cold allodynia induced by either the chemotherapeutic agent paclitaxel or a traumatic nerve injury induced by partial sciatic nerve ligation. TAT-GESV also blocked the paclitaxel-induced phosphorylation at Ser15 of p53, a substrate of p38 MAPK. Finally, TAT-GESV (i.t.) did not induce NMDAR-mediated motor ataxia in the rotarod test and did not alter basal nociceptive thresholds in the radiant heat tail-flick test. These observations support the hypothesis that antiallodynic efficacy of an nNOS-NOS1AP disruptor may result, at least in part, from blockade of p38 MAPK-mediated downstream effects. Our studies demonstrate, for the first time, that disrupting nNOS-NOS1AP protein-protein interactions attenuates mechanistically distinct forms of neuropathic pain without unwanted motor ataxic effects of NMDAR antagonists

IUPUIScholarWorks

On the optimality of gluing over scales

Author: I. Newman
J. Matoušek
J. Matoušek
J.R. Lee
N. Linial
R. Krauthgamer
S. Arora
S. Chawla
S. Rao
T.J. Laakso
U. Lang
Y. Aumann
Y. Benyamini
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

We show that for every

\alpha > 0

, there exist

n

-point metric spaces (X,d) where every "scale" admits a Euclidean embedding with distortion at most

\alpha

, but the whole space requires distortion at least

\Omega(\sqrt{\alpha \log n})

. This shows that the scale-gluing lemma [Lee, SODA 2005] is tight, and disproves a conjecture stated there. This matching upper bound was known to be tight at both endpoints, i.e. when

\alpha = \Theta(1)

and

\alpha = \Theta(\log n)

, but nowhere in between. More specifically, we exhibit

n

-point spaces with doubling constant

\lambda

requiring Euclidean distortion

\Omega(\sqrt{\log \lambda \log n})

, which also shows that the technique of "measured descent" [Krauthgamer, et. al., Geometric and Functional Analysis] is optimal. We extend this to obtain a similar tight result for

L_p

spaces with

p > 1

.Comment: minor revision

arXiv.org e-Print Archive

CiteSeerX

Crossref

Cholangitis Due to Candidiasis of the Extra-Hepatic Biliary Tract

Author: Chawla Y. K.
Singh Kartar
Vaiphei K.
Wig J. D.
Publication venue: Hindawi Publishing Corporation
Publication date: 01/01/1998
Field of study

A case of isolated candidal fungal balls in the common bile duct causing obstructive jaundice and cholangitis is described. There were no predisposing factors. The fungal balls were removed from the common bile duct and a transduodenal sphincteroplasty was performed. Microscopic analysis yielded colonies of candida. Postoperative period was uneventful. At follow-up no evidence of candida infection was evident. He is now 3 years post-surgery and is well

Crossref

Directory of Open Access Journals

PubMed Central

Cell culture–based production of defective interfering influenza A virus particles in perfusion mode using an alternating tangential flow filtration system

Author: Cattaneo M.
Chawla A.
Genzel Y.
Hein M.
Kupke S.
Reichl U.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

Respiratory diseases including influenza A virus (IAV) infections represent a major threat to human health. While the development of a vaccine requires a lot of time, a fast countermeasure could be the use of defective interfering particles (DIPs) for antiviral therapy. IAV DIPs are usually characterized by a large internal deletion in one viral RNA segment. Consequentially, DIPs can only propagate in presence of infectious standard viruses (STVs), compensating the missing gene function. Here, they interfere with and suppress the STV replication and might act “universally” against many IAV subtypes. We recently reported a production system for purely clonal DIPs utilizing genetically modified cells. In the present study, we established an automated perfusion process for production of a DIP, called DI244, using an alternating tangential flow filtration (ATF) system for cell retention. Viable cell concentrations and DIP titers more than 10 times higher than for a previously reported batch cultivation were observed. Furthermore, we investigated a novel tubular cell retention device for its potential for continuous virus harvesting into the permeate. Very comparable performances to typically used hollow fiber membranes were found during the cell growth phase. During the virus replication phase, the tubular membrane, in contrast to the hollow fiber membrane, allowed 100% of the produced virus particles to pass through. To our knowledge, this is the first time a continuous virus harvest was shown for a membrane-based perfusion process. Overall, the process established offers interesting possibilities for advanced process integration strategies for next-generation virus particle and virus vector manufacturing. Key points • An automated perfusion process for production of IAV DIPs was established. • DIP titers of 7.40E + 9 plaque forming units per mL were reached. • A novel tubular cell retention device enabled continuous virus harvesting. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s00253-021-11561-y

PubMed Central

MPG.PuRe

Parallel Perceptrons, Activation Margins and Imbalanced Training Set Pruning

Author: J. Dorronsoro
J.A. Swets
N. Chawla
N. Nilsson
P. Auer
T. Fawcett
Y. Freund
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

The final publication is available at Springer via http://dx.doi.org/10.1007/11492542_6Proceedings of Second Iberian Conference, IbPRIA 2005, Estoril, Portugal, June 7-9, 2005, Part IIA natural way to deal with training samples in imbalanced class problems is to prune them removing redundant patterns, easy to classify and probably over represented, and label noisy patterns that belonging to one class are labelled as members of another. This allows classifier construction to focus on borderline patterns, likely to be the most informative ones. To appropriately define the above subsets, in this work we will use as base classifiers the so–called parallel perceptrons, a novel approach to committee machine training that allows, among other things, to naturally define margins for hidden unit activations. We shall use these margins to define the above pattern types and to iteratively perform subsample selections in an initial training set that enhance classification accuracy and allow for a balanced classifier performance even when class sizes are greatly different.With partial support of Spain’s CICyT, TIC 01–572, TIN2004–0767

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Biblos-e Archivo

WTEN: An advanced coupled tensor factorization strategy for learning from imbalanced data

Author: AK Menon
FM Harper
G Wu
H He
JP Bradford
NV Chawla
NV Chawla
R Akbani
T Fawcett
T Jo
TG Kolda
XY Liu
Y Koren
ZH Zhou
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

© Springer International Publishing AG 2016. Learning from imbalanced and sparse data in multi-mode and high-dimensional tensor formats efficiently is a significant problem in data mining research. On one hand,Coupled Tensor Factorization (CTF) has become one of the most popular methods for joint analysis of heterogeneous sparse data generated from different sources. On the other hand,techniques such as sampling,cost-sensitive learning,etc. have been applied to many supervised learning models to handle imbalanced data. This research focuses on studying the effectiveness of combining advantages of both CTF and imbalanced data learning techniques for missing entry prediction,especially for entries with rare class labels. Importantly,we have also investigated the implication of joint analysis of the main tensor and extra information. One of our major goals is to design a robust weighting strategy for CTF to be able to not only effectively recover missing entries but also perform well when the entries are associated with imbalanced labels. Experiments on both real and synthetic datasets show that our approach outperforms existing CTF algorithms on imbalanced data

Crossref

OPUS - University of Technology Sydney