Search CORE

3,459 research outputs found

Outlier Detection from Network Data with Subnetwork Interpretation

Author: Basu Prithwish
Dang Xuan-Hong
Silva Arlei
Singh Ambuj
Swami Ananthram
Publication venue
Publication date: 30/09/2016
Field of study

Detecting a small number of outliers from a set of data observations is always challenging. This problem is more difficult in the setting of multiple network samples, where computing the anomalous degree of a network sample is generally not sufficient. In fact, explaining why the network is exceptional, expressed in the form of subnetwork, is also equally important. In this paper, we develop a novel algorithm to address these two key problems. We treat each network sample as a potential outlier and identify subnetworks that mostly discriminate it from nearby regular samples. The algorithm is developed in the framework of network regression combined with the constraints on both network topology and L1-norm shrinkage to perform subnetwork discovery. Our method thus goes beyond subspace/subgraph discovery and we show that it converges to a global optimum. Evaluation on various real-world network datasets demonstrates that our algorithm not only outperforms baselines in both network and high dimensional setting, but also discovers highly relevant and interpretable local subnetworks, further enhancing our understanding of anomalous networks

arXiv.org e-Print Archive

Crossref

Product-based Neural Networks for User Response Prediction

Author: Cai Han
Qu Yanru
Ren Kan
Wang Jun
Wen Ying
Yu Yong
Zhang Weinan
Publication venue
Publication date: 01/11/2016
Field of study

Predicting user responses, such as clicks and conversions, is of great importance and has found its usage in many Web applications including recommender systems, web search and online advertising. The data in those applications is mostly categorical and contains multiple fields; a typical representation is to transform it into a high-dimensional sparse binary feature representation via one-hot encoding. Facing with the extreme sparsity, traditional models may limit their capacity of mining shallow patterns from the data, i.e. low-order feature combinations. Deep models like deep neural networks, on the other hand, cannot be directly applied for the high-dimensional input because of the huge feature space. In this paper, we propose a Product-based Neural Networks (PNN) with an embedding layer to learn a distributed representation of the categorical data, a product layer to capture interactive patterns between inter-field categories, and further fully connected layers to explore high-order feature interactions. Our experimental results on two large-scale real-world ad click datasets demonstrate that PNNs consistently outperform the state-of-the-art models on various metrics.Comment: 6 pages, 5 figures, ICDM201

arXiv.org e-Print Archive

Crossref

Automating the Construction of Jet Observables with Machine Learning

Author: Datta Kaustuv
Larkoski Andrew
Nachman Benjamin
Publication venue: 'American Physical Society (APS)'
Publication date: 05/03/2019
Field of study

Machine-learning assisted jet substructure tagging techniques have the potential to significantly improve searches for new particles and Standard Model measurements in hadronic final states. Techniques with simple analytic forms are particularly useful for establishing robustness and gaining physical insight. We introduce a procedure to automate the construction of a large class of observables that are chosen to completely specify

M

-body phase space. The procedure is validated on the task of distinguishing

H\rightarrow b\bar{b}

from

g\rightarrow b\bar{b}

, where

M=3

and previous brute-force approaches to construct an optimal product observable for the

M

-body phase space have established the baseline performance. We then use the new method to design tailored observables for the boosted

Z'

search, where

M=4

and brute-force methods are intractable. The new classifiers outperform standard

2

-prong tagging observables, illustrating the power of the new optimization method for improving searches and measurement at the LHC and beyond.Comment: 15 pages, 8 tables, 12 figure

arXiv.org e-Print Archive

Repository for Publications and Research Data

eScholarship - University of California