3,459 research outputs found
Outlier Detection from Network Data with Subnetwork Interpretation
Detecting a small number of outliers from a set of data observations is
always challenging. This problem is more difficult in the setting of multiple
network samples, where computing the anomalous degree of a network sample is
generally not sufficient. In fact, explaining why the network is exceptional,
expressed in the form of subnetwork, is also equally important. In this paper,
we develop a novel algorithm to address these two key problems. We treat each
network sample as a potential outlier and identify subnetworks that mostly
discriminate it from nearby regular samples. The algorithm is developed in the
framework of network regression combined with the constraints on both network
topology and L1-norm shrinkage to perform subnetwork discovery. Our method thus
goes beyond subspace/subgraph discovery and we show that it converges to a
global optimum. Evaluation on various real-world network datasets demonstrates
that our algorithm not only outperforms baselines in both network and high
dimensional setting, but also discovers highly relevant and interpretable local
subnetworks, further enhancing our understanding of anomalous networks
Product-based Neural Networks for User Response Prediction
Predicting user responses, such as clicks and conversions, is of great
importance and has found its usage in many Web applications including
recommender systems, web search and online advertising. The data in those
applications is mostly categorical and contains multiple fields; a typical
representation is to transform it into a high-dimensional sparse binary feature
representation via one-hot encoding. Facing with the extreme sparsity,
traditional models may limit their capacity of mining shallow patterns from the
data, i.e. low-order feature combinations. Deep models like deep neural
networks, on the other hand, cannot be directly applied for the
high-dimensional input because of the huge feature space. In this paper, we
propose a Product-based Neural Networks (PNN) with an embedding layer to learn
a distributed representation of the categorical data, a product layer to
capture interactive patterns between inter-field categories, and further fully
connected layers to explore high-order feature interactions. Our experimental
results on two large-scale real-world ad click datasets demonstrate that PNNs
consistently outperform the state-of-the-art models on various metrics.Comment: 6 pages, 5 figures, ICDM201
Automating the Construction of Jet Observables with Machine Learning
Machine-learning assisted jet substructure tagging techniques have the
potential to significantly improve searches for new particles and Standard
Model measurements in hadronic final states. Techniques with simple analytic
forms are particularly useful for establishing robustness and gaining physical
insight. We introduce a procedure to automate the construction of a large class
of observables that are chosen to completely specify -body phase space. The
procedure is validated on the task of distinguishing
from , where and previous brute-force approaches
to construct an optimal product observable for the -body phase space have
established the baseline performance. We then use the new method to design
tailored observables for the boosted search, where and brute-force
methods are intractable. The new classifiers outperform standard -prong
tagging observables, illustrating the power of the new optimization method for
improving searches and measurement at the LHC and beyond.Comment: 15 pages, 8 tables, 12 figure
- …