5,273 research outputs found
Probabilistic Receiver Architecture Combining BP, MF, and EP for Multi-Signal Detection
Receiver algorithms which combine belief propagation (BP) with the mean field
(MF) approximation are well-suited for inference of both continuous and
discrete random variables. In wireless scenarios involving detection of
multiple signals, the standard construction of the combined BP-MF framework
includes the equalization or multi-user detection functions within the MF
subgraph. In this paper, we show that the MF approximation is not particularly
effective for multi-signal detection. We develop a new factor graph
construction for application of the BP-MF framework to problems involving the
detection of multiple signals. We then develop a low-complexity variant to the
proposed construction in which Gaussian BP is applied to the equalization
factors. In this case, the factor graph of the joint probability distribution
is divided into three subgraphs: (i) a MF subgraph comprised of the observation
factors and channel estimation, (ii) a Gaussian BP subgraph which is applied to
multi-signal detection, and (iii) a discrete BP subgraph which is applied to
demodulation and decoding. Expectation propagation is used to approximate
discrete distributions with a Gaussian distribution and links the discrete BP
and Gaussian BP subgraphs. The result is a probabilistic receiver architecture
with strong theoretical justification which can be applied to multi-signal
detection.Comment: 30 pages, 9 figure
A Spectral Framework for Anomalous Subgraph Detection
A wide variety of application domains are concerned with data consisting of
entities and their relationships or connections, formally represented as
graphs. Within these diverse application areas, a common problem of interest is
the detection of a subset of entities whose connectivity is anomalous with
respect to the rest of the data. While the detection of such anomalous
subgraphs has received a substantial amount of attention, no
application-agnostic framework exists for analysis of signal detectability in
graph-based data. In this paper, we describe a framework that enables such
analysis using the principal eigenspace of a graph's residuals matrix, commonly
called the modularity matrix in community detection. Leveraging this analytical
tool, we show that the framework has a natural power metric in the spectral
norm of the anomalous subgraph's adjacency matrix (signal power) and of the
background graph's residuals matrix (noise power). We propose several
algorithms based on spectral properties of the residuals matrix, with more
computationally expensive techniques providing greater detection power.
Detection and identification performance are presented for a number of signal
and noise models, including clusters and bipartite foregrounds embedded into
simple random backgrounds as well as graphs with community structure and
realistic degree distributions. The trends observed verify intuition gleaned
from other signal processing areas, such as greater detection power when the
signal is embedded within a less active portion of the background. We
demonstrate the utility of the proposed techniques in detecting small, highly
anomalous subgraphs in real graphs derived from Internet traffic and product
co-purchases.Comment: In submission to the IEEE, 16 pages, 8 figure
Unsupervised Learning of Spike Patterns for Seizure Detection and Wavefront Estimation of High Resolution Micro Electrocorticographic ({\mu}ECoG) Data
For the past few years, we have developed flexible, active, multiplexed
recording devices for high resolution recording over large, clinically relevant
areas in the brain. While this technology has enabled a much higher-resolution
view of the electrical activity of the brain, the analytical methods to
process, categorize and respond to the huge volumes of seizure data produced by
these devices have not yet been developed. In this work we proposed an
unsupervised learning framework for spike analysis, which by itself reveals
spike pattern. By applying advanced video processing techniques for separating
a multi-channel recording into individual spike segments, unfolding the spike
segments manifold and identifying natural clusters for spike patterns, we are
able to find the common spike motion patterns. And we further explored using
these patterns for more interesting and practical problems as seizure
prediction and spike wavefront prediction. These methods have been applied to
in-vivo feline seizure recordings and yielded promising results
Thresholds For Detecting An Anomalous Path From Noisy Environments
We consider the "searching for a trail in a maze" composite hypothesis
testing problem, in which one attempts to detect an anomalous directed path in
a lattice 2D box of side n based on observations on the nodes of the box. Under
the signal hypothesis, one observes independent Gaussian variables of unit
variance at all nodes, with zero, mean off the anomalous path and mean \mu_n on
it. Under the null hypothesis, one observes i.i.d. standard Gaussians on all
nodes. Arias-Castro et al. (2008) showed that if the unknown directed path
under the signal hypothesis has known the initial location, then detection is
possible (in the minimax sense) if \mu_n >> 1/\sqrt log n, while it is not
possible if \mu_n << 1/ log n\sqrt log log n. In this paper, we show that this
result continues to hold even when the initial location of the unknown path is
not known. As is the case with Arias-Castro et al. (2008), the upper bound here
also applies when the path is undirected. The improvement is achieved by
replacing the linear detection statistic used in Arias-Castro et al. (2008)
with a polynomial statistic, which is obtained by employing a multi-scale
analysis on a quadratic statistic to bootstrap its performance. Our analysis is
motivated by ideas developed in the context of the analysis of random polymers
in Lacoin (2010)
Multiresolution Representations for Piecewise-Smooth Signals on Graphs
What is a mathematically rigorous way to describe the taxi-pickup
distribution in Manhattan, or the profile information in online social
networks? A deep understanding of representing those data not only provides
insights to the data properties, but also benefits to many subsequent
processing procedures, such as denoising, sampling, recovery and localization.
In this paper, we model those complex and irregular data as piecewise-smooth
graph signals and propose a graph dictionary to effectively represent those
graph signals. We first propose the graph multiresolution analysis, which
provides a principle to design good representations. We then propose a
coarse-to-fine approach, which iteratively partitions a graph into two
subgraphs until we reach individual nodes. This approach efficiently implements
the graph multiresolution analysis and the induced graph dictionary promotes
sparse representations piecewise-smooth graph signals. Finally, we validate the
proposed graph dictionary on two tasks: approximation and localization. The
empirical results show that the proposed graph dictionary outperforms eight
other representation methods on six datasets, including traffic networks,
social networks and point cloud meshes
Sparsity Learning Based Multiuser Detection in Grant-Free Massive-Device Multiple Access
In this work, we study the multiuser detection (MUD) problem for a grant-free
massive-device multiple access (MaDMA) system, where a large number of
single-antenna user devices transmit sporadic data to a multi-antenna base
station (BS). Specifically, we put forth two MUD schemes, termed random
sparsity learning multiuser detection (RSL-MUD) and structured sparsity
learning multiuser detection (SSL-MUD) for the time-slotted and
non-time-slotted grant-free MaDMA systems, respectively. In the time-slotted
RSL-MUD scheme, active users generate and transmit data packets with random
sparsity. In the non-time-slotted SSL-MUD scheme, we introduce a
sliding-window-based detection framework, and the user signals in each
observation window naturally exhibit structured sparsity. We show that by
exploiting the sparsity embedded in the user signals, we can recover the user
activity state, the channel, and the user data in a single phase, without using
pilot signals for channel estimation and/or active user identification. To this
end, we develop a message-passing based statistical inference framework for the
BS to blindly detect the user data without any prior knowledge of the
identities and the channel state information (CSI) of the active users.
Simulation results show that our RSL-MUD and SSL-MUD schemes significantly
outperform their counterpart schemes in both reducing the transmission overhead
and improving the error behavior of the system.Comment: 12 pages, 9 figures, 3 table
Catching Loosely Synchronized Behavior in Face of Camouflage
Fraud has severely detrimental impacts on the business of social networks and
other online applications. A user can become a fake celebrity by purchasing
"zombie followers" on Twitter. A merchant can boost his reputation through fake
reviews on Amazon. This phenomenon also conspicuously exists on Facebook, Yelp
and TripAdvisor, etc. In all the cases, fraudsters try to manipulate the
platform's ranking mechanism by faking interactions between the fake accounts
they control and the target customers.Comment: Submitted to WWW 2019, Oct.201
SliceNDice: Mining Suspicious Multi-attribute Entity Groups with Multi-view Graphs
Given the reach of web platforms, bad actors have considerable incentives to
manipulate and defraud users at the expense of platform integrity. This has
spurred research in numerous suspicious behavior detection tasks, including
detection of sybil accounts, false information, and payment scams/fraud. In
this paper, we draw the insight that many such initiatives can be tackled in a
common framework by posing a detection task which seeks to find groups of
entities which share too many properties with one another across multiple
attributes (sybil accounts created at the same time and location, propaganda
spreaders broadcasting articles with the same rhetoric and with similar
reshares, etc.) Our work makes four core contributions: Firstly, we posit a
novel formulation of this task as a multi-view graph mining problem, in which
distinct views reflect distinct attribute similarities across entities, and
contextual similarity and attribute importance are respected. Secondly, we
propose a novel suspiciousness metric for scoring entity groups given the
abnormality of their synchronicity across multiple views, which obeys intuitive
desiderata that existing metrics do not. Finally, we propose the SliceNDice
algorithm which enables efficient extraction of highly suspicious entity
groups, and demonstrate its practicality in production, in terms of strong
detection performance and discoveries on Snapchat's large advertiser ecosystem
(89% precision and numerous discoveries of real fraud rings), marked
outperformance of baselines (over 97% precision/recall in simulated settings)
and linear scalability.Comment: Published in Proceedings of 2019 IEEE 6th International Conference on
Data Science and Advanced Analytics (DSAA
Signal Representations on Graphs: Tools and Applications
We present a framework for representing and modeling data on graphs. Based on
this framework, we study three typical classes of graph signals: smooth graph
signals, piecewise-constant graph signals, and piecewise-smooth graph signals.
For each class, we provide an explicit definition of the graph signals and
construct a corresponding graph dictionary with desirable properties. We then
study how such graph dictionary works in two standard tasks: approximation and
sampling followed with recovery, both from theoretical as well as algorithmic
perspectives. Finally, for each class, we present a case study of a real-world
problem by using the proposed methodology
Statistical Evaluation of Spectral Methods for Anomaly Detection in Networks
Monitoring of networks for anomaly detection has attracted a lot of attention
in recent years especially with the rise of connected devices and social
networks. This is of importance as anomaly detection could span a wide range of
application, from detecting terrorist cells in counter-terrorism efforts to
phishing attacks in social network circles. For this reason, numerous
techniques for anomaly detection have been introduced. However, application of
these techniques to more complex network models is hindered by various
challenges such as the size of the network being investigated, how much apriori
information is needed, the size of the anomalous graph, among others. A recent
technique introduced by Miller et al, which relies on a spectral framework for
anomaly detection, has the potential to address many of these challenges. In
their discussion of the spectral framework, three algorithms were proposed that
relied on the eigenvalues and eigenvectors of the residual matrix of a binary
network. The authors demonstrated the ability to detect anomalous subgraphs
that were less than 1% of the network size. However, to date, there is little
work that has been done to evaluate the statistical performance of these
algorithms. This study investigates the statistical properties of the spectral
methods, specifically the Chi-square and L1 norm algorithm proposed by Miller.
We will analyze the performance of the algorithm using simulated networks and
also extend the method's application to count networks. Finally we will make
some methodological improvements and recommendations to both algorithms.Comment: 39 pages, 17 figure
- …