Search CORE

19,033 research outputs found

Attentive monitoring of multiple video streams driven by a Bayesian foraging strategy

Author: Boccignone Giuseppe
Napoletano Paolo
Tisato Francesco
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 27/04/2015
Field of study

In this paper we shall consider the problem of deploying attention to subsets of the video streams for collating the most relevant data and information of interest related to a given task. We formalize this monitoring problem as a foraging problem. We propose a probabilistic framework to model observer's attentive behavior as the behavior of a forager. The forager, moment to moment, focuses its attention on the most informative stream/camera, detects interesting objects or activities, or switches to a more profitable stream. The approach proposed here is suitable to be exploited for multi-stream video summarization. Meanwhile, it can serve as a preliminary step for more sophisticated video surveillance, e.g. activity and behavior analysis. Experimental results achieved on the UCR Videoweb Activities Dataset, a publicly available dataset, are presented to illustrate the utility of the proposed technique.Comment: Accepted to IEEE Transactions on Image Processin

arXiv.org e-Print Archive

AIR Universita degli studi di Milano

Finding Subcube Heavy Hitters in Analytics Data Streams

Author: Kveton Branislav
Muthukrishnan S.
Vu Hoa T.
Xian Yikun
Publication venue
Publication date: 01/01/2018
Field of study

Data streams typically have items of large number of dimensions. We study the fundamental heavy-hitters problem in this setting. Formally, the data stream consists of

d

-dimensional items

x_1,\ldots,x_m \in [n]^d

. A

k

-dimensional subcube

T

is a subset of distinct coordinates

\{ T_1,\cdots,T_k \} \subseteq [d]

. A subcube heavy hitter query

{\rm Query}(T,v)

v \in [n]^k

, outputs YES if

f_T(v) \geq \gamma

and NO if

f_T(v) < \gamma/4

, where

f_T

is the ratio of number of stream items whose coordinates

T

have joint values

v

. The all subcube heavy hitters query

{\rm AllQuery}(T)

outputs all joint values

v

that return YES to

{\rm Query}(T,v)

. The one dimensional version of this problem where

d=1

was heavily studied in data stream theory, databases, networking and signal processing. The subcube heavy hitters problem is applicable in all these cases. We present a simple reservoir sampling based one-pass streaming algorithm to solve the subcube heavy hitters problem in

\tilde{O}(kd/\gamma)

space. This is optimal up to poly-logarithmic factors given the established lower bound. In the worst case, this is

\Theta(d^2/\gamma)

which is prohibitive for large

d

, and our goal is to circumvent this quadratic bottleneck. Our main contribution is a model-based approach to the subcube heavy hitters problem. In particular, we assume that the dimensions are related to each other via the Naive Bayes model, with or without a latent dimension. Under this assumption, we present a new two-pass,

\tilde{O}(d/\gamma)

-space algorithm for our problem, and a fast algorithm for answering

{\rm AllQuery}(T)

O(k/\gamma^2)

time. Our work develops the direction of model-based data stream analysis, with much that remains to be explored.Comment: To appear in WWW 201

arXiv.org e-Print Archive

Crossref

Brain covariance selection: better individual functional connectivity models using population prior

Author: Gramfort Alexandre
Poline Jean Baptiste
Thirion Bertrand
Varoquaux Gaël
Publication venue
Publication date: 01/01/2010
Field of study

Spontaneous brain activity, as observed in functional neuroimaging, has been shown to display reproducible structure that expresses brain architecture and carries markers of brain pathologies. An important view of modern neuroscience is that such large-scale structure of coherent activity reflects modularity properties of brain connectivity graphs. However, to date, there has been no demonstration that the limited and noisy data available in spontaneous activity observations could be used to learn full-brain probabilistic models that generalize to new data. Learning such models entails two main challenges: i) modeling full brain connectivity is a difficult estimation problem that faces the curse of dimensionality and ii) variability between subjects, coupled with the variability of functional signals between experimental runs, makes the use of multiple datasets challenging. We describe subject-level brain functional connectivity structure as a multivariate Gaussian process and introduce a new strategy to estimate it from group data, by imposing a common structure on the graphical model in the population. We show that individual models learned from functional Magnetic Resonance Imaging (fMRI) data using this population prior generalize better to unseen data than models based on alternative regularization schemes. To our knowledge, this is the first report of a cross-validated model of spontaneous brain activity. Finally, we use the estimated graphical model to explore the large-scale characteristics of functional architecture and show for the first time that known cognitive networks appear as the integrated communities of functional connectivity graph.Comment: in Advances in Neural Information Processing Systems, Vancouver : Canada (2010

arXiv.org e-Print Archive

CiteSeerX

INRIA a CCSD electronic archive server

HAL-CEA

Process Monitoring on Sequences of System Call Count Vectors

Author: Dymshits Michael
Myara Ben
Tolpin David
Publication venue
Publication date: 12/07/2017
Field of study

We introduce a methodology for efficient monitoring of processes running on hosts in a corporate network. The methodology is based on collecting streams of system calls produced by all or selected processes on the hosts, and sending them over the network to a monitoring server, where machine learning algorithms are used to identify changes in process behavior due to malicious activity, hardware failures, or software errors. The methodology uses a sequence of system call count vectors as the data format which can handle large and varying volumes of data. Unlike previous approaches, the methodology introduced in this paper is suitable for distributed collection and processing of data in large corporate networks. We evaluate the methodology both in a laboratory setting on a real-life setup and provide statistics characterizing performance and accuracy of the methodology.Comment: 5 pages, 4 figures, ICCST 201

arXiv.org e-Print Archive

Crossref