Search CORE

17,272 research outputs found

A Two-step Statistical Approach for Inferring Network Traffic Demands (Revises Technical Report BUCS-2003-003)

Author: Diot C.
Matta I.
Medina A.
Salamatian K.
Taft N.
Publication venue: Boston University Computer Science Department
Publication date: 01/03/2004
Field of study

Accurate knowledge of traffic demands in a communication network enables or enhances a variety of traffic engineering and network management tasks of paramount importance for operational networks. Directly measuring a complete set of these demands is prohibitively expensive because of the huge amounts of data that must be collected and the performance impact that such measurements would impose on the regular behavior of the network. As a consequence, we must rely on statistical techniques to produce estimates of actual traffic demands from partial information. The performance of such techniques is however limited due to their reliance on limited information and the high amount of computations they incur, which limits their convergence behavior. In this paper we study a two-step approach for inferring network traffic demands. First we elaborate and evaluate a modeling approach for generating good starting points to be fed to iterative statistical inference techniques. We call these starting points informed priors since they are obtained using actual network information such as packet traces and SNMP link counts. Second we provide a very fast variant of the EM algorithm which extends its computation range, increasing its accuracy and decreasing its dependence on the quality of the starting point. Finally, we evaluate and compare alternative mechanisms for generating starting points and the convergence characteristics of our EM algorithm against a recently proposed Weighted Least Squares approach.National Science Foundation (ANI-0095988, EIA-0202067, ITR ANI-0205294

Boston University Institutional Repository (OpenBU)

Nonparametric Feature Extraction from Dendrograms

Author: Chehreghani Morteza Haghir
Chehreghani Mostafa Haghir
Publication venue
Publication date: 18/11/2019
Field of study

We propose feature extraction from dendrograms in a nonparametric way. The Minimax distance measures correspond to building a dendrogram with single linkage criterion, with defining specific forms of a level function and a distance function over that. Therefore, we extend this method to arbitrary dendrograms. We develop a generalized framework wherein different distance measures can be inferred from different types of dendrograms, level functions and distance functions. Via an appropriate embedding, we compute a vector-based representation of the inferred distances, in order to enable many numerical machine learning algorithms to employ such distances. Then, to address the model selection problem, we study the aggregation of different dendrogram-based distances respectively in solution space and in representation space in the spirit of deep representations. In the first approach, for example for the clustering problem, we build a graph with positive and negative edge weights according to the consistency of the clustering labels of different objects among different solutions, in the context of ensemble methods. Then, we use an efficient variant of correlation clustering to produce the final clusters. In the second approach, we investigate the sequential combination of different distances and features sequentially in the spirit of multi-layered architectures to obtain the final features. Finally, we demonstrate the effectiveness of our approach via several numerical studies

arXiv.org e-Print Archive

A Bayesian approach for inferring neuronal connectivity from calcium fluorescent imaging data

Author: Mishchencko Yuriy
Paninski Liam
Vogelstein Joshua T.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 21/07/2011
Field of study

Deducing the structure of neural circuits is one of the central problems of modern neuroscience. Recently-introduced calcium fluorescent imaging methods permit experimentalists to observe network activity in large populations of neurons, but these techniques provide only indirect observations of neural spike trains, with limited time resolution and signal quality. In this work we present a Bayesian approach for inferring neural circuitry given this type of imaging data. We model the network activity in terms of a collection of coupled hidden Markov chains, with each chain corresponding to a single neuron in the network and the coupling between the chains reflecting the network's connectivity matrix. We derive a Monte Carlo Expectation--Maximization algorithm for fitting the model parameters; to obtain the sufficient statistics in a computationally-efficient manner, we introduce a specialized blockwise-Gibbs algorithm for sampling from the joint activity of all observed neurons given the observed fluorescence data. We perform large-scale simulations of randomly connected neuronal networks with biophysically realistic parameters and find that the proposed methods can accurately infer the connectivity in these networks given reasonable experimental and computational constraints. In addition, the estimation accuracy may be improved significantly by incorporating prior knowledge about the sparseness of connectivity in the network, via standard L

_1

penalization methods.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS303 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

PowerSpy: Location Tracking using Mobile Device Power Analysis

Author: Boneh Dan
Michalevsky Yan
Nakibly Gabi
Schulman Aaron
Veerapandian Gunaa Arumugam
Publication venue
Publication date: 17/08/2015
Field of study

Modern mobile platforms like Android enable applications to read aggregate power usage on the phone. This information is considered harmless and reading it requires no user permission or notification. We show that by simply reading the phone's aggregate power consumption over a period of a few minutes an application can learn information about the user's location. Aggregate phone power consumption data is extremely noisy due to the multitude of components and applications that simultaneously consume power. Nevertheless, by using machine learning algorithms we are able to successfully infer the phone's location. We discuss several ways in which this privacy leak can be remedied.Comment: Usenix Security 201

arXiv.org e-Print Archive

CiteSeerX

Active Learning of Multiple Source Multiple Destination Topologies

Author: Animashree An
Athina Markopoulou
Maciej Kurant
Michael Rabbat
Pegah Sattari
Senior Member
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/03/2013
Field of study

We consider the problem of inferring the topology of a network with

M

sources and

N

receivers (hereafter referred to as an

M

-by-

N

network), by sending probes between the sources and receivers. Prior work has shown that this problem can be decomposed into two parts: first, infer smaller subnetwork components (i.e.,

1

-by-

N

's or

2

-by-

2

's) and then merge these components to identify the

M

-by-

N

topology. In this paper, we focus on the second part, which had previously received less attention in the literature. In particular, we assume that a

1

-by-

N

topology is given and that all

2

-by-

2

components can be queried and learned using end-to-end probes. The problem is which

2

-by-

2

's to query and how to merge them with the given

1

-by-

N

, so as to exactly identify the

2

-by-

N

topology, and optimize a number of performance metrics, including the number of queries (which directly translates into measurement bandwidth), time complexity, and memory usage. We provide a lower bound,

\lceil \frac{N}{2} \rceil

, on the number of

2

-by-

2

's required by any active learning algorithm and propose two greedy algorithms. The first algorithm follows the framework of multiple hypothesis testing, in particular Generalized Binary Search (GBS), since our problem is one of active learning, from

2

-by-

2

queries. The second algorithm is called the Receiver Elimination Algorithm (REA) and follows a bottom-up approach: at every step, it selects two receivers, queries the corresponding

2

-by-

2

, and merges it with the given

1

-by-

N

; it requires exactly

N-1

steps, which is much less than all

\binom{N}{2}

possible

2

-by-

2

's. Simulation results over synthetic and realistic topologies demonstrate that both algorithms correctly identify the

2

-by-

N

topology and are near-optimal, but REA is more efficient in practice

arXiv.org e-Print Archive

CiteSeerX

eScholarship - University of California

Caltech Authors

Understanding Internet topology: principles, models, and validation

Author: David Alderson
John C. Doyle
Lun Li
Student Member
Walter Willinger
Publication venue
Publication date: 01/01/2005
Field of study

Building on a recent effort that combines a first-principles approach to modeling router-level connectivity with a more pragmatic use of statistics and graph theory, we show in this paper that for the Internet, an improved understanding of its physical infrastructure is possible by viewing the physical connectivity as an annotated graph that delivers raw connectivity and bandwidth to the upper layers in the TCP/IP protocol stack, subject to practical constraints (e.g., router technology) and economic considerations (e.g., link costs). More importantly, by relying on data from Abilene, a Tier-1 ISP, and the Rocketfuel project, we provide empirical evidence in support of the proposed approach and its consistency with networking reality. To illustrate its utility, we: 1) show that our approach provides insight into the origin of high variability in measured or inferred router-level maps; 2) demonstrate that it easily accommodates the incorporation of additional objectives of network design (e.g., robustness to router failure); and 3) discuss how it complements ongoing community efforts to reverse-engineer the Internet

CiteSeerX

Caltech Authors

Calhoun, Institutional Archive of the Naval Postgraduate School

Data based identification and prediction of nonlinear and complex dynamical systems

Author: Grebogi Celso
Lai Ying-Cheng
Wang Wen-Xu
Publication venue: 'Elsevier BV'
Publication date: 27/04/2017
Field of study

We thank Dr. R. Yang (formerly at ASU), Dr. R.-Q. Su (formerly at ASU), and Mr. Zhesi Shen for their contributions to a number of original papers on which this Review is partly based. This work was supported by ARO under Grant No. W911NF-14-1-0504. W.-X. Wang was also supported by NSFC under Grants No. 61573064 and No. 61074116, as well as by the Fundamental Research Funds for the Central Universities, Beijing Nova Programme.Peer reviewedPostprin

arXiv.org e-Print Archive

Aberdeen University Research