267,470 research outputs found
Rank-1 Tensor Approximation Methods and Application to Deflation
Because of the attractiveness of the canonical polyadic (CP) tensor
decomposition in various applications, several algorithms have been designed to
compute it, but efficient ones are still lacking. Iterative deflation
algorithms based on successive rank-1 approximations can be used to perform
this task, since the latter are rather easy to compute. We first present an
algebraic rank-1 approximation method that performs better than the standard
higher-order singular value decomposition (HOSVD) for three-way tensors.
Second, we propose a new iterative rank-1 approximation algorithm that improves
any other rank-1 approximation method. Third, we describe a probabilistic
framework allowing to study the convergence of deflation CP decomposition
(DCPD) algorithms based on successive rank-1 approximations. A set of computer
experiments then validates theoretical results and demonstrates the efficiency
of DCPD algorithms compared to other ones
Computing fuzzy rough approximations in large scale information systems
Rough set theory is a popular and powerful machine learning tool. It is especially suitable for dealing with information systems that exhibit inconsistencies, i.e. objects that have the same values for the conditional attributes but a different value for the decision attribute. In line with the emerging granular computing paradigm, rough set theory groups objects together based on the indiscernibility of their attribute values. Fuzzy rough set theory extends rough set theory to data with continuous attributes, and detects degrees of inconsistency in the data. Key to this is turning the indiscernibility relation into a gradual relation, acknowledging that objects can be similar to a certain extent. In very large datasets with millions of objects, computing the gradual indiscernibility relation (or in other words, the soft granules) is very demanding, both in terms of runtime and in terms of memory. It is however required for the computation of the lower and upper approximations of concepts in the fuzzy rough set analysis pipeline. Current non-distributed implementations in R are limited by memory capacity. For example, we found that a state of the art non-distributed implementation in R could not handle 30,000 rows and 10 attributes on a node with 62GB of memory. This is clearly insufficient to scale fuzzy rough set analysis to massive datasets. In this paper we present a parallel and distributed solution based on Message Passing Interface (MPI) to compute fuzzy rough approximations in very large information systems. Our results show that our parallel approach scales with problem size to information systems with millions of objects. To the best of our knowledge, no other parallel and distributed solutions have been proposed so far in the literature for this problem
State space collapse and diffusion approximation for a network operating under a fair bandwidth sharing policy
We consider a connection-level model of Internet congestion control,
introduced by Massouli\'{e} and Roberts [Telecommunication Systems 15 (2000)
185--201], that represents the randomly varying number of flows present in a
network. Here, bandwidth is shared fairly among elastic document transfers
according to a weighted -fair bandwidth sharing policy introduced by Mo
and Walrand [IEEE/ACM Transactions on Networking 8 (2000) 556--567] []. Assuming Poisson arrivals and exponentially distributed document
sizes, we focus on the heavy traffic regime in which the average load placed on
each resource is approximately equal to its capacity. A fluid model (or
functional law of large numbers approximation) for this stochastic model was
derived and analyzed in a prior work [Ann. Appl. Probab. 14 (2004) 1055--1083]
by two of the authors. Here, we use the long-time behavior of the solutions of
the fluid model established in that paper to derive a property called
multiplicative state space collapse, which, loosely speaking, shows that in
diffusion scale, the flow count process for the stochastic model can be
approximately recovered as a continuous lifting of the workload process.Comment: Published in at http://dx.doi.org/10.1214/08-AAP591 the Annals of
Applied Probability (http://www.imstat.org/aap/) by the Institute of
Mathematical Statistics (http://www.imstat.org
On approximating copulas by finite mixtures
Copulas are now frequently used to approximate or estimate multivariate
distributions because of their ability to take into account the multivariate
dependence of the variables while controlling the approximation properties of
the marginal densities. Copula based multivariate models can often also be more
parsimonious than fitting a flexible multivariate model, such as a mixture of
normals model, directly to the data. However, to be effective, it is imperative
that the family of copula models considered is sufficiently flexible. Although
finite mixtures of copulas have been used to construct flexible families of
copulas, their approximation properties are not well understood and we show
that natural candidates such as mixtures of elliptical copulas and mixtures of
Archimedean copulas cannot approximate a general copula arbitrarily well. Our
article develops fundamental tools for approximating a general copula
arbitrarily well by a mixture and proposes a family of finite mixtures that can
do so. We illustrate empirically on a financial data set that our approach for
estimating a copula can be much more parsimonious and results in a better fit
than approximating the copula by a mixture of normal copulas.Comment: 26 pages and 1 figure and 2 table
- …