Search CORE

8,746 research outputs found

Catoptrical rough set model on two universes using granule-based definition and its variable precision extensions

Author: Beynon
Bonikowski
Bonikowski
Dai
Dai
Dai
Dai
Dai
Dai
Dai
Dai
Dai
Dubois
Gehrke
Greco
He
He
Hu
Huang
Huifeng Han
Jianhua Dai
Jun Liu
Katzberg
Kryszkiewicz
Li
Li
Li
Lin
Lingras
Liu
Liu
Lu
Maofu Liu
Mi
Mieszkowicz-Rolka
Min
Min
Pawlak
Pawlak
Pei
Pomykala
Qian
Shuping Wan
Skowron
Stefanowski
Tsumoto
Wong
Wu
Wu
Wybraniec-Skardowska
Xiaohong Zhang
Yamak
Yan
Yang
Yao
Yao
Yao
Yao
Yao
Yao
Zakowski
Zhang
Zhang
Zhang
Zhang
Zhao
Zhenli Lu
Zhu
Zhu
Ziarko
Publication venue: 'Elsevier BV'
Publication date
Field of study

When Hashes Met Wedges: A Distributed Algorithm for Finding High Similarity Vectors

Author: Andoni A.
Davis T.
Gionis A.
Goel A.
Shrivastava A.
Shrivastava A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 03/03/2017
Field of study

Finding similar user pairs is a fundamental task in social networks, with numerous applications in ranking and personalization tasks such as link prediction and tie strength detection. A common manifestation of user similarity is based upon network structure: each user is represented by a vector that represents the user's network connections, where pairwise cosine similarity among these vectors defines user similarity. The predominant task for user similarity applications is to discover all similar pairs that have a pairwise cosine similarity value larger than a given threshold

\tau

. In contrast to previous work where

\tau

is assumed to be quite close to 1, we focus on recommendation applications where

\tau

is small, but still meaningful. The all pairs cosine similarity problem is computationally challenging on networks with billions of edges, and especially so for settings with small

\tau

. To the best of our knowledge, there is no practical solution for computing all user pairs with, say

\tau = 0.2

on large social networks, even using the power of distributed algorithms. Our work directly addresses this challenge by introducing a new algorithm --- WHIMP --- that solves this problem efficiently in the MapReduce model. The key insight in WHIMP is to combine the "wedge-sampling" approach of Cohen-Lewis for approximate matrix multiplication with the SimHash random projection techniques of Charikar. We provide a theoretical analysis of WHIMP, proving that it has near optimal communication costs while maintaining computation cost comparable with the state of the art. We also empirically demonstrate WHIMP's scalability by computing all highly similar pairs on four massive data sets, and show that it accurately finds high similarity pairs. In particular, we note that WHIMP successfully processes the entire Twitter network, which has tens of billions of edges

arXiv.org e-Print Archive

Crossref

Sample Complexity Analysis for Learning Overcomplete Latent Variable Models through Tensor Methods

Author: Anandkumar Animashree
Ge Rong
Janzamin Majid
Publication venue
Publication date: 03/08/2014
Field of study

We provide guarantees for learning latent variable models emphasizing on the overcomplete regime, where the dimensionality of the latent space can exceed the observed dimensionality. In particular, we consider multiview mixtures, spherical Gaussian mixtures, ICA, and sparse coding models. We provide tight concentration bounds for empirical moments through novel covering arguments. We analyze parameter recovery through a simple tensor power update algorithm. In the semi-supervised setting, we exploit the label or prior information to get a rough estimate of the model parameters, and then refine it using the tensor method on unlabeled samples. We establish that learning is possible when the number of components scales as

k=o(d^{p/2})

, where

d

is the observed dimension, and

p

is the order of the observed moment employed in the tensor method. Our concentration bound analysis also leads to minimax sample complexity for semi-supervised learning of spherical Gaussian mixtures. In the unsupervised setting, we use a simple initialization algorithm based on SVD of the tensor slices, and provide guarantees under the stricter condition that

k\le \beta d

(where constant

\beta

can be larger than

1

), where the tensor method recovers the components under a polynomial running time (and exponential in

\beta

). Our analysis establishes that a wide range of overcomplete latent variable models can be learned efficiently with low computational and sample complexity through tensor decomposition methods.Comment: Title change

arXiv.org e-Print Archive

eScholarship - University of California

A differential method for bounding the ground state energy

Author: Amaury Mouchet
Arnold V I
Barnsley M
Barta J
Baumgartner B
Bessa G P Montenegro J F
Caffarel M
Crandall R E
Crandall R E
Demazure M
Maslov V P
Poston T
Reed M
Thirring W
Publication venue: 'IOP Publishing'
Publication date: 15/12/2004
Field of study

For a wide class of Hamiltonians, a novel method to obtain lower and upper bounds for the lowest energy is presented. Unlike perturbative or variational techniques, this method does not involve the computation of any integral (a normalisation factor or a matrix element). It just requires the determination of the absolute minimum and maximum in the whole configuration space of the local energy associated with a normalisable trial function (the calculation of the norm is not needed). After a general introduction, the method is applied to three non-integrable systems: the asymmetric annular billiard, the many-body spinless Coulombian problem, the hydrogen atom in a constant and uniform magnetic field. Being more sensitive than the variational methods to any local perturbation of the trial function, this method can used to systematically improve the energy bounds with a local skilled analysis; an algorithm relying on this method can therefore be constructed and an explicit example for a one-dimensional problem is given.Comment: Accepted for publication in Journal of Physics

arXiv.org e-Print Archive

Crossref

HAL Université de Tours

CERN Document Server

Parameter Selection and Uncertainty Measurement for Variable Precision Probabilistic Rough Set

Author: Ma Weimin
Sun Bingzhen
Yue Lei
Zhao Haiyan
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 26/07/2018
Field of study

In this paper, we consider the problem of parameter selection and uncertainty measurement for a variable precision probabilistic rough set. Firstly, within the framework of the variable precision probabilistic rough set model, the relative discernibility of a variable precision rough set in probabilistic approximation space is discussed, and the conditions that make precision parameters α discernible in a variable precision probabilistic rough set are put forward. Concurrently, we consider the lack of predictability of precision parameters in a variable precision probabilistic rough set, and we propose a systematic threshold selection method based on relative discernibility of sets, using the concept of relative discernibility in probabilistic approximation space. Furthermore, a numerical example is applied to test the validity of the proposed method in this paper. Secondly, we discuss the problem of uncertainty measurement for the variable precision probabilistic rough set. The concept of classical fuzzy entropy is introduced into probabilistic approximation space, and the uncertain information that comes from approximation space and the approximated objects is fully considered. Then, an axiomatic approach is established for uncertainty measurement in a variable precision probabilistic rough set, and several related interesting properties are also discussed. Thirdly, we study the attribute reduction for the variable precision probabilistic rough set. The definition of reduction and its characteristic theorems are given for the variable precision probabilistic rough set. The main contribution of this paper is twofold. One is to propose a method of parameter selection for a variable precision probabilistic rough set. Another is to present a new approach to measurement uncertainty and the method of attribute reduction for a variable precision probabilistic rough set

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

The investigation of the Bayesian rough set model

Author: Slezak Dominik
Ziarko Wojciech
Publication venue: Elsevier Inc.
Publication date: 15/01/1999
Field of study

AbstractThe original Rough Set model is concerned primarily with algebraic properties of approximately defined sets. The Variable Precision Rough Set (VPRS) model extends the basic rough set theory to incorporate probabilistic information. The article presents a non-parametric modification of the VPRS model called the Bayesian Rough Set (BRS) model, where the set approximations are defined by using the prior probability as a reference. Mathematical properties of BRS are investigated. It is shown that the quality of BRS models can be evaluated using probabilistic gain function, which is suitable for identification and elimination of redundant attributes

Elsevier - Publisher Connector

Crossref

Repositori Obert de Coneixement de l'Ajuntament de Barcelona

The investigation of the Bayesian rough set model

Author: Slezak Dominik
Ziarko Wojciech
Publication venue: Elsevier Inc.
Publication date: 31/07/2005
Field of study

Elsevier - Publisher Connector