Search CORE

5,746 research outputs found

When Hashes Met Wedges: A Distributed Algorithm for Finding High Similarity Vectors

Author: Andoni A.
Davis T.
Gionis A.
Goel A.
Shrivastava A.
Shrivastava A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 03/03/2017
Field of study

Finding similar user pairs is a fundamental task in social networks, with numerous applications in ranking and personalization tasks such as link prediction and tie strength detection. A common manifestation of user similarity is based upon network structure: each user is represented by a vector that represents the user's network connections, where pairwise cosine similarity among these vectors defines user similarity. The predominant task for user similarity applications is to discover all similar pairs that have a pairwise cosine similarity value larger than a given threshold

\tau

. In contrast to previous work where

\tau

is assumed to be quite close to 1, we focus on recommendation applications where

\tau

is small, but still meaningful. The all pairs cosine similarity problem is computationally challenging on networks with billions of edges, and especially so for settings with small

\tau

. To the best of our knowledge, there is no practical solution for computing all user pairs with, say

\tau = 0.2

on large social networks, even using the power of distributed algorithms. Our work directly addresses this challenge by introducing a new algorithm --- WHIMP --- that solves this problem efficiently in the MapReduce model. The key insight in WHIMP is to combine the "wedge-sampling" approach of Cohen-Lewis for approximate matrix multiplication with the SimHash random projection techniques of Charikar. We provide a theoretical analysis of WHIMP, proving that it has near optimal communication costs while maintaining computation cost comparable with the state of the art. We also empirically demonstrate WHIMP's scalability by computing all highly similar pairs on four massive data sets, and show that it accurately finds high similarity pairs. In particular, we note that WHIMP successfully processes the entire Twitter network, which has tens of billions of edges

arXiv.org e-Print Archive

Crossref

Spatially and Temporally Directed Noise Cancellation Using Federated Learning

Author: Goel Shantanu
Itankar Piyush T.
Publication venue: Technical Disclosure Commons
Publication date: 19/03/2020
Field of study

Machine learning models can be trained to cancel noise of diverse types or spectral characteristics, e.g. traffic noise, background chatter, etc. Such models are trained by feeding training data that includes labeled noise waveforms, which is an expensive and time-consuming procedure. Further, the effectiveness of such machine learning models is limited in canceling types of noise absent from training data. Trained models occupy significant amounts of memory which limits their use in consumer devices. This disclosure describes the use of federated learning techniques to train noise canceling models locally at diverse device locations and times. With user permission, the trained models are tagged with timestamp and location, such that when a user device has time or location matching a particular noise cancellation model, the particular model is provided to the user device. Noise cancellation on the user device is then performed with a compact machine learning model that is suited to the time and location of the user device

Technical Disclosure Common

Decremental All-Pairs ALL Shortest Paths and Betweenness Centrality

Author: C Demetrescu
DR Karger
K Goel
LC Freeman
M Nasre
M Thorup
T Coffman
U Brandes
Publication venue
Publication date: 14/11/2014
Field of study

We consider the all pairs all shortest paths (APASP) problem, which maintains the shortest path dag rooted at every vertex in a directed graph G=(V,E) with positive edge weights. For this problem we present a decremental algorithm (that supports the deletion of a vertex, or weight increases on edges incident to a vertex). Our algorithm runs in amortized O(\vstar^2 \cdot \log n) time per update, where n=|V|, and \vstar bounds the number of edges that lie on shortest paths through any given vertex. Our APASP algorithm can be used for the decremental computation of betweenness centrality (BC), a graph parameter that is widely used in the analysis of large complex networks. No nontrivial decremental algorithm for either problem was known prior to our work. Our method is a generalization of the decremental algorithm of Demetrescu and Italiano [DI04] for unique shortest paths, and for graphs with \vstar =O(n), we match the bound in [DI04]. Thus for graphs with a constant number of shortest paths between any pair of vertices, our algorithm maintains APASP and BC scores in amortized time O(n^2 \log n) under decremental updates, regardless of the number of edges in the graph.Comment: An extended abstract of this paper will appear in Proc. ISAAC 201

arXiv.org e-Print Archive

Crossref

Comparative Study of Popular Data Mining Algorithms

Author: Goel Harsh
Venkat Narayana Rao T.
Publication venue: Auricle Global Society of Education and Research
Publication date: 30/09/2018
Field of study

Data Science is an appealing field , in the present world due to advancement of science as there is huge assortment of data which exist in numerous forms . Such data must be handled with care and store safely so that it can be retrieved as per needs. Some of the popular or commonly used algorithms are Apriori algorithm, K Means Clustering, Support Vector machines(SVM) and Association Rule Mining algorithms. This paper focus on the above mentioned algorithms and a comparison is made in terms of Technique, Time Utilization Software taking real time data examples

International Journal on Future Revolution in Computer Science & Communication Engineering

Coupled discrete/continuum simulations of the impact of granular slugs with clamped beams: stand-off effects

Author: Deshpande VS
Goel A
Liu T
Uth T
Wadley HNG
Publication venue: 'Elsevier BV'
Publication date: 01/01/2018
Field of study

Coupled discrete particle/continuum simulations of the normal (zero obliquity) impact of granular slugs against the centre of deformable, end-clamped beams are reported. The simulations analyse the experiments of Uth et al. (2015) enabling a detailed interpretation of their observations of temporal evolution of granular slug and a strong stand-off distance dependence of the structural response. The high velocity granular slugs were generated by the pushing action of a piston and develop a spatial velocity gradient due to elastic energy stored during the loading phase by the piston. The velocity gradient within the “stretching” slug is a strong function of the inter-particle contact stiffness and the time the piston takes to ramp up to its final velocity. Other inter-particle contact properties such as damping and friction are shown to have negligible effect on the evolution of the granular slug. The velocity gradients result in a slug density that decreases with increasing stand-off distance, and therefore the pressure imposed by the slug on the beams is reduced with increasing stand-off. This results in the stand-off dependence of the beam's deflection observed by Uth et al. (2015). The coupled simulations capture both the permanent deflections of the beams and their dynamic deformation modes with a high degree of fidelity. These simulations shed new light on the stand-off effect observed during the loading of structures by shallow-buried explosions

Nottingham ePrints

Nottingham eTheses

Crossref

Repository@Nottingham

Queen Mary Research Online

CUED - Cambridge University Engineering Department

Error Measures for Noise-Free Surrogate Approximations

Author: Goel Tushar
Haftka Raphael T.
Shyy Wei
Publication venue: 'American Institute of Aeronautics and Astronautics (AIAA)'
Publication date: 01/01/2008
Field of study

Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/76064/1/AIAA-2008-901-367.pd

Hong Kong University of Science and Technology Institutional Repository

Deep Blue Documents at the University of Michigan

Budget feasible mechanisms on matroids

Author: A Schrijver
DNC Tse
EH Clarke
G Demange
G Goel
H Chan
LM Ausubel
S Bikhchandani
T Groves
W Vickrey
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Motivated by many practical applications, in this paper we study budget feasible mechanisms where the goal is to procure independent sets from matroids. More specifically, we are given a matroid =(,) where each ground (indivisible) element is a selfish agent. The cost of each element (i.e., for selling the item or performing a service) is only known to the element itself. There is a buyer with a budget having additive valuations over the set of elements E. The goal is to design an incentive compatible (truthful) budget feasible mechanism which procures an independent set of the matroid under the given budget that yields the largest value possible to the buyer. Our result is a deterministic, polynomial-time, individually rational, truthful and budget feasible mechanism with 4-approximation to the optimal independent set. Then, we extend our mechanism to the setting of matroid intersections in which the goal is to procure common independent sets from multiple matroids. We show that, given a polynomial time deterministic blackbox that returns -approximation solutions to the matroid intersection problem, there exists a deterministic, polynomial time, individually rational, truthful and budget feasible mechanism with (3+1) -approximation to the optimal common independent set

Crossref

Archivio della ricerca- Università di Roma La Sapienza

Building a GUI Application for Viewing and Searching Apache Kafka Messages

Author: Shivkumar Goel, M Prabhanath Nair, T Varghese John
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 30/06/2017
Field of study

Apache Kafka is a scalable messaging system that follows Publish-Subscribe Model as its core. Several traditional messaging system like MSMQ, RabbitMQ exist but they have limitations in terms of performance and throughput. Kafka, developed at LinkedIn is the latest messaging technology being adopted by most of the top internet companies. The purpose of this paper is to provide a GUI and search tool, to view and monitor messages insideKafka

International Journal on Recent and Innovation Trends in Computing and Communication