Search CORE

98 research outputs found

Complete Genome Sequences of a Clinical Isolate and an Environmental Isolate of Vibrio parahaemolyticus.

Author: Fischer Markus
Jones Jessica L
Kong Nguyet
Lüdeke Catharina HM
Weimer Bart C
Publication venue: eScholarship, University of California
Publication date: 01/03/2015
Field of study

Vibrio parahaemolyticus is the leading cause of seafood-borne infections in the United States. We report complete genome sequences for two V. parahaemolyticus strains isolated in 2007, CDC_K4557 and FDA_R31 of clinical and oyster origin, respectively. These two sequences might assist in the investigation of differential virulence of this organism

PubMed Central

eScholarship - University of California

PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems

Author: Chun Byung-Gon
Interlandi Matteo
Lee Yunseong
Santambrogio Marco Domenico
Scolari Alberto
Weimer Markus
Publication venue
Publication date: 01/01/2018
Field of study

Machine Learning models are often composed of pipelines of transformations. While this design allows to efficiently execute single model components at training time, prediction serving has different requirements such as low latency, high throughput and graceful performance degradation under heavy load. Current prediction serving systems consider models as black boxes, whereby prediction-time-specific optimizations are ignored in favor of ease of deployment. In this paper, we present PRETZEL, a prediction serving system introducing a novel white box architecture enabling both end-to-end and multi-model optimizations. Using production-like model pipelines, our experiments show that PRETZEL is able to introduce performance improvements over different dimensions; compared to state-of-the-art approaches PRETZEL is on average able to reduce 99th percentile latency by 5.5x while reducing memory footprint by 25x, and increasing throughput by 4.7x.Comment: 16 pages, 14 figures, 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI), 201

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Workshop on information heterogeneity and fusion in recommender systems (HetRec 2010)

Author: Brusilovsky Peter L.
Cantador Iván
Koren Yehuda Koren
Kuflik Tsvi
Weimer Markus
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2010
Field of study

This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in RecSys '10 Proceedings of the fourth ACM conference on Recommender systems , http://dx.doi.org/10.1145/1864708.1864796

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Biblos-e Archivo

Iterative MapReduce for Large Scale Machine Learning

Author: Borkar Vinayak
Bu Yingyi
Carey Michael J.
Condie Tyson
Polyzotis Neoklis
Ramakrishnan Raghu
Rosen Joshua
Weimer Markus
Publication venue
Publication date: 13/03/2013
Field of study

Large datasets ("Big Data") are becoming ubiquitous because the potential value in deriving insights from data, across a wide range of business and scientific applications, is increasingly recognized. In particular, machine learning - one of the foundational disciplines for data analysis, summarization and inference - on Big Data has become routine at most organizations that operate large clouds, usually based on systems such as Hadoop that support the MapReduce programming paradigm. It is now widely recognized that while MapReduce is highly scalable, it suffers from a critical weakness for machine learning: it does not support iteration. Consequently, one has to program around this limitation, leading to fragile, inefficient code. Further, reliance on the programmer is inherently flawed in a multi-tenanted cloud environment, since the programmer does not have visibility into the state of the system when his or her program executes. Prior work has sought to address this problem by either developing specialized systems aimed at stylized applications, or by augmenting MapReduce with ad hoc support for saving state across iterations (driven by an external loop). In this paper, we advocate support for looping as a first-class construct, and propose an extension of the MapReduce programming paradigm called {\em Iterative MapReduce}. We then develop an optimizer for a class of Iterative MapReduce programs that cover most machine learning techniques, provide theoretical justifications for the key optimization steps, and empirically demonstrate that system-optimized programs for significant machine learning tasks are competitive with state-of-the-art specialized solutions

arXiv.org e-Print Archive

CiteSeerX

Hiding in Plain Sight: A Longitudinal Study of Combosquatting Abuse

Author: Alex Gontmakher
Artem Dinaburg
Chris Kanich
Fidelis Threat Research Team
FireEye
Jakobsson Markus
Jakobsson Markus
Kreibich Christian
Lee
Leyla Bilge Engin Kirda
Manos Antonakakis Roberto Perdisci
Manos Antonakakis Roberto Perdisci
Manos Antonakakis Roberto Perdisci
Marczak William R
Monrose
Rahbarinia R. Perdisci
Rodney Joffe
Snyder Peter
Symantec
TECHNOLOGIES.
Wang Yi-Min
Weimer
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 28/08/2017
Field of study

Domain squatting is a common adversarial practice where attackers register domain names that are purposefully similar to popular domains. In this work, we study a specific type of domain squatting called "combosquatting," in which attackers register domains that combine a popular trademark with one or more phrases (e.g., betterfacebook[.]com, youtube-live[.]com). We perform the first large-scale, empirical study of combosquatting by analyzing more than 468 billion DNS records---collected from passive and active DNS data sources over almost six years. We find that almost 60% of abusive combosquatting domains live for more than 1,000 days, and even worse, we observe increased activity associated with combosquatting year over year. Moreover, we show that combosquatting is used to perform a spectrum of different types of abuse including phishing, social engineering, affiliate abuse, trademark abuse, and even advanced persistent threats. Our results suggest that combosquatting is a real problem that requires increased scrutiny by the security community.Comment: ACM CCS 1

arXiv.org e-Print Archive

Crossref