Search CORE

11,673 research outputs found

A Structured Approach to Predicting Image Enhancement Parameters

Author: Chandakkar Parag S.
Li Baoxin
Publication venue
Publication date: 04/04/2017
Field of study

Social networking on mobile devices has become a commonplace of everyday life. In addition, photo capturing process has become trivial due to the advances in mobile imaging. Hence people capture a lot of photos everyday and they want them to be visually-attractive. This has given rise to automated, one-touch enhancement tools. However, the inability of those tools to provide personalized and content-adaptive enhancement has paved way for machine-learned methods to do the same. The existing typical machine-learned methods heuristically (e.g. kNN-search) predict the enhancement parameters for a new image by relating the image to a set of similar training images. These heuristic methods need constant interaction with the training images which makes the parameter prediction sub-optimal and computationally expensive at test time which is undesired. This paper presents a novel approach to predicting the enhancement parameters given a new image using only its features, without using any training images. We propose to model the interaction between the image features and its corresponding enhancement parameters using the matrix factorization (MF) principles. We also propose a way to integrate the image features in the MF formulation. We show that our approach outperforms heuristic approaches as well as recent approaches in MF and structured prediction on synthetic as well as real-world data of image enhancement.Comment: WACV 201

arXiv.org e-Print Archive

Restricted Boltzmann Machines for Robust and Fast Latent Truth Discovery

Author: Broelemann Klaus
Gottron Thomas
Kasneci Gjergji
Publication venue
Publication date: 31/12/2017
Field of study

We address the problem of latent truth discovery, LTD for short, where the goal is to discover the underlying true values of entity attributes in the presence of noisy, conflicting or incomplete information. Despite a multitude of algorithms to address the LTD problem that can be found in literature, only little is known about their overall performance with respect to effectiveness (in terms of truth discovery capabilities), efficiency and robustness. A practical LTD approach should satisfy all these characteristics so that it can be applied to heterogeneous datasets of varying quality and degrees of cleanliness. We propose a novel algorithm for LTD that satisfies the above requirements. The proposed model is based on Restricted Boltzmann Machines, thus coined LTD-RBM. In extensive experiments on various heterogeneous and publicly available datasets, LTD-RBM is superior to state-of-the-art LTD techniques in terms of an overall consideration of effectiveness, efficiency and robustness

arXiv.org e-Print Archive

Bias in OLAP Queries: Detection, Explanation, and Removal

Author: Gehrke Johannes
Salimi Babak
Suciu Dan
Publication venue
Publication date: 24/07/2018
Field of study

On line analytical processing (OLAP) is an essential element of decision-support systems. OLAP tools provide insights and understanding needed for improved decision making. However, the answers to OLAP queries can be biased and lead to perplexing and incorrect insights. In this paper, we propose HypDB, a system to detect, explain, and to resolve bias in decision-support queries. We give a simple definition of a \emph{biased query}, which performs a set of independence tests on the data to detect bias. We propose a novel technique that gives explanations for bias, thus assisting an analyst in understanding what goes on. Additionally, we develop an automated method for rewriting a biased query into an unbiased query, which shows what the analyst intended to examine. In a thorough evaluation on several real datasets we show both the quality and the performance of our techniques, including the completely automatic discovery of the revolutionary insights from a famous 1973 discrimination case.Comment: This paper is an extended version of a paper presented at SIGMOD 201

arXiv.org e-Print Archive

FairMod - Making Predictive Models Discrimination Aware

Author: Le Thuc Duy
Li Gefei
Li Jiuyong
Liu Jixue
Liu Lin
Ye Feiyue
Publication venue
Publication date: 04/11/2018
Field of study

Predictive models such as decision trees and neural networks may produce discrimination in their predictions. This paper proposes a method to post-process the predictions of a predictive model to make the processed predictions non-discriminatory. The method considers multiple protected variables together. Multiple protected variables make the problem more challenging than a simple protected variable. The method uses a well-cited discrimination metric and adapts it to allow the specification of explanatory variables, such as position, profession, education, that describe the contexts of the applications. It models the post-processing of predictions problem as a nonlinear optimization problem to find best adjustments to the predictions so that the discrimination constraints of all protected variables are all met at the same time. The proposed method is independent of classification methods. It can handle the cases that existing methods cannot handle: satisfying multiple protected attributes at the same time, allowing multiple explanatory attributes, and being independent of classification model types. An evaluation using four real world data sets shows that the proposed method is as effectively as existing methods, in addition to its extra power

arXiv.org e-Print Archive

Bridging observational studies and randomized experiments by embedding the former in the latter

Author: Bind Marie-Abele C.
Rubin Donald B.
Publication venue
Publication date: 18/09/2017
Field of study

The health effects of environmental exposures have been studied for decades, typically using standard regression models to assess exposure-outcome associations found in observational non-experimental data. We propose and illustrate a different approach to examine causal effects of environmental exposures on health outcomes from observational data. Our strategy attempts to structure the observational data to approximate data from a hypothetical, but realistic, randomized experiment. This approach, based on insights from classical experimental design, involves four stages, and relies on modern computing to implement the effort in two of the four stages.More specifically, our strategy involves: 1) a conceptual stage that involves the precise formulation of the causal question in terms of a hypothetical randomized experiment where the exposure is assigned to units; 2) a design stage that attempts to reconstruct (or approximate) a randomized experiment before any outcome data are observed, 3) a statistical analysis comparing the outcomes of interest in the exposed and non-exposed units of the hypothetical randomized experiment, and 4) a summary stage providing conclusions about statistical evidence for the sizes of possible causal effects of the exposure on outcomes. We illustrate our approach using an example examining the effect of parental smoking on children's lung function collected in families living in East Boston in the 1970's. To complement the traditional purely model-based approaches, our strategy, which includes outcome free matched-sampling, provides workable tools to quantify possible detrimental exposure effects on human health outcomes especially because it also includes transparent diagnostics to assess the assumptions of the four-stage statistical approach being applied

arXiv.org e-Print Archive

Multiplicative Coevolution Regression Models for Longitudinal Networks and Nodal Attributes

Author: He Yanjun
Hoff Peter D.
Publication venue
Publication date: 07/12/2017
Field of study

We introduce a simple and extendable coevolution model for the analysis of longitudinal network and nodal attribute data. The model features parameters that describe three phenomena: homophily, contagion and autocorrelation of the network and nodal attribute process. Homophily here describes how changes to the network may be associated with between-node similarities in terms of their nodal attributes. Contagion refers to how node-level attributes may change depending on the network. The model we present is based upon a pair of intertwined autoregressive processes. We obtain least-squares parameter estimates for continuous-valued fully-observed network and attribute data. We also provide methods for Bayesian inference in several other cases, including ordinal network and attribute data, and models involving latent nodal attributes. These model extensions are applied to an analysis of international relations data and to data from a study of teen delinquency and friendship networks.Comment: 20 page

arXiv.org e-Print Archive

Latent Constraints: Learning to Generate Conditionally from Unconditional Generative Models

Author: Engel Jesse
Hoffman Matthew
Roberts Adam
Publication venue
Publication date: 21/12/2017
Field of study

Deep generative neural networks have proven effective at both conditional and unconditional modeling of complex data distributions. Conditional generation enables interactive control, but creating new controls often requires expensive retraining. In this paper, we develop a method to condition generation without retraining the model. By post-hoc learning latent constraints, value functions that identify regions in latent space that generate outputs with desired attributes, we can conditionally sample from these regions with gradient-based optimization or amortized actor functions. Combining attribute constraints with a universal "realism" constraint, which enforces similarity to the data distribution, we generate realistic conditional images from an unconditional variational autoencoder. Further, using gradient-based optimization, we demonstrate identity-preserving transformations that make the minimal adjustment in latent space to modify the attributes of an image. Finally, with discrete sequences of musical notes, we demonstrate zero-shot conditional generation, learning latent constraints in the absence of labeled data or a differentiable reward function. Code with dedicated cloud instance has been made publicly available (https://goo.gl/STGMGx)

arXiv.org e-Print Archive

Semi-supervised Embedding Learning for High-dimensional Bayesian Optimization

Author: Chen Jingfan
Gu Rong
Huang Yihua
Yuan Chunfeng
Zhu Guanghui
Publication venue
Publication date: 29/05/2020
Field of study

Bayesian optimization is a broadly applied methodology to optimize the expensive black-box function. Despite its success, it still faces the challenge from the high-dimensional search space. To alleviate this problem, we propose a novel Bayesian optimization framework (termed SILBO), which finds a low-dimensional space to perform Bayesian optimization iteratively through semi-supervised dimension reduction. SILBO incorporates both labeled points and unlabeled points acquired from the acquisition function to guide the embedding space learning. To accelerate the learning procedure, we present a randomized method for generating the projection matrix. Furthermore, to map from the low-dimensional space to the high-dimensional original space, we propose two mapping strategies:

\text{SILBO}_{FZ}

and

\text{SILBO}_{FX}

according to the evaluation overhead of the objective function. Experimental results on both synthetic function and hyperparameter optimization tasks demonstrate that SILBO outperforms the existing state-of-the-art high-dimensional Bayesian optimization methods

arXiv.org e-Print Archive

Packet Score based network security and Traffic Optimization

Author: Karthik S.
Saravanan k.
Publication venue
Publication date: 09/02/2012
Field of study

One of the critical threat to internet security is Distributed Denial of Service (DDoS). This paper by the introduction of automated online attack classification and attack packet discarding helps to resolve the network security issue by certain level. The incoming packets are assigned scores based on the priority associated with the attributes and on comparison with probability distribution of arriving packets on per packet basis

arXiv.org e-Print Archive

Next Stop "NoOps": Enabling Cross-System Diagnostics Through Graph-based Composition of Logs and Metrics

Author: Brandon Alvaro
Carrera David
Muntés-Mulero Victor
Solé Marc
Zasadziński Michał
Publication venue
Publication date: 01/08/2018
Field of study

Performing diagnostics in IT systems is an increasingly complicated task, and it is not doable in satisfactory time by even the most skillful operators. Systems and their architecture change very rapidly in response to business and user demand. Many organizations see value in the maintenance and management model of NoOps that stands for No Operations. One of the implementations of this model is a system that is maintained automatically without any human intervention. The path to NoOps involves not only precise and fast diagnostics but also reusing as much knowledge as possible after the system is reconfigured or changed. The biggest challenge is to leverage knowledge on one IT system and reuse this knowledge for diagnostics of another, different system. We propose a framework of weighted graphs which can transfer knowledge, and perform high-quality diagnostics of IT systems. We encode all possible data in a graph representation of a system state and automatically calculate weights of these graphs. Then, thanks to the evaluation of similarity between graphs, we transfer knowledge about failures from one system to another and use it for diagnostics. We successfully evaluate the proposed approach on Spark, Hadoop, Kafka and Cassandra systems.Comment: Peer-reviewed, accepted as a regular paper to IEEE Cluster 2018. To be published through proceedings in September 201

arXiv.org e-Print Archive