Search CORE

11,670 research outputs found

ICE: Enabling Non-Experts to Build Models Interactively for Large-Scale Lopsided Problems

Author: Aparna Lakshmiratan
Carlos Garcia
David Chickering
David Grangier
Denis Charles
Jina Suh
Johan Verwey
Jurado Suarez
Léon Bottou
Patrice Simard
Saleema Amershi
Publication venue
Publication date: 01/01/2014
Field of study

Quick interaction between a human teacher and a learning machine presents numerous benefits and challenges when working with web-scale data. The human teacher guides the machine towards accomplishing the task of interest. The learning machine leverages big data to find examples that maximize the training value of its interaction with the teacher. When the teacher is restricted to labeling examples selected by the machine, this problem is an instance of active learning. When the teacher can provide additional information to the machine (e.g., suggestions on what examples or predictive features should be used) as the learning task progresses, then the problem becomes one of interactive learning. To accommodate the two-way communication channel needed for efficient interactive learning, the teacher and the machine need an environment that supports an interaction language. The machine can access, process, and summarize more examples than the teacher can see in a lifetime. Based on the machine's output, the teacher can revise the definition of the task or make it more precise. Both the teacher and the machine continuously learn and benefit from the interaction. We have built a platform to (1) produce valuable and deployable models and (2) support research on both the machine learning and user interface challenges of the interactive learning problem. The platform relies on a dedicated, low-latency, distributed, in-memory architecture that allows us to construct web-scale learning machines with quick interaction speed. The purpose of this paper is to describe this architecture and demonstrate how it supports our research efforts. Preliminary results are presented as illustrations of the architecture but are not the primary focus of the paper

arXiv.org e-Print Archive

CiteSeerX

Track 3: Computations in theoretical physics -- techniques and methods

Author: Luisoni Gionata
Poslavsky Stanislav
Schroder York
Publication venue: 'IOP Publishing'
Publication date: 12/04/2016
Field of study

Here, we attempt to summarize the activities of Track 3 of the 17th International Workshop on Advanced Computing and Analysis Techniques in Physics Research (ACAT 2016).Comment: 10 pages, 3 figures, to appear in the proceedings of ACAT 201

arXiv.org e-Print Archive

CERN Document Server

Using the R Package crlmm for Genotyping and Copy Number Estimation

Author: Benilton Carvalho
Ingo Ruczinski
Matthew E. Ritchie
Rafael A. Irizarry
Robert B. Scharpf
Publication venue
Publication date
Field of study

Genotyping platforms such as Affymetrix can be used to assess genotype-phenotype as well as copy number-phenotype associations at millions of markers. While genotyping algorithms are largely concordant when assessed on HapMap samples, tools to assess copy number changes are more variable and often discordant. One explanation for the discordance is that copy number estimates are susceptible to systematic differences between groups of samples that were processed at different times or by different labs. Analysis algorithms that do not adjust for batch effects are prone to spurious measures of association. The R package crlmm implements a multilevel model that adjusts for batch effects and provides allele-specific estimates of copy number. This paper illustrates a workflow for the estimation of allele-specific copy number and integration of the marker-level estimates with complimentary Bioconductor software for inferring regions of copy number gain or loss. All analyses are performed in the statistical environment R.

Research Papers in Economics