109,144 research outputs found
Evorus: A Crowd-powered Conversational Assistant Built to Automate Itself Over Time
Crowd-powered conversational assistants have been shown to be more robust
than automated systems, but do so at the cost of higher response latency and
monetary costs. A promising direction is to combine the two approaches for high
quality, low latency, and low cost solutions. In this paper, we introduce
Evorus, a crowd-powered conversational assistant built to automate itself over
time by (i) allowing new chatbots to be easily integrated to automate more
scenarios, (ii) reusing prior crowd answers, and (iii) learning to
automatically approve response candidates. Our 5-month-long deployment with 80
participants and 281 conversations shows that Evorus can automate itself
without compromising conversation quality. Crowd-AI architectures have long
been proposed as a way to reduce cost and latency for crowd-powered systems;
Evorus demonstrates how automation can be introduced successfully in a deployed
system. Its architecture allows future researchers to make further innovation
on the underlying automated components in the context of a deployed open domain
dialog system.Comment: 10 pages. To appear in the Proceedings of the Conference on Human
Factors in Computing Systems 2018 (CHI'18
Using Visualization to Support Data Mining of Large Existing Databases
In this paper. we present ideas how visualization technology can be used to improve the difficult process of querying very large databases. With our VisDB system, we try to provide visual support not only for the query specification process. but also for evaluating query results and. thereafter, refining the query accordingly. The main idea of our system is to represent as many data items as possible by the pixels of the display device. By arranging and coloring the pixels according to the relevance for the query, the user gets a visual impression of the resulting data set and of its relevance for the query. Using an interactive query interface, the user may change the query dynamically and receives immediate feedback by the visual representation of the resulting data set. By using multiple windows for different parts of the query, the user gets visual feedback for each part of the query and, therefore, may easier understand the overall result. To support complex queries, we introduce the notion of approximate joins which allow the user to find data items that only approximately fulfill join conditions. We also present ideas how our technique may be extended to support the interoperation of heterogeneous databases. Finally, we discuss the performance problems that are caused by interfacing to existing database systems and present ideas to solve these problems by using data structures supporting a multidimensional search of the database
BioSimulator.jl: Stochastic simulation in Julia
Biological systems with intertwined feedback loops pose a challenge to
mathematical modeling efforts. Moreover, rare events, such as mutation and
extinction, complicate system dynamics. Stochastic simulation algorithms are
useful in generating time-evolution trajectories for these systems because they
can adequately capture the influence of random fluctuations and quantify rare
events. We present a simple and flexible package, BioSimulator.jl, for
implementing the Gillespie algorithm, -leaping, and related stochastic
simulation algorithms. The objective of this work is to provide scientists
across domains with fast, user-friendly simulation tools. We used the
high-performance programming language Julia because of its emphasis on
scientific computing. Our software package implements a suite of stochastic
simulation algorithms based on Markov chain theory. We provide the ability to
(a) diagram Petri Nets describing interactions, (b) plot average trajectories
and attached standard deviations of each participating species over time, and
(c) generate frequency distributions of each species at a specified time.
BioSimulator.jl's interface allows users to build models programmatically
within Julia. A model is then passed to the simulate routine to generate
simulation data. The built-in tools allow one to visualize results and compute
summary statistics. Our examples highlight the broad applicability of our
software to systems of varying complexity from ecology, systems biology,
chemistry, and genetics. The user-friendly nature of BioSimulator.jl encourages
the use of stochastic simulation, minimizes tedious programming efforts, and
reduces errors during model specification.Comment: 27 pages, 5 figures, 3 table
Crowdbreaks: Tracking Health Trends using Public Social Media Data and Crowdsourcing
In the past decade, tracking health trends using social media data has shown
great promise, due to a powerful combination of massive adoption of social
media around the world, and increasingly potent hardware and software that
enables us to work with these new big data streams. At the same time, many
challenging problems have been identified. First, there is often a mismatch
between how rapidly online data can change, and how rapidly algorithms are
updated, which means that there is limited reusability for algorithms trained
on past data as their performance decreases over time. Second, much of the work
is focusing on specific issues during a specific past period in time, even
though public health institutions would need flexible tools to assess multiple
evolving situations in real time. Third, most tools providing such capabilities
are proprietary systems with little algorithmic or data transparency, and thus
little buy-in from the global public health and research community. Here, we
introduce Crowdbreaks, an open platform which allows tracking of health trends
by making use of continuous crowdsourced labelling of public social media
content. The system is built in a way which automatizes the typical workflow
from data collection, filtering, labelling and training of machine learning
classifiers and therefore can greatly accelerate the research process in the
public health domain. This work introduces the technical aspects of the
platform and explores its future use cases
- …