257,623 research outputs found
Framework for reliable, real-time facial expression recognition for low resolution images
International audienceAutomatic recognition of facial expressions is a challenging problem specially for low spatial resolution facial images. It has many potential applications in human-computer interactions, social robots, deceit detection, interactive video and behavior monitoring. In this study we present a novel framework that can recognize facial expressions very efficiently and with high accuracy even for very low resolution facial images. The proposed framework is memory and time efficient as it extracts texture features in a pyramidal fashion only from the perceptual salient regions of the face. We tested the framework on different databases, which includes Cohn-Kanade (CK+) posed facial expression database, spontaneous expressions of MMI facial expression database and FG-NET facial expressions and emotions database (FEED) and obtained very good results. Moreover, our proposed framework exceeds state-of-the-art methods for expression recognition on low resolution images
Evaluating Human-Language Model Interaction
Many real-world applications of language models (LMs), such as writing
assistance and code autocomplete, involve human-LM interaction. However, most
benchmarks are non-interactive in that a model produces output without human
involvement. To evaluate human-LM interaction, we develop a new framework,
Human-AI Language-based Interaction Evaluation (HALIE), that defines the
components of interactive systems and dimensions to consider when designing
evaluation metrics. Compared to standard, non-interactive evaluation, HALIE
captures (i) the interactive process, not only the final output; (ii) the
first-person subjective experience, not just a third-party assessment; and
(iii) notions of preference beyond quality (e.g., enjoyment and ownership). We
then design five tasks to cover different forms of interaction: social
dialogue, question answering, crossword puzzles, summarization, and metaphor
generation. With four state-of-the-art LMs (three variants of OpenAI's GPT-3
and AI21 Labs' Jurassic-1), we find that better non-interactive performance
does not always translate to better human-LM interaction. In particular, we
highlight three cases where the results from non-interactive and interactive
metrics diverge and underscore the importance of human-LM interaction for LM
evaluation.Comment: Authored by the Center for Research on Foundation Models (CRFM) at
the Stanford Institute for Human-Centered Artificial Intelligence (HAI
Diamonds in Dystopia
Diamonds in Dystopia is a body of work and web framework for creatively datamining large sources of text for mobile interaction. So far we have used it for live-streaming poetry performances at various locations such as SXSW Interactive and TEDx in addition to fine arts installations. As a perfor- mance it is a web driven app for incorporating improvisation into experiential storytelling. The audience acts as collaborator by sending word selections by tapping language on their mobiles to trigger reactions to send a distilled, improvisational stanza culled from a massive corpus of text to the poet on stage. The individual taps coming from the audience also trigger synthesized audio effects at varying pitches to create a musical experience as well as contributing to a vi- sual projection of the poem and audience interactivity. Cre- ated by Vincent A. Cellucci (poet), Jesse Allison (Professor of Experimental Music), and Derick Ostrenko (Professor of Digital Art), the applications use natural language processing on text to generate an innovative media stage project. The app creators are interested in creative data mining and incorporating interactive media into performances that challenge people’s perceptions and expectations for the mediums of music, digital art and design, and poetry
DeepSI: Interactive Deep Learning for Semantic Interaction
In this paper, we design novel interactive deep learning methods to improve
semantic interactions in visual analytics applications. The ability of semantic
interaction to infer analysts' precise intents during sensemaking is dependent
on the quality of the underlying data representation. We propose the
framework that integrates deep learning into
the human-in-the-loop interactive sensemaking pipeline, with two important
properties. First, deep learning extracts meaningful representations from raw
data, which improves semantic interaction inference. Second, semantic
interactions are exploited to fine-tune the deep learning representations,
which then further improves semantic interaction inference. This feedback loop
between human interaction and deep learning enables efficient learning of user-
and task-specific representations. To evaluate the advantage of embedding the
deep learning within the semantic interaction loop, we compare
against a state-of-the-art but more basic use
of deep learning as only a feature extractor pre-processed outside of the
interactive loop. Results of two complementary studies, a human-centered
qualitative case study and an algorithm-centered simulation-based quantitative
experiment, show that more accurately
captures users' complex mental models with fewer interactions
Recommended from our members
Physical Plan Instrumentation in Databases: Mechanisms and Applications
Database management systems (DBMSs) are designed with the goal set to compile SQL queries to physical plans that, when executed, provide results to the SQL queries. Building on this functionality, an ever-increasing number of application domains (e.g., provenance management, online query optimization, physical database design, interactive data profiling, monitoring, and interactive data visualization) seek to operate on how queries are executed by the DBMS for a wide variety of purposes ranging from debugging and data explanation to optimization and monitoring. Unfortunately, DBMSs provide little, if any, support to facilitate the development of this class of important application domains. The effect is such that database application developers and database system architects either rewrite the database internals in ad-hoc ways; work around the SQL interface, if possible, with inevitable performance penalties; or even build new databases from scratch only to express and optimize their domain-specific application logic over how queries are executed.
To address this problem in a principled manner in this dissertation, we introduce a prototype DBMS, namely, Smoke, that exposes instrumentation mechanisms in the form of a framework to allow external applications to manipulate physical plans. Intuitively, a physical plan is the underlying representation that DBMSs use to encode how a SQL query will be executed, and providing instrumentation mechanisms at this representation level allows applications to express and optimize their logic on how queries are executed.
Having such an instrumentation-enabled DBMS in-place, we then consider how to express and optimize applications that rely their logic on how queries are executed. To best demonstrate the expressive and optimization power of instrumentation-enabled DBMSs, we express and optimize applications across several important domains including provenance management, interactive data visualization, interactive data profiling, physical database design, online query optimization, and query discovery. Expressivity-wise, we show that Smoke can express known techniques, introduce novel semantics on known techniques, and introduce new techniques across domains. Performance-wise, we show case-by-case that Smoke is on par with or up-to several orders of magnitudes faster than state-of-the-art imperative and declarative implementations of important applications across domains.
As such, we believe our contributions provide evidence and form the basis towards a class of instrumentation-enabled DBMSs with the goal set to express and optimize applications across important domains with core logic over how queries are executed by DBMSs
OpenFace: An open source facial behavior analysis toolkit
Over the past few years, there has been an increased
interest in automatic facial behavior analysis and understanding.
We present OpenFace – an open source tool
intended for computer vision and machine learning researchers,
affective computing community and people interested
in building interactive applications based on facial
behavior analysis. OpenFace is the first open source tool
capable of facial landmark detection, head pose estimation,
facial action unit recognition, and eye-gaze estimation.
The computer vision algorithms which represent the core of
OpenFace demonstrate state-of-the-art results in all of the
above mentioned tasks. Furthermore, our tool is capable of
real-time performance and is able to run from a simple webcam
without any specialist hardware. Finally, OpenFace
allows for easy integration with other applications and devices
through a lightweight messaging system.European Community Seventh Framework Programme (FP7/2007-2013) under grant agreement No. 289021 (ASC-Inclusion)
Interactive Machine Learning with Applications in Health Informatics
Recent years have witnessed unprecedented growth of health data, including millions of biomedical research publications, electronic health records, patient discussions on health forums and social media, fitness tracker trajectories, and genome sequences. Information retrieval and machine learning techniques are powerful tools to unlock invaluable knowledge in these data, yet they need to be guided by human experts. Unlike training machine learning models in other domains, labeling and analyzing health data requires highly specialized expertise, and the time of medical experts is extremely limited. How can we mine big health data with little expert effort? In this dissertation, I develop state-of-the-art interactive machine learning algorithms that bring together human intelligence and machine intelligence in health data mining tasks. By making efficient use of human expert's domain knowledge, we can achieve high-quality solutions with minimal manual effort.
I first introduce a high-recall information retrieval framework that helps human users efficiently harvest not just one but as many relevant documents as possible from a searchable corpus. This is a common need in professional search scenarios such as medical search and literature review. Then I develop two interactive machine learning algorithms that leverage human expert's domain knowledge to combat the curse of "cold start" in active learning, with applications in clinical natural language processing. A consistent empirical observation is that the overall learning process can be reliably accelerated by a knowledge-driven "warm start", followed by machine-initiated active learning. As a theoretical contribution, I propose a general framework for interactive machine learning. Under this framework, a unified optimization objective explains many existing algorithms used in practice, and inspires the design of new algorithms.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/147518/1/raywang_1.pd
- …