5,685 research outputs found
Zero Shot Learning for Code Education: Rubric Sampling with Deep Learning Inference
In modern computer science education, massive open online courses (MOOCs) log
thousands of hours of data about how students solve coding challenges. Being so
rich in data, these platforms have garnered the interest of the machine
learning community, with many new algorithms attempting to autonomously provide
feedback to help future students learn. But what about those first hundred
thousand students? In most educational contexts (i.e. classrooms), assignments
do not have enough historical data for supervised learning. In this paper, we
introduce a human-in-the-loop "rubric sampling" approach to tackle the "zero
shot" feedback challenge. We are able to provide autonomous feedback for the
first students working on an introductory programming assignment with accuracy
that substantially outperforms data-hungry algorithms and approaches human
level fidelity. Rubric sampling requires minimal teacher effort, can associate
feedback with specific parts of a student's solution and can articulate a
student's misconceptions in the language of the instructor. Deep learning
inference enables rubric sampling to further improve as more assignment
specific student data is acquired. We demonstrate our results on a novel
dataset from Code.org, the world's largest programming education platform.Comment: To appear at AAAI 2019; 9 page
Neural Collaborative Subspace Clustering
We introduce the Neural Collaborative Subspace Clustering, a neural model
that discovers clusters of data points drawn from a union of low-dimensional
subspaces. In contrast to previous attempts, our model runs without the aid of
spectral clustering. This makes our algorithm one of the kinds that can
gracefully scale to large datasets. At its heart, our neural model benefits
from a classifier which determines whether a pair of points lies on the same
subspace or not. Essential to our model is the construction of two affinity
matrices, one from the classifier and the other from a notion of subspace
self-expressiveness, to supervise training in a collaborative scheme. We
thoroughly assess and contrast the performance of our model against various
state-of-the-art clustering algorithms including deep subspace-based ones.Comment: Accepted to ICML 201
Pedestrian Attribute Recognition: A Survey
Recognizing pedestrian attributes is an important task in computer vision
community due to it plays an important role in video surveillance. Many
algorithms has been proposed to handle this task. The goal of this paper is to
review existing works using traditional methods or based on deep learning
networks. Firstly, we introduce the background of pedestrian attributes
recognition (PAR, for short), including the fundamental concepts of pedestrian
attributes and corresponding challenges. Secondly, we introduce existing
benchmarks, including popular datasets and evaluation criterion. Thirdly, we
analyse the concept of multi-task learning and multi-label learning, and also
explain the relations between these two learning algorithms and pedestrian
attribute recognition. We also review some popular network architectures which
have widely applied in the deep learning community. Fourthly, we analyse
popular solutions for this task, such as attributes group, part-based,
\emph{etc}. Fifthly, we shown some applications which takes pedestrian
attributes into consideration and achieve better performance. Finally, we
summarized this paper and give several possible research directions for
pedestrian attributes recognition. The project page of this paper can be found
from the following website:
\url{https://sites.google.com/view/ahu-pedestrianattributes/}.Comment: Check our project page for High Resolution version of this survey:
https://sites.google.com/view/ahu-pedestrianattributes
Scalable Recollections for Continual Lifelong Learning
Given the recent success of Deep Learning applied to a variety of single
tasks, it is natural to consider more human-realistic settings. Perhaps the
most difficult of these settings is that of continual lifelong learning, where
the model must learn online over a continuous stream of non-stationary data. A
successful continual lifelong learning system must have three key capabilities:
it must learn and adapt over time, it must not forget what it has learned, and
it must be efficient in both training time and memory. Recent techniques have
focused their efforts primarily on the first two capabilities while questions
of efficiency remain largely unexplored. In this paper, we consider the problem
of efficient and effective storage of experiences over very large time-frames.
In particular we consider the case where typical experiences are O(n) bits and
memories are limited to O(k) bits for k << n. We present a novel scalable
architecture and training algorithm in this challenging domain and provide an
extensive evaluation of its performance. Our results show that we can achieve
considerable gains on top of state-of-the-art methods such as GEM.Comment: AAAI 201
TasNet: time-domain audio separation network for real-time, single-channel speech separation
Robust speech processing in multi-talker environments requires effective
speech separation. Recent deep learning systems have made significant progress
toward solving this problem, yet it remains challenging particularly in
real-time, short latency applications. Most methods attempt to construct a mask
for each source in time-frequency representation of the mixture signal which is
not necessarily an optimal representation for speech separation. In addition,
time-frequency decomposition results in inherent problems such as
phase/magnitude decoupling and long time window which is required to achieve
sufficient frequency resolution. We propose Time-domain Audio Separation
Network (TasNet) to overcome these limitations. We directly model the signal in
the time-domain using an encoder-decoder framework and perform the source
separation on nonnegative encoder outputs. This method removes the frequency
decomposition step and reduces the separation problem to estimation of source
masks on encoder outputs which is then synthesized by the decoder. Our system
outperforms the current state-of-the-art causal and noncausal speech separation
algorithms, reduces the computational cost of speech separation, and
significantly reduces the minimum required latency of the output. This makes
TasNet suitable for applications where low-power, real-time implementation is
desirable such as in hearable and telecommunication devices.Comment: Camera ready version for ICASSP 2018, Calgary, Canad
Creating Capsule Wardrobes from Fashion Images
We propose to automatically create capsule wardrobes. Given an inventory of
candidate garments and accessories, the algorithm must assemble a minimal set
of items that provides maximal mix-and-match outfits. We pose the task as a
subset selection problem. To permit efficient subset selection over the space
of all outfit combinations, we develop submodular objective functions capturing
the key ingredients of visual compatibility, versatility, and user-specific
preference. Since adding garments to a capsule only expands its possible
outfits, we devise an iterative approach to allow near-optimal submodular
function maximization. Finally, we present an unsupervised approach to learn
visual compatibility from "in the wild" full body outfit photos; the
compatibility metric translates well to cleaner catalog photos and improves
over existing methods. Our results on thousands of pieces from popular fashion
websites show that automatic capsule creation has potential to mimic skilled
fashionistas in assembling flexible wardrobes, while being significantly more
scalable.Comment: Accepted to CVPR 201
- …