2,576 research outputs found
Using humanoid robots to study human behavior
Our understanding of human behavior advances as our humanoid robotics work progresses-and vice versa. This team's work focuses on trajectory formation and planning, learning from demonstration, oculomotor control and interactive behaviors. They are programming robotic behavior based on how we humans “program” behavior in-or train-each other
Scalable Inference of Customer Similarities from Interactions Data using Dirichlet Processes
Under the sociological theory of homophily, people who are similar to one
another are more likely to interact with one another. Marketers often have
access to data on interactions among customers from which, with homophily as a
guiding principle, inferences could be made about the underlying similarities.
However, larger networks face a quadratic explosion in the number of potential
interactions that need to be modeled. This scalability problem renders
probability models of social interactions computationally infeasible for all
but the smallest networks. In this paper we develop a probabilistic framework
for modeling customer interactions that is both grounded in the theory of
homophily, and is flexible enough to account for random variation in who
interacts with whom. In particular, we present a novel Bayesian nonparametric
approach, using Dirichlet processes, to moderate the scalability problems that
marketing researchers encounter when working with networked data. We find that
this framework is a powerful way to draw insights into latent similarities of
customers, and we discuss how marketers can apply these insights to
segmentation and targeting activities
Statistical Learning Approaches to Information Filtering
Enabling computer systems to understand human thinking or
behaviors has ever been an exciting challenge to computer
scientists. In recent years one such a topic, information
filtering, emerges to help users find desired information items (e.g.~movies, books, news) from large amount of available data, and has become crucial in many applications, like product recommendation, image retrieval, spam email filtering, news filtering, and web navigation etc..
An information filtering system must be able to understand users' information needs. Existing approaches either infer a
user's profile by exploring his/her connections to other users, i.e.~collaborative filtering (CF), or analyzing the content descriptions of liked or disliked examples annotated by the user, ~i.e.~content-based filtering (CBF). Those methods work well to some extent, but are facing difficulties due to lack of insights into the problem.
This thesis intensively studies a wide scope of information
filtering technologies. Novel and principled machine
learning methods are proposed to model users' information needs. The work demonstrates that the uncertainty of user profiles and the connections between them can be effectively modelled by using probability theory and Bayes rule. As one major contribution of this thesis, the work clarifies the ``structure'' of information filtering and gives rise to principled solutions. In summary, the work of this thesis mainly covers the following
three aspects:
Collaborative filtering: We develop a probabilistic model for memory-based collaborative filtering (PMCF), which has clear links with classical memory-based CF. Various heuristics to improve memory-based CF have been proposed
in the literature. In contrast, extensions based on PMCF can be made in a principled probabilistic way. With PMCF, we describe a CF paradigm that involves interactions with
users, instead of passively receiving data from users in conventional CF, and actively chooses the most informative patterns to learn, thereby greatly reduce user efforts and computational costs.
Content-based filtering: One major problem for CBF is the
deficiency and high dimensionality of content-descriptive
features. Information items (e.g.~images or articles) are typically described by high-dimensional features with mixed types of attributes, that seem to be developed independently but intrinsically related. We derive a generalized principle component analysis to merge high-dimensional and heterogenous content features into a low-dimensional continuous latent space. The derived features brings great conveniences to CBF, because most existing algorithms easily cope with low-dimensional and continuous data, and more importantly, the extracted data highlight the intrinsic semantics of original content features.
Hybrid filtering: How to combine CF and CBF in an ``smart'' way remains one of the most challenging problems in information filtering. Little principled work exists so far. This thesis reveals that people's information needs can be naturally modelled with a hierarchical Bayesian thinking, where each individual's data are generated based on his/her own profile model, which itself is a sample from a common distribution of the population of user profiles. Users are thus connected to each other via this common distribution. Due to the complexity of such a distribution in real-world applications, usually applied parametric models are too restrictive, and we thus introduce a nonparametric hierarchical Bayesian model using Dirichlet process. We derive effective and efficient algorithms to learn the described model. In particular, the finally achieved hybrid filtering methods are surprisingly simple and intuitively understandable, offering clear insights to previous work on pure CF, pure CBF, and hybrid filtering
Algorithms for CVaR Optimization in MDPs
In many sequential decision-making problems we may want to manage risk by
minimizing some measure of variability in costs in addition to minimizing a
standard criterion. Conditional value-at-risk (CVaR) is a relatively new risk
measure that addresses some of the shortcomings of the well-known
variance-related risk measures, and because of its computational efficiencies
has gained popularity in finance and operations research. In this paper, we
consider the mean-CVaR optimization problem in MDPs. We first derive a formula
for computing the gradient of this risk-sensitive objective function. We then
devise policy gradient and actor-critic algorithms that each uses a specific
method to estimate this gradient and updates the policy parameters in the
descent direction. We establish the convergence of our algorithms to locally
risk-sensitive optimal policies. Finally, we demonstrate the usefulness of our
algorithms in an optimal stopping problem.Comment: Submitted to NIPS 1
- …