96 research outputs found
Understanding the Role of Adaptivity in Machine Teaching: The Case of Version Space Learners
In real-world applications of education, an effective teacher adaptively
chooses the next example to teach based on the learner's current state.
However, most existing work in algorithmic machine teaching focuses on the
batch setting, where adaptivity plays no role. In this paper, we study the case
of teaching consistent, version space learners in an interactive setting. At
any time step, the teacher provides an example, the learner performs an update,
and the teacher observes the learner's new state. We highlight that adaptivity
does not speed up the teaching process when considering existing models of
version space learners, such as "worst-case" (the learner picks the next
hypothesis randomly from the version space) and "preference-based" (the learner
picks hypothesis according to some global preference). Inspired by human
teaching, we propose a new model where the learner picks hypotheses according
to some local preference defined by the current hypothesis. We show that our
model exhibits several desirable properties, e.g., adaptivity plays a key role,
and the learner's transitions over hypotheses are smooth/interpretable. We
develop efficient teaching algorithms and demonstrate our results via
simulation and user studies.Comment: NeurIPS 2018 (extended version
RayMVSNet++: Learning Ray-based 1D Implicit Fields for Accurate Multi-View Stereo
Learning-based multi-view stereo (MVS) has by far centered around 3D
convolution on cost volumes. Due to the high computation and memory consumption
of 3D CNN, the resolution of output depth is often considerably limited.
Different from most existing works dedicated to adaptive refinement of cost
volumes, we opt to directly optimize the depth value along each camera ray,
mimicking the range finding of a laser scanner. This reduces the MVS problem to
ray-based depth optimization which is much more light-weight than full cost
volume optimization. In particular, we propose RayMVSNet which learns
sequential prediction of a 1D implicit field along each camera ray with the
zero-crossing point indicating scene depth. This sequential modeling, conducted
based on transformer features, essentially learns the epipolar line search in
traditional multi-view stereo. We devise a multi-task learning for better
optimization convergence and depth accuracy. We found the monotonicity property
of the SDFs along each ray greatly benefits the depth estimation. Our method
ranks top on both the DTU and the Tanks & Temples datasets over all previous
learning-based methods, achieving an overall reconstruction score of 0.33mm on
DTU and an F-score of 59.48% on Tanks & Temples. It is able to produce
high-quality depth estimation and point cloud reconstruction in challenging
scenarios such as objects/scenes with non-textured surface, severe occlusion,
and highly varying depth range. Further, we propose RayMVSNet++ to enhance
contextual feature aggregation for each ray through designing an attentional
gating unit to select semantically relevant neighboring rays within the local
frustum around that ray. RayMVSNet++ achieves state-of-the-art performance on
the ScanNet dataset. In particular, it attains an AbsRel of 0.058m and produces
accurate results on the two subsets of textureless regions and large depth
variation.Comment: IEEE Transactions on Pattern Analysis and Machine Intelligence. arXiv
admin note: substantial text overlap with arXiv:2204.0132
Leveraging Large Language Models in Conversational Recommender Systems
A Conversational Recommender System (CRS) offers increased transparency and
control to users by enabling them to engage with the system through a real-time
multi-turn dialogue. Recently, Large Language Models (LLMs) have exhibited an
unprecedented ability to converse naturally and incorporate world knowledge and
common-sense reasoning into language understanding, unlocking the potential of
this paradigm. However, effectively leveraging LLMs within a CRS introduces new
technical challenges, including properly understanding and controlling a
complex conversation and retrieving from external sources of information. These
issues are exacerbated by a large, evolving item corpus and a lack of
conversational data for training. In this paper, we provide a roadmap for
building an end-to-end large-scale CRS using LLMs. In particular, we propose
new implementations for user preference understanding, flexible dialogue
management and explainable recommendations as part of an integrated
architecture powered by LLMs. For improved personalization, we describe how an
LLM can consume interpretable natural language user profiles and use them to
modulate session-level context. To overcome conversational data limitations in
the absence of an existing production CRS, we propose techniques for building a
controllable LLM-based user simulator to generate synthetic conversations. As a
proof of concept we introduce RecLLM, a large-scale CRS for YouTube videos
built on LaMDA, and demonstrate its fluency and diverse functionality through
some illustrative example conversations
- …