24 research outputs found
Enhancing Topic Extraction in Recommender Systems with Entropy Regularization
In recent years, many recommender systems have utilized textual data for
topic extraction to enhance interpretability. However, our findings reveal a
noticeable deficiency in the coherence of keywords within topics, resulting in
low explainability of the model. This paper introduces a novel approach called
entropy regularization to address the issue, leading to more interpretable
topics extracted from recommender systems, while ensuring that the performance
of the primary task stays competitively strong. The effectiveness of the
strategy is validated through experiments on a variation of the probabilistic
matrix factorization model that utilizes textual data to extract item
embeddings. The experiment results show a significant improvement in topic
coherence, which is quantified by cosine similarity on word embeddings
CN-Celeb-AV: A Multi-Genre Audio-Visual Dataset for Person Recognition
Audio-visual person recognition (AVPR) has received extensive attention.
However, most datasets used for AVPR research so far are collected in
constrained environments, and thus cannot reflect the true performance of AVPR
systems in real-world scenarios. To meet the request for research on AVPR in
unconstrained conditions, this paper presents a multi-genre AVPR dataset
collected `in the wild', named CN-Celeb-AV. This dataset contains more than
419k video segments from 1,136 persons from public media. In particular, we put
more emphasis on two real-world complexities: (1) data in multiple genres; (2)
segments with partial information. A comprehensive study was conducted to
compare CN-Celeb-AV with two popular public AVPR benchmark datasets, and the
results demonstrated that CN-Celeb-AV is more in line with real-world scenarios
and can be regarded as a new benchmark dataset for AVPR research. The dataset
also involves a development set that can be used to boost the performance of
AVPR systems in real-life situations. The dataset is free for researchers and
can be downloaded from http://cnceleb.org/.Comment: INTERSPEECH 202
RoboEXP: Action-Conditioned Scene Graph via Interactive Exploration for Robotic Manipulation
Robots need to explore their surroundings to adapt to and tackle tasks in
unknown environments. Prior work has proposed building scene graphs of the
environment but typically assumes that the environment is static, omitting
regions that require active interactions. This severely limits their ability to
handle more complex tasks in household and office environments: before setting
up a table, robots must explore drawers and cabinets to locate all utensils and
condiments. In this work, we introduce the novel task of interactive scene
exploration, wherein robots autonomously explore environments and produce an
action-conditioned scene graph (ACSG) that captures the structure of the
underlying environment. The ACSG accounts for both low-level information, such
as geometry and semantics, and high-level information, such as the
action-conditioned relationships between different entities in the scene. To
this end, we present the Robotic Exploration (RoboEXP) system, which
incorporates the Large Multimodal Model (LMM) and an explicit memory design to
enhance our system's capabilities. The robot reasons about what and how to
explore an object, accumulating new information through the interaction process
and incrementally constructing the ACSG. We apply our system across various
real-world settings in a zero-shot manner, demonstrating its effectiveness in
exploring and modeling environments it has never seen before. Leveraging the
constructed ACSG, we illustrate the effectiveness and efficiency of our RoboEXP
system in facilitating a wide range of real-world manipulation tasks involving
rigid, articulated objects, nested objects like Matryoshka dolls, and
deformable objects like cloth.Comment: Project Page: https://jianghanxiao.github.io/roboexp-web
Blood-coated sensor for high-throughput ptychographic cytometry on a Blu-ray disc
Blu-ray drive is an engineering masterpiece that integrates disc rotation,
pickup head translation, and three lasers in a compact and portable format.
Here we integrate a blood-coated image sensor with a modified Blu-ray drive for
high-throughput cytometric analysis of various bio-specimens. In this device,
samples are mounted on the rotating Blu-ray disc and illuminated by the
built-in lasers from the pickup head. The resulting coherent diffraction
patterns are then recorded by the blood-coated image sensor. The rich spatial
features of the blood-cell monolayer help down-modulate the object information
for sensor detection, thus forming a high-resolution computational bio-lens
with a theoretically unlimited field of view. With the acquired data, we
develop a lensless coherent diffraction imaging modality termed rotational
ptychography for image reconstruction. We show that our device can resolve the
435 nm linewidth on the resolution target and has a field of view only limited
by the size of the Blu-ray disc. To demonstrate its applications, we perform
high-throughput urinalysis by locating disease-related calcium oxalate crystals
over the entire microscope slide. We also quantify different types of cells on
a blood smear with an acquisition speed of ~10,000 cells per second. For in
vitro experiment, we monitor live bacterial cultures over the entire Petri dish
with single-cell resolution. Using biological cells as a computational lens
could enable new intriguing imaging devices for point-of-care diagnostics.
Modifying a Blu-ray drive with the blood-coated sensor further allows the
spread of high-throughput optical microscopy from well-equipped laboratories to
citizen scientists worldwide
Lensless polarimetric coded ptychography (pol-CP) for high-resolution, high-throughput birefringence imaging on a chip
Polarimetric imaging provides valuable insights into the polarization state
of light interacting with a sample. It can infer crucial birefringence
properties of bio-specimens without using any labels, thereby facilitating the
diagnosis of diseases such as cancer and osteoarthritis. In this study, we
introduce a novel polarimetric coded ptychography (pol-CP) approach that
enables high-resolution, high-throughput birefringence imaging on a chip. Our
platform deviates from traditional lens-based polarization systems by employing
an integrated polarimetric coded sensor for lensless diffraction data
acquisition. Utilizing Jones calculus, we quantitatively determine the
birefringence retardance and orientation information of bio-specimens from four
recovered intensity images. Our portable pol-CP prototype can resolve the
435-nm linewidth on the resolution target and the imaging field of view for a
single acquisition is limited only by the detector size of 41 mm^2. The
prototype allows for the acquisition of gigapixel birefringence images with a
180-mm^2 field of view in ~3.5 minutes, achieving an imaging throughput
comparable to that of a conventional whole slide scanner. To demonstrate its
biomedical applications, we perform high-throughput imaging of malaria-infected
blood smears, locating parasites using birefringence contrast. We also generate
birefringence maps of label-free thyroid smears to identify thyroid follicles.
Notably, the recovered birefringence maps emphasize the same regions as
autofluorescence images, indicating the potential for rapid on-site evaluation
of label-free biopsies. The reported approach offers a portable, turnkey
solution for high-resolution, high-throughput polarimetric analysis without
using lenses, with potential applications in disease diagnosis, sample
screening, and label-free chemical imaging