67 research outputs found
MatchZoo: A Learning, Practicing, and Developing System for Neural Text Matching
Text matching is the core problem in many natural language processing (NLP)
tasks, such as information retrieval, question answering, and conversation.
Recently, deep leaning technology has been widely adopted for text matching,
making neural text matching a new and active research domain. With a large
number of neural matching models emerging rapidly, it becomes more and more
difficult for researchers, especially those newcomers, to learn and understand
these new models. Moreover, it is usually difficult to try these models due to
the tedious data pre-processing, complicated parameter configuration, and
massive optimization tricks, not to mention the unavailability of public codes
sometimes. Finally, for researchers who want to develop new models, it is also
not an easy task to implement a neural text matching model from scratch, and to
compare with a bunch of existing models. In this paper, therefore, we present a
novel system, namely MatchZoo, to facilitate the learning, practicing and
designing of neural text matching models. The system consists of a powerful
matching library and a user-friendly and interactive studio, which can help
researchers: 1) to learn state-of-the-art neural text matching models
systematically, 2) to train, test and apply these models with simple
configurable steps; and 3) to develop their own models with rich APIs and
assistance
Learning Visual Features from Snapshots for Web Search
When applying learning to rank algorithms to Web search, a large number of
features are usually designed to capture the relevance signals. Most of these
features are computed based on the extracted textual elements, link analysis,
and user logs. However, Web pages are not solely linked texts, but have
structured layout organizing a large variety of elements in different styles.
Such layout itself can convey useful visual information, indicating the
relevance of a Web page. For example, the query-independent layout (i.e., raw
page layout) can help identify the page quality, while the query-dependent
layout (i.e., page rendered with matched query words) can further tell rich
structural information (e.g., size, position and proximity) of the matching
signals. However, such visual information of layout has been seldom utilized in
Web search in the past. In this work, we propose to learn rich visual features
automatically from the layout of Web pages (i.e., Web page snapshots) for
relevance ranking. Both query-independent and query-dependent snapshots are
considered as the new inputs. We then propose a novel visual perception model
inspired by human's visual search behaviors on page viewing to extract the
visual features. This model can be learned end-to-end together with traditional
human-crafted features. We also show that such visual features can be
efficiently acquired in the online setting with an extended inverted indexing
scheme. Experiments on benchmark collections demonstrate that learning visual
features from Web page snapshots can significantly improve the performance of
relevance ranking in ad-hoc Web retrieval tasks.Comment: CIKM 201
Multi-dimensional Fusion and Consistency for Semi-supervised Medical Image Segmentation
In this paper, we introduce a novel semi-supervised learning framework
tailored for medical image segmentation. Central to our approach is the
innovative Multi-scale Text-aware ViT-CNN Fusion scheme. This scheme adeptly
combines the strengths of both ViTs and CNNs, capitalizing on the unique
advantages of both architectures as well as the complementary information in
vision-language modalities. Further enriching our framework, we propose the
Multi-Axis Consistency framework for generating robust pseudo labels, thereby
enhancing the semi-supervised learning process. Our extensive experiments on
several widely-used datasets unequivocally demonstrate the efficacy of our
approach
Novel feedback-Bayesian BP neural network combined with extended Kalman filtering for the battery state-of-charge estimation.
The state of charge estimation of lithium-ion batteries plays an important role in real-time monitoring and safety. To solve the problem that high non-linearity during real-time estimation of lithium-ion batteries who cause that it is difficult to estimate accurately. Taking lithium-ion battery as the research object, the working characteristics of lithium-ion ion battery are studied under various working conditions. To reduce the error caused by the nonlinearity of the lithium battery system, the BP neural network with the high approximation of nonlinearity is combined with the extended Kalman filtering. At the same time, to eliminate the over fitting of training, Bayesian regularization is used to optimize the neural network. Taking into account the real-time requirements of lithium-ion batteries, a feedback network is adopted to carry out real-time algorithm integration on lithium-ion batteries. A simulation model is established, and the results are analyzed in combination with various working conditions. Experimental results show that the algorithm has the characteristics of fast convergence and good tracking effect, and the estimation error is within 1.10%. It is verified that the Feedback-Bayesian BP neural network combined with the extended Kalman filtering algorithm can improve the accuracy of lithium-ion battery state-of-charge estimation
Visual Named Entity Linking: A New Dataset and A Baseline
Visual Entity Linking (VEL) is a task to link regions of images with their
corresponding entities in Knowledge Bases (KBs), which is beneficial for many
computer vision tasks such as image retrieval, image caption, and visual
question answering. While existing tasks in VEL either rely on textual data to
complement a multi-modal linking or only link objects with general entities,
which fails to perform named entity linking on large amounts of image data. In
this paper, we consider a purely Visual-based Named Entity Linking (VNEL) task,
where the input only consists of an image. The task is to identify objects of
interest (i.e., visual entity mentions) in images and link them to
corresponding named entities in KBs. Since each entity often contains rich
visual and textual information in KBs, we thus propose three different
sub-tasks, i.e., visual to visual entity linking (V2VEL), visual to textual
entity linking (V2TEL), and visual to visual-textual entity linking (V2VTEL).
In addition, we present a high-quality human-annotated visual person linking
dataset, named WIKIPerson. Based on WIKIPerson, we establish a series of
baseline algorithms for the solution of each sub-task, and conduct experiments
to verify the quality of proposed datasets and the effectiveness of baseline
methods. We envision this work to be helpful for soliciting more works
regarding VNEL in the future. The codes and datasets are publicly available at
https://github.com/ict-bigdatalab/VNEL.Comment: 13 pages, 11 figures, published to EMNLP 2022(findings
L^2R: Lifelong Learning for First-stage Retrieval with Backward-Compatible Representations
First-stage retrieval is a critical task that aims to retrieve relevant
document candidates from a large-scale collection. While existing retrieval
models have achieved impressive performance, they are mostly studied on static
data sets, ignoring that in the real-world, the data on the Web is continuously
growing with potential distribution drift. Consequently, retrievers trained on
static old data may not suit new-coming data well and inevitably produce
sub-optimal results. In this work, we study lifelong learning for first-stage
retrieval, especially focusing on the setting where the emerging documents are
unlabeled since relevance annotation is expensive and may not keep up with data
emergence. Under this setting, we aim to develop model updating with two goals:
(1) to effectively adapt to the evolving distribution with the unlabeled
new-coming data, and (2) to avoid re-inferring all embeddings of old documents
to efficiently update the index each time the model is updated.
We first formalize the task and then propose a novel Lifelong Learning method
for the first-stage Retrieval, namely L^2R. L^2R adopts the typical memory
mechanism for lifelong learning, and incorporates two crucial components: (1)
selecting diverse support negatives for model training and memory updating for
effective model adaptation, and (2) a ranking alignment objective to ensure the
backward-compatibility of representations to save the cost of index rebuilding
without hurting the model performance. For evaluation, we construct two new
benchmarks from LoTTE and Multi-CPR datasets to simulate the document
distribution drift in realistic retrieval scenarios. Extensive experiments show
that L^2R significantly outperforms competitive lifelong learning baselines.Comment: accepted by CIKM202
- …