250 research outputs found
LS-DTKMS: A Local Search Algorithm for Diversified Top-k MaxSAT Problem
The Maximum Satisfiability (MaxSAT), an important optimization problem, has a range of applications, including network routing, planning and scheduling, and combinatorial auctions. Among these applications, one usually benefits from having not just one single solution, but k diverse solutions. Motivated by this, we study an extension of MaxSAT, named Diversified Top-k MaxSAT (DTKMS) problem, which is to find k feasible assignments of a given formula such that each assignment satisfies all hard clauses and all of them together satisfy the maximum number of soft clauses. This paper presents a local search algorithm, LS-DTKMS, for DTKMS problem, which exploits novel scoring functions to select variables and assignments. Experiments demonstrate that LS-DTKMS outperforms the top-k MaxSAT based DTKMS solvers and state-of-the-art solvers for diversified top-k clique problem
Energy-efficient Connected Cruise Control with Lean Penetration of Connected Vehicles
This paper focuses on energy-efficient longitudinal controller design for a
connected automated truck that travels in mixed traffic consisting of connected
and non-connected vehicles. The truck has access to information about connected
vehicles beyond line of sight using vehicle-to-vehicle (V2V) communication. A
novel connected cruise control design is proposed which incorporates additional
delays into the control law when responding to distant connected vehicles to
account for the finite propagation of traffic waves. The speeds of
non-connected vehicles are modeled as stochastic processes. A fundamental
theorem is proven which links the spectral properties of the motion signals to
the average energy consumption. This enables us to tune controller parameters
and maximize energy efficiency. Simulations with synthetic data and real
traffic data are used to demonstrate the energy efficiency of the control
design. It is demonstrated that even with lean penetration of connected
vehicles, our controller can bring significant energy savings.Comment: This is submitted to IEEE Transactions on Intelligent Transportation
System
Learning to Correct Noisy Labels for Fine-Grained Entity Typing via Co-Prediction Prompt Tuning
Fine-grained entity typing (FET) is an essential task in natural language
processing that aims to assign semantic types to entities in text. However, FET
poses a major challenge known as the noise labeling problem, whereby current
methods rely on estimating noise distribution to identify noisy labels but are
confused by diverse noise distribution deviation. To address this limitation,
we introduce Co-Prediction Prompt Tuning for noise correction in FET, which
leverages multiple prediction results to identify and correct noisy labels.
Specifically, we integrate prediction results to recall labeled labels and
utilize a differentiated margin to identify inaccurate labels. Moreover, we
design an optimization objective concerning divergent co-predictions during
fine-tuning, ensuring that the model captures sufficient information and
maintains robustness in noise identification. Experimental results on three
widely-used FET datasets demonstrate that our noise correction approach
significantly enhances the quality of various types of training samples,
including those annotated using distant supervision, ChatGPT, and
crowdsourcing.Comment: Accepted by Findings of EMNLP 2023, 11 page
A Boundary Offset Prediction Network for Named Entity Recognition
Named entity recognition (NER) is a fundamental task in natural language
processing that aims to identify and classify named entities in text. However,
span-based methods for NER typically assign entity types to text spans,
resulting in an imbalanced sample space and neglecting the connections between
non-entity and entity spans. To address these issues, we propose a novel
approach for NER, named the Boundary Offset Prediction Network (BOPN), which
predicts the boundary offsets between candidate spans and their nearest entity
spans. By leveraging the guiding semantics of boundary offsets, BOPN
establishes connections between non-entity and entity spans, enabling
non-entity spans to function as additional positive samples for entity
detection. Furthermore, our method integrates entity type and span
representations to generate type-aware boundary offsets instead of using entity
types as detection targets. We conduct experiments on eight widely-used NER
datasets, and the results demonstrate that our proposed BOPN outperforms
previous state-of-the-art methods.Comment: Accepted by Findings of EMNLP 2023, 13 page
A Study of Unsupervised Evaluation Metrics for Practical and Automatic Domain Adaptation
Unsupervised domain adaptation (UDA) methods facilitate the transfer of
models to target domains without labels. However, these methods necessitate a
labeled target validation set for hyper-parameter tuning and model selection.
In this paper, we aim to find an evaluation metric capable of assessing the
quality of a transferred model without access to target validation labels. We
begin with the metric based on mutual information of the model prediction.
Through empirical analysis, we identify three prevalent issues with this
metric: 1) It does not account for the source structure. 2) It can be easily
attacked. 3) It fails to detect negative transfer caused by the over-alignment
of source and target features. To address the first two issues, we incorporate
source accuracy into the metric and employ a new MLP classifier that is held
out during training, significantly improving the result. To tackle the final
issue, we integrate this enhanced metric with data augmentation, resulting in a
novel unsupervised UDA metric called the Augmentation Consistency Metric (ACM).
Additionally, we empirically demonstrate the shortcomings of previous
experiment settings and conduct large-scale experiments to validate the
effectiveness of our proposed metric. Furthermore, we employ our metric to
automatically search for the optimal hyper-parameter set, achieving superior
performance compared to manually tuned sets across four common benchmarks.
Codes will be available soon
Energy-efficient Reactive and Predictive Connected Cruise Control
In this paper, we propose a framework for the longitudinal control of
connected and automated vehicles traveling in mixed traffic consisting of
connected and non-connected human-driven vehicles. Reactive and predictive
controllers are proposed. Reactive controllers are given by explicit feedback
control laws. In predictive controllers, the control input is optimized in a
receding-horizon fashion, which depends on the predictions of motions of
preceding vehicles. Beyond-line-of-sight information is obtained via
vehicle-to-vehicle (V2V) communication, and is utilized in the proposed
reactive and predictive controllers. Simulations utilizing real traffic data
are used to show that connectivity can bring significant energy savings.Comment: 18 pages, 12 figures, submitted to Transportation Research Part C:
Emerging Technologie
ProtLLM: An Interleaved Protein-Language LLM with Protein-as-Word Pre-Training
We propose ProtLLM, a versatile cross-modal large language model (LLM) for
both protein-centric and protein-language tasks. ProtLLM features a unique
dynamic protein mounting mechanism, enabling it to handle complex inputs where
the natural language text is interspersed with an arbitrary number of proteins.
Besides, we propose the protein-as-word language modeling approach to train
ProtLLM. By developing a specialized protein vocabulary, we equip the model
with the capability to predict not just natural language but also proteins from
a vast pool of candidates. Additionally, we construct a large-scale interleaved
protein-text dataset, named InterPT, for pre-training. This dataset
comprehensively encompasses both (1) structured data sources like protein
annotations and (2) unstructured data sources like biological research papers,
thereby endowing ProtLLM with crucial knowledge for understanding proteins. We
evaluate ProtLLM on classic supervised protein-centric tasks and explore its
novel protein-language applications. Experimental results demonstrate that
ProtLLM not only achieves superior performance against protein-specialized
baselines on protein-centric tasks but also induces zero-shot and in-context
learning capabilities on protein-language tasks.Comment: https://protllm.github.io/project
Semantic Segmentation to Extract Coronary Arteries in Invasive Coronary Angiograms
Accurate semantic segmentation of each coronary artery using invasive coronary angiography (ICA) is important for stenosis assessment and coronary artery disease (CAD) diagnosis. In this paper, we propose a multi-step semantic segmentation algorithm based on analyzing arterial segments extracted from ICAs. The proposed algorithm firstly extracts the entire arterial binary mask (binary vascular tree) using a deep learning-based method. Then we extract the centerline of the binary vascular tree and separate it into different arterial segments. Finally, by extracting the underlying arterial topology, position, and pixel features, we construct a powerful coronary artery segment classifier based on a support vector machine. Each arterial segment is classified into the left coronary artery (LCA), left anterior descending (LAD), and other types of arterial segments. The proposed method was tested on a dataset with 225 ICAs and achieved a mean accuracy of 70.33% for the multi-class artery classification and a mean intersection over union of 0.6868 for semantic segmentation of arteries. The experimental results show the effectiveness of the proposed algorithm, which provides impressive performance for analyzing the individual arteries in ICAs
- …