37 research outputs found
Valid Inequalities and Facets for Multi-Module (Survivable) Capacitated Network Design Problem
In this dissertation, we develop new methodologies and algorithms to solve the multi-module (survivable) network design problem. Many real-world decision-making problems can be modeled as network design problems, especially on networks with capacity requirements on arcs or edges. In most cases, network design problems of this type that have been studied involve different types of capacity sizes (modules), and we call them the multi-module capacitated network design (MMND) problem. MMND problems arise in various industrial applications, such as transportation, telecommunication, power grid, data centers, and oil production, among many others.
In the first part of the dissertation, we study the polyhedral structure of the MMND problem. We summarize current literature on polyhedral study of MMND, which generates the family of the so-called cutset inequalities based on the traditional mixed integer rounding (MIR). We then introduce a new family of inequalities for MMND based on the so-called n-step MIR, and show that various classes of cutset inequalities in the literature are special cases of these inequalities. We do so by studying a mixed integer set, the cutset polyhedron, which is closely related to MMND. We We also study the strength of this family of inequalities by providing some facet-defining conditions. These inequalities are then tested on MMND instances, and our computational results show that these classes of inequalities are very effective for solving MMND problems. Generalizations of these inequalities for some variants of MMND are also discussed.
Network design problems have many generalizations depending on the application. In the second part of the dissertation, we study a highly applicable form of SND, referred to as multi-module SND (MM-SND), in which transmission capacities on edges can be sum of integer multiples of differently sized capacity modules. For the first time, we formulate MM-SND as a mixed integer program (MIP) using preconfigured-cycles (p-cycles) to reroute flow on failed edges. We derive several classes of valid inequalities for this MIP, and show that the valid inequalities previously developed in the literature for single-module SND are special cases of our inequalities. Furthermore, we show that our valid inequalities are facet-defining for MM-SND in many cases. Our computational results, using a heuristic separation algorithm, show that these inequalities are very effective in solving MM-SND. In particular they are more effective than compared to using single-module inequalities alone.
Lastly, we generalize the inequalities for MMND for other mixed integer sets relaxed from MMND and the cutset polyhedron. These inequalities also generalize several valid inequalities in the literature. We conclude the dissertation by summarizing the work and pointing out potential directions for future research
Optimum Search Schemes for Approximate String Matching Using Bidirectional FM-Index
Finding approximate occurrences of a pattern in a text using a full-text
index is a central problem in bioinformatics and has been extensively
researched. Bidirectional indices have opened new possibilities in this regard
allowing the search to start from anywhere within the pattern and extend in
both directions. In particular, use of search schemes (partitioning the pattern
and searching the pieces in certain orders with given bounds on errors) can
yield significant speed-ups. However, finding optimal search schemes is a
difficult combinatorial optimization problem.
Here for the first time, we propose a mixed integer program (MIP) capable to
solve this optimization problem for Hamming distance with given number of
pieces. Our experiments show that the optimal search schemes found by our MIP
significantly improve the performance of search in bidirectional FM-index upon
previous ad-hoc solutions. For example, approximate matching of 101-bp Illumina
reads (with two errors) becomes 35 times faster than standard backtracking.
Moreover, despite being performed purely in the index, the running time of
search using our optimal schemes (for up to two errors) is comparable to the
best state-of-the-art aligners, which benefit from combining search in index
with in-text verification using dynamic programming. As a result, we anticipate
a full-fledged aligner that employs an intelligent combination of search in the
bidirectional FM-index using our optimal search schemes and in-text
verification using dynamic programming outperforms today's best aligners. The
development of such an aligner, called FAMOUS (Fast Approximate string Matching
using OptimUm search Schemes), is ongoing as our future work
Together We Make Sense -- Learning Meta-Sense Embeddings from Pretrained Static Sense Embeddings
Sense embedding learning methods learn multiple vectors for a given ambiguous
word, corresponding to its different word senses. For this purpose, different
methods have been proposed in prior work on sense embedding learning that use
different sense inventories, sense-tagged corpora and learning methods.
However, not all existing sense embeddings cover all senses of ambiguous words
equally well due to the discrepancies in their training resources. To address
this problem, we propose the first-ever meta-sense embedding method --
Neighbour Preserving Meta-Sense Embeddings, which learns meta-sense embeddings
by combining multiple independently trained source sense embeddings such that
the sense neighbourhoods computed from the source embeddings are preserved in
the meta-embedding space. Our proposed method can combine source sense
embeddings that cover different sets of word senses. Experimental results on
Word Sense Disambiguation (WSD) and Word-in-Context (WiC) tasks show that the
proposed meta-sense embedding method consistently outperforms several
competitive baselines.Comment: Accepted to Findings of ACL 202
An image is worth 1000 lies: adversarial transferability across prompts on vision-language models
Different from traditional task-specific vision models, recent large VLMs can readily adapt to different vision tasks by simply using different textual instructions, i.e., prompts. However, a well-known concern about traditional task-specific vision models is that they can be misled by imperceptible adversarial perturbations. Furthermore, the concern is exacerbated by the phenomenon that the same adversarial perturbations can fool different task-specific models. Given that VLMs rely on prompts to adapt to different tasks, an intriguing question emerges: Can a single adversarial image mislead all predictions of VLMs when a thousand different prompts are given? This question essentially introduces a novel perspective on adversarial transferability: cross-prompt adversarial transferability. In this work, we propose the Cross-Prompt Attack (CroPA). This proposed method updates the visual adversarial perturbation with learnable textual prompts, which are designed to counteract the misleading effects of the adversarial image. By doing this, CroPA significantly improves the transferability of adversarial examples across prompts. Extensive experiments are conducted to verify the strong cross-prompt adversarial transferability of CroPA with prevalent VLMs including Flamingo, BLIP-2, and InstructBLIP in various different tasks
Noisy Correspondence Learning with Meta Similarity Correction
Despite the success of multimodal learning in cross-modal retrieval task, the
remarkable progress relies on the correct correspondence among multimedia data.
However, collecting such ideal data is expensive and time-consuming. In
practice, most widely used datasets are harvested from the Internet and
inevitably contain mismatched pairs. Training on such noisy correspondence
datasets causes performance degradation because the cross-modal retrieval
methods can wrongly enforce the mismatched data to be similar. To tackle this
problem, we propose a Meta Similarity Correction Network (MSCN) to provide
reliable similarity scores. We view a binary classification task as the
meta-process that encourages the MSCN to learn discrimination from positive and
negative meta-data. To further alleviate the influence of noise, we design an
effective data purification strategy using meta-data as prior knowledge to
remove the noisy samples. Extensive experiments are conducted to demonstrate
the strengths of our method in both synthetic and real-world noises, including
Flickr30K, MS-COCO, and Conceptual Captions.Comment: Accepted at CVPR 202
Model-agnostic origin attribution of generated images with few-shot examples
Recent progress in visual generative models enables the generation of high-quality images. To prevent the misuse of generated images, it is important to identify the origin model that generates them. In this work, we study the origin attribution of generated images in a practical setting where only a few images generated by a source model are available and the source model cannot be accessed. The goal is to check if a given image is generated by the source model. We first formulate this problem as a few-shot one-class classification task. To solve the task, we propose OCC-CLIP, a CLIP-based framework for few-shot oneclass classification, enabling the identification of an image’s source model, even among multiple candidates. Extensive experiments corresponding to various generative models verify the effectiveness of our OCC-CLIP framework. Furthermore, an experiment based on the recently released DALL ·E-3 API verifies the real-world applicability of our solution
Noise-Tolerant Learning for Audio-Visual Action Recognition
Recently, video recognition is emerging with the help of multi-modal
learning, which focuses on integrating distinct modalities to improve the
performance or robustness of the model. Although various multi-modal learning
methods have been proposed and offer remarkable recognition results, almost all
of these methods rely on high-quality manual annotations and assume that
modalities among multi-modal data provide semantically relevant information.
Unfortunately, the widely used video datasets are usually coarse-annotated or
collected from the Internet. Thus, it inevitably contains a portion of noisy
labels and noisy correspondence. To address this challenge, we use the
audio-visual action recognition task as a proxy and propose a noise-tolerant
learning framework to find anti-interference model parameters against both
noisy labels and noisy correspondence. Specifically, our method consists of two
phases that aim to rectify noise by the inherent correlation between
modalities. First, a noise-tolerant contrastive training phase is performed to
make the model immune to the possible noisy-labeled data. To alleviate the
influence of noisy correspondence, we propose a cross-modal noise estimation
component to adjust the consistency between different modalities. As the noisy
correspondence existed at the instance level, we further propose a
category-level contrastive loss to reduce its interference. Second, in the
hybrid-supervised training phase, we calculate the distance metric among
features to obtain corrected labels, which are used as complementary
supervision to guide the training. Extensive experiments on a wide range of
noisy levels demonstrate that our method significantly improves the robustness
of the action recognition model and surpasses the baselines by a clear margin.Comment: This work has been submitted to the IEEE for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessibl
Learning Edge Representations via Low-Rank Asymmetric Projections
We propose a new method for embedding graphs while preserving directed edge
information. Learning such continuous-space vector representations (or
embeddings) of nodes in a graph is an important first step for using network
information (from social networks, user-item graphs, knowledge bases, etc.) in
many machine learning tasks.
Unlike previous work, we (1) explicitly model an edge as a function of node
embeddings, and we (2) propose a novel objective, the "graph likelihood", which
contrasts information from sampled random walks with non-existent edges.
Individually, both of these contributions improve the learned representations,
especially when there are memory constraints on the total size of the
embeddings. When combined, our contributions enable us to significantly improve
the state-of-the-art by learning more concise representations that better
preserve the graph structure.
We evaluate our method on a variety of link-prediction task including social
networks, collaboration networks, and protein interactions, showing that our
proposed method learn representations with error reductions of up to 76% and
55%, on directed and undirected graphs. In addition, we show that the
representations learned by our method are quite space efficient, producing
embeddings which have higher structure-preserving accuracy but are 10 times
smaller