466 research outputs found
Region-Based Image Retrieval Revisited
Region-based image retrieval (RBIR) technique is revisited. In early attempts
at RBIR in the late 90s, researchers found many ways to specify region-based
queries and spatial relationships; however, the way to characterize the
regions, such as by using color histograms, were very poor at that time. Here,
we revisit RBIR by incorporating semantic specification of objects and
intuitive specification of spatial relationships. Our contributions are the
following. First, to support multiple aspects of semantic object specification
(category, instance, and attribute), we propose a multitask CNN feature that
allows us to use deep learning technique and to jointly handle multi-aspect
object specification. Second, to help users specify spatial relationships among
objects in an intuitive way, we propose recommendation techniques of spatial
relationships. In particular, by mining the search results, a system can
recommend feasible spatial relationships among the objects. The system also can
recommend likely spatial relationships by assigned object category names based
on language prior. Moreover, object-level inverted indexing supports very fast
shortlist generation, and re-ranking based on spatial constraints provides
users with instant RBIR experiences.Comment: To appear in ACM Multimedia 2017 (Oral
部位特異的変異導入によるCBFβと相互作用するHIV-1 Vif残基の決定
京都大学0048新制・課程博士博士(医学)甲第18881号医博第3992号新制||医||1009(附属図書館)31832京都大学大学院医学研究科医学専攻(主査)教授 小柳 義夫, 教授 松岡 雅雄, 教授 朝長 啓造学位規則第4条第1項該当Doctor of Medical ScienceKyoto UniversityDFA
General and Practical Tuning Method for Off-the-Shelf Graph-Based Index: SISAP Indexing Challenge Report by Team UTokyo
Despite the efficacy of graph-based algorithms for Approximate Nearest
Neighbor (ANN) searches, the optimal tuning of such systems remains unclear.
This study introduces a method to tune the performance of off-the-shelf
graph-based indexes, focusing on the dimension of vectors, database size, and
entry points of graph traversal. We utilize a black-box optimization algorithm
to perform integrated tuning to meet the required levels of recall and Queries
Per Second (QPS). We applied our approach to Task A of the SISAP 2023 Indexing
Challenge and got second place in the 10M and 30M tracks. It improves
performance substantially compared to brute force methods. This research offers
a universally applicable tuning method for graph-based indexes, extending
beyond the specific conditions of the competition to broader uses.Comment: Accepted paper on 2nd place solution of SISAP 2023 Indexing Challenge
Task
Adversarial Doodles: Interpretable and Human-drawable Attacks Provide Describable Insights
DNN-based image classification models are susceptible to adversarial attacks.
Most previous adversarial attacks do not focus on the interpretability of the
generated adversarial examples, and we cannot gain insights into the mechanism
of the target classifier from the attacks. Therefore, we propose Adversarial
Doodles, which have interpretable shapes. We optimize black b\'ezier curves to
fool the target classifier by overlaying them onto the input image. By
introducing random perspective transformation and regularizing the doodled
area, we obtain compact attacks that cause misclassification even when humans
replicate them by hand. Adversarial doodles provide describable and intriguing
insights into the relationship between our attacks and the classifier's output.
We utilize adversarial doodles and discover the bias inherent in the target
classifier, such as "We add two strokes on its head, a triangle onto its body,
and two lines inside the triangle on a bird image. Then, the classifier
misclassifies the image as a butterfly.
Fast Partitioned Learned Bloom Filter
A Bloom filter is a memory-efficient data structure for approximate
membership queries used in numerous fields of computer science. Recently,
learned Bloom filters that achieve better memory efficiency using machine
learning models have attracted attention. One such filter, the partitioned
learned Bloom filter (PLBF), achieves excellent memory efficiency. However,
PLBF requires a time complexity to construct the data structure,
where and are the hyperparameters of PLBF. One can improve memory
efficiency by increasing , but the construction time becomes extremely long.
Thus, we propose two methods that can reduce the construction time while
maintaining the memory efficiency of PLBF. First, we propose fast PLBF, which
can construct the same data structure as PLBF with a smaller time complexity
. Second, we propose fast PLBF++, which can construct the data
structure with even smaller time complexity . Fast PLBF++
does not necessarily construct the same data structure as PLBF. Still, it is
almost as memory efficient as PLBF, and it is proved that fast PLBF++ has the
same data structure as PLBF when the distribution satisfies a certain
constraint. Our experimental results from real-world datasets show that (i)
fast PLBF and fast PLBF++ can construct the data structure up to 233 and 761
times faster than PLBF, (ii) fast PLBF can achieve the same memory efficiency
as PLBF, and (iii) fast PLBF++ can achieve almost the same memory efficiency as
PLBF.Comment: NeurIPS 202
Defense-Prefix for Preventing Typographic Attacks on CLIP
Vision-language pre-training models (VLPs) have exhibited revolutionary
improvements in various vision-language tasks. In VLP, some adversarial attacks
fool a model into false or absurd classifications. Previous studies addressed
these attacks by fine-tuning the model or changing its architecture. However,
these methods risk losing the original model's performance and are difficult to
apply to downstream tasks. In particular, their applicability to other tasks
has not been considered. In this study, we addressed the reduction of the
impact of typographic attacks on CLIP without changing the model parameters. To
achieve this, we expand the idea of ``prefix learning'' and introduce our
simple yet effective method: Defense-Prefix (DP), which inserts the DP token
before a class name to make words ``robust'' against typographic attacks. Our
method can be easily applied to downstream tasks, such as object detection,
because the proposed method is independent of the model parameters. Our method
significantly improves the accuracy of classification tasks for typographic
attack datasets, while maintaining the zero-shot capabilities of the model. In
addition, we leverage our proposed method for object detection, demonstrating
its high applicability and effectiveness. The codes and datasets will be
publicly available.Comment: Under revie
- …