164 research outputs found
Multiple Instance Curriculum Learning for Weakly Supervised Object Detection
When supervising an object detector with weakly labeled data, most existing
approaches are prone to trapping in the discriminative object parts, e.g.,
finding the face of a cat instead of the full body, due to lacking the
supervision on the extent of full objects. To address this challenge, we
incorporate object segmentation into the detector training, which guides the
model to correctly localize the full objects. We propose the multiple instance
curriculum learning (MICL) method, which injects curriculum learning (CL) into
the multiple instance learning (MIL) framework. The MICL method starts by
automatically picking the easy training examples, where the extent of the
segmentation masks agree with detection bounding boxes. The training set is
gradually expanded to include harder examples to train strong detectors that
handle complex images. The proposed MICL method with segmentation in the loop
outperforms the state-of-the-art weakly supervised object detectors by a
substantial margin on the PASCAL VOC datasets.Comment: Published in BMVC 201
Large Foundation Models for Power Systems
Foundation models, such as Large Language Models (LLMs), can respond to a
wide range of format-free queries without any task-specific data collection or
model training, creating various research and application opportunities for the
modeling and operation of large-scale power systems. In this paper, we outline
how such large foundation model such as GPT-4 are developed, and discuss how
they can be leveraged in challenging power and energy system tasks. We first
investigate the potential of existing foundation models by validating their
performance on four representative tasks across power system domains, including
the optimal power flow (OPF), electric vehicle (EV) scheduling, knowledge
retrieval for power engineering technical reports, and situation awareness. Our
results indicate strong capabilities of such foundation models on boosting the
efficiency and reliability of power system operational pipelines. We also
provide suggestions and projections on future deployment of foundation models
in power system applications.Comment: Code available at https://github.com/chennnnnyize/LLM_PowerSystem
Task-Adaptive Tokenization: Enhancing Long-Form Text Generation Efficacy in Mental Health and Beyond
We propose task-adaptive tokenization as a way to adapt the generation
pipeline to the specifics of a downstream task and enhance long-form generation
in mental health. Inspired by insights from cognitive science, our
task-adaptive tokenizer samples variable segmentations from multiple outcomes,
with sampling probabilities optimized based on task-specific data. We introduce
a strategy for building a specialized vocabulary and introduce a vocabulary
merging protocol that allows for the integration of task-specific tokens into
the pre-trained model's tokenization step. Through extensive experiments on
psychological question-answering tasks in both Chinese and English, we find
that our task-adaptive tokenization approach brings a significant improvement
in generation performance while using up to 60% fewer tokens. Preliminary
experiments point to promising results when using our tokenization approach
with very large language models.Comment: Accepted at the main conference of The 2023 Conference on Empirical
Methods in Natural Language Processing; 8 page
Instance Embedding Transfer to Unsupervised Video Object Segmentation
We propose a method for unsupervised video object segmentation by
transferring the knowledge encapsulated in image-based instance embedding
networks. The instance embedding network produces an embedding vector for each
pixel that enables identifying all pixels belonging to the same object. Though
trained on static images, the instance embeddings are stable over consecutive
video frames, which allows us to link objects together over time. Thus, we
adapt the instance networks trained on static images to video object
segmentation and incorporate the embeddings with objectness and optical flow
features, without model retraining or online fine-tuning. The proposed method
outperforms state-of-the-art unsupervised segmentation methods in the DAVIS
dataset and the FBMS dataset.Comment: To appear in CVPR 201
- …