2 research outputs found
RoadText-1K: Text Detection & Recognition Dataset for Driving Videos
Perceiving text is crucial to understand semantics of outdoor scenes and
hence is a critical requirement to build intelligent systems for driver
assistance and self-driving. Most of the existing datasets for text detection
and recognition comprise still images and are mostly compiled keeping text in
mind. This paper introduces a new "RoadText-1K" dataset for text in driving
videos. The dataset is 20 times larger than the existing largest dataset for
text in videos. Our dataset comprises 1000 video clips of driving without any
bias towards text and with annotations for text bounding boxes and
transcriptions in every frame. State of the art methods for text detection,
recognition and tracking are evaluated on the new dataset and the results
signify the challenges in unconstrained driving videos compared to existing
datasets. This suggests that RoadText-1K is suited for research and development
of reading systems, robust enough to be incorporated into more complex
downstream tasks like driver assistance and self-driving. The dataset can be
found at http://cvit.iiit.ac.in/research/projects/cvit-projects/roadtext-1kComment: to be published in ICRA 202
Object-QA: Towards High Reliable Object Quality Assessment
In object recognition applications, object images usually appear with
different quality levels. Practically, it is very important to indicate object
image qualities for better application performance, e.g. filtering out
low-quality object image frames to maintain robust video object recognition
results and speed up inference. However, no previous works are explicitly
proposed for addressing the problem. In this paper, we define the problem of
object quality assessment for the first time and propose an effective approach
named Object-QA to assess high-reliable quality scores for object images.
Concretely, Object-QA first employs a well-designed relative quality assessing
module that learns the intra-class-level quality scores by referring to the
difference between object images and their estimated templates. Then an
absolute quality assessing module is designed to generate the final quality
scores by aligning the quality score distributions in inter-class. Besides,
Object-QA can be implemented with only object-level annotations, and is also
easily deployed to a variety of object recognition tasks. To our best knowledge
this is the first work to put forward the definition of this problem and
conduct quantitative evaluations. Validations on 5 different datasets show that
Object-QA can not only assess high-reliable quality scores according with human
cognition, but also improve application performance