5,319 research outputs found
Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers
The massive amounts of digitized historical documents acquired over the last
decades naturally lend themselves to automatic processing and exploration.
Research work seeking to automatically process facsimiles and extract
information thereby are multiplying with, as a first essential step, document
layout analysis. If the identification and categorization of segments of
interest in document images have seen significant progress over the last years
thanks to deep learning techniques, many challenges remain with, among others,
the use of finer-grained segmentation typologies and the consideration of
complex, heterogeneous documents such as historical newspapers. Besides, most
approaches consider visual features only, ignoring textual signal. In this
context, we introduce a multimodal approach for the semantic segmentation of
historical newspapers that combines visual and textual features. Based on a
series of experiments on diachronic Swiss and Luxembourgish newspapers, we
investigate, among others, the predictive power of visual and textual features
and their capacity to generalize across time and sources. Results show
consistent improvement of multimodal models in comparison to a strong visual
baseline, as well as better robustness to high material variance
A Survey on Retrieval of Mathematical Knowledge
We present a short survey of the literature on indexing and retrieval of
mathematical knowledge, with pointers to 72 papers and tentative taxonomies of
both retrieval problems and recurring techniques.Comment: CICM 2015, 20 page
Automatic Palaeographic Exploration of Genizah Manuscripts
The Cairo Genizah is a collection of hand-written documents containing approximately
350,000 fragments of mainly Jewish texts discovered in the late 19th
century. The
fragments are today spread out in some 75 libraries and private collections worldwide,
but there is an ongoing effort to document and catalogue all extant fragments.
Palaeographic information plays a key role in the study of the Genizah collection.
Script style, andâmore specificallyâhandwriting, can be used to identify fragments that
might originate from the same original work. Such matched fragments, commonly
referred to as âjoinsâ, are currently identified manually by experts, and presumably only
a small fraction of existing joins have been discovered to date. In this work, we show
that automatic handwriting matching functions, obtained from non-specific features
using a corpus of writing samples, can perform this task quite reliably. In addition, we
explore the problem of grouping various Genizah documents by script style, without
being provided any prior information about the relevant styles. The automatically
obtained grouping agrees, for the most part, with the palaeographic taxonomy. In cases
where the method fails, it is due to apparent similarities between related scripts
User Review-Based Change File Localization for Mobile Applications
In the current mobile app development, novel and emerging DevOps practices
(e.g., Continuous Delivery, Integration, and user feedback analysis) and tools
are becoming more widespread. For instance, the integration of user feedback
(provided in the form of user reviews) in the software release cycle represents
a valuable asset for the maintenance and evolution of mobile apps. To fully
make use of these assets, it is highly desirable for developers to establish
semantic links between the user reviews and the software artefacts to be
changed (e.g., source code and documentation), and thus to localize the
potential files to change for addressing the user feedback. In this paper, we
propose RISING (Review Integration via claSsification, clusterIng, and
linkiNG), an automated approach to support the continuous integration of user
feedback via classification, clustering, and linking of user reviews. RISING
leverages domain-specific constraint information and semi-supervised learning
to group user reviews into multiple fine-grained clusters concerning similar
users' requests. Then, by combining the textual information from both commit
messages and source code, it automatically localizes potential change files to
accommodate the users' requests. Our empirical studies demonstrate that the
proposed approach outperforms the state-of-the-art baseline work in terms of
clustering and localization accuracy, and thus produces more reliable results.Comment: 15 pages, 3 figures, 8 table
Action Recognition in Videos: from Motion Capture Labs to the Web
This paper presents a survey of human action recognition approaches based on
visual data recorded from a single video camera. We propose an organizing
framework which puts in evidence the evolution of the area, with techniques
moving from heavily constrained motion capture scenarios towards more
challenging, realistic, "in the wild" videos. The proposed organization is
based on the representation used as input for the recognition task, emphasizing
the hypothesis assumed and thus, the constraints imposed on the type of video
that each technique is able to address. Expliciting the hypothesis and
constraints makes the framework particularly useful to select a method, given
an application. Another advantage of the proposed organization is that it
allows categorizing newest approaches seamlessly with traditional ones, while
providing an insightful perspective of the evolution of the action recognition
task up to now. That perspective is the basis for the discussion in the end of
the paper, where we also present the main open issues in the area.Comment: Preprint submitted to CVIU, survey paper, 46 pages, 2 figures, 4
table
- âŠ