175 research outputs found
Research Agenda in Intelligent Infrastructure to Enhance Disaster Management, Community Resilience and Public Safety
Modern societies can be understood as the intersection of four interdependent
systems: (1) the natural environment of geography, climate and weather; (2) the
built environment of cities, engineered systems, and physical infrastructure;
(3) the social environment of human populations, communities and socio-economic
activities; and (4) an information ecosystem that overlays the other three
domains and provides the means for understanding, interacting with, and
managing the relationships between the natural, built, and human environments.
As the nation and its communities become more connected, networked and
technologically sophisticated, new challenges and opportunities arise that
demand a rethinking of current approaches to public safety and emergency
management. Addressing the current and future challenges requires an equally
sophisticated program of research, technology development, and strategic
planning. The design and integration of intelligent infrastructure-including
embedded sensors, the Internet of Things (IoT), advanced wireless information
technologies, real-time data capture and analysis, and machine-learning-based
decision support-holds the potential to greatly enhance public safety,
emergency management, disaster recovery, and overall community resilience,
while addressing new and emerging threats to public safety and security.
Ultimately, the objective of this program of research and development is to
save lives, reduce risk and disaster impacts, permit efficient use of material
and social resources, and protect quality of life and economic stability across
entire regions.Comment: A Computing Community Consortium (CCC) white paper, 4 page
Multi-Character Field Recognition for Arabic and Chinese Handwriting
Two methods, Symbolic Indirect Correlation (SIC) and Style Constrained Classification (SCC), are proposed for recognizing handwritten Arabic and Chinese words and phrases. SIC reassembles variable-length segments of an unknown query that match similar segments of labeled reference words. Recognition is based on the correspondence between the order of the feature vectors and of the lexical transcript in both the query and the references. SIC implicitly incorporates language context in the form of letter n-grams. SCC is based on the notion that the style (distortion or noise) of a character is a good predictor of the distortions arising in other characters, even of a different class, from the same source. It is adaptive in the sense that with a long-enough field, its accuracy converges to that of a style-specific classifier trained on the writer of the unknown query. Neither SIC nor SCC requires the query words to appear among the references
Multi-Character Field Recognition for Arabic and Chinese Handwriting
Two methods, Symbolic Indirect Correlation (SIC) and Style Constrained Classification (SCC), are proposed for recognizing handwritten Arabic and Chinese words and phrases. SIC reassembles variable-length segments of an unknown query that match similar segments of labeled reference words. Recognition is based on the correspondence between the order of the feature vectors and of the lexical transcript in both the query and the references. SIC implicitly incorporates language context in the form of letter n-grams. SCC is based on the notion that the style (distortion or noise) of a character is a good predictor of the distortions arising in other characters, even of a different class, from the same source. It is adaptive in the sense that with a long-enough field, its accuracy converges to that of a style-specific classifier trained on the writer of the unknown query. Neither SIC nor SCC requires the query words to appear among the references
An Open Architecture for End-to-End Document Analysis Benchmarking
ISBN: 978-1-4577-1350-7International audienceIn this paper we present a fully operational, scalable and open architecture allowing to perform end-to-end document analysis benchmarking without needing to develop the whole pipeline. By decomposing the whole analysis process into coarse grained tasks, and by building upon community provided state-of-the art algorithms, our architecture allows virtually any combination of elementary document analysis algorithms, regardless their running system environment, programming language or data structures. Its flexible structure makes it very straightforward to plug in new experimental algorithms, compare them to equivalent other algorithms, and observe its effects on end-to-end tasks without need to install, compile or otherwise interact with any other software than one's own
A Platform for Storing, Visualizing, and Interpreting Collections of Noisy Documents
International audienceThe goal of document image analysis is to produce interpretations that match those of a fluent and knowledgeable human when viewing the same input. Because computer vision techniques are not perfect, the text that results when processing scanned pages is frequently noisy. Building on previous work, we propose a new paradigm for handling the inevitable incomplete, partial, erroneous, or slightly orthogonal interpretations that commonly arise in document datasets. Starting from the observation that interpretations are dependent on application context or user viewpoint, we describe a platform now under development that is capable of managing multiple interpretations for a document and offers an unprecedented level of interaction so that users can freely build upon, extend, or correct existing interpretations. In this way, the system supports the creation of a continuously expanding and improving document analysis repository which can be used to support research in the field
The Non-Geek's Guide to the DAE Platform
International audienceThe Document Analysis and Exploitation platform is a sophisticated technical environment that consists of a repository containing document images, implementations of document analysis algorithms, and the results of these algo- rithms when applied to data in the repository. The use of a web- services model makes it possible to set up document analysis pipelines that form the basis for reproducible protocols. Since the platform keeps track of all intermediate results, it becomes an information resource for the analysis of experimental data. This paper provides a tutorial on how to get started using the platform. It covers the technical details needed to overcome the initial hurdles and have a productive experience with DAE
The DAE Platform: a Framework for Reproducible Research in Document Image Analysis
International audienceWe present the DAE Platform in the specic context of reproducible research. DAE was developed at Lehigh University targeted at the Document Image Analysis research community for distributing document images and associated document analysis algorithms, as well as an unlimited range of annotations and ground truth for benchmark-ing and evaluation of new contributions to the state-of-the-art. DAE was conceived from the beginning with the idea of reproducibility and data provenance in mind. In this paper we more specically analyze how this approach answers a number of challenges raised by the need of providing fully reproducible experimental research. Furthermore, since DAE has been up and running without interruption since 2010, we are in a position of providing a qualitative analysis of the technological choices made at the time, and suggest some new perspectives in light of more recent technologies and practices
Challenges for the Engineering Drawing Lehigh Steel Collection
International audienceThe Lehigh Steel Collection (LSC) is an extremely large, heterogeneous set of documents dating from the 1960's through the 1990's. It was retrieved by Lehigh University after it acquired research facilities from Bethlehem Steel, a now-bankrupt company that was once the second-largest steel producer and the largest shipbuilder in the United States. The documents account for and describe research and development activities that were conducted on site, and consist of a very wide range of technical documentation, handwritten notes and memos, annotated printed documents, etc. This paper addresses only a sub-part of this collection: the approximately 4000 engineering drawings and blueprints that were retrieved. The challenge resides essentially in the fact that these documents come in different sizes and shapes, in a wide variety of conservation and degradation stages, and more importantly in bulk, and without ground-truth. Making them available to the research community through digitization is one step the good direction, the question now is what to do with them. This paper tries to lay down some first basic stepping stones for enhancing the documents' meta-data and annotations
Towards Improved Paper-Based Election Technology
Resources are presented for fostering paper-based election technology. They comprise a diverse collection of real and simulated ballot and survey images, and software tools for ballot synthesis, registration, segmentation, and ground-truthing. The grids underlying the designated location of voter marks are extracted from 13,315 degraded ballot images. The actual skew angles of sample ballots, recorded as part of complete ballot descriptions compiled with the interactive ground-truthing tool, are compared with their automatically extracted parameters. The average error is 0.1 degrees. These results provide a baseline for the application of digital image analysis to the scrutiny of electoral ballots
- …