11,147 research outputs found
Digital Image Access & Retrieval
The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio
Sustainable Waste Sorter
Indiana University Purdue University IndianapolisThe purpose of this project is to help people eliminate the confusion on whether they should throw their trash away or dispose of it in a recycling bin. The sustainable waste sorter is an informative device that tells the user where to place their trash. Our customer and the origin of the idea came from an organization called Roche Diagnostics Operations. Roche Diagnostics Operations is a multinational healthcare organization, the Indianapolis location focuses more on creating and developing their diabetic test strips. The device is created of four main components which include a Raspberry Pi 2 Model B, a camera module, an LCD screen, and a casing/mount that holds all of these components together. All of these components are compatible with the Raspberry Pi 2 Model B. The software was programmed in Python and the database in MySQL. During the development of the device, the most challenging task was learning how to develop in the new language, Python. Once the device reached a stable state it was piloted at Roche Diagnostics Operations. The purpose of the first of three pilot sessions was to verify that the device worked in the environment and that the items entered in the database were recognized; as a result, the device passed that test. The second pilot session had the same purpose as the first pilot session but with more items in the database. The device received more interaction during the second pilot session, though the team decided to schedule a third pilot session once all the items were entered into the database and a revamped user interface was completed. The team entered about 800 entries into the database and created a new and interactive user interface for the device. The third pilot session was a success; the items that were scanned by testers were recognized and the new user interface was a success as well. Overall, the sustainable waste sorter project was successful and educational. We, as students, took all of our fundamental learnings from our previous courses and applied them to this project. This allowed us to enhance our problem solving and project management skills. As people use the device, we hope that it educates them on how to properly recycle therefore improving the environmental state of our planet.Computer Engineering Technolog
Corpus Conversion Service: A Machine Learning Platform to Ingest Documents at Scale
Over the past few decades, the amount of scientific articles and technical
literature has increased exponentially in size. Consequently, there is a great
need for systems that can ingest these documents at scale and make the
contained knowledge discoverable. Unfortunately, both the format of these
documents (e.g. the PDF format or bitmap images) as well as the presentation of
the data (e.g. complex tables) make the extraction of qualitative and
quantitive data extremely challenging. In this paper, we present a modular,
cloud-based platform to ingest documents at scale. This platform, called the
Corpus Conversion Service (CCS), implements a pipeline which allows users to
parse and annotate documents (i.e. collect ground-truth), train
machine-learning classification algorithms and ultimately convert any type of
PDF or bitmap-documents to a structured content representation format. We will
show that each of the modules is scalable due to an asynchronous microservice
architecture and can therefore handle massive amounts of documents.
Furthermore, we will show that our capability to gather ground-truth is
accelerated by machine-learning algorithms by at least one order of magnitude.
This allows us to both gather large amounts of ground-truth in very little time
and obtain very good precision/recall metrics in the range of 99\% with regard
to content conversion to structured output. The CCS platform is currently
deployed on IBM internal infrastructure and serving more than 250 active users
for knowledge-engineering project engagements.Comment: Accepted paper at KDD 2018 conferenc
ICT in Czech companies: business efficiency potentials to be achieved.
The paper deals with business potential analysis based on the data published by Czech Statistic Authority (SÚ). It shows that the infrastructure state of the art even in small Czech companies enables to expand ERP and CRM systems, trading over Internet, Supply Chain Management and other new trends. Internet security is here of greatest importance, however it cannot be seen as major obstacle for new trading methods. The greatest challenge identified is the process and workflow optimization. To streamline workflow the document management supporting nearly seamless integration crossover the functional areas is of greatest importance. Moreover, process optimization can run into difficulties due to cross-organization functionalities of new IT architecture concepts like Service Oriented Architecture, WEB2 concepts and other methods and means. In this paper the value flow approach is shortly mentioned as an alternative to process modeling and workflow approach. Value oriented methods can overcome the process oriented approach limitations.ICT infrastructure; Business processes; Process modeling; Document management; Value chains; Business semantics
The NASA Astrophysics Data System: Data Holdings
Since its inception in 1993, the ADS Abstract Service has become an
indispensable research tool for astronomers and astrophysicists worldwide. In
those seven years, much effort has been directed toward improving both the
quantity and the quality of references in the database. From the original
database of approximately 160,000 astronomy abstracts, our dataset has grown
almost tenfold to approximately 1.5 million references covering astronomy,
astrophysics, planetary sciences, physics, optics, and engineering. We collect
and standardize data from approximately 200 journals and present the resulting
information in a uniform, coherent manner. With the cooperation of journal
publishers worldwide, we have been able to place scans of full journal articles
on-line back to the first volumes of many astronomical journals, and we are
able to link to current version of articles, abstracts, and datasets for
essentially all of the current astronomy literature. The trend toward
electronic publishing in the field, the use of electronic submission of
abstracts for journal articles and conference proceedings, and the increasingly
prominent use of the World Wide Web to disseminate information have enabled the
ADS to build a database unparalleled in other disciplines.
The ADS can be accessed at http://adswww.harvard.eduComment: 24 pages, 1 figure, 6 tables, 3 appendice
- …