41 research outputs found
Digital Image Access & Retrieval
The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio
The Astrolabe Project: Identifying and Curating Astronomical Dark Data through Development of Cyberinfrastructure Resources
As research datasets and analyses grow in complexity, data that could be
valuable to other researchers and to support the integrity of published work
remain uncurated across disciplines. These data are especially concentrated in
the Long Tail of funded research, where curation resources and related
expertise are often inaccessible. In the domain of astronomy, it is undisputed
that uncurated dark data exist, but the scope of the problem remains uncertain.
The Astrolabe Project is a collaboration between University of Arizona
researchers, the CyVerse cyberinfrastructure environment, and American
Astronomical Society, with a mission to identify and ingest
previously-uncurated astronomical data, and to provide a robust computational
environment for analysis and sharing of data, as well as services for authors
wishing to deposit data associated with publications. Following expert feedback
obtained through two workshops held in 2015 and 2016, Astrolabe is funded in
part by National Science Foundation. The system is being actively developed
within CyVerse, and Astrolabe collaborators are soliciting heterogeneous
datasets and potential users for the prototype system. Astrolabe team members
are currently working to characterize the properties of uncurated astronomical
data, and to develop automated methods for locating potentially-useful data to
be targeted for ingest into Astrolabe, while cultivating a user community for
the new data management system.Comment: To be published in Proceedings of Library and Information Services in
Astronomy (LISA) VIII; conference held in Strasbourg, France, June 6-9, 201
Astrolabe: Curating, Linking and Computing Astronomy's Dark Data
Where appropriate repositories are not available to support all relevant
astronomical data products, data can fall into darkness: unseen and unavailable
for future reference and re-use. Some data in this category are legacy or old
data, but newer datasets are also often uncurated and could remain "dark". This
paper provides a description of the design motivation and development of
Astrolabe, a cyberinfrastructure project that addresses a set of community
recommendations for locating and ensuring the long-term curation of dark or
otherwise at-risk data and integrated computing. This paper also describes the
outcomes of the series of community workshops that informed creation of
Astrolabe. According to participants in these workshops, much astronomical dark
data currently exist that are not curated elsewhere, as well as software that
can only be executed by a few individuals and therefore becomes unusable
because of changes in computing platforms. Astronomical research questions and
challenges would be better addressed with integrated data and computational
resources that fall outside the scope of existing observatory and space mission
projects. As a solution, the design of the Astrolabe system is aimed at
developing new resources for management of astronomical data. The project is
based in CyVerse cyberinfrastructure technology and is a collaboration between
the University of Arizona and the American Astronomical Society. Overall the
project aims to support open access to research data by leveraging existing
cyberinfrastructure resources and promoting scientific discovery by making
potentially-useful data in a computable format broadly available to the
astronomical community.Comment: Accepted for publication in the Astrophysical Journal Supplement
Series, 22 pages, 2 figure
Augmenting optical character recognition (OCR) for improved digitization: Strategies to access scientific data in natural history collections
The Augmenting OCR Working Group (A-OCR WG) at Integrated Digitized Biocollections (iDigBio) seeks to improve community OCR strategies and algorithms for faster, better parsing of OCR output derived from valuable data on natural history collection specimen labels. This task is exceedingly difficult because museum labels are often annotated, and vary in content, form and font. Under the National Science Foundation's (NSF) Advancing Digitization of Biological Collections (ADBC) program, iDigBio is building a cyberinfrastructure to aggregate quality data from museum specimens housed in collections across the United States for use by researchers, educators, environmentalists and the public. Since March of 2012, the A-OCR WG formed from community consensus to begin its role in this endeavor, defining reachable goals including setting up a hackathon concurrent with iConference 2013. This paper reports on the definition of some key problems identified by the A-OCR WG since these science problems will drive research and cyberinfrastructure development.published or submitted for publicationis peer reviewe
Datasphere at the Biosphere II: Computation and data in the wild
Biological Field Stations provide a unique set of opportunities and challenges for digital curation. The stations serve as the center of short-term and long-term biological research, from biomolecular-scale to ecosystems-scale research. They represent some of the last remaining “natural” areas in certain regions. Stations provide unique information about local biotic and abiotic conditions. Data shared among the stations support continental scale and global research initiatives. The stations themselves support a large number of researchers who often come from multiple universities and other research and teaching institutions around the world. Because of this decentralized user base, it is particularly difficult for stations to capture data and other research products generated by research at the stations. The authors, part of a larger NSF funded “Empowering Long Tail Research” project (NSF:#1216872), conducted a survey of field station researchers and then held a two-day workshop to identify challenges and opportunities for “grand challenge” research questions that could be enabled through development of cyberinfrastructure. The information gathered through this study will inform future proposals for cyberinfrastructure development.ye
Graduate Curriculum for Biological Information Specialists: A Key to Integration of Scale in Biology
Scientific data problems do not stand in isolation. They are part of a larger set of challenges associated with the escalation of scientific information and changes in scholarly communication in the digital environment. Biologists in particular are generating enormous sets of data at a high rate, and new discoveries in the biological sciences will increasingly depend on the integration of data across multiple scales. This work will require new kinds of information expertise in key areas. To build this professional capacity we have developed two complementary educational programs: a Biological Information Specialist (BIS) masters degree and a concentration in Data Curation (DC). We believe that BISs will be central in the development of cyberinfrastructure and information services needed to facilitate interdisciplinary and multi-scale science. Here we present three sample cases from our current research projects to illustrate areas in which we expect information specialists to make important contributions to biological research practice
O Serviço de documentação textual e iconografia do Museu Paulista
The essay compares the curatorship's works realized during the decade of 1990 by the actual Department of Textual and Iconographical Documentation of Museu Paulista, responsible for the MP Fund / Permanent File (Fundo MP/Arquivo Permanente), hundreds of collections and textual funds and 50.000 iconography pieces, great part of which are gathered in photographic collections. It shows how the documentation work extrapolates the limits of SVDHICO in order to integrate itself with the group activities of the museum and with other research groups. It also points towards new work methodologies which allow to perform the curatorship in an integrated way with the interdisciplinary research and the culture diffusion.O artigo faz um balanço dos trabalhos de curadoria realizados durante a década de 1990 pelo atual Serviço de Documentação Textual e Iconografia do Museu Paulista, responsável pelo Fundo MP/Arquivo Permanente, centenas de coleções e fundos textuais e 50.000 peças de iconografia, grande parte delas reunidas em coleções fotográficas. Mostra como o trabalho de documentação extrapola os limites do SVDHICO para integrar-se com as atividades de conjunto do Museu e com outros grupos de pesquisa. Aponta também para novas metodologias de trabalho com imagens que permitem realizar a curadoria de forma integrada à pesquisa interdisciplinar e à difusão cultural
Image Retrieval as Linguistic and Nonlinguistic Visual Model Matching
published or submitted for publicatio