Search CORE

41 research outputs found

Interactive Machine Learning (IML) Markup of OCR Generated Text by Exploiting Domain Knowledge: A Biodiversity Case Study

Author: Heidorn P. Bryan
Wei Qin
Publication venue
Publication date: 28/02/2008
Field of study

Illinois Digital Environment for Access to Learning and Scholarship Repository

Digital Image Access & Retrieval

Author: Heidorn P. Bryan
Sandore Beth
Publication venue: Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign
Publication date: 01/01/1997
Field of study

The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio

Illinois Digital Environment for Access to Learning and Scholarship Repository

The Astrolabe Project: Identifying and Curating Astronomical Dark Data through Development of Cyberinfrastructure Resources

Author: Heidorn P. Bryan
Stahlman Gretchen R.
Steffen Julie
Publication venue: 'EDP Sciences'
Publication date: 15/05/2018
Field of study

As research datasets and analyses grow in complexity, data that could be valuable to other researchers and to support the integrity of published work remain uncurated across disciplines. These data are especially concentrated in the Long Tail of funded research, where curation resources and related expertise are often inaccessible. In the domain of astronomy, it is undisputed that uncurated dark data exist, but the scope of the problem remains uncertain. The Astrolabe Project is a collaboration between University of Arizona researchers, the CyVerse cyberinfrastructure environment, and American Astronomical Society, with a mission to identify and ingest previously-uncurated astronomical data, and to provide a robust computational environment for analysis and sharing of data, as well as services for authors wishing to deposit data associated with publications. Following expert feedback obtained through two workshops held in 2015 and 2016, Astrolabe is funded in part by National Science Foundation. The system is being actively developed within CyVerse, and Astrolabe collaborators are soliciting heterogeneous datasets and potential users for the prototype system. Astrolabe team members are currently working to characterize the properties of uncurated astronomical data, and to develop automated methods for locating potentially-useful data to be targeted for ingest into Astrolabe, while cultivating a user community for the new data management system.Comment: To be published in Proceedings of Library and Information Services in Astronomy (LISA) VIII; conference held in Strasbourg, France, June 6-9, 201

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

The University of Arizona

Astrolabe: Curating, Linking and Computing Astronomy's Dark Data

Author: Heidorn P. Bryan
Stahlman Gretchen R.
Steffen Julie
Publication venue: 'American Astronomical Society'
Publication date: 26/02/2018
Field of study

Where appropriate repositories are not available to support all relevant astronomical data products, data can fall into darkness: unseen and unavailable for future reference and re-use. Some data in this category are legacy or old data, but newer datasets are also often uncurated and could remain "dark". This paper provides a description of the design motivation and development of Astrolabe, a cyberinfrastructure project that addresses a set of community recommendations for locating and ensuring the long-term curation of dark or otherwise at-risk data and integrated computing. This paper also describes the outcomes of the series of community workshops that informed creation of Astrolabe. According to participants in these workshops, much astronomical dark data currently exist that are not curated elsewhere, as well as software that can only be executed by a few individuals and therefore becomes unusable because of changes in computing platforms. Astronomical research questions and challenges would be better addressed with integrated data and computational resources that fall outside the scope of existing observatory and space mission projects. As a solution, the design of the Astrolabe system is aimed at developing new resources for management of astronomical data. The project is based in CyVerse cyberinfrastructure technology and is a collaboration between the University of Arizona and the American Astronomical Society. Overall the project aims to support open access to research data by leveraging existing cyberinfrastructure resources and promoting scientific discovery by making potentially-useful data in a computable format broadly available to the astronomical community.Comment: Accepted for publication in the Astrophysical Journal Supplement Series, 22 pages, 2 figure

arXiv.org e-Print Archive

Crossref

The University of Arizona

Augmenting optical character recognition (OCR) for improved digitization: Strategies to access scientific data in natural history collections

Author: Heidorn P. Bryan
Paul Deborah L.
Publication venue: 'iSchools'
Publication date: 01/02/2013
Field of study

The Augmenting OCR Working Group (A-OCR WG) at Integrated Digitized Biocollections (iDigBio) seeks to improve community OCR strategies and algorithms for faster, better parsing of OCR output derived from valuable data on natural history collection specimen labels. This task is exceedingly difficult because museum labels are often annotated, and vary in content, form and font. Under the National Science Foundation's (NSF) Advancing Digitization of Biological Collections (ADBC) program, iDigBio is building a cyberinfrastructure to aggregate quality data from museum specimens housed in collections across the United States for use by researchers, educators, environmentalists and the public. Since March of 2012, the A-OCR WG formed from community consensus to begin its role in this endeavor, defining reachable goals including setting up a hackathon concurrent with iConference 2013. This paper reports on the definition of some key problems identified by the A-OCR WG since these science problems will drive research and cyberinfrastructure development.published or submitted for publicationis peer reviewe

Illinois Digital Environment for Access to Learning and Scholarship Repository

Datasphere at the Biosphere II: Computation and data in the wild

Author: Chong Steven
Heidorn P. Bryan
Stahlman Gretchen Renee
Publication venue: 'iSchools'
Publication date: 15/03/2015
Field of study

Biological Field Stations provide a unique set of opportunities and challenges for digital curation. The stations serve as the center of short-term and long-term biological research, from biomolecular-scale to ecosystems-scale research. They represent some of the last remaining “natural” areas in certain regions. Stations provide unique information about local biotic and abiotic conditions. Data shared among the stations support continental scale and global research initiatives. The stations themselves support a large number of researchers who often come from multiple universities and other research and teaching institutions around the world. Because of this decentralized user base, it is particularly difficult for stations to capture data and other research products generated by research at the stations. The authors, part of a larger NSF funded “Empowering Long Tail Research” project (NSF:#1216872), conducted a survey of field station researchers and then held a two-day workshop to identify challenges and opportunities for “grand challenge” research questions that could be enabled through development of cyberinfrastructure. The information gathered through this study will inform future proposals for cyberinfrastructure development.ye

Illinois Digital Environment for Access to Learning and Scholarship Repository

Graduate Curriculum for Biological Information Specialists: A Key to Integration of Scale in Biology

Author: Cragin Melissa H.
Heidorn Bryan P.
Palmer Carole L.
Wright Dan
Publication venue: 'Edinburgh University Library'
Publication date: 01/12/2007
Field of study

Scientific data problems do not stand in isolation. They are part of a larger set of challenges associated with the escalation of scientific information and changes in scholarly communication in the digital environment. Biologists in particular are generating enormous sets of data at a high rate, and new discoveries in the biological sciences will increasingly depend on the integration of data across multiple scales. This work will require new kinds of information expertise in key areas. To build this professional capacity we have developed two complementary educational programs: a Biological Information Specialist (BIS) masters degree and a concentration in Data Curation (DC). We believe that BISs will be central in the development of cyberinfrastructure and information services needed to facilitate interdisciplinary and multi-scale science. Here we present three sample cases from our current research projects to illustrate areas in which we expect information specialists to make important contributions to biological research practice

Directory of Open Access Journals

International Journal of Digital Curation

O Serviço de documentação textual e iconografia do Museu Paulista

Author: ADAMS Gavin
ALMEIDA Adilson José
ALSFORD Stephen
BARBUY Heloisa
BARBUY Heloisa (Org.)
BEARMAN D
BELLUZZO Ana Maria
BESSER Howard
BITTENCOURT Vera
BREFE Ana Claudia Fonseca
BRILLIANT Richard Brilliant
BURKE Peter
CACALY S
CARVALHO Vânia Carneiro de
CARVALHO Vânia Carneiro de
CARVALHO Vânia Carneiro de
CARVALHO Vânia Carneiro de
CARVALHO Vânia Carneiro de
CARVALHO Vânia Carneiro de
CARVALHO Vânia Carneiro de
CARVALHO Vânia Carneiro de
CARVALHO Vânia Carneiro de
CARVALHO Vânia Carneiro de
CARVALHO Vânia Carneiro de
CARVALHO Vânia Carneiro de
CARVALHO Vânia Carneiro de
CHRISTO Maraliz de Castro Vieira
COSTA Maria Cristina
FERNANDES Paula Porta S
FRANCASTEL Pierre
HEIDORN P. Bryan
HUTTER Lucy Maffei
HÖRNER Erik
HÖRNER Erik
LEVI Darrell E
LIMA Solange Ferraz de
LIMA Solange Ferraz de
MACAMBIRA Yvoty
MAKINO M
MAKINO Miyoko
MAKINO Miyoko
MAKINO Miyoko
MALHEIRO Maria Cecília Stávale
MENESES Ulpiano
MENESES Ulpiano Bezerra de
MENESES Ulpiano Bezerra de
MILANI Lara B
Miyoko Makino
MORETTIN Eduardo Victorio
NEME Mario
OLIVEIRA Cecilia Helena de Salles
OLIVEIRA Cecilia Helena de Salles
OLIVEIRA Cecilia Helena de Salles
OLIVEIRA Cecilia Helena de Salles
OLIVEIRA Cecilia Helena de Salles
OLIVEIRA Cecília Helena de Salles
OLIVEIRA Lilian Aparecida de
PANOFSKY Erwin
PETRELLA Yara Lígia Mello Moreira
POINTON Marcia
RIBEIRO Angela Maria Gianeze
ROSEMBERG Liana Ruth Bergstein
RUIZ Adilson
Shirley Ribeiro da Silva
SMIT Johanna W
Solange Ferraz de Lima
SOUZA Jonas Soares de
TAUNAY Affonso de E
TESSITORE Viviane
TESSITORE Viviane
TESSITORE Viviane
TESSITORE Viviane
Vânia Carneiro de Carvalho
WITTER José Sebastião
WITTER José Sebastião
Publication venue: Universidade de São Paulo. Museu Paulista
Publication date: 01/01/2003
Field of study

The essay compares the curatorship's works realized during the decade of 1990 by the actual Department of Textual and Iconographical Documentation of Museu Paulista, responsible for the MP Fund / Permanent File (Fundo MP/Arquivo Permanente), hundreds of collections and textual funds and 50.000 iconography pieces, great part of which are gathered in photographic collections. It shows how the documentation work extrapolates the limits of SVDHICO in order to integrate itself with the group activities of the museum and with other research groups. It also points towards new work methodologies which allow to perform the curatorship in an integrated way with the interdisciplinary research and the culture diffusion.O artigo faz um balanço dos trabalhos de curadoria realizados durante a década de 1990 pelo atual Serviço de Documentação Textual e Iconografia do Museu Paulista, responsável pelo Fundo MP/Arquivo Permanente, centenas de coleções e fundos textuais e 50.000 peças de iconografia, grande parte delas reunidas em coleções fotográficas. Mostra como o trabalho de documentação extrapola os limites do SVDHICO para integrar-se com as atividades de conjunto do Museu e com outros grupos de pesquisa. Aponta também para novas metodologias de trabalho com imagens que permitem realizar a curadoria de forma integrada à pesquisa interdisciplinar e à difusão cultural

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Directory of Open Access Journals

Cadernos Espinosanos (E-Journal)

Image Retrieval as Linguistic and Nonlinguistic Visual Model Matching

Author: Heidorn P. Bryan
Publication venue: Graduate School of Library and Information Science. University of Illinois at Urbana-Champaign
Publication date: 01/01/1999
Field of study

published or submitted for publicatio

Illinois Digital Environment for Access to Learning and Scholarship Repository