3,678 research outputs found
eScience Symposium Reflections from Christopher Erdmann: Redefining the Librarian’s Role in eScience
Christopher Erdmann, Head Librarian, Wolbach Library, Harvard Smithsonian Center for Astrophysics, shares his multiple experiences at the University of Massachusetts and New England Area Librarian eScience Symposium, as well as current challenges he sees for data science librarians. Christopher presented on a panel at the Eighth Annual eScience Symposium discussing the “future of data science” from a librarian’s perspective. See the video of this panel presentation on the 2016 eScience Symposium website.
A transcript of this interview is available for download via the Download button above
Telescope Bibliographies: an Essential Component of Archival Data Management and Operations
Assessing the impact of astronomical facilities rests upon an evaluation of
the scientific discoveries which their data have enabled. Telescope
bibliographies, which link data products with the literature, provide a way to
use bibliometrics as an impact measure for the underlying data. In this paper
we argue that the creation and maintenance of telescope bibliographies should
be considered an integral part of an observatory's operations. We review the
existing tools, services, and workflows which support these curation
activities, giving an estimate of the effort and expertise required to maintain
an archive-based telescope bibliography.Comment: 10 pages, 3 figures, to appear in SPIE Astronomical Telescopes and
Instrumentation, SPIE Conference Series 844
Social Research Collaboration: Libraries Need Not Apply?
Social media was born an efficient method of personal networking. As more and more researchers took to social media platforms, we have witnessed an organic growth of collaboration among scholars, faculty, students, etc. This phenomenon has led us to a profound change in the way we conduct research through social media. Research through collaboration is now increasingly important in order to achieve a higher impact throughout the research community. But where does the library fit into this? The simple answer is that researchers are now bypassing the library.
This presentation will look at the new reality of social research collaboration and discuss what kinds of webbased tools can support the workflow and peer collaboration of researchers. The presenters will also discuss why it is essential for libraries to become part of the solution before they are left out in the cold
Challenges and solutions for Latin named entity recognition
Although spanning thousands of years and genres as diverse as liturgy, historiography, lyric and other forms of prose and poetry, the body of Latin texts is still relatively sparse compared to English. Data sparsity in Latin presents a number of challenges for traditional Named Entity
Recognition techniques. Solving such challenges and enabling reliable Named Entity Recognition in Latin texts can facilitate many down-stream applications, from machine translation to digital historiography, enabling Classicists, historians, and archaeologists for instance, to track
the relationships of historical persons, places, and groups on a large scale. This paper presents the first annotated corpus for evaluating Named Entity Recognition in Latin, as well as a fully supervised model that achieves over 90% F-score on a held-out test set, significantly outperforming a competitive baseline. We also present a novel active learning strategy that predicts how many and which sentences need to be annotated for named entities in order to attain a specified degree
of accuracy when recognizing named entities automatically in a given text. This maximizes the productivity of annotators while simultaneously controlling quality
Recommended from our members
How Do Astronomers Share Data? Reliability and Persistence of Datasets Linked in AAS Publications and a Qualitative Study of Data Practices among US Astronomers
We analyze data sharing practices of astronomers over the past fifteen years. An analysis of URL links embedded in papers published by the American Astronomical Society reveals that the total number of links included in the literature rose dramatically from 1997 until 2005, when it leveled off at around 1500 per year. The analysis also shows that the availability of linked material decays with time: in 2011, 44% of links published a decade earlier, in 2001, were broken. A rough analysis of link types reveals that links to data hosted on astronomers' personal websites become unreachable much faster than links to datasets on curated institutional sites. To gauge astronomers' current data sharing practices and preferences further, we performed in-depth interviews with 12 scientists and online surveys with 173 scientists, all at a large astrophysical research institute in the United States: the Harvard-Smithsonian Center for Astrophysics, in Cambridge, MA. Both the in-depth interviews and the online survey indicate that, in principle, there is no philosophical objection to data-sharing among astronomers at this institution. Key reasons that more data are not presently shared more efficiently in astronomy include: the difficulty of sharing large data sets; over reliance on non-robust, non-reproducible mechanisms for sharing data (e.g. emailing it); unfamiliarity with options that make data-sharing easier (faster) and/or more robust; and, lastly, a sense that other researchers would not want the data to be shared. We conclude with a short discussion of a new effort to implement an easy-to-use, robust, system for data sharing in astronomy, at theastrodata.org, and we analyze the uptake of that system to-date
Identifying barriers to healthcare as reported by rural and medically underserved patients in Oklahoma
OBJECTIVE: The Rural Patient Experience survey seeks to identify barriers to healthcare faced by patients in rural Oklahoma. Through the administration of a survey directly to patients, this study will analyze the current status of healthcare access, availability, and usage among rural Oklahoma populations. Results can be used to implement effective improvements in healthcare access tailored to specific patient-identified barriers.METHODS: Surveys will be distributed to individuals residing in rural communities and Health Professional Shortage Areas in the state of Oklahoma. The study involves patients of healthcare facilities in partnerships with Oklahoma State University's Center for Health System Innovation, and the facilities that agree to participate in the study will allow access to their patient panel. Patients residing in rural zip codes will be pooled into a randomly sampled population for survey distribution. Two-thirds (67%) of qualifying patients from each patient panel will be randomly selected to receive a survey in order to achieve a sample of adequate size.Responses will be analyzed using summary statistics, descriptive statistics, and significance testing.RESULTS & CONCLUSIONS: The development of the survey is being conducted and results are pending the distribution of the survey
Practical, Efficient, and Customizable Active Learning for Named Entity Recognition in the Digital Humanities
Scholars in inter-disciplinary fields like the
Digital Humanities are increasingly interested
in semantic annotation of specialized corpora.
Yet, under-resourced languages, imperfect or
noisily structured data, and user-specific classification tasks make it difficult to meet their
needs using off-the-shelf models. Manual annotation of large corpora from scratch, meanwhile, can be prohibitively expensive. Thus,
we propose an active learning solution for
named entity recognition, attempting to maximize a custom model’s improvement per additional unit of manual annotation. Our system
robustly handles any domain or user-defined
label set and requires no external resources,
enabling quality named entity recognition for
Humanities corpora where such resources are
not available. Evaluating on typologically disparate languages and datasets, we reduce required annotation by 20-60% and greatly outperform a competitive active learning baseline.New York University–Paris Sciences Lettres Global Alliance grant; National Endowment for the Humanities grant, award HAA-256078-17; Computational Approaches to Modeling Language lab
at New York University Abu Dhab
- …