47,561 research outputs found

    Improving average ranking precision in user searches for biomedical research datasets

    Full text link
    Availability of research datasets is keystone for health and life science study reproducibility and scientific progress. Due to the heterogeneity and complexity of these data, a main challenge to be overcome by research data management systems is to provide users with the best answers for their search queries. In the context of the 2016 bioCADDIE Dataset Retrieval Challenge, we investigate a novel ranking pipeline to improve the search of datasets used in biomedical experiments. Our system comprises a query expansion model based on word embeddings, a similarity measure algorithm that takes into consideration the relevance of the query terms, and a dataset categorisation method that boosts the rank of datasets matching query constraints. The system was evaluated using a corpus with 800k datasets and 21 annotated user queries. Our system provides competitive results when compared to the other challenge participants. In the official run, it achieved the highest infAP among the participants, being +22.3% higher than the median infAP of the participant's best submissions. Overall, it is ranked at top 2 if an aggregated metric using the best official measures per participant is considered. The query expansion method showed positive impact on the system's performance increasing our baseline up to +5.0% and +3.4% for the infAP and infNDCG metrics, respectively. Our similarity measure algorithm seems to be robust, in particular compared to Divergence From Randomness framework, having smaller performance variations under different training conditions. Finally, the result categorization did not have significant impact on the system's performance. We believe that our solution could be used to enhance biomedical dataset management systems. In particular, the use of data driven query expansion methods could be an alternative to the complexity of biomedical terminologies

    Detecting Family Resemblance: Automated Genre Classification.

    Get PDF
    This paper presents results in automated genre classification of digital documents in PDF format. It describes genre classification as an important ingredient in contextualising scientific data and in retrieving targetted material for improving research. The current paper compares the role of visual layout, stylistic features and language model features in clustering documents and presents results in retrieving five selected genres (Scientific Article, Thesis, Periodicals, Business Report, and Form) from a pool of materials populated with documents of the nineteen most popular genres found in our experimental data set.

    Internet source evaluation: The role of implicit associations and psychophysiological self-regulation

    Get PDF
    This study focused on middle school students\u2019 source evaluation skills as a key component of digital literacy. Specifically, it examined the role of two unexplored individual factors that may affect the evaluation of sources providing information about the controversial topic of the health risks associated with the use of mobile phones. The factors were the implicit association of mobile phone with health or no health, and psychophysiological self-regulation as reflected in basal Heart Rate Variability (HRV). Seventy-two seventh graders read six webpages that provided contrasting information on the unsettled topic of the potential health risks related to the use of mobile phones. Then they were asked to rank-order the six websites along the dimension of reliability (source evaluation). Findings revealed that students were able to discriminate between the most and least reliable websites, justifying their ranking in light of different criteria. However, overall, they were little accurate in rank-ordering all six Internet sources. Both implicit associations and HRV correlated with source evaluation. The interaction between the two individual variables was a significant predictor of participants\u2019 performance in rank-ordering the websites for reliability. A slope analysis revealed that when students had an average psychophysiological self-regulation, the stronger their association of the mobile phone with health, the better their performance on source evaluation. Theoretical and educational significances of the study are discussed

    Exploring the Academic/Creative Writing Binary

    Get PDF
    I began to work on this study in my ENG 201: Writing in the Disciplines class during my junior year at Pace University. After being asked to write a paper on what writing looks like in my discipline, I realized that my perceptions of the kinds of writing done by faculty and students in a university English department were limited and constricting as a result of the binary way in which I viewed academic and creative forms of writing. For instance, I had trouble believing that my creative writing professor studied pre-med in undergrad. I continued my research on this topic by developing a study to discover how faculty and undergraduates think about writing in an English department. In conducting this research, I hoped to redefine and illustrate potential overlaps between academic and creative writing and to propose new (perhaps more fluid or capacious) ways of labeling and conveying the kind of writing students and faculty produce. Specifically, I wanted to explore whether these are terms or categories that either groups use, or whether faculty and students’ perceptions of academic and creative writing challenge these categories. I explored these concepts through a qualitative study. After obtaining IRB approval, I devoted one class of Meaghan Brewer’s English 201:Writing in the Discplines to a workshop where students in the class brought in samples of their own writing and then put them into categories and created labels. Students filled out a form giving a rationale for how they labeled different kinds of writing before having a class discussion. I repeated the same process in a composition faculty meeting in the English department. These activities are modeled on activities described in research by composition scholar Anne Ruggles Gere. This highly contextual, qualitative research is commonplace in composition studies and has been present in the majority of my initial literature review. In conducting this study, my largest obstacle was the small amount of time I had to analyze the results of my activities between drafts. However, the data collected exceeded my expectations in that, like in much of the research cited in this paper, I found students had binary views of academic and creative writing despite not using them often as labels. For the most part, they described academic as being constricting and reliant on structure whereas they saw creative as a freer style that allowed them to voice an opinion. On the contrary, faculty used these terms more frequently, but thought about them in less binary ways. After having a group discussion, both faculty and students appeared to have broadened the way they looked at writing which is what I was hoping to encourage with this study. My findings suggest that faculty members need to create curricula that encourage students to see genres in more complex ways. Future research might explore how expanding the approach to teaching genre could redefine student perceptions of college writing

    Evaluating Multilingual Gisting of Web Pages

    Get PDF
    We describe a prototype system for multilingual gisting of Web pages, and present an evaluation methodology based on the notion of gisting as decision support. This evaluation paradigm is straightforward, rigorous, permits fair comparison of alternative approaches, and should easily generalize to evaluation in other situations where the user is faced with decision-making on the basis of information in restricted or alternative form.Comment: 7 pages, uses psfig and aaai style
    • …
    corecore