26 research outputs found
SORTA:a system for ontology-based re-coding and technical annotation of biomedical phenotype data
There is an urgent need to standardize the semantics of biomedical data values, such as phenotypes, to enable comparative and integrative analyses. However, it is unlikely that all studies will use the same data collection protocols. As a result, retrospective standardization is often required, which involves matching of original (unstructured or locally coded) data to widely used coding or ontology systems such as SNOMED CT (clinical terms), ICD-10 (International Classification of Disease) and HPO (Human Phenotype Ontology). This data curation process is usually a time-consuming process performed by a human expert. To help mechanize this process, we have developed SORTA, a computer-aided system for rapidly encoding free text or locally coded values to a formal coding system or ontology. SORTA matches original data values (uploaded in semicolon delimited format) to a target coding system (uploaded in Excel spreadsheet, OWL ontology web language or OBO open biomedical ontologies format). It then semi-automatically shortlists candidate codes for each data value using Lucene and n-gram based matching algorithms, and can also learn from matches chosen by human experts. We evaluated SORTA's applicability in two use cases. For the LifeLines biobank, we used SORTA to recode 90 000 free text values (including 5211 unique values) about physical exercise to MET (Metabolic Equivalent of Task) codes. For the CINEAS clinical symptom coding system, we used SORTA to map to HPO, enriching HPO when necessary (315 terms matched so far). Out of the shortlists at rank 1, we found a precision/recall of 0.97/0.98 in LifeLines and of 0.58/0.45 in CINEAS. More importantly, users found the tool both a major time saver and a quality improvement because SORTA reduced the chances of human mistakes. Thus, SORTA can dramatically ease data (re) coding tasks and we believe it will prove useful for many more projects
BiobankUniverse:Automatic matchmaking between datasets for biobank data discovery and integration
Motivation: Biobanks are indispensable for large-scale genetic/epidemiological studies, yet it remains difficult for researchers to determine which biobanks contain data matching their research questions. Results: To overcome this, we developed a new matching algorithm that identifies pairs of related data elements between biobanks and research variables with high precision and recall. It integrates lexical comparison, Unified Medical Language System ontology tagging and semantic query expansion. The result is BiobankUniverse, a fast matchmaking service for biobanks and researchers. Biobankers upload their data elements and researchers their desired study variables, BiobankUniverse automatically shortlists matching attributes between them. Users can quickly explore matching potential and search for biobanks/data elements matching their research. They can also curate matches and define personalized data-universes
Robots should be seen and not heard…sometimes: Anthropomorphism and AI service robot interactions
There is a growing need to understand how consumers will interact with artificially intelligent (AI) domestic service robots, which are currently entering consumer homes at increasing rates, yet without a theoretical understanding of the consumer preferences influencing interaction roles such robots may play within the home. Guided by anthropomorphism theory, this study explores how different levels of robot humanness and social interaction opportunities affect consumers' liking for service robots. A review of the extant literature is conducted, yielding three hypotheses that are tested via 953 responses to an online scenario-based experiment. Findings indicate that while consumers prefer higher levels of humanness and moderate-to-high levels of social interaction opportunity, only some participants liked robots more when dialogue (high-interaction opportunity) was offered. Resulting from this study is the proposed Humanized-AI Social Interactivity Framework. The framework extends previous studies in marketing and consumer behavior literature by offering an increased understanding of how households will choose to interact with service robots in domestic environments based on humanness and social interaction. Guidelines for practitioners and two overarching themes for future research emerge from this study. This paper contributes to an increased understanding of potential interactions with service robots in domestic environments.</p
MOLGENIS/connect:a system for semi-automatic integration of heterogeneous phenotype data with applications in biobanks
Motivation: While the size and number of biobanks, patient registries and other data collections are increasing, biomedical researchers still often need to pool data for statistical power, a task that requires time-intensive retrospective integration. Results: To address this challenge, we developed MOLGENIS/connect, a semi-automatic system to find, match and pool data from different sources. The system shortlists relevant source attributes from thousands of candidates using ontology-based query expansion to overcome variations in terminology. Then it generates algorithms that transform source attributes to a common target DataSchema. These include unit conversion, categorical value matching and complex conversion patterns (e.g. calculation of BMI). In comparison to human-experts, MOLGENIS/connect was able to auto-generate 27% of the algorithms perfectly, with an additional 46% needing only minor editing, representing a reduction in the human effort and expertise needed to pool data. Availability and Implementation: Source code, binaries and documentation are available as open-source under LGPLv3 from http://github.com/molgenis/molgenis and www.molgenis.org/connect
Crossing Party Lines: Political Identity and Partisans’ Reactions to Violating Party Norms
The current studies examined the experiences of undergraduate political partisans who cross party lines to support a preferred, out-of-party candidate, and thus open themselves to the possibility of being misclassified as a member of a rival political party. Strongly identified partisans who endorsed an out-of-party candidate, and thus expected others to misclassify them, reported heightened threats to belonging and coherence (Study 1), unless they disclaimed rival party status by asserting their political affiliation. In Study 2, strongly identified partisans who could be misclassified were less confident in their choice of an out-of-party candidate compared to partisans who asserted their political affiliation. These results highlight the impact of identity misclassification concerns on strongly identified partisans whose personal preferences are inconsistent with party norms