Search CORE

950 research outputs found

Interactive Machine Learning with Applications in Health Informatics

Author: Wang Yue
Publication venue
Publication date: 01/01/2018
Field of study

Recent years have witnessed unprecedented growth of health data, including millions of biomedical research publications, electronic health records, patient discussions on health forums and social media, fitness tracker trajectories, and genome sequences. Information retrieval and machine learning techniques are powerful tools to unlock invaluable knowledge in these data, yet they need to be guided by human experts. Unlike training machine learning models in other domains, labeling and analyzing health data requires highly specialized expertise, and the time of medical experts is extremely limited. How can we mine big health data with little expert effort? In this dissertation, I develop state-of-the-art interactive machine learning algorithms that bring together human intelligence and machine intelligence in health data mining tasks. By making efficient use of human expert's domain knowledge, we can achieve high-quality solutions with minimal manual effort. I first introduce a high-recall information retrieval framework that helps human users efficiently harvest not just one but as many relevant documents as possible from a searchable corpus. This is a common need in professional search scenarios such as medical search and literature review. Then I develop two interactive machine learning algorithms that leverage human expert's domain knowledge to combat the curse of "cold start" in active learning, with applications in clinical natural language processing. A consistent empirical observation is that the overall learning process can be reliably accelerated by a knowledge-driven "warm start", followed by machine-initiated active learning. As a theoretical contribution, I propose a general framework for interactive machine learning. Under this framework, a unified optimization objective explains many existing algorithms used in practice, and inspires the design of new algorithms.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/147518/1/raywang_1.pd

Deep Blue Documents at the University of Michigan

Text Summarization Techniques: A Brief Survey

Author: Allahyari Mehdi
Assefi Mehdi
Gutierrez Juan B.
Kochut Krys
Pouriyeh Seyedamin
Safaei Saeid
Trippe Elizabeth D.
Publication venue
Publication date: 01/01/2017
Field of study

In recent years, there has been a explosion in the amount of text data from a variety of sources. This volume of text is an invaluable source of information and knowledge which needs to be effectively summarized to be useful. In this review, the main approaches to automatic text summarization are described. We review the different processes for summarization and describe the effectiveness and shortcomings of the different methods.Comment: Some of references format have update

arXiv.org e-Print Archive

Georgia Southern University: Digital Commons@Georgia Southern

Recommended from our members

Computational Approaches to Assisting Patients\u27 Medical Comprehension from Electronic Health Records

Author: Zheng Jiaping
Publication venue: ScholarWorks@UMass Amherst
Publication date: 16/07/2020
Field of study

Patient-centered care has been established as a fundamental approach to improve the quality of health care in a seminal report by the Institute of Medicine published at the start of the century. Improved access to health information and demand for greater transparency contributed to its move into the mainstream. Research has also demonstrated that actively involving patients in the management of their own health can lead to better outcomes, and potentially lower costs. However, despite the efforts in many areas of medicine to embrace patient-centered care, engaging patients is still considered a challenge. One of the barriers is the lack of effective tools to help patients understand their health conditions, options and their consequences. Patient portals are now widely adopted by hospitals and other healthcare practices to provide patients with the capabilities to view their own Electronic Health Records. They are a rich resource of information for patients. However, the language in the records are generally difficult for patients without training in medicine to understand. Furthermore, the amount of information can often be overwhelming as well. In this work, we propose computational approaches to foster patient engagement from three aspects by exploiting the rich information in the medical records. First, we design a framework to automatically generate health literacy instruments to measure a patient\u27s literacy levels. This framework exploits readily available large scale corpora to generate instruments in a commonly used test format. Second, we investigate methods that can determine the readability of complex documents such as health records. We propose to rank document readability, instead of assigning a grade level or a pre-defined difficulty category. Lastly, we examine the problem of finding targeted educational materials to facilitate patient comprehension of medical notes. We study methods to formulate effective queries from specialized and long clinical narratives. In addition, we propose a neural network based method to identify medical concepts that are important to patients. The three aspects of this work address the issues of the overabundance and technical complexity of medical language in health records. We demonstrate that our approaches are effective with various experiments and evaluation metric

ScholarWorks@UMass Amherst

Comparing Attributional and Relational Similarity as a Means to Identify Clinically Relevant Drug-gene Relationships

Author: Fathiamini Safa
Publication venue: DigitalCommons@TMC
Publication date: 15/08/2018
Field of study

In emerging domains, such as precision oncology, knowledge extracted from explicit assertions may be insufficient to identify relationships of interest. One solution to this problem involves drawing inference on the basis of similarity. Computational methods have been developed to estimate the semantic similarity and relatedness between terms and relationships that are distributed across corpora of literature such as Medline abstracts and other forms of human readable text. Most research on distributional similarity has focused on the notion of attributional similarity, which estimates the similarity between entities based on the contexts in which they occur across a large corpus. A relatively under-researched area concerns relational similarity, in which the similarity between pairs of entities is estimated from the contexts in which these entity pairs occur together. While it seems intuitive that models capturing the structure of the relationships between entities might mediate the identification of biologically important relationships, there is to date no comparison of the relative utility of attributional and relational models for this purpose. In this research, I compare the performance of a range of relational and attributional similarity methods, on the task of identifying drugs that may be therapeutically useful in the context of particular aberrant genes, as identified by a team of human experts. My hypothesis is that relational similarity will be of greater utility than attributional similarity as a means to identify biological relationships that may provide answers to clinical questions, (such as “which drugs INHIBIT gene x”?) in the context of rapidly evolving domains. My results show that models based on relational similarity outperformed models based on attributional similarity on this task. As the methods explained in this research can be applied to identify any sort of relationship for which cue pairs exist, my results suggest that relational similarity may be a suitable approach to apply to other biomedical problems. Furthermore, I found models based on neural word embeddings (NWE) to be particularly useful for this task, given their higher performance than Random Indexing-based models, and significantly less computational effort needed to create them. NWE methods (such as those produced by the popular word2vec tool) are a relatively recent development in the domain of distributional semantics, and are considered by many as the state-of-the-art when it comes to semantic language modeling. However, their application in identifying biologically important relationships from Medline in general, and specifically, in the domain of precision oncology has not been well studied. The results of this research can guide the design and implementation of biomedical question answering and other relationship extraction applications for precision medicine, precision oncology and other similar domains, where there is rapid emergence of novel knowledge. The methods developed and evaluated in this project can help NLP applications provide more accurate results by leveraging corpus based methods that are by design scalable and robust

DigitalCommons@The Texas Medical Center

Context-sensitive information retrieval

Author: Bai Jing
Publication venue
Publication date: 01/01/2007
Field of study

Thèse numérisée par la Direction des bibliothèques de l'Université de Montréal

Dépôt Institutionnel Numérique

CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

Author: Boujemaa Nozha
Compañó Ramón
Dosch Christoph
Geurts Joost
Karlgren Jussi
King Paul
Kompatsiaris Yiannis
Köhler Joachim
Le Moine Jean-Yves
Ortgies Robert
Point Jean-Charles
Rotenberg Boris
Rudström Åsa
Sebe Nicu
Publication venue: Chorus Project Consortium
Publication date: 01/01/2007
Field of study

Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Software institutes' Online Digital Archive

Question format shifts bias away from the emphasised response in tests of recognition memory

Author: Mill Ravi Dev
O'Connor Akira Robert
Publication venue: 'Elsevier BV'
Publication date: 01/05/2016
Field of study

The question asked to interrogate memory has potential to influence response bias at retrieval, yet has not been systematically investigated. According to framing effects in the field of eyewitness testimony, retrieval cueing effects in cognitive psychology and the acquiescence bias in questionnaire responding, the question should establish a confirmatory bias. Conversely, according to findings from the rewarded decision-making literature involving mixed incentives, the question should establish a disconfirmatory bias. Across three experiments (ns = 90 [online], 29 [laboratory] and 29 [laboratory]) we demonstrate a disconfirmatory bias - "old?" decreased old responding. This bias is underpinned by a goal-driven mechanism wherein participants seek to maximise emphasised response accuracy at the expense of frequency. Moreover, we demonstrate that disconfirmatory biases can be generated without explicit reference to the goal state. We conclude that subtle aspects of the test environment influence retrieval to a greater extent than has been previously considered.PostprintPeer reviewe

St Andrews Research Repository

Pretrained Transformers for Text Ranking: BERT and Beyond

Author: Lin Jimmy
Nogueira Rodrigo
Yates Andrew
Publication venue
Publication date: 01/01/2020
Field of study

The goal of text ranking is to generate an ordered list of texts retrieved from a corpus in response to a query. Although the most common formulation of text ranking is search, instances of the task can also be found in many natural language processing applications. This survey provides an overview of text ranking with neural network architectures known as transformers, of which BERT is the best-known example. The combination of transformers and self-supervised pretraining has been responsible for a paradigm shift in natural language processing (NLP), information retrieval (IR), and beyond. In this survey, we provide a synthesis of existing work as a single point of entry for practitioners who wish to gain a better understanding of how to apply transformers to text ranking problems and researchers who wish to pursue work in this area. We cover a wide range of modern techniques, grouped into two high-level categories: transformer models that perform reranking in multi-stage architectures and dense retrieval techniques that perform ranking directly. There are two themes that pervade our survey: techniques for handling long documents, beyond typical sentence-by-sentence processing in NLP, and techniques for addressing the tradeoff between effectiveness (i.e., result quality) and efficiency (e.g., query latency, model and index size). Although transformer architectures and pretraining techniques are recent innovations, many aspects of how they are applied to text ranking are relatively well understood and represent mature techniques. However, there remain many open research questions, and thus in addition to laying out the foundations of pretrained transformers for text ranking, this survey also attempts to prognosticate where the field is heading

arXiv.org e-Print Archive

MPG.PuRe

Discrete Emotion Effects on Lexical Decision Response Times

Author: A Agresti
A Balota D
A Lewis P
A Russell J
A Schacht
A Stevenson R
A Stevenson R
A Stevenson R
Angela Sirigu
Arthur M. Jacobs
B Briesemeister B
B New
Benny B. Briesemeister
C Britton J
C Darwin
C Herbert
D Kreibig S
E Izard C
E Izard C
F Pratto
G Scott G
J Grainger
J Hewig
J Hofmann M
J Hofmann M
J Holcomb P
J Kissler
J Lang P
J Lang P
J Larsen R
J Panksepp
J Panksepp
J Parrott D
J Thomas S
K Lund
L Etcoff N
L Feldman Barrett
L Kuchinke
L Kuchinke
L-H Võ M
Lars Kuchinke
M Bradley M
M Bradley M
M Nakic
P Ekman
P Ekman
P Kanske
P Laukka
R Graf
R Reisenzein
S Andrews
S Burnett
S Windmann
S-T Kousta
T Armstrong
T Murphy S
W Wundt
W Young A
Z Estes
Z Estes
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Our knowledge about affective processes, especially concerning effects on cognitive demands like word processing, is increasing steadily. Several studies consistently document valence and arousal effects, and although there is some debate on possible interactions and different notions of valence, broad agreement on a two dimensional model of affective space has been achieved. Alternative models like the discrete emotion theory have received little interest in word recognition research so far. Using backward elimination and multiple regression analyses, we show that five discrete emotions (i.e., happiness, disgust, fear, anger and sadness) explain as much variance as two published dimensional models assuming continuous or categorical valence, with the variables happiness, disgust and fear significantly contributing to this account. Moreover, these effects even persist in an experiment with discrete emotion conditions when the stimuli are controlled for emotional valence and arousal levels. We interpret this result as evidence for discrete emotion effects in visual word recognition that cannot be explained by the two dimensional affective space account

Institutional Repository of the Freie Universität Berlin

Public Library of Science (PLOS)

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

The role of phonology in visual word recognition: evidence from Chinese

Author: Ip JKM
Lau DKY
Leung MT
Weekes BS
Publication venue: 'United States Sports Academy'
Publication date: 01/01/2010
Field of study

Posters - Letter/Word Processing V: abstract no. 5024The hypothesis of bidirectional coupling of orthography and phonology predicts that phonology plays a role in visual word recognition, as observed in the effects of feedforward and feedback spelling to sound consistency on lexical decision. However, because orthography and phonology are closely related in alphabetic languages (homophones in alphabetic languages are usually orthographically similar), it is difficult to exclude an influence of orthography on phonological effects in visual word recognition. Chinese languages contain many written homophones that are orthographically dissimilar, allowing a test of the claim that phonological effects can be independent of orthographic similarity. We report a study of visual word recognition in Chinese based on a mega-analysis of lexical decision performance with 500 characters. The results from multiple regression analyses, after controlling for orthographic frequency, stroke number, and radical frequency, showed main effects of feedforward and feedback consistency, as well as interactions between these variables and phonological frequency and number of homophones. Implications of these results for resonance models of visual word recognition are discussed.postprin

HKU Scholars Hub