Search CORE

22,212 research outputs found

Intelligent Word Embeddings of Free-Text Radiology Reports

Author: Banerjee Imon
Goldman Roger Eric
Madhavan Sriraman
Rubin Daniel L.
Publication venue
Publication date: 01/01/2017
Field of study

Radiology reports are a rich resource for advancing deep learning applications in medicine by leveraging the large volume of data continuously being updated, integrated, and shared. However, there are significant challenges as well, largely due to the ambiguity and subtlety of natural language. We propose a hybrid strategy that combines semantic-dictionary mapping and word2vec modeling for creating dense vector embeddings of free-text radiology reports. Our method leverages the benefits of both semantic-dictionary mapping as well as unsupervised learning. Using the vector representation, we automatically classify the radiology reports into three classes denoting confidence in the diagnosis of intracranial hemorrhage by the interpreting radiologist. We performed experiments with varying hyperparameter settings of the word embeddings and a range of different classifiers. Best performance achieved was a weighted precision of 88% and weighted recall of 90%. Our work offers the potential to leverage unstructured electronic health record data by allowing direct analysis of narrative clinical notes.Comment: AMIA Annual Symposium 201

arXiv.org e-Print Archive

eScholarship - University of California

Mostly-Unsupervised Statistical Segmentation of Japanese Kanji Sequences

Author: Ando Rie Kubota
Lee Lillian
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 10/05/2002
Field of study

Given the lack of word delimiters in written Japanese, word segmentation is generally considered a crucial first step in processing Japanese texts. Typical Japanese segmentation algorithms rely either on a lexicon and syntactic analysis or on pre-segmented data; but these are labor-intensive, and the lexico-syntactic techniques are vulnerable to the unknown word problem. In contrast, we introduce a novel, more robust statistical method utilizing unsegmented training data. Despite its simplicity, the algorithm yields performance on long kanji sequences comparable to and sometimes surpassing that of state-of-the-art morphological analyzers over a variety of error metrics. The algorithm also outperforms another mostly-unsupervised statistical algorithm previously proposed for Chinese. Additionally, we present a two-level annotation scheme for Japanese to incorporate multiple segmentation granularities, and introduce two novel evaluation metrics, both based on the notion of a compatible bracket, that can account for multiple granularities simultaneously.Comment: 22 pages. To appear in Natural Language Engineerin

arXiv.org e-Print Archive

CiteSeerX

Crossref

Comprehensive Review of Opinion Summarization

Author: Ganesan Kavita
Kim Hyun Duk
Sondhi Parikshit
Zhai ChengXiang
Publication venue
Publication date: 01/01/2011
Field of study

The abundance of opinions on the web has kindled the study of opinion summarization over the last few years. People have introduced various techniques and paradigms to solving this special task. This survey attempts to systematically investigate the different techniques and approaches used in opinion summarization. We provide a multi-perspective classification of the approaches used and highlight some of the key weaknesses of these approaches. This survey also covers evaluation techniques and data sets used in studying the opinion summarization problem. Finally, we provide insights into some of the challenges that are left to be addressed as this will help set the trend for future research in this area.unpublishednot peer reviewe

CiteSeerX

Illinois Digital Environment for Access to Learning and Scholarship Repository

Irregular speech rate dissociates auditory cortical entrainment, evoked responses, and frontal alpha

Author: Gross Joachim
Ince Robin A. A.
Kayser Christoph
Kayser Stephanie J.
Publication venue: 'Society for Neuroscience'
Publication date: 04/11/2015
Field of study

The entrainment of slow rhythmic auditory cortical activity to the temporal regularities in speech is considered to be a central mechanism underlying auditory perception. Previous work has shown that entrainment is reduced when the quality of the acoustic input is degraded, but has also linked rhythmic activity at similar time scales to the encoding of temporal expectations. To understand these bottom-up and top-down contributions to rhythmic entrainment, we manipulated the temporal predictive structure of speech by parametrically altering the distribution of pauses between syllables or words, thereby rendering the local speech rate irregular while preserving intelligibility and the envelope fluctuations of the acoustic signal. Recording EEG activity in human participants, we found that this manipulation did not alter neural processes reflecting the encoding of individual sound transients, such as evoked potentials. However, the manipulation significantly reduced the fidelity of auditory delta (but not theta) band entrainment to the speech envelope. It also reduced left frontal alpha power and this alpha reduction was predictive of the reduced delta entrainment across participants. Our results show that rhythmic auditory entrainment in delta and theta bands reflect functionally distinct processes. Furthermore, they reveal that delta entrainment is under top-down control and likely reflects prefrontal processes that are sensitive to acoustical regularities rather than the bottom-up encoding of acoustic features

Crossref

PubMed Central

Enlighten

Statistical Inferences for Polarity Identification in Natural Language

Author: Feuerriegel Stefan
Neumann Dirk
Pröllochs Nicolas
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2018
Field of study

Information forms the basis for all human behavior, including the ubiquitous decision-making that people constantly perform in their every day lives. It is thus the mission of researchers to understand how humans process information to reach decisions. In order to facilitate this task, this work proposes a novel method of studying the reception of granular expressions in natural language. The approach utilizes LASSO regularization as a statistical tool to extract decisive words from textual content and draw statistical inferences based on the correspondence between the occurrences of words and an exogenous response variable. Accordingly, the method immediately suggests significant implications for social sciences and Information Systems research: everyone can now identify text segments and word choices that are statistically relevant to authors or readers and, based on this knowledge, test hypotheses from behavioral research. We demonstrate the contribution of our method by examining how authors communicate subjective information through narrative materials. This allows us to answer the question of which words to choose when communicating negative information. On the other hand, we show that investors trade not only upon facts in financial disclosures but are distracted by filler words and non-informative language. Practitioners - for example those in the fields of investor communications or marketing - can exploit our insights to enhance their writings based on the true perception of word choice

arXiv.org e-Print Archive

Repository for Publications and Research Data

Directory of Open Access Journals

FigShare

Learning and comparing functional connectomes across subjects

Author: Craddock R. C.
Varoquaux Gaël
Publication venue: 'Elsevier BV'
Publication date: 14/04/2013
Field of study

Functional connectomes capture brain interactions via synchronized fluctuations in the functional magnetic resonance imaging signal. If measured during rest, they map the intrinsic functional architecture of the brain. With task-driven experiments they represent integration mechanisms between specialized brain areas. Analyzing their variability across subjects and conditions can reveal markers of brain pathologies and mechanisms underlying cognition. Methods of estimating functional connectomes from the imaging signal have undergone rapid developments and the literature is full of diverse strategies for comparing them. This review aims to clarify links across functional-connectivity methods as well as to expose different steps to perform a group study of functional connectomes

arXiv.org e-Print Archive

Crossref

HAL-Inserm

INRIA a CCSD electronic archive server

HAL-CEA

The extraction of the new components from electrogastrogram (EGG), using both adaptive filtering and electrocardiographic (ECG) derived respiration signal

Author: Komorowski Dariusz
Pietraszek Stanislaw
Provazník Ivo
Tkacz Ewaryst
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 23/06/2015
Field of study

Electrogastrographic examination (EGG) is a noninvasive method for an investigation of a stomach slow wave propagation. The typical range of frequency for EGG signal is from 0.015 to 0.15 Hz or (0.015–0.3 Hz) and the signal usually is captured with sampling frequency not exceeding 4 Hz. In this paper a new approach of method for recording the EGG signals with high sampling frequency (200 Hz) is proposed. High sampling frequency allows collection of signal, which includes not only EGG component but also signal from other organs of the digestive system such as the duodenum, colon as well as signal connected with respiratory movements and finally electrocardiographic signal (ECG). The presented method allows improve the quality of analysis of EGG signals by better suppress respiratory disturbance and extract new components from high sampling electrogastrographic signals (HSEGG) obtained from abdomen surface. The source of the required new signal components can be inner organs such as the duodenum and colon. One of the main problems that appear during analysis the EGG signals and extracting signal components from inner organs is how to suppress the respiratory components. In this work an adaptive filtering method that requires a reference signal is proposed.Electrogastrographic examination (EGG) is a noninvasive method for an investigation of a stomach slow wave propagation. The typical range of frequency for EGG signal is from 0.015 to 0.15 Hz or (0.015–0.3 Hz) and the signal usually is captured with sampling frequency not exceeding 4 Hz. In this paper a new approach of method for recording the EGG signals with high sampling frequency (200 Hz) is proposed. High sampling frequency allows collection of signal, which includes not only EGG component but also signal from other organs of the digestive system such as the duodenum, colon as well as signal connected with respiratory movements and finally electrocardiographic signal (ECG). The presented method allows improve the quality of analysis of EGG signals by better suppress respiratory disturbance and extract new components from high sampling electrogastrographic signals (HSEGG) obtained from abdomen surface. The source of the required new signal components can be inner organs such as the duodenum and colon. One of the main problems that appear during analysis the EGG signals and extracting signal components from inner organs is how to suppress the respiratory components. In this work an adaptive filtering method that requires a reference signal is proposed

Springer - Publisher Connector

PubMed Central

Digital library of Brno University of Technology