22,212 research outputs found
Intelligent Word Embeddings of Free-Text Radiology Reports
Radiology reports are a rich resource for advancing deep learning
applications in medicine by leveraging the large volume of data continuously
being updated, integrated, and shared. However, there are significant
challenges as well, largely due to the ambiguity and subtlety of natural
language. We propose a hybrid strategy that combines semantic-dictionary
mapping and word2vec modeling for creating dense vector embeddings of free-text
radiology reports. Our method leverages the benefits of both
semantic-dictionary mapping as well as unsupervised learning. Using the vector
representation, we automatically classify the radiology reports into three
classes denoting confidence in the diagnosis of intracranial hemorrhage by the
interpreting radiologist. We performed experiments with varying hyperparameter
settings of the word embeddings and a range of different classifiers. Best
performance achieved was a weighted precision of 88% and weighted recall of
90%. Our work offers the potential to leverage unstructured electronic health
record data by allowing direct analysis of narrative clinical notes.Comment: AMIA Annual Symposium 201
Mostly-Unsupervised Statistical Segmentation of Japanese Kanji Sequences
Given the lack of word delimiters in written Japanese, word segmentation is
generally considered a crucial first step in processing Japanese texts. Typical
Japanese segmentation algorithms rely either on a lexicon and syntactic
analysis or on pre-segmented data; but these are labor-intensive, and the
lexico-syntactic techniques are vulnerable to the unknown word problem. In
contrast, we introduce a novel, more robust statistical method utilizing
unsegmented training data. Despite its simplicity, the algorithm yields
performance on long kanji sequences comparable to and sometimes surpassing that
of state-of-the-art morphological analyzers over a variety of error metrics.
The algorithm also outperforms another mostly-unsupervised statistical
algorithm previously proposed for Chinese.
Additionally, we present a two-level annotation scheme for Japanese to
incorporate multiple segmentation granularities, and introduce two novel
evaluation metrics, both based on the notion of a compatible bracket, that can
account for multiple granularities simultaneously.Comment: 22 pages. To appear in Natural Language Engineerin
Comprehensive Review of Opinion Summarization
The abundance of opinions on the web has kindled the study of opinion summarization over the last few years. People have introduced various techniques and paradigms to solving this special task. This survey attempts to systematically investigate the different techniques and approaches used in opinion summarization. We provide a multi-perspective classification of the approaches used and highlight some of the key weaknesses of these approaches. This survey also covers evaluation techniques and data sets used in studying the opinion summarization problem. Finally, we provide insights into some of the challenges that are left to be addressed as this will help set the trend for future research in this area.unpublishednot peer reviewe
Irregular speech rate dissociates auditory cortical entrainment, evoked responses, and frontal alpha
The entrainment of slow rhythmic auditory cortical activity to the temporal regularities in speech is considered to be a central mechanism underlying auditory perception. Previous work has shown that entrainment is reduced when the quality of the acoustic input is degraded, but has also linked rhythmic activity at similar time scales to the encoding of temporal expectations. To understand these bottom-up and top-down contributions to rhythmic entrainment, we manipulated the temporal predictive structure of speech by parametrically altering the distribution of pauses between syllables or words, thereby rendering the local speech rate irregular while preserving intelligibility and the envelope fluctuations of the acoustic signal. Recording EEG activity in human participants, we found that this manipulation did not alter neural processes reflecting the encoding of individual sound transients, such as evoked potentials. However, the manipulation significantly reduced the fidelity of auditory delta (but not theta) band entrainment to the speech envelope. It also reduced left frontal alpha power and this alpha reduction was predictive of the reduced delta entrainment across participants. Our results show that rhythmic auditory entrainment in delta and theta bands reflect functionally distinct processes. Furthermore, they reveal that delta entrainment is under top-down control and likely reflects prefrontal processes that are sensitive to acoustical regularities rather than the bottom-up encoding of acoustic features
Statistical Inferences for Polarity Identification in Natural Language
Information forms the basis for all human behavior, including the ubiquitous
decision-making that people constantly perform in their every day lives. It is
thus the mission of researchers to understand how humans process information to
reach decisions. In order to facilitate this task, this work proposes a novel
method of studying the reception of granular expressions in natural language.
The approach utilizes LASSO regularization as a statistical tool to extract
decisive words from textual content and draw statistical inferences based on
the correspondence between the occurrences of words and an exogenous response
variable. Accordingly, the method immediately suggests significant implications
for social sciences and Information Systems research: everyone can now identify
text segments and word choices that are statistically relevant to authors or
readers and, based on this knowledge, test hypotheses from behavioral research.
We demonstrate the contribution of our method by examining how authors
communicate subjective information through narrative materials. This allows us
to answer the question of which words to choose when communicating negative
information. On the other hand, we show that investors trade not only upon
facts in financial disclosures but are distracted by filler words and
non-informative language. Practitioners - for example those in the fields of
investor communications or marketing - can exploit our insights to enhance
their writings based on the true perception of word choice
Learning and comparing functional connectomes across subjects
Functional connectomes capture brain interactions via synchronized
fluctuations in the functional magnetic resonance imaging signal. If measured
during rest, they map the intrinsic functional architecture of the brain. With
task-driven experiments they represent integration mechanisms between
specialized brain areas. Analyzing their variability across subjects and
conditions can reveal markers of brain pathologies and mechanisms underlying
cognition. Methods of estimating functional connectomes from the imaging signal
have undergone rapid developments and the literature is full of diverse
strategies for comparing them. This review aims to clarify links across
functional-connectivity methods as well as to expose different steps to perform
a group study of functional connectomes
The extraction of the new components from electrogastrogram (EGG), using both adaptive filtering and electrocardiographic (ECG) derived respiration signal
Electrogastrographic examination (EGG) is a noninvasive method for an investigation of a stomach slow wave propagation. The typical range of frequency for EGG signal is from 0.015 to 0.15 Hz or (0.015–0.3 Hz) and the signal usually is captured with sampling frequency not exceeding 4 Hz. In this paper a new approach of method for recording the EGG signals with high sampling frequency (200 Hz) is proposed. High sampling frequency allows collection of signal, which includes not only EGG component but also signal from other organs of the digestive system such as the duodenum, colon as well as signal connected with respiratory movements and finally electrocardiographic signal (ECG). The presented method allows improve the quality of analysis of EGG signals by better suppress respiratory disturbance and extract new components from high sampling electrogastrographic signals (HSEGG) obtained from abdomen surface. The source of the required new signal components can be inner organs such as the duodenum and colon. One of the main problems that appear during analysis the EGG signals and extracting signal components from inner organs is how to suppress the respiratory components. In this work an adaptive filtering method that requires a reference signal is proposed.Electrogastrographic examination (EGG) is a noninvasive method for an investigation of a stomach slow wave propagation. The typical range of frequency for EGG signal is from 0.015 to 0.15 Hz or (0.015–0.3 Hz) and the signal usually is captured with sampling frequency not exceeding 4 Hz. In this paper a new approach of method for recording the EGG signals with high sampling frequency (200 Hz) is proposed. High sampling frequency allows collection of signal, which includes not only EGG component but also signal from other organs of the digestive system such as the duodenum, colon as well as signal connected with respiratory movements and finally electrocardiographic signal (ECG). The presented method allows improve the quality of analysis of EGG signals by better suppress respiratory disturbance and extract new components from high sampling electrogastrographic signals (HSEGG) obtained from abdomen surface. The source of the required new signal components can be inner organs such as the duodenum and colon. One of the main problems that appear during analysis the EGG signals and extracting signal components from inner organs is how to suppress the respiratory components. In this work an adaptive filtering method that requires a reference signal is proposed
- …