23,055 research outputs found
Social media mining for identification and exploration of health-related information from pregnant women
Widespread use of social media has led to the generation of substantial
amounts of information about individuals, including health-related information.
Social media provides the opportunity to study health-related information about
selected population groups who may be of interest for a particular study. In
this paper, we explore the possibility of utilizing social media to perform
targeted data collection and analysis from a particular population group --
pregnant women. We hypothesize that we can use social media to identify cohorts
of pregnant women and follow them over time to analyze crucial health-related
information. To identify potentially pregnant women, we employ simple
rule-based searches that attempt to detect pregnancy announcements with
moderate precision. To further filter out false positives and noise, we employ
a supervised classifier using a small number of hand-annotated data. We then
collect their posts over time to create longitudinal health timelines and
attempt to divide the timelines into different pregnancy trimesters. Finally,
we assess the usefulness of the timelines by performing a preliminary analysis
to estimate drug intake patterns of our cohort at different trimesters. Our
rule-based cohort identification technique collected 53,820 users over thirty
months from Twitter. Our pregnancy announcement classification technique
achieved an F-measure of 0.81 for the pregnancy class, resulting in 34,895 user
timelines. Analysis of the timelines revealed that pertinent health-related
information, such as drug-intake and adverse reactions can be mined from the
data. Our approach to using user timelines in this fashion has produced very
encouraging results and can be employed for other important tasks where
cohorts, for which health-related information may not be available from other
sources, are required to be followed over time to derive population-based
estimates.Comment: 9 page
Inferring Strategies for Sentence Ordering in Multidocument News Summarization
The problem of organizing information for multidocument summarization so that
the generated summary is coherent has received relatively little attention.
While sentence ordering for single document summarization can be determined
from the ordering of sentences in the input article, this is not the case for
multidocument summarization where summary sentences may be drawn from different
input articles. In this paper, we propose a methodology for studying the
properties of ordering information in the news genre and describe experiments
done on a corpus of multiple acceptable orderings we developed for the task.
Based on these experiments, we implemented a strategy for ordering information
that combines constraints from chronological order of events and topical
relatedness. Evaluation of our augmented algorithm shows a significant
improvement of the ordering over two baseline strategies
Timeline Generation: Tracking individuals on Twitter
In this paper, we propose a unsupervised framework to reconstruct a person's
life history by creating a chronological list for {\it personal important
events} (PIE) of individuals based on the tweets they published. By analyzing
individual tweet collections, we find that what are suitable for inclusion in
the personal timeline should be tweets talking about personal (as opposed to
public) and time-specific (as opposed to time-general) topics. To further
extract these types of topics, we introduce a non-parametric multi-level
Dirichlet Process model to recognize four types of tweets: personal
time-specific (PersonTS), personal time-general (PersonTG), public
time-specific (PublicTS) and public time-general (PublicTG) topics, which, in
turn, are used for further personal event extraction and timeline generation.
To the best of our knowledge, this is the first work focused on the generation
of timeline for individuals from twitter data. For evaluation, we have built a
new golden standard Timelines based on Twitter and Wikipedia that contain PIE
related events from 20 {\it ordinary twitter users} and 20 {\it celebrities}.
Experiments on real Twitter data quantitatively demonstrate the effectiveness
of our approach
Towards Building a Knowledge Base of Monetary Transactions from a News Collection
We address the problem of extracting structured representations of economic
events from a large corpus of news articles, using a combination of natural
language processing and machine learning techniques. The developed techniques
allow for semi-automatic population of a financial knowledge base, which, in
turn, may be used to support a range of data mining and exploration tasks. The
key challenge we face in this domain is that the same event is often reported
multiple times, with varying correctness of details. We address this challenge
by first collecting all information pertinent to a given event from the entire
corpus, then considering all possible representations of the event, and
finally, using a supervised learning method, to rank these representations by
the associated confidence scores. A main innovative element of our approach is
that it jointly extracts and stores all attributes of the event as a single
representation (quintuple). Using a purpose-built test set we demonstrate that
our supervised learning approach can achieve 25% improvement in F1-score over
baseline methods that consider the earliest, the latest or the most frequent
reporting of the event.Comment: Proceedings of the 17th ACM/IEEE-CS Joint Conference on Digital
Libraries (JCDL '17), 201
Relationships between working memory, expressive vocabulary and arithmetical reasoning in children with and without intellectual disabilities.
This experiment examined the relationships between working memory and two measures of achievement, namely expressive vocabulary and arithmetical reasoning, in children with and without intellectual disabilities (ID). For 11-12-year-old children with intellectual disabilities, memory measures tapping the central executive were the most important predictors of both expressive vocabulary and arithmetical reasoning, with phonological memory making a small additional contribution to expressive vocabulary. For mainstream 11-12-year-old children, phonological memory was the best predictor of expressive vocabulary, whereas, arithmetical reasoning ability was predicted by visual memory and to a lesser extent phonological memory. The third group of children, 7-8-year-old mainstream children, had been matched on mental age with the intellectual disability group. For these children the most important predictor of expressive vocabulary was phonological memory, with a small additional contribution from visual memory. Arithmetical reasoning was best predicted by memory measures tapping the central executive with an additional contribution from phonological memory. These results suggest that different working memory resources are used by children of varying ages and ability levels to carry out at least some cognitive tasks
- …