2,665 research outputs found
The dynamics of reading development in L2 English for academic purposes
In a mixed-methods approach, this study investigates the complex and dynamic developmental trajectories of 27 Chinese Chemistry major undergraduates’ English academic reading ability. Twelve parallel tests were designed, validated, and used weekly during one semester. The analyses included a group pre-post design to measure academic reading gains, a regression analysis to predict beginning reading score with English proficiency and Chemistry knowledge as predictors, individual longitudinal case studies to measure variability and phase shifts, and a cluster analysis to discover (un)common developmental patterns. Finally, a qualitative study used interviews to discover difficulties in reading and strategies to overcome them. English proficiency predicted the initial reading score and the group gained significantly in academic reading. Each learner showed different non-linear patterns, and a cluster analysis revealed few similar patterns among learners. The high gainers showed relatively more variability over time and used more and a wider variety and more sophisticated learning and reading strategies to improve.</p
Storia: Summarizing Social Media Content based on Narrative Theory using Crowdsourcing
People from all over the world use social media to share thoughts and
opinions about events, and understanding what people say through these channels
has been of increasing interest to researchers, journalists, and marketers
alike. However, while automatically generated summaries enable people to
consume large amounts of data efficiently, they do not provide the context
needed for a viewer to fully understand an event. Narrative structure can
provide templates for the order and manner in which this data is presented to
create stories that are oriented around narrative elements rather than
summaries made up of facts. In this paper, we use narrative theory as a
framework for identifying the links between social media content. To do this,
we designed crowdsourcing tasks to generate summaries of events based on
commonly used narrative templates. In a controlled study, for certain types of
events, people were more emotionally engaged with stories created with
narrative structure and were also more likely to recommend them to others
compared to summaries created without narrative structure
A Corpus-Based Approach for Building Semantic Lexicons
Semantic knowledge can be a great asset to natural language processing
systems, but it is usually hand-coded for each application. Although some
semantic information is available in general-purpose knowledge bases such as
WordNet and Cyc, many applications require domain-specific lexicons that
represent words and categories for a particular topic. In this paper, we
present a corpus-based method that can be used to build semantic lexicons for
specific categories. The input to the system is a small set of seed words for a
category and a representative text corpus. The output is a ranked list of words
that are associated with the category. A user then reviews the top-ranked words
and decides which ones should be entered in the semantic lexicon. In
experiments with five categories, users typically found about 60 words per
category in 10-15 minutes to build a core semantic lexicon.Comment: 8 pages - to appear in Proceedings of EMNLP-
Machine Learning at Microsoft with ML .NET
Machine Learning is transitioning from an art and science into a technology
available to every developer. In the near future, every application on every
platform will incorporate trained models to encode data-based decisions that
would be impossible for developers to author. This presents a significant
engineering challenge, since currently data science and modeling are largely
decoupled from standard software development processes. This separation makes
incorporating machine learning capabilities inside applications unnecessarily
costly and difficult, and furthermore discourage developers from embracing ML
in first place. In this paper we present ML .NET, a framework developed at
Microsoft over the last decade in response to the challenge of making it easy
to ship machine learning models in large software applications. We present its
architecture, and illuminate the application demands that shaped it.
Specifically, we introduce DataView, the core data abstraction of ML .NET which
allows it to capture full predictive pipelines efficiently and consistently
across training and inference lifecycles. We close the paper with a
surprisingly favorable performance study of ML .NET compared to more recent
entrants, and a discussion of some lessons learned
Information extraction and data mining from Chinese financial news.
Ng Anny.Thesis (M.Phil.)--Chinese University of Hong Kong, 2002.Includes bibliographical references (leaves 139-142).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Problem Definition --- p.2Chapter 1.2 --- Thesis Organization --- p.3Chapter 2 --- Chinese Text Summarization Using Genetic Algorithm --- p.4Chapter 2.1 --- Introduction --- p.4Chapter 2.2 --- Related Work --- p.6Chapter 2.3 --- Genetic Algorithm Approach --- p.10Chapter 2.3.1 --- Fitness Function --- p.11Chapter 2.3.2 --- Genetic operators --- p.14Chapter 2.4 --- Implementation Details --- p.15Chapter 2.5 --- Experimental results --- p.19Chapter 2.6 --- Limitations and Future Work --- p.24Chapter 2.7 --- Conclusion --- p.26Chapter 3 --- Event Extraction from Chinese Financial News --- p.27Chapter 3.1 --- Introduction --- p.28Chapter 3.2 --- Method --- p.29Chapter 3.2.1 --- Data Set Preparation --- p.29Chapter 3.2.2 --- Positive Word --- p.30Chapter 3.2.3 --- Negative Word --- p.31Chapter 3.2.4 --- Window --- p.31Chapter 3.2.5 --- Event Extraction --- p.32Chapter 3.3 --- System Overview --- p.33Chapter 3.4 --- Implementation --- p.33Chapter 3.4.1 --- Event Type and Positive Word --- p.34Chapter 3.4.2 --- Company Name --- p.34Chapter 3.4.3 --- Negative Word --- p.36Chapter 3.4.4 --- Event Extraction --- p.37Chapter 3.5 --- Stock Database --- p.38Chapter 3.5.1 --- Stock Movements --- p.39Chapter 3.5.2 --- Implementation --- p.39Chapter 3.5.3 --- Stock Database Transformation --- p.39Chapter 3.6 --- Performance Evaluation --- p.40Chapter 3.6.1 --- Performance measures --- p.40Chapter 3.6.2 --- Evaluation --- p.41Chapter 3.7 --- Conclusion --- p.45Chapter 4 --- Mining Frequent Episodes --- p.46Chapter 4.1 --- Introduction --- p.46Chapter 4.1.1 --- Definitions --- p.48Chapter 4.2 --- Related Work --- p.50Chapter 4.3 --- Double-Part Event Tree for the database --- p.56Chapter 4.3.1 --- Complexity of tree construction --- p.62Chapter 4.4 --- Mining Frequent Episodes with the DE-tree --- p.63Chapter 4.4.1 --- Conditional Event Trees --- p.66Chapter 4.4.2 --- Single Path Conditional Event Tree --- p.67Chapter 4.4.3 --- Complexity of Mining Frequent Episodes with DE-Tree --- p.67Chapter 4.4.4 --- An Example --- p.68Chapter 4.4.5 --- Completeness of finding frequent episodes --- p.71Chapter 4.5 --- Implementation of DE-Tree --- p.71Chapter 4.6 --- Method 2: Node-List Event Tree --- p.76Chapter 4.6.1 --- Tree construction --- p.79Chapter 4.6.2 --- Order of Position Bits --- p.83Chapter 4.7 --- Implementation of NE-tree construction --- p.84Chapter 4.7.1 --- Complexity of NE-Tree Construction --- p.86Chapter 4.8 --- Mining Frequent Episodes with NE-tree --- p.87Chapter 4.8.1 --- Conditional NE-Tree --- p.87Chapter 4.8.2 --- Single Path Conditional NE-Tree --- p.88Chapter 4.8.3 --- Complexity of Mining Frequent Episodes with NE-Tree --- p.89Chapter 4.8.4 --- An Example --- p.89Chapter 4.9 --- Performance evaluation --- p.91Chapter 4.9.1 --- Synthetic data --- p.91Chapter 4.9.2 --- Real data --- p.99Chapter 4.10 --- Conclusion --- p.103Chapter 5 --- Mining N-most Interesting Episodes --- p.104Chapter 5.1 --- Introduction --- p.105Chapter 5.2 --- Method --- p.106Chapter 5.2.1 --- Threshold Improvement --- p.108Chapter 5.2.2 --- Pseudocode --- p.112Chapter 5.3 --- Experimental Results --- p.112Chapter 5.3.1 --- Synthetic Data --- p.113Chapter 5.3.2 --- Real Data --- p.119Chapter 5.4 --- Conclusion --- p.121Chapter 6 --- Mining Frequent Episodes with Event Constraints --- p.122Chapter 6.1 --- Introduction --- p.122Chapter 6.2 --- Method --- p.123Chapter 6.3 --- Experimental Results --- p.125Chapter 6.3.1 --- Synthetic Data --- p.126Chapter 6.3.2 --- Real Data --- p.129Chapter 6.4 --- Conclusion --- p.131Chapter 7 --- Conclusion --- p.133Chapter A --- Test Cases --- p.135Chapter A.1 --- Text 1 --- p.135Chapter A.2 --- Text 2 --- p.137Bibliography --- p.13
Knowledge-Enhanced Personalized Review Generation with Capsule Graph Neural Network
Personalized review generation (PRG) aims to automatically produce review
text reflecting user preference, which is a challenging natural language
generation task. Most of previous studies do not explicitly model factual
description of products, tending to generate uninformative content. Moreover,
they mainly focus on word-level generation, but cannot accurately reflect more
abstractive user preference in multiple aspects. To address the above issues,
we propose a novel knowledge-enhanced PRG model based on capsule graph neural
network~(Caps-GNN). We first construct a heterogeneous knowledge graph (HKG)
for utilizing rich item attributes. We adopt Caps-GNN to learn graph capsules
for encoding underlying characteristics from the HKG. Our generation process
contains two major steps, namely aspect sequence generation and sentence
generation. First, based on graph capsules, we adaptively learn aspect capsules
for inferring the aspect sequence. Then, conditioned on the inferred aspect
label, we design a graph-based copy mechanism to generate sentences by
incorporating related entities or words from HKG. To our knowledge, we are the
first to utilize knowledge graph for the PRG task. The incorporated KG
information is able to enhance user preference at both aspect and word levels.
Extensive experiments on three real-world datasets have demonstrated the
effectiveness of our model on the PRG task.Comment: Accepted by CIKM 2020 (Long Paper
- …