222 research outputs found
Multi-document summarization based on atomic semantic events and their temporal relationss
Automatic multi-document summarization (MDS) is the process of extracting the most important information such as events and entities from multiple natural language
texts focused on the same topic.
We extract all types of semantic atomic information and feed them to a topic model to
experiment with their effects on a summary. We design a coherent summarization system by taking into account the sentence relative positions in the original text.
Our generic MDS system has outperformed the best recent multi-document
summarization system in DUC 2004 in terms of ROUGE-1 recall and -measure. Our query-focused summarization system
achieves a statistically similar result to the state-of-the-art
unsupervised system for DUC 2007 query-focused MDS task in ROUGE-2 recall measure. Update
Summarization is a new form of MDS where novel yet salience sentences are chosen as summary sentences based on the assumption that the user has already
read a given set of documents. In this thesis, we present an event based update summarization where the novelty is detected based on the temporal ordering of events and
the saliency is ensured
by event and entity distribution. To our knowledge, no other study has deeply investigated the effects of the novelty information acquired from the temporal ordering of events
(assuming that a sentence contains one or more events) in the domain of update MDS. Our update MDS system has outperformed
the state-of-the-art update MDS system in terms of ROUGE-2, and ROUGE-SU4 recall measures. Our MDS systems also generate quality summaries which are manually evaluated based
on popular evaluation criteria
Automatic text summarisation using linguistic knowledge-based semantics
Text summarisation is reducing a text document to a short substitute summary. Since the commencement of the field, almost all summarisation research works implemented to this date involve identification and extraction of the most important document/cluster segments, called extraction. This typically involves scoring each document sentence according to a composite scoring function consisting of surface level and semantic features. Enabling machines to analyse text features and understand their meaning potentially requires both text semantic analysis and equipping computers with an external semantic knowledge. This thesis addresses extractive text summarisation by proposing a number of semantic and knowledge-based approaches. The work combines the high-quality semantic information in WordNet, the crowdsourced encyclopaedic knowledge in Wikipedia, and the manually crafted categorial variation in CatVar, to improve the summary quality. Such improvements are accomplished through sentence level morphological analysis and the incorporation of Wikipedia-based named-entity semantic relatedness while using heuristic algorithms. The study also investigates how sentence-level semantic analysis based on semantic role labelling (SRL), leveraged with a background world knowledge, influences sentence textual similarity and text summarisation. The proposed sentence similarity and summarisation methods were evaluated on standard publicly available datasets such as the Microsoft Research Paraphrase Corpus (MSRPC), TREC-9 Question Variants, and the Document Understanding Conference 2002, 2005, 2006 (DUC 2002, DUC 2005, DUC 2006) Corpora. The project also uses Recall-Oriented Understudy for Gisting Evaluation (ROUGE) for the quantitative assessment of the proposed summarisersâ performances. Results of our systems showed their effectiveness as compared to related state-of-the-art summarisation methods and baselines. Of the proposed summarisers, the SRL Wikipedia-based system demonstrated the best performance
Deep Architectures for Visual Recognition and Description
In recent times, digital media contents are inherently of multimedia type, consisting of the form text, audio, image and video. Several of the outstanding computer Vision (CV) problems are being successfully solved with the help of modern Machine Learning (ML) techniques. Plenty of research work has already been carried out in the field of Automatic Image Annotation (AIA), Image Captioning and Video Tagging. Video Captioning, i.e., automatic description generation from digital video, however, is a different and complex problem altogether. This study compares various existing video captioning approaches available today and attempts their classification and analysis based on different parameters, viz., type of captioning methods (generation/retrieval), type of learning models employed, the desired output description length generated, etc. This dissertation also attempts to critically analyze the existing benchmark datasets used in various video captioning models and the evaluation metrics for assessing the final quality of the resultant video descriptions generated. A detailed study of important existing models, highlighting their comparative advantages as well as disadvantages are also included.
In this study a novel approach for video captioning on the Microsoft Video Description (MSVD) dataset and Microsoft Video-to-Text (MSR-VTT) dataset is proposed using supervised learning techniques to train a deep combinational framework, for achieving better quality video captioning via predicting semantic tags. We develop simple shallow CNN (2D and 3D) as feature extractors, Deep Neural Networks (DNNs and Bidirectional LSTMs (BiLSTMs) as tag prediction models and Recurrent Neural Networks (RNNs) (LSTM) model as the language model. The aim of the work was to provide an alternative narrative to generating captions from videos via semantic tag predictions and deploy simpler shallower deep model architectures with lower memory requirements as solution so that it is not very memory extensive and the developed models prove to be stable and viable options when the scale of the data is increased.
This study also successfully employed deep architectures like the Convolutional Neural Network (CNN) for speeding up automation process of hand gesture recognition and classification of the sign languages of the Indian classical dance form, âBharatnatyamâ. This hand gesture classification is primarily aimed at 1) building a novel dataset of 2D single hand gestures belonging to 27 classes that were collected from (i) Google search engine (Google images), (ii) YouTube videos (dynamic and with background considered) and (iii) professional artists under staged environment constraints (plain backgrounds). 2) exploring the effectiveness of CNNs for identifying and classifying the single hand gestures by optimizing the hyperparameters, and 3) evaluating the impacts of transfer learning and double transfer learning, which is a novel concept explored for achieving higher classification accuracy
Information fusion for automated question answering
Until recently, research efforts in automated Question Answering (QA) have mainly
focused on getting a good understanding of questions to retrieve correct answers. This
includes deep parsing, lookups in ontologies, question typing and machine learning
of answer patterns appropriate to question forms. In contrast, I have focused on the
analysis of the relationships between answer candidates as provided in open domain
QA on multiple documents. I argue that such candidates have intrinsic properties,
partly regardless of the question, and those properties can be exploited to provide better
quality and more user-oriented answers in QA.Information fusion refers to the technique of merging pieces of information from
different sources. In QA over free text, it is motivated by the frequency with which
different answer candidates are found in different locations, leading to a multiplicity
of answers. The reason for such multiplicity is, in part, the massive amount of data
used for answering, and also its unstructured and heterogeneous content: Besides amÂŹ
biguities in user questions leading to heterogeneity in extractions, systems have to deal
with redundancy, granularity and possible contradictory information. Hence the need
for answer candidate comparison. While frequency has proved to be a significant charÂŹ
acteristic of a correct answer, I evaluate the value of other relationships characterizing
answer variability and redundancy.Partially inspired by recent developments in multi-document summarization, I reÂŹ
define the concept of "answer" within an engineering approach to QA based on the
Model-View-Controller (MVC) pattern of user interface design. An "answer model"
is a directed graph in which nodes correspond to entities projected from extractions
and edges convey relationships between such nodes. The graph represents the fusion
of information contained in the set of extractions. Different views of the answer model
can be produced, capturing the fact that the same answer can be expressed and preÂŹ
sented in various ways: picture, video, sound, written or spoken language, or a formal
data structure. Within this framework, an answer is a structured object contained in the
model and retrieved by a strategy to build a particular view depending on the end user
(or taskj's requirements.I describe shallow techniques to compare entities and enrich the model by discovering four broad categories of relationships between entities in the model: equivalence,
inclusion, aggregation and alternative. Quantitatively, answer candidate modeling imÂŹ
proves answer extraction accuracy. It also proves to be more robust to incorrect answer
candidates than traditional techniques. Qualitatively, models provide meta-information
encoded by relationships that allow shallow reasoning to help organize and generate
the final output
Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation
This paper surveys the current state of the art in Natural Language
Generation (NLG), defined as the task of generating text or speech from
non-linguistic input. A survey of NLG is timely in view of the changes that the
field has undergone over the past decade or so, especially in relation to new
(usually data-driven) methods, as well as new applications of NLG technology.
This survey therefore aims to (a) give an up-to-date synthesis of research on
the core tasks in NLG and the architectures adopted in which such tasks are
organised; (b) highlight a number of relatively recent research topics that
have arisen partly as a result of growing synergies between NLG and other areas
of artificial intelligence; (c) draw attention to the challenges in NLG
evaluation, relating them to similar challenges faced in other areas of Natural
Language Processing, with an emphasis on different evaluation methods and the
relationships between them.Comment: Published in Journal of AI Research (JAIR), volume 61, pp 75-170. 118
pages, 8 figures, 1 tabl
Extracting Temporal and Causal Relations between Events
Structured information resulting from temporal information processing is
crucial for a variety of natural language processing tasks, for instance to
generate timeline summarization of events from news documents, or to answer
temporal/causal-related questions about some events. In this thesis we present
a framework for an integrated temporal and causal relation extraction system.
We first develop a robust extraction component for each type of relations, i.e.
temporal order and causality. We then combine the two extraction components
into an integrated relation extraction system, CATENA---CAusal and Temporal
relation Extraction from NAtural language texts---, by utilizing the
presumption about event precedence in causality, that causing events must
happened BEFORE resulting events. Several resources and techniques to improve
our relation extraction systems are also discussed, including word embeddings
and training data expansion. Finally, we report our adaptation efforts of
temporal information processing for languages other than English, namely
Italian and Indonesian.Comment: PhD Thesi
Unsupervised Graph-Based Similarity Learning Using Heterogeneous Features.
Relational data refers to data that contains explicit relations among objects. Nowadays, relational
data are universal and have a broad appeal in many different application domains. The
problem of estimating similarity between objects is a core requirement for many standard
Machine Learning (ML), Natural Language Processing (NLP) and Information Retrieval
(IR) problems such as clustering, classiffication, word sense disambiguation, etc. Traditional
machine learning approaches represent the data using simple, concise representations such
as feature vectors. While this works very well for homogeneous data, i.e, data with a single
feature type such as text, it does not exploit the availability of dfferent feature types fully.
For example, scientic publications have text, citations, authorship information, venue information.
Each of the features can be used for estimating similarity. Representing such
objects has been a key issue in efficient mining (Getoor and Taskar, 2007). In this thesis,
we propose natural representations for relational data using multiple, connected layers of
graphs; one for each feature type. Also, we propose novel algorithms for estimating similarity
using multiple heterogeneous features. Also, we present novel algorithms for tasks like topic detection and music recommendation using the estimated similarity measure. We
demonstrate superior performance of the proposed algorithms (root mean squared error of
24.81 on the Yahoo! KDD Music recommendation data set and classiffication accuracy of
88% on the ACL Anthology Network data set) over many of the state of the art algorithms,
such as Latent Semantic Analysis (LSA), Multiple Kernel Learning (MKL) and spectral
clustering and baselines on large, standard data sets.Ph.D.Computer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/89824/1/mpradeep_1.pd
Research in the Language, Information and Computation Laboratory of the University of Pennsylvania
This report takes its name from the Computational Linguistics Feedback Forum (CLiFF), an informal discussion group for students and faculty. However the scope of the research covered in this report is broader than the title might suggest; this is the yearly report of the LINC Lab, the Language, Information and Computation Laboratory of the University of Pennsylvania.
It may at first be hard to see the threads that bind together the work presented here, work by faculty, graduate students and postdocs in the Computer Science and Linguistics Departments, and the Institute for Research in Cognitive Science. It includes prototypical Natural Language fields such as: Combinatorial Categorial Grammars, Tree Adjoining Grammars, syntactic parsing and the syntax-semantics interface; but it extends to statistical methods, plan inference, instruction understanding, intonation, causal reasoning, free word order languages, geometric reasoning, medical informatics, connectionism, and language acquisition.
Naturally, this introduction cannot spell out all the connections between these abstracts; we invite you to explore them on your own. In fact, with this issue itâs easier than ever to do so: this document is accessible on the âinformation superhighwayâ. Just call up http://www.cis.upenn.edu/~cliff-group/94/cliffnotes.html
In addition, you can find many of the papers referenced in the CLiFF Notes on the net. Most can be obtained by following links from the authorsâ abstracts in the web version of this report.
The abstracts describe the researchersâ many areas of investigation, explain their shared concerns, and present some interesting work in Cognitive Science. We hope its new online format makes the CLiFF Notes a more useful and interesting guide to Computational Linguistics activity at Penn
Argumentative zoning information extraction from scientific text
Let me tell you, writing a thesis is not always a barrel of laughsâand strange things can happen, too. For example, at the height of my thesis paranoia, I had a re-current dream in which my cat Amy gave me detailed advice on how to restructure the thesis chapters, which was awfully nice of her. But I also had a lot of human help throughout this time, whether things were going fine or beserk. Most of all, I want to thank Marc Moens: I could not have had a better or more knowledgable supervisor. He always took time for me, however busy he might have been, reading chapters thoroughly in two days. He both had the calmness of mind to give me lots of freedom in research, and the right judgement to guide me away, tactfully but determinedly, from the occasional catastrophe or other waiting along the way. He was great fun to work with and also became a good friend. My work has profitted from the interdisciplinary, interactive and enlightened atmosphere at the Human Communication Centre and the Centre for Cognitive Science (which is now called something else). The Language Technology Group was a great place to work in, as my research was grounded in practical applications develope
- âŠ