22,939 research outputs found
Dynamics of the Corruption Eradication in Indonesia
The paper discusses an important aspect of the complexity of corruption eradication in Indonesia. Corruption eradication is practically not merely about law enforcement, but also related to social, economic, and political aspects of the nation. By extracting the data from national news media and implement models describing the sentiment relations among political actors, the connection between balance of the sentiment among political elites and the critical levels of the investigation and law enforcement is apparently demonstrated. The focus group discussions among experts, practitioners, and social activists confirm the model
Aspect-Based Sentiment Analysis Using a Two-Step Neural Network Architecture
The World Wide Web holds a wealth of information in the form of unstructured
texts such as customer reviews for products, events and more. By extracting and
analyzing the expressed opinions in customer reviews in a fine-grained way,
valuable opportunities and insights for customers and businesses can be gained.
We propose a neural network based system to address the task of Aspect-Based
Sentiment Analysis to compete in Task 2 of the ESWC-2016 Challenge on Semantic
Sentiment Analysis. Our proposed architecture divides the task in two subtasks:
aspect term extraction and aspect-specific sentiment extraction. This approach
is flexible in that it allows to address each subtask independently. As a first
step, a recurrent neural network is used to extract aspects from a text by
framing the problem as a sequence labeling task. In a second step, a recurrent
network processes each extracted aspect with respect to its context and
predicts a sentiment label. The system uses pretrained semantic word embedding
features which we experimentally enhance with semantic knowledge extracted from
WordNet. Further features extracted from SenticNet prove to be beneficial for
the extraction of sentiment labels. As the best performing system in its
category, our proposed system proves to be an effective approach for the
Aspect-Based Sentiment Analysis
Link prediction in very large directed graphs: Exploiting hierarchical properties in parallel
Link prediction is a link mining task that tries to find new edges within a given graph. Among the targets of link prediction there is large directed graphs, which are frequent structures nowadays. The typical sparsity of large graphs demands of high precision predictions in order to obtain usable results. However, the size of those graphs only permits the execution of scalable algorithms. As a trade-off between those two problems we recently proposed a link prediction algorithm for directed graphs that exploits hierarchical properties. The algorithm can be classified as a local score, which entails scalability. Unlike the rest of local scores, our proposal assumes the existence of an underlying model for the data which allows it to produce predictions with a higher precision. We test the validity of its hierarchical assumptions on two clearly hierarchical data sets, one of them based on RDF. Then we test it on a non-hierarchical data set based on Wikipedia to demonstrate its broad applicability. Given the computational complexity of link prediction in very large graphs we also introduce some general recommendations useful to make of link prediction an efficiently parallelized problem.Peer ReviewedPostprint (published version
Web Data Extraction, Applications and Techniques: A Survey
Web Data Extraction is an important problem that has been studied by means of
different scientific tools and in a broad range of applications. Many
approaches to extracting data from the Web have been designed to solve specific
problems and operate in ad-hoc domains. Other approaches, instead, heavily
reuse techniques and algorithms developed in the field of Information
Extraction.
This survey aims at providing a structured and comprehensive overview of the
literature in the field of Web Data Extraction. We provided a simple
classification framework in which existing Web Data Extraction applications are
grouped into two main classes, namely applications at the Enterprise level and
at the Social Web level. At the Enterprise level, Web Data Extraction
techniques emerge as a key tool to perform data analysis in Business and
Competitive Intelligence systems as well as for business process
re-engineering. At the Social Web level, Web Data Extraction techniques allow
to gather a large amount of structured data continuously generated and
disseminated by Web 2.0, Social Media and Online Social Network users and this
offers unprecedented opportunities to analyze human behavior at a very large
scale. We discuss also the potential of cross-fertilization, i.e., on the
possibility of re-using Web Data Extraction techniques originally designed to
work in a given domain, in other domains.Comment: Knowledge-based System
Recommended from our members
Proceedings of QG2010: The Third Workshop on Question Generation
These are the peer-reviewed proceedings of "QG2010, The Third Workshop on Question Generation". The workshop included a special track for "QGSTEC2010: The First Question Generation Shared Task and Evaluation Challenge".
QG2010 was held as part of The Tenth International Conference on Intelligent Tutoring Systems (ITS2010)
Web 2.0, language resources and standards to automatically build a multilingual named entity lexicon
This paper proposes to advance in the current state-of-the-art of automatic Language Resource (LR) building by taking into consideration three elements: (i) the knowledge available in existing LRs, (ii) the vast amount of information available from the collaborative paradigm that has emerged from the Web 2.0 and (iii) the use of standards to improve interoperability. We present a case study in which a set of LRs for different languages (WordNet for English and Spanish and Parole-Simple-Clips for Italian) are
extended with Named Entities (NE) by exploiting Wikipedia and the aforementioned LRs. The practical result is a multilingual NE lexicon connected to these LRs and to two ontologies: SUMO and SIMPLE. Furthermore, the paper addresses an important problem which affects the Computational Linguistics area in the present, interoperability, by making use of the ISO LMF standard to encode this lexicon. The different steps of the procedure (mapping, disambiguation, extraction, NE identification and postprocessing) are comprehensively explained and evaluated. The resulting resource contains 974,567, 137,583 and 125,806 NEs for English, Spanish and Italian respectively. Finally, in order to check the usefulness of the constructed resource, we apply it into a state-of-the-art Question Answering system and evaluate its impact; the NE lexicon improves the system’s accuracy by 28.1%. Compared to previous approaches to build NE repositories, the current proposal represents a step forward in terms of automation, language independence, amount of NEs acquired and richness of the information represented
- …