Search CORE

1,985 research outputs found

From Lexical to Semantic Features in Paraphrase Identification

Author: Fialho Pedro
Quaresma Paulo
Publication venue: OASIcs - OpenAccess Series in Informatics. 8th Symposium on Languages, Applications and Technologies (SLATE 2019)
Publication date: 01/01/2019
Field of study

The task of paraphrase identification has been applied to diverse scenarios in Natural Language Processing, such as Machine Translation, summarization, or plagiarism detection. In this paper we present a comparative study on the performance of lexical, syntactic and semantic features in the task of paraphrase identification in the Microsoft Research Paraphrase Corpus. In our experiments, semantic features do not represent a gain in results, and syntactic features lead to the best results, but only if combined with lexical features

Dagstuhl Research Online Publication Server

Repositório Científico da Universidade de Évora

A hybrid approach for arabic semantic relation extraction

Author: Bounhas Ibrahim
Elayeb Bilel
Evrard Fabrice
Lahbib Wiem
Slimani Yahya
Publication venue: AAAI Press
Publication date: 01/01/2013
Field of study

Information retrieval applications are essential tools to manage the huge amount of information in the Web. Ontologies have great importance in these applications. The idea here is that several data belonging to a domain of interest are represented and related semantically in the ontology, which can help to navigate, manage and reuse these data. Despite of the growing need of ontology, only few works were interested in Arabic language. Indeed, arabic texts are highly ambiguous, especially when diacritics are absent. Besides, existent works does not cover all the types of se-mantic relations, which are useful to structure Arabic ontol-ogies. A lot of work has been done on cooccurrence- based techniques, which lead to over-generation. In this paper, we propose a new approach for Arabic se-mantic relation extraction. We use vocalized texts to reduce ambiguities and propose a new distributional approach for similarity calculus, which is compared to cooccurrence. We discuss our contribution through experimental results and propose some perspectives for future research

Open Archive Toulouse Archive Ouverte

Sentiment Lexicon Adaptation with Context and Semantics for the Social Web

Author: Bollen
Feng
Lin
Shaffer
Thelwall
Turney
Turney
Weaver
Publication venue: 'IOS Press'
Publication date: 06/04/2017
Field of study

Sentiment analysis over social streams offers governments and organisations a fast and effective way to monitor the publics' feelings towards policies, brands, business, etc. General purpose sentiment lexicons have been used to compute sentiment from social streams, since they are simple and effective. They calculate the overall sentiment of texts by using a general collection of words, with predetermined sentiment orientation and strength. However, words' sentiment often vary with the contexts in which they appear, and new words might be encountered that are not covered by the lexicon, particularly in social media environments where content emerges and changes rapidly and constantly. In this paper, we propose a lexicon adaptation approach that uses contextual as well as semantic information extracted from DBPedia to update the words' weighted sentiment orientations and to add new words to the lexicon. We evaluate our approach on three different Twitter datasets, and show that enriching the lexicon with contextual and semantic information improves sentiment computation by 3.4% in average accuracy, and by 2.8% in average F1 measure

Crossref

Open Research Online (The Open University)

A Novel ILP Framework for Summarizing Content with High Lexical Variety

Author: Almeida
Celikyilmaz
Conroy
DIANE LITMAN
FEI LIU
Goodfellow
Li
Li
Luo
Luo
Luo
Martins
Mazumder
Mosteller
Narayan
Qian
Ren
Tarnpradab
Wang
WENCAN LUO
Wilson
Xiong
ZITAO LIU
Publication venue
Publication date: 25/07/2018
Field of study

Summarizing content contributed by individuals can be challenging, because people make different lexical choices even when describing the same events. However, there remains a significant need to summarize such content. Examples include the student responses to post-class reflective questions, product reviews, and news articles published by different news agencies related to the same events. High lexical diversity of these documents hinders the system's ability to effectively identify salient content and reduce summary redundancy. In this paper, we overcome this issue by introducing an integer linear programming-based summarization framework. It incorporates a low-rank approximation to the sentence-word co-occurrence matrix to intrinsically group semantically-similar lexical items. We conduct extensive experiments on datasets of student responses, product reviews, and news documents. Our approach compares favorably to a number of extractive baselines as well as a neural abstractive summarization system. The paper finally sheds light on when and why the proposed framework is effective at summarizing content with high lexical variety.Comment: Accepted for publication in the journal of Natural Language Engineering, 201

arXiv.org e-Print Archive

Crossref

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Towards incorporating affective computing to virtual rehabilitation: surrogating attributed attention from posture for boosting therapy adaptation

Author: Enrique Sucar L
Heyer P
Orihuela-Espina F
Rivas JJ
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 14/10/2014
Field of study

Spiral - Imperial College Digital Repository

Feature Selection of Network Intrusion Data using Genetic Algorithm and Particle Swarm Optimization

Author: Syarif Iwan
Publication venue: 'EMITTER International Journal of Engineering Technology'
Publication date: 01/12/2016
Field of study

This paper describes the advantages of using Evolutionary Algorithms (EA) for feature selection on network intrusion dataset. Most current Network Intrusion Detection Systems (NIDS) are unable to detect intrusions in real time because of high dimensional data produced during daily operation. Extracting knowledge from huge data such as intrusion data requires new approach. The more complex the datasets, the higher computation time and the harder they are to be interpreted and analyzed. This paper investigates the performance of feature selection algoritms in network intrusiona data. We used Genetic Algorithms (GA) and Particle Swarm Optimizations (PSO) as feature selection algorithms. When applied to network intrusion datasets, both GA and PSO have significantly reduces the number of features. Our experiments show that GA successfully reduces the number of attributes from 41 to 15 while PSO reduces the number of attributes from 41 to 9. Using k Nearest Neighbour (k-NN) as a classifier,the GA-reduced dataset which consists of 37% of original attributes, has accuracy improvement from 99.28% to 99.70% and its execution time is also 4.8 faster than the execution time of original dataset. Using the same classifier, PSO-reduced dataset which consists of 22% of original attributes, has the fastest execution time (7.2 times faster than the execution time of original datasets). However, its accuracy is slightly reduced 0.02% from 99.28% to 99.26%. Overall, both GA and PSO are good solution as feature selection techniques because theyhave shown very good performance in reducing the number of features significantly while still maintaining and sometimes improving the classification accuracy as well as reducing the computation time

EMITTER - International Journal of Engineering Technology

Directory of Open Access Journals

EMITTER International Journal of Engineering Technology

unsupervised detection of argumentative units though topic modeling techniques

Author: Alfio Ferrara
Georgios Petasis
Stefano Montanelli
Publication venue
Publication date: 01/01/2017
Field of study

AIR Universita degli studi di Milano

Open Access Repository

Impact of a Non-Traditional Research Approach

Author
Publication venue
Publication date: 01/01/2014
Field of study

abstract: Construction Management research has not been successful in changing the practices of the construction industry. The method of receiving grants and the peer review paper system that academics rely on to achieve promotion, does not align to academic researchers becoming experts who can bring change to industry practices. Poor construction industry performance has been documented for the past 25 years in the international construction management field. However, after 25 years of billions of dollars of research investment, the solution remains elusive. Research has shown that very few researchers have a hypothesis, run cycles of research tests in the industry, and result in changing industry practices. The most impactful research identified in this thesis, has led to conclusions that pre-planning is critical, hiring contractors who have expertise will result in better performance, and risk is mitigated when the supply chain partners work together and expertise is utilized at the beginning of projects. The problems with construction non-performance have persisted. Legal contract issues have become more important. Traditional research approaches have not identified the severity and the source of construction non-performance. The problem seems to be as complex as ever. The construction industry practices and the academic research community remain in silos. This research proposes that the problem may be in the traditional construction management research structure and methodology. The research has identified a unique non-traditional research program that has documented over 1700 industry tests, which has resulted in a decrease in client management by up to 79%, contractors adding value by up to 38%, increased customer satisfaction by up to 140%, reduced change order rates as low as -0.6%, and decreased cost of services by up to 31%. The purpose of this thesis is to document the performance of the non-traditional research program around the above identified results. The documentation of such an effort will shed more light on what is required for a sustainable, industry impacting, and academic expert based research program.Dissertation/ThesisMasters Thesis Construction 201

ASU Digital Repository

Data analytics 2016: proceedings of the fifth international conference on data analytics

Author: Bhulai Sandjai
Semanjski Ivana
Publication venue: The International Academy, Research and Industry Association
Publication date: 01/01/2016
Field of study

VU Research Portal

Ghent University Academic Bibliography