12 research outputs found

    Machine learning algorithms and techniques for sentiment analysis in scientific paper reviews: a systematic literature review

    Get PDF
    Sentiment analysis also referred to as opinion mining, is an automated process for identifying and classifying subjective information such as sentiments from a piece of text usually comments and reviews. Supported by machine learning algorithms, it is possible to identify positive, neutral or negative opinions, being possible to rank or classify them in order to reach some kind of conclusion or obtain any type of information. Thus, this paper aims to perform a systematic literature review in order to report the state-of-the-art of machine learning techniques for sentiment analysis applied to texts of reviews, comments and evaluations of scientific papers.This work has been supported by IViSSEM: POCI-01-0145-FEDER-28284, COMPETE: POCI-01- 0145-FEDER-007043 and FCT - Fundação para a Ciência e Tecnologia within the Project Scope: UID/CEC/00319/2013

    Classification of Radical Web Content in Indonesia using Web Content Mining and k-Nearest Neighbor Algorithm

    Get PDF
    Radical content in procedural meaning is content which have provoke the violence, spread the hatred and anti nationalism. Radical definition for each country is different, especially in Indonesia. Radical content is more identical with provocation issue, ethnic and religious hatred that is called SARA in Indonesian languange. SARA content is very difficult to detect due to the large number, unstructure system and many noise can be caused multiple interpretations. This problem can threat the unity and harmony of the religion. According to this condition, it is required a system that can distinguish the radical content or not. In this system, we propose text mining approach using DF threshold and Human Brain as the feature extraction. The system is divided into several steps, those are collecting data which is including at preprocessing part, text mining, selection features, classification for grouping the data with class label, simillarity calculation of data training, and visualization to the radical content or non radical content. The experimental result show that using combination from 10-cross validation and k-Nearest Neighbor (kNN) as the classification methods achieve 66.37% accuracy performance with 7 k value of kNN method[1]

    Leveraging Indexical Pragmatics (OFIP) for Search Engine: An Ontology- based Approach

    Get PDF
    The relevance of search results is an important indicator of information retrieval performance. A domain-specific Search Engine (SE), distinct from a general web SE, focuses on a specific segment of online content and may increase search results relevance. Traditional methods to improve domain-specific SE precision heavily depend on query expansion, lexical analysis of texts, and large amounts of training data. These methods suffer from limited effectiveness and efficiency because expanded query terms and coarse language features bring in uncontrollable complexity and increase dimensionality. Our design, leveraging the integrated power of computational syntax, semantics, and indexical pragmatics, proposes an ontology-driven framework that is tailored to work in a dynamic Internet environment without large amounts of manually annotated training data. This article presents our design, that is essential for building a domain-specific SE, and its instantiation in the terrorism domain

    La lucha contra el terrorismo y la delincuencia organizada: Una visión desde la lingüística y la ingeniería del conocimiento

    Get PDF
    The aim of Natural Language Processing is to create computational systems for the production and comprehension of language by machines. In this regard, symbolic approaches to language put forth conceptual models which represent both common and specialised knowledge. This paper describes the ontological modelling of the “collective criminal agent” and its implementation in FunGramKB, a knowledge base for language processing and artificial reasoning. More specifically, the study focuses on the conceptual definition of three terminological units from the domains of terrorism and organised crime: cartel, oriented cluster, and terrorist cell. The main assumption is that ontological modelling applied to language technologies can play a major role in combating a variety of security threats to today’s society.El objetivo del procesamiento del lenguaje natural es la creación sistemas computacionales de producción y comprensión lingüística. Un aspecto prioritario de este enfoque consiste en elaborar modelos conceptuales que permitan formalizar el conocimiento humano. Este artículo aborda la elaboración de modelos que, conteniendo unidades léxicas propias de los ámbitos del terrorismo y ladelincuencia organizada, puedan utilizarse con la base de conocimiento FunGramKB para llevar a cabo tareas de procesamiento lingüístico y de razonamiento artificial.El artículo parte del concepto “agente criminal colectivo” e ilustra la formalización conceptual de las unidades cartel (“cártel”), oriented cluster (“grupo con fines propios”) y terrorist cell (“célula terrorista”). La conceptualización de unidades léxicas constituye un paso fundamental hacia el desarrollo de aplicaciones que ofrezcan soluciones a los distintos problemas que se plantean en el ámbito profesional, así como en el conjunto de la sociedad

    Artificial and Natural Topic Detection in Online Social Networks

    Get PDF
    Online Social Networks (OSNs), such as Twitter, offer attractive means of social interactions and communications, but also raise privacy and security issues. The OSNs provide valuable information to marketing and competitiveness based on users posts and opinions stored inside a huge volume of data from several themes, topics, and subjects. In order to mining the topics discussed on an OSN we present a novel application of Louvain method for TopicModeling based on communities detection in graphs by modularity. The proposed approach succeeded in finding topics in five different datasets composed of textual content from Twitter and Youtube. Another important contribution achieved was about the presence of texts posted by spammers. In this case, a particular behavior observed by graph community architecture (density and degree) allows the indication of a topic strength and the classification of it as natural or artificial. The later created by the spammers on OSNs

    An Empirical Approach for Extreme Behavior Identification through Tweets Using Machine Learning

    Get PDF
    This research was supported by the Ministry of Trade, Industry & Energy (MOTIE, Korea) under Industrial Technology Innovation Program. No.10063130, Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2019R1A2C1006159), and MSIT(Ministry of Science and ICT), Korea, under the ITRC(Information Technology Research Center) support program (IITP-2019-2016-0-00313) supervised by the IITP (Institute for Information & communications Technology Promotion), and the 2018 Yeungnam University Research Grant.Peer reviewe

    Fraudes, violences et autres comportements déviants dans le sport professionnel et olympique : Opportunités et limites des sources ouvertes en ligne comme moyen de renseignement

    Get PDF
    Les sources ouvertes en ligne sont de plus en plus utilisées comme outils de renseignement. Cette contribution explore la manière dont elles peuvent être utilisées pour étudier la déviance dans le sport professionnel et olympique. L’étude a considéré les fraudes perpétrées sur le terrain (le dopage, la manipulation de match et les fraudes à l’éligibilité) et en dehors (la corruption), les violences sur et en dehors du terrain de jeu (le hooliganisme et le terrorisme), ainsi que les autres comportements préjudiciables commis par les sportifs, que ce soit sur le terrain ou en dehors. Un dispositif de veille en ligne a été mis en place pour la récolte d’articles publiés en 2016 en anglais. Des 775 cas qui ont été relevés, les fraudes (surtout les cas de dopage et de manipulations de matchs) comptaient à elles seules pour 85 % des cas. Au total, 87 pays sont impliqués dans cette étude, bien que certaines sous-régions – l’Europe de l’Est (17,4 %), l’Afrique de l’Est (16,1 %), ainsi que l’Australie et la Nouvelle-Zélande (10,2 %) – se démarquent par une concentration plus élevée de cas. Quant aux patterns temporels, des distributions sont concentrées surtout aux mois d’août et novembre pour les fraudes, et aux mois de juin et juillet pour les violences. Ces tendances peuvent laisser supposer une saisonnalité des comportements déviants. En se concentrant sur les biais liés au langage et aux types de sources, des recherches plus approfondies sur le sujet pourraient contribuer à une surveillance systématique de la représentation médiatique de ces phénomènes dans le monde

    Government and Nongovernmental Collaboration to Build Community Resiliency Against Terrorism in Oklahoma City

    Get PDF
    The way communities build resiliency and prepare for acts of terrorism is ambiguous in the United States; best practices remain unclear. Due to mobility and advancements in communication technologies, individuals and organizations share information, incite anger, recruit, and act on ideological grievances with ease. Such grievances are bolstered by the political and social exclusion of disparate groups through poorly designed policies and ineffective government structures. Using a combination of social constructivism and systems thinking theories, this case study explored collaboration efforts between government agencies and nongovernment experts in Oklahoma City, OK, identifying best practices as a result of lessons learned following the 1995 bombing of the Alfred P. Murrah Federal Building. Data were acquired through public records related to the bombing, combined with a qualitative survey of 31 community leaders. These data were inductively coded and subjected to a thematic analysis procedure. Key findings indicate that while open communication with the community and increased coordination were suggested by participants, reports were kept internal to each agency and not widely shared or implemented effectively across the community. Sharing the identified best practices and acknowledging collaboration opportunities promotes positive social change by involving the broader community and building early resiliency to address ideologic grievances and create more effective community counterterrorism plans
    corecore