2,257 research outputs found

    Knowledge Modelling and Learning through Cognitive Networks

    Get PDF
    One of the most promising developments in modelling knowledge is cognitive network science, which aims to investigate cognitive phenomena driven by the networked, associative organization of knowledge. For example, investigating the structure of semantic memory via semantic networks has illuminated how memory recall patterns influence phenomena such as creativity, memory search, learning, and more generally, knowledge acquisition, exploration, and exploitation. In parallel, neural network models for artificial intelligence (AI) are also becoming more widespread as inferential models for understanding which features drive language-related phenomena such as meaning reconstruction, stance detection, and emotional profiling. Whereas cognitive networks map explicitly which entities engage in associative relationships, neural networks perform an implicit mapping of correlations in cognitive data as weights, obtained after training over labelled data and whose interpretation is not immediately evident to the experimenter. This book aims to bring together quantitative, innovative research that focuses on modelling knowledge through cognitive and neural networks to gain insight into mechanisms driving cognitive processes related to knowledge structuring, exploration, and learning. The book comprises a variety of publication types, including reviews and theoretical papers, empirical research, computational modelling, and big data analysis. All papers here share a commonality: they demonstrate how the application of network science and AI can extend and broaden cognitive science in ways that traditional approaches cannot

    Sentiment Analysis on Twitter Data and Social Trends: The Case of Greek General Elections

    Get PDF
    Η ανάλυση συναισθήματος και εξόρυξη γνώμης (Sentiment Analysis-Opinion Mining) είναι η διαδικασία χρήσης επεξεργασίας φυσικής γλώσσας και διαφόρων τεχνικών (μηχανική μάθηση, λεξικά) για τον εντοπισμό και την εξαγωγή υποκειμενικών πληροφοριών από δεδομένα κειμένου. Χρησιμοποιείται συνήθως για τον προσδιορισμό του συνολικού συναισθήματος ενός κειμένου, όπως αν είναι θετικό, αρνητικό ή ουδέτερο. Σκοπός της παρούσας Διπλωματικής Εργασίας είναι η ανάλυση του συναισθήματος σε δεδομένα του Twitter. Πιο συγκεκριμένα, εφαρμόστηκε μια προσέγγιση βασισμένη σε λεξικό για την ανάλυση του συναισθήματος σε κείμενο tweet που σχετίζεται με τις Βουλευτικές Εκλογές του 2019 στην Ελλάδα. Τα tweets είναι στην ελληνική γλώσσα και ταξινομούνται ως θετικά, αρνητικά και ουδέτερα με βάση το συνολικό συναίσθημα που εκφράζουν. Μέσω της ανάλυσης συναισθήματος στα σύνολα δεδομένων με τη χρήση της γλώσσας προγραμματισμού Python, εξάγουμε συμπεράσματα σχετικά με τις κοινωνικές τάσεις που αναπτύσσονται στο προεκλογικό twitter σε σχέση με τα έξι (6) πολιτικά κόμματα που εξέλεξαν βουλευτές σε αυτές τις εκλογές. Τα αποτελέσματα παρουσιάζονται με σαφείς οπτικοποιήσεις με τη χρήση του εργαλείου Tableau για πληρέστερη κατανόηση. Εκτός από την περιγραφή της υλοποίησης, παρουσιάζονται οι κυριότεροι περιορισμοί και οι προκλήσεις και δυσκολίες που προέκυψαν στην προσπάθεια επεξεργασίας της ελληνικής γλώσσας. Τέλος, επιχειρείται η να επισήμανση ορισμένων πτυχών της ανάλυσης συναισθήματος και εξόρυξης γνώμης που χρήζουν βελτίωσης, τόσο στη προτεινόμενη εφαρμογή που παρουσιάζεται εδώ όσο και σε άλλες υπάρχουσες.Sentiment analysis and Opinion Mining involve the process of using natural language processing and various techniques (machine learning, lexicons) to identify and extract subjective information from text data. Sentiment analysis and Opinion Mining are commonly used to determine the emotional tone of a piece of text, such as whether it is positive, negative, or neutral. The purpose of the present Thesis is to analyze sentiment in Twitter data. More specifically, a lexicon-based approach has been implemented to analyze sentiment in tweet texts related to the 2019 general elections in Greece. The tweets are in the Greek language and are classified as positive, negative, and neutral based on the overall sentiment they express. Sentiment analysis implemented on the datasets using the Python programming language allows insights and conclusions about the social trends that develop in pre-election twitter in relation to the six (6) political parties that elected Members of Parliament (MPs) in the 2019 elections. The results are presented with visualizations using the Tableau tool targeting to a clear and more complete understanding. In addition to the description of the implementation, the main challenges, limitations, and difficulties encountered in trying to process the Greek language are presented, along with aspects of the implementation that can be improved, as well as other existing issues in Sentiment analysis and Opinion Mining

    Cappadocian kinship

    Get PDF
    Cappadocian kinship systems are very interesting from a sociolinguistic and anthropological perspective because of the mixture of inherited Greek and borrowed Turkish kinship terms. Precisely because the number of Turkish kinship terms differs from one variety to another, it is necessary to talk about Cappadocian kinship systems in the plural rather than about the Cappadocian kinship system in the singular. Although reference will be made to other Cappadocian varieties, this paper will focus on the kinship systems of Mišotika and Aksenitika, the two Central Cappadocian dialects still spoken today in several communities in Greece. Particular attention will be given to the use of borrowed Turkish kinship terms, which sometimes seem to co-exist together with their inherited Greek counterparts, e.g. mána vs. néne ‘mother’, ailfó/aelfó vs. γardáš ‘brother’ etc. In the final part of the paper some kinship terms with obscure or hitherto unknown etymology will be discussed, e.g. káka ‘grandmother’, ižá ‘aunt’, lúva ‘uncle (father’s brother)’ etc

    Identification and monitoring polarization from social network perspective

    Get PDF
    Abstract. Polarization is a new phenomenon that threatens the cohesion and social development of our society. The raise of social media is known to have contributed significantly to the emergence of this phenomenon as it can be noticed from the multiplication of far right and racist online communities as well as the ill-structured political discourse. This can be noticed from scrutinizing recent US or EU elections. Automatic identification of polarization from social media plays a key role in devising appropriate defence strategy to tackle the issue and avoid escalation. This thesis implements several methods to identify polarization from Twitter data issued from Trump-Clinton US election campaign using metrics like Belief Polarization Index (BPI) and Sentiment Analysis. Furtherly, semantic role labelling and argument mining were applied to derive structure of arguments of polarized discourse. Especially, we constructed thirteen topics of interests that were used as potential candidates for polarized discourse. For each topic, the cosine distance of the frequency of the topic overtime between the two candidates was used to indicate the polarization (called as Belief Polarization Index). The statistics inference of sentiment scores was implemented to convey either a positive or negative polarity, which are then further examined using argument structure. All the proposed approaches provide attempts to measure the polarization between two individuals from different perspectives, which may give some hints or references for future research.Tiivistelmä. Polarisaatio on uusi ilmiö, joka uhkaa yhteiskuntamme yhteenkuuluvuutta ja sosiaalista kehitystä. Sosiaalisen median nousun tiedetään vaikuttaneen merkittävästi tämän ilmiön syntymiseen, koska se voidaan havaita äärioikeistolaisten ja rasististen verkkoyhteisöjen lisääntymisestä sekä huonosti jäsennellystä poliittisesta keskustelusta. Tämä voidaan havaita tarkastelemalla äskettäisiä Yhdysvaltojen tai EU: n vaaleja. Polarisaation automaattisella tunnistamisella sosiaalisesta mediasta on keskeinen rooli sopivan puolustusstrategian suunnittelussa ongelman ratkaisemiseksi ja eskalaation välttämiseksi. Tässä opinnäytetyössä toteutetaan useita menetelmiä polarisaation tunnistamiseksi Yhdysvaltain Trump-Clintonin vaalikampanjan Twitter-tiedoista käyttämällä mittareita, kuten vakaumuspolarisaatio indeksi (BPI) ja mielipiteiden analyysi. Lisäksi semanttisen roolin merkintöjä ja argumenttien louhintaa sovellettiin polarisoidun diskurssin argumenttien rakenteen johtamiseen. Erityisesti rakensimme kolmetoista aihepiiriä, joita käytettiin potentiaalisina ehdokkaina polarisoituneeseen keskusteluun. Kunkin aiheen kohdalla kahden ehdokkaan aiheiden ylityötiheyden kosinietäisyyttä käytettiin osoittamaan polarisaatiota (kutsutaan nimellä Belief Polarization Index). Tunnelmapisteiden tilastollinen päättely toteutettiin joko positiivisen tai negatiivisen napaisuuden välittämiseksi, joita sitten tutkitaan edelleen argumenttirakennetta käyttäen. Kaikki ehdotetut lähestymistavat tarjoavat yrityksiä mitata kahden ihmisen välistä polarisaatiota eri näkökulmista, mikä saattaa antaa vihjeitä tai viitteitä tulevaa tutkimusta varten

    Qualities, objects, sorts, and other treasures : gold digging in English and Arabic

    Get PDF
    In the present monograph, we will deal with questions of lexical typology in the nominal domain. By the term "lexical typology in the nominal domain", we refer to crosslinguistic regularities in the interaction between (a) those areas of the lexicon whose elements are capable of being used in the construction of "referring phrases" or "terms" and (b) the grammatical patterns in which these elements are involved. In the traditional analyses of a language such as English, such phrases are called "nominal phrases". In the study of the lexical aspects of the relevant domain, however, we will not confine ourselves to the investigation of "nouns" and "pronouns" but intend to take into consideration all those parts of speech which systematically alternate with nouns, either as heads or as modifiers of nominal phrases. In particular, this holds true for adjectives both in English and in other Standard European Languages. It is well known that adjectives are often difficult to distinguish from nouns, or that elements with an overt adjectival marker are used interchangeably with nouns, especially in particular semantic fields such as those denoting MATERIALS or NATlONALlTIES. That is, throughout this work the expression "lexical typology in the nominal domain" should not be interpreted as "a typology of nouns", but, rather, as the cross-linguistic investigation of lexical areas constitutive for "referring phrases" irrespective of how the parts-of-speech system in a specific language is defined

    Sentiment Analysis Using Machine Learning Techniques

    Get PDF
    Before buying a product, people usually go to various shops in the market, query about the product, cost, and warranty, and then finally buy the product based on the opinions they received on cost and quality of service. This process is time consuming and the chances of being cheated by the seller are more as there is nobody to guide as to where the buyer can get authentic product and with proper cost. But now-a-days a good number of persons depend upon the on-line market for buying their required products. This is because the information about the products is available from multiple sources; thus it is comparatively cheap and also has the facility of home delivery. Again, before going through the process of placing order for any product, customers very often refer to the comments or reviews of the present users of the product, which help them take decision about the quality of the product as well as the service provided by the seller. Similar to placing order for products, it is observed that there are quite a few specialists in the field of movies, who go though the movie and then finally give a comment about the quality of the movie, i.e., to watch the movie or not or in five-star rating. These reviews are mainly in the text format and sometimes tough to understand. Thus, these reports need to be processed appropriately to obtain some meaningful information. Classification of these reviews is one of the approaches to extract knowledge about the reviews. In this thesis, different machine learning techniques are used to classify the reviews. Simulation and experiments are carried out to evaluate the performance of the proposed classification methods. It is observed that a good number of researchers have often considered two different review datasets for sentiment classification namely aclIMDb and Polarity dataset. The IMDb dataset is divided into training and testing data. Thus, training data are used for training the machine learning algorithms and testing data are used to test the data based on the training information. On the other hand, polarity dataset does not have separate data for training and testing. Thus, k-fold cross validation technique is used to classify the reviews. Four different machine learning techniques (MLTs) viz., Naive Bayes (NB), Support Vector Machine (SVM), Random Forest (RF), and Linear Discriminant Analysis (LDA) are used for the classification of these movie reviews. Different performance evaluation parameters are used to evaluate the performance of the machine learning techniques. It is observed that among the above four machine learning algorithms, RF technique yields the classification result, with more accuracy. Secondly, n-gram based classification of reviews are carried out on the aclIMDb dataset..

    Multiword expressions

    Get PDF
    Multiword expressions (MWEs) are a challenge for both the natural language applications and the linguistic theory because they often defy the application of the machinery developed for free combinations where the default is that the meaning of an utterance can be predicted from its structure. There is a rich body of primarily descriptive work on MWEs for many European languages but comparative work is little. The volume brings together MWE experts to explore the benefits of a multilingual perspective on MWEs. The ten contributions in this volume look at MWEs in Bulgarian, English, French, German, Maori, Modern Greek, Romanian, Serbian, and Spanish. They discuss prominent issues in MWE research such as classification of MWEs, their formal grammatical modeling, and the description of individual MWE types from the point of view of different theoretical frameworks, such as Dependency Grammar, Generative Grammar, Head-driven Phrase Structure Grammar, Lexical Functional Grammar, Lexicon Grammar

    Ontology Enrichment from Free-text Clinical Documents: A Comparison of Alternative Approaches

    Get PDF
    While the biomedical informatics community widely acknowledges the utility of domain ontologies, there remain many barriers to their effective use. One important requirement of domain ontologies is that they achieve a high degree of coverage of the domain concepts and concept relationships. However, the development of these ontologies is typically a manual, time-consuming, and often error-prone process. Limited resources result in missing concepts and relationships, as well as difficulty in updating the ontology as domain knowledge changes. Methodologies developed in the fields of Natural Language Processing (NLP), Information Extraction (IE), Information Retrieval (IR), and Machine Learning (ML) provide techniques for automating the enrichment of ontology from free-text documents. In this dissertation, I extended these methodologies into biomedical ontology development. First, I reviewed existing methodologies and systems developed in the fields of NLP, IR, and IE, and discussed how existing methods can benefit the development of biomedical ontologies. This previously unconducted review was published in the Journal of Biomedical Informatics. Second, I compared the effectiveness of three methods from two different approaches, the symbolic (the Hearst method) and the statistical (the Church and Lin methods), using clinical free-text documents. Third, I developed a methodological framework for Ontology Learning (OL) evaluation and comparison. This framework permits evaluation of the two types of OL approaches that include three OL methods. The significance of this work is as follows: 1) The results from the comparative study showed the potential of these methods for biomedical ontology enrichment. For the two targeted domains (NCIT and RadLex), the Hearst method revealed an average of 21% and 11% new concept acceptance rates, respectively. The Lin method produced a 74% acceptance rate for NCIT; the Church method, 53%. As a result of this study (published in the Journal of Methods of Information in Medicine), many suggested candidates have been incorporated into the NCIT; 2) The evaluation framework is flexible and general enough that it can analyze the performance of ontology enrichment methods for many domains, thus expediting the process of automation and minimizing the likelihood that key concepts and relationships would be missed as domain knowledge evolves
    corecore