Search CORE

9 research outputs found

Tracing Linguistic Relations in Winning and Losing Sides of Explicit Opposing Groups

Author: Cambria Erik
Mondal Anupam
Sanli Ceyda
Publication venue
Publication date: 01/01/2017
Field of study

Linguistic relations in oral conversations present how opinions are constructed and developed in a restricted time. The relations bond ideas, arguments, thoughts, and feelings, re-shape them during a speech, and finally build knowledge out of all information provided in the conversation. Speakers share a common interest to discuss. It is expected that each speaker's reply includes duplicated forms of words from previous speakers. However, linguistic adaptation is observed and evolves in a more complex path than just transferring slightly modified versions of common concepts. A conversation aiming a benefit at the end shows an emergent cooperation inducing the adaptation. Not only cooperation, but also competition drives the adaptation or an opposite scenario and one can capture the dynamic process by tracking how the concepts are linguistically linked. To uncover salient complex dynamic events in verbal communications, we attempt to discover self-organized linguistic relations hidden in a conversation with explicitly stated winners and losers. We examine open access data of the United States Supreme Court. Our understanding is crucial in big data research to guide how transition states in opinion mining and decision-making should be modeled and how this required knowledge to guide the model should be pinpointed, by filtering large amount of data.Comment: Full paper, Proceedings of FLAIRS-2017 (30th Florida Artificial Intelligence Research Society), Special Track, Artificial Intelligence for Big Social Data Analysi

arXiv.org e-Print Archive

DR-NTU (Digital Repository of NTU)

Семантический анализ и поиск текстов на естественном языке для Интернет-портала

Author: Kobets Nataliya
Kovaliuk Tetiana
Publication venue: CEUR Workshop Proceedings
Publication date: 15/05/2019
Field of study

The article is devoted to solving the set of problems related to natural language texts semantic analysis. The following problems are addressed: automation of generating metadata files describing the semantic representation of a web page; semantic network construction for a given set of texts; semantic search execution for a given set of texts using metadata files; and semantic network export to RDF format. The algorithms for knowledge extraction from text, semantic network construction and query execution on a given semantic network are described. The lexico-syntactic patterns method was used as a basis to approach these problems. A specification for describing lexico-syntactic patterns has been developed and a pattern interpreter based on the morphological dictionary of the Ukrainian language has been created as a part of the software implementation of the method. Experimental studies have been carried out for the «classification of living organisms» subject environment set of patterns. Modified Boyer–Moore–Horspool algorithm was used to address the problem of interpreting.Стаття присвячена розв’язанню комплексу задач з семантичного аналізу текстів природною мовою. Розглянуті такі задачі: автоматизація процесу генерації файлів метаданих, що описують семантичне представлення веб-сторінки; побудова семантичної мережі по заданій множині текстів; виконання семантичного пошуку по заданій множині текстів з використанням файлів метаданих; експорт семантичної мережі в формат RDF. Для розв’язання поставлених задач описані алгоритми відокремлення знань із текстів, представлення їх у вигляді семантичної мережі і виконанні запитів до побудованої мережі. Основним підходом до розв’язання цих задач слугував метод лексико-синтаксичних шаблонів.Для програмної реалізації методу розроблено специфікацію опису лексико-синтаксичних шаблонів, створено інтерпретатор шаблонів на основі морфологічного словнику української мови. Експериментальні дослідження проведені для набор шаблонів предметного середовища «класифікація живих організмів». Для розв’язання задачі інтерпретації лексико-синтаксичних шаблонів використовувався модифікований алгоритм Бойера–Мура–Хорпускула.Статья посвящена решению комплекса задач семантического анализа текстов на естественном языке. Рассмотрены следующие задачи: автоматизация процесса генерации файлов метаданных, описывающих семантическое представление веб-страницы; построение семантической сети по заданному множеству текстов; выполнения семантического поиска по заданному множеству текстов с использованием файлов метаданных; экспорт семантической сети в формат RDF. Для решения поставленных задач описаны алгоритмы выделения знаний из текстов, представление их в виде семантической сети и выполнении запросов к построенной сети. Основным подходом к решению этих задач служил метод лексико-синтаксических шаблонов. Для программной реализации метода разработаны спецификации описания лексико-синтаксических шаблонов, создан интерпретатор шаблонов на основе морфологического словаре украинского языка. Экспериментальные исследования проведены для набор шаблонов предметной среды «классификация живых организмов». Для решения задачи интерпретации лексико-синтаксических шаблонов использовался модифицированный алгоритм Бойера-Мура-Хорпускул

Borys Grinchenko Kyiv University Institutional repository

Building WordNet for Afaan Oromoo

Author: Bacha Biru Abera
Publication venue: The International Institute for Science, Technology and Education (IISTE)
Publication date: 29/05/2020
Field of study

WordNet is a lexical database which has many relations to disambiguate the sense of words for natural languages. From the WordNet relations synonyms and hyponym has major role for natural language processing and artificial intelligence applications. In this paper, word embedding (Word2Vec) and lexico-syntactic pattern (LSP) are developed to extract automatically synonyms and hyponyms respectively. For this study, the word embedding is evaluated on two specialized domain algorithms such as a continuous bag of words and Skip Gram algorithms and show superior results. Applying word embedding (Word2Vec) algorithms for Afaan Oromo texts has been registered 80.09% and 85.04% for the continuous bag of words and Skip Gram respectively. According to the result achieved in this study, the skip-gram algorithm does a better job for frequent pairs of words than a continuous bag of words. But, a continuous bag of words algorithm is faster while skip-gram is slower. A lexical syntactic pattern with the combination of Word2Vec and without Word2Vec is also evaluated using information retrieval evaluation metrics such as precision, recall and F-measure to extract hyponym relation from Afaan Oromoo texts. The precision, recall and F-measure have been registered by lexical syntactic patterns without the combination of Word2Vec is 66.73%, 72%, and 69.26% respectively and with the combination of Word2Vec 81.14%, 80.8%, and 81.1% have been registered for precision, recall and F-measure respectively. There are factors that could affect the accuracy of results: 1) the style of writer of Afaan Oromoo i.e. they write a noun phrase with many adjective to express the noun for the reader; and, 2) it is possible that some instances of the LSP are missed due to misspellings and other typographical errors. Keywords: Afaan Oromoo WordNet, Word embedding, Lexico syntactic patterns, Extraction of WordNet relations. DOI: 10.7176/CEIS/11-3-01 Publication date:May 31st 202

International Institute for Science, Technology and Education (IISTE): E-Journals

Ekstraksi Relasi Meronymy dengan Lexico-Syntactic Patterns

Author: Kardinata Eunike
Rakhmawati Nur Aini
Publication venue: 'Tanjungpura University'
Publication date: 27/04/2020
Field of study

Ontologi terdiri atas konsep dan relasi yang masing-masing dapat diekstrak dengan berbagai macam metode. Salah satu metode yang dapat digunakan untuk ekstraksi relasi adalah metode berdasarkan Lexico-Syntactic Patterns. Secara sederhana, ekstraksi relasi dilakukan dengan mendapatkan sebuah pola yang menunjukkan sebuah relasi. Kemudian dilakukan percobaan untuk menguji apakah pola yang didapatkan mampu memprediksi relasi dengan tepat. Pada penelitian ini dilakukan percobaan untuk menguji pola relasi meronymy yang didapatkan dari dataset penelitian terdahulu. Evaluasi dilakukan dengan menggunakan nilai recall dan precision. Dari penelitian ini, ditemukan bahwa banyaknya (keragaman) variasi dalam sekumpulan pola yang menunjukkan suatu relasi dapat mempengaruhi kemampuan kumpulan pola tersebut untuk memprediksi relasi dengan tepat. Semakin banyak variasi pola dalam satu relasi, maka ketepatan prediksi cenderung menurun

JEPIN (Jurnal Edukasi dan Penelitian Informatika)

Векторное представление слов с семантическими отношениями: экспериментальные наблюдения

Author: Maria Karyaeva S.
Pavel Braslavski I.
Valery Sokolov A.
Валерий Соколов Анатольевич
Мария Каряева Сергеевна
Павел Браславский Исаакович
Publication venue: 'P.G. Demidov Yaroslavl State University'
Publication date: 19/12/2018
Field of study

The ability to identify semantic relations between words has made a word2vec model widely used in NLP tasks. The idea of word2vec is based on a simple rule that a higher similarity can be reached if two words have a similar context. Each word can be represented as a vector, so the closest coordinates of vectors can be interpreted as similar words. It allows to establish semantic relations (synonymy, relations of hypernymy and hyponymy and other semantic relations) by applying an automatic extraction. The extraction of semantic relations by hand is considered as a time-consuming and biased task, requiring a large amount of time and some help of experts. Unfortunately, the word2vec model provides an associative list of words which does not consist of relative words only. In this paper, we show some additional criteria that may be applicable to solve this problem. Observations and experiments with well-known characteristics, such as word frequency, a position in an associative list, might be useful for improving results for the task of extraction of semantic relations for the Russian language by using word embedding. In the experiments, the word2vec model trained on the Flibusta and pairs from Wiktionary are used as examples with semantic relationships. Semantically related words are applicable to thesauri, ontologies and intelligent systems for natural language processing.Возможность идентификации семантической близости между словами сделала модель word2vec широко используемой в NLP-задачах. Идея word2vec основана на контекстной близости слов. Каждое слово может быть представлено в виде вектора, близкие координаты векторов могут быть интерпретированы как близкие по смыслу слова. Таким образом, извлечение семантических отношений (отношение синонимии, родо-видовые отношения и другие) может быть автоматизировано. Установление семантических отношений вручную считается трудоемкой и необъективной задачей, требующей большого количества времени и привлечения экспертов. Но среди ассоциативных слов, сформированных с использованием модели word2vec, встречаются слова, не представляющие никаких отношений с главным словом, для которого был представлен ассоциативный ряд. В работе рассматриваются дополнительные критерии, которые могут быть применимы для решения данной проблемы. Наблюдения и проведенные эксперименты с общеизвестными характеристиками, такими как частота слов, позиция в ассоциативном ряду, могут быть использованы для улучшения результатов при работе с векторным представлением слов в части определения семантических отношений для русского языка. В экспериментах используется обученная на корпусах Флибусты модель word2vec и размеченные данные Викисловаря в качестве образцовых примеров, в которых отражены семантические отношения. Семантически связанные слова (или термины) нашли свое применение в тезаурусах, онтологиях, интеллектуальных системах для обработки естественного языка

Modeling and Analysis of Information Systems / Моделирование и анализ информационных систем (МАИС)

ASPER: Attention-based Approach to Extract Syntactic Patterns denoting Semantic Relations in Sentential Context

Author: Md. Ahsanul Kabir
Mohammed Al Hasan
Typer Philips
Xiao Luo
Publication venue
Publication date: 01/01/2021
Field of study

Semantic relationships, such as hyponym-hypernym, cause-effect, meronym-holonym etc., between a pair of entities in a sentence are usually reflected through syntactic patterns. Automatic extraction of such patterns benefits several downstream tasks, including, entity extraction, ontology building, and question answering. Unfortunately, automatic extraction of such patterns has not yet received much attention from NLP and information retrieval researchers. In this work, we propose an attentionbased supervised deep learning model, ASPER, which extracts syntactic patterns between entities exhibiting a given semantic relation in the sentential context. We validate the performance of ASPER on three distinct semantic relations—hyponym-hypernym, cause-effect, and meronym-holonym on six datasets. Experimental results show that for all these semantic relations, ASPER can automatically identify a collection of syntactic patterns reflecting the existence of such a relation between a pair of entities in a sentence. In comparison to the existing methodologies of syntactic pattern extraction, ASPER’s performance is substantially superior

IUPUIScholarWorks

Design of an E-learning system using semantic information and cloud computing technologies

Author: Badawy Fathelbab Mohammed Ahmed
Publication venue
Publication date: 22/09/2023
Field of study

Humanity is currently suffering from many difficult problems that threaten the life and survival of the human race. It is very easy for all mankind to be affected, directly or indirectly, by these problems. Education is a key solution for most of them. In our thesis we tried to make use of current technologies to enhance and ease the learning process. We have designed an e-learning system based on semantic information and cloud computing, in addition to many other technologies that contribute to improving the educational process and raising the level of students. The design was built after much research on useful technology, its types, and examples of actual systems that were previously discussed by other researchers. In addition to the proposed design, an algorithm was implemented to identify topics found in large textual educational resources. It was tested and proved to be efficient against other methods. The algorithm has the ability of extracting the main topics from textual learning resources, linking related resources and generating interactive dynamic knowledge graphs. This algorithm accurately and efficiently accomplishes those tasks even for bigger books. We used Wikipedia Miner, TextRank, and Gensim within our algorithm. Our algorithm‘s accuracy was evaluated against Gensim, largely improving its accuracy. Augmenting the system design with the implemented algorithm will produce many useful services for improving the learning process such as: identifying main topics of big textual learning resources automatically and connecting them to other well defined concepts from Wikipedia, enriching current learning resources with semantic information from external sources, providing student with browsable dynamic interactive knowledge graphs, and making use of learning groups to encourage students to share their learning experiences and feedback with other learners.Programa de Doctorado en Ingeniería Telemática por la Universidad Carlos III de MadridPresidente: Luis Sánchez Fernández.- Secretario: Luis de la Fuente Valentín.- Vocal: Norberto Fernández Garcí

Universidad Carlos III de Madrid e-Archivo