9 research outputs found
Tracing Linguistic Relations in Winning and Losing Sides of Explicit Opposing Groups
Linguistic relations in oral conversations present how opinions are
constructed and developed in a restricted time. The relations bond ideas,
arguments, thoughts, and feelings, re-shape them during a speech, and finally
build knowledge out of all information provided in the conversation. Speakers
share a common interest to discuss. It is expected that each speaker's reply
includes duplicated forms of words from previous speakers. However, linguistic
adaptation is observed and evolves in a more complex path than just
transferring slightly modified versions of common concepts. A conversation
aiming a benefit at the end shows an emergent cooperation inducing the
adaptation. Not only cooperation, but also competition drives the adaptation or
an opposite scenario and one can capture the dynamic process by tracking how
the concepts are linguistically linked. To uncover salient complex dynamic
events in verbal communications, we attempt to discover self-organized
linguistic relations hidden in a conversation with explicitly stated winners
and losers. We examine open access data of the United States Supreme Court. Our
understanding is crucial in big data research to guide how transition states in
opinion mining and decision-making should be modeled and how this required
knowledge to guide the model should be pinpointed, by filtering large amount of
data.Comment: Full paper, Proceedings of FLAIRS-2017 (30th Florida Artificial
Intelligence Research Society), Special Track, Artificial Intelligence for
Big Social Data Analysi
Π‘Π΅ΠΌΠ°Π½ΡΠΈΡΠ΅ΡΠΊΠΈΠΉ Π°Π½Π°Π»ΠΈΠ· ΠΈ ΠΏΠΎΠΈΡΠΊ ΡΠ΅ΠΊΡΡΠΎΠ² Π½Π° Π΅ΡΡΠ΅ΡΡΠ²Π΅Π½Π½ΠΎΠΌ ΡΠ·ΡΠΊΠ΅ Π΄Π»Ρ ΠΠ½ΡΠ΅ΡΠ½Π΅Ρ-ΠΏΠΎΡΡΠ°Π»Π°
The article is devoted to solving the set of problems related to natural language texts semantic analysis. The following problems are addressed: automation of generating metadata files describing the semantic representation of a web page; semantic network construction for a given set of texts; semantic search execution for a given set of texts using metadata files; and semantic network export to RDF format. The algorithms for knowledge extraction from text, semantic network construction and query execution on a given semantic
network are described. The lexico-syntactic patterns method was used as a basis to approach these problems. A specification for describing lexico-syntactic patterns has been developed and a pattern interpreter based on the
morphological dictionary of the Ukrainian language has been created as a part of the software implementation of the method. Experimental studies have been carried out for the Β«classification of living organismsΒ» subject environment set of patterns. Modified BoyerβMooreβHorspool algorithm was used to address the problem of interpreting.Π‘ΡΠ°ΡΡΡ ΠΏΡΠΈΡΠ²ΡΡΠ΅Π½Π° ΡΠΎΠ·Π²βΡΠ·Π°Π½Π½Ρ ΠΊΠΎΠΌΠΏΠ»Π΅ΠΊΡΡ Π·Π°Π΄Π°Ρ Π· ΡΠ΅ΠΌΠ°Π½ΡΠΈΡΠ½ΠΎΠ³ΠΎ Π°Π½Π°Π»ΡΠ·Ρ ΡΠ΅ΠΊΡΡΡΠ² ΠΏΡΠΈΡΠΎΠ΄Π½ΠΎΡ ΠΌΠΎΠ²ΠΎΡ. Π ΠΎΠ·Π³Π»ΡΠ½ΡΡΡ ΡΠ°ΠΊΡ Π·Π°Π΄Π°ΡΡ: Π°Π²ΡΠΎΠΌΠ°ΡΠΈΠ·Π°ΡΡΡ ΠΏΡΠΎΡΠ΅ΡΡ Π³Π΅Π½Π΅ΡΠ°ΡΡΡ ΡΠ°ΠΉΠ»ΡΠ² ΠΌΠ΅ΡΠ°Π΄Π°Π½ΠΈΡ
, ΡΠΎ ΠΎΠΏΠΈΡΡΡΡΡ ΡΠ΅ΠΌΠ°Π½ΡΠΈΡΠ½Π΅ ΠΏΡΠ΅Π΄ΡΡΠ°Π²Π»Π΅Π½Π½Ρ Π²Π΅Π±-ΡΡΠΎΡΡΠ½ΠΊΠΈ; ΠΏΠΎΠ±ΡΠ΄ΠΎΠ²Π° ΡΠ΅ΠΌΠ°Π½ΡΠΈΡΠ½ΠΎΡ ΠΌΠ΅ΡΠ΅ΠΆΡ ΠΏΠΎ Π·Π°Π΄Π°Π½ΡΠΉ ΠΌΠ½ΠΎΠΆΠΈΠ½Ρ ΡΠ΅ΠΊΡΡΡΠ²; Π²ΠΈΠΊΠΎΠ½Π°Π½Π½Ρ ΡΠ΅ΠΌΠ°Π½ΡΠΈΡΠ½ΠΎΠ³ΠΎ ΠΏΠΎΡΡΠΊΡ ΠΏΠΎ Π·Π°Π΄Π°Π½ΡΠΉ ΠΌΠ½ΠΎΠΆΠΈΠ½Ρ
ΡΠ΅ΠΊΡΡΡΠ² Π· Π²ΠΈΠΊΠΎΡΠΈΡΡΠ°Π½Π½ΡΠΌ ΡΠ°ΠΉΠ»ΡΠ² ΠΌΠ΅ΡΠ°Π΄Π°Π½ΠΈΡ
; Π΅ΠΊΡΠΏΠΎΡΡ ΡΠ΅ΠΌΠ°Π½ΡΠΈΡΠ½ΠΎΡ ΠΌΠ΅ΡΠ΅ΠΆΡ Π² ΡΠΎΡΠΌΠ°Ρ RDF. ΠΠ»Ρ ΡΠΎΠ·Π²βΡΠ·Π°Π½Π½Ρ ΠΏΠΎΡΡΠ°Π²Π»Π΅Π½ΠΈΡ
Π·Π°Π΄Π°Ρ ΠΎΠΏΠΈΡΠ°Π½Ρ Π°Π»Π³ΠΎΡΠΈΡΠΌΠΈ Π²ΡΠ΄ΠΎΠΊΡΠ΅ΠΌΠ»Π΅Π½Π½Ρ Π·Π½Π°Π½Ρ ΡΠ· ΡΠ΅ΠΊΡΡΡΠ², ΠΏΡΠ΅Π΄ΡΡΠ°Π²Π»Π΅Π½Π½Ρ ΡΡ
Ρ Π²ΠΈΠ³Π»ΡΠ΄Ρ ΡΠ΅ΠΌΠ°Π½ΡΠΈΡΠ½ΠΎΡ
ΠΌΠ΅ΡΠ΅ΠΆΡ Ρ Π²ΠΈΠΊΠΎΠ½Π°Π½Π½Ρ Π·Π°ΠΏΠΈΡΡΠ² Π΄ΠΎ ΠΏΠΎΠ±ΡΠ΄ΠΎΠ²Π°Π½ΠΎΡ ΠΌΠ΅ΡΠ΅ΠΆΡ. ΠΡΠ½ΠΎΠ²Π½ΠΈΠΌ ΠΏΡΠ΄Ρ
ΠΎΠ΄ΠΎΠΌ Π΄ΠΎ ΡΠΎΠ·Π²βΡΠ·Π°Π½Π½Ρ ΡΠΈΡ
Π·Π°Π΄Π°Ρ ΡΠ»ΡΠ³ΡΠ²Π°Π² ΠΌΠ΅ΡΠΎΠ΄ Π»Π΅ΠΊΡΠΈΠΊΠΎ-ΡΠΈΠ½ΡΠ°ΠΊΡΠΈΡΠ½ΠΈΡ
ΡΠ°Π±Π»ΠΎΠ½ΡΠ².ΠΠ»Ρ ΠΏΡΠΎΠ³ΡΠ°ΠΌΠ½ΠΎΡ ΡΠ΅Π°Π»ΡΠ·Π°ΡΡΡ ΠΌΠ΅ΡΠΎΠ΄Ρ ΡΠΎΠ·ΡΠΎΠ±Π»Π΅Π½ΠΎ ΡΠΏΠ΅ΡΠΈΡΡΠΊΠ°ΡΡΡ ΠΎΠΏΠΈΡΡ Π»Π΅ΠΊΡΠΈΠΊΠΎ-ΡΠΈΠ½ΡΠ°ΠΊΡΠΈΡΠ½ΠΈΡ
ΡΠ°Π±Π»ΠΎΠ½ΡΠ², ΡΡΠ²ΠΎΡΠ΅Π½ΠΎ ΡΠ½ΡΠ΅ΡΠΏΡΠ΅ΡΠ°ΡΠΎΡ ΡΠ°Π±Π»ΠΎΠ½ΡΠ² Π½Π° ΠΎΡΠ½ΠΎΠ²Ρ ΠΌΠΎΡΡΠΎΠ»ΠΎΠ³ΡΡΠ½ΠΎΠ³ΠΎ ΡΠ»ΠΎΠ²Π½ΠΈΠΊΡ ΡΠΊΡΠ°ΡΠ½ΡΡΠΊΠΎΡ ΠΌΠΎΠ²ΠΈ. ΠΠΊΡΠΏΠ΅ΡΠΈΠΌΠ΅Π½ΡΠ°Π»ΡΠ½Ρ Π΄ΠΎΡΠ»ΡΠ΄ΠΆΠ΅Π½Π½Ρ ΠΏΡΠΎΠ²Π΅Π΄Π΅Π½Ρ Π΄Π»Ρ Π½Π°Π±ΠΎΡ ΡΠ°Π±Π»ΠΎΠ½ΡΠ² ΠΏΡΠ΅Π΄ΠΌΠ΅ΡΠ½ΠΎΠ³ΠΎ ΡΠ΅ΡΠ΅Π΄ΠΎΠ²ΠΈΡΠ° Β«ΠΊΠ»Π°ΡΠΈΡΡΠΊΠ°ΡΡΡ ΠΆΠΈΠ²ΠΈΡ
ΠΎΡΠ³Π°Π½ΡΠ·ΠΌΡΠ²Β». ΠΠ»Ρ ΡΠΎΠ·Π²βΡΠ·Π°Π½Π½Ρ Π·Π°Π΄Π°ΡΡ ΡΠ½ΡΠ΅ΡΠΏΡΠ΅ΡΠ°ΡΡΡ Π»Π΅ΠΊΡΠΈΠΊΠΎ-ΡΠΈΠ½ΡΠ°ΠΊΡΠΈΡΠ½ΠΈΡ
ΡΠ°Π±Π»ΠΎΠ½ΡΠ² Π²ΠΈΠΊΠΎΡΠΈΡΡΠΎΠ²ΡΠ²Π°Π²ΡΡ ΠΌΠΎΠ΄ΠΈΡΡΠΊΠΎΠ²Π°Π½ΠΈΠΉ Π°Π»Π³ΠΎΡΠΈΡΠΌ ΠΠΎΠΉΠ΅ΡΠ°βΠΡΡΠ°βΠ₯ΠΎΡΠΏΡΡΠΊΡΠ»Π°.Π‘ΡΠ°ΡΡΡ ΠΏΠΎΡΠ²ΡΡΠ΅Π½Π° ΡΠ΅ΡΠ΅Π½ΠΈΡ ΠΊΠΎΠΌΠΏΠ»Π΅ΠΊΡΠ° Π·Π°Π΄Π°Ρ ΡΠ΅ΠΌΠ°Π½ΡΠΈΡΠ΅ΡΠΊΠΎΠ³ΠΎ Π°Π½Π°Π»ΠΈΠ·Π° ΡΠ΅ΠΊΡΡΠΎΠ² Π½Π° Π΅ΡΡΠ΅ΡΡΠ²Π΅Π½Π½ΠΎΠΌ ΡΠ·ΡΠΊΠ΅. Π Π°ΡΡΠΌΠΎΡΡΠ΅Π½Ρ ΡΠ»Π΅Π΄ΡΡΡΠΈΠ΅ Π·Π°Π΄Π°ΡΠΈ: Π°Π²ΡΠΎΠΌΠ°ΡΠΈΠ·Π°ΡΠΈΡ ΠΏΡΠΎΡΠ΅ΡΡΠ° Π³Π΅Π½Π΅ΡΠ°ΡΠΈΠΈ ΡΠ°ΠΉΠ»ΠΎΠ² ΠΌΠ΅ΡΠ°Π΄Π°Π½Π½ΡΡ
, ΠΎΠΏΠΈΡΡΠ²Π°ΡΡΠΈΡ
ΡΠ΅ΠΌΠ°Π½ΡΠΈΡΠ΅ΡΠΊΠΎΠ΅ ΠΏΡΠ΅Π΄ΡΡΠ°Π²Π»Π΅Π½ΠΈΠ΅ Π²Π΅Π±-ΡΡΡΠ°Π½ΠΈΡΡ; ΠΏΠΎΡΡΡΠΎΠ΅Π½ΠΈΠ΅ ΡΠ΅ΠΌΠ°Π½ΡΠΈΡΠ΅ΡΠΊΠΎΠΉ ΡΠ΅ΡΠΈ ΠΏΠΎ Π·Π°Π΄Π°Π½Π½ΠΎΠΌΡ ΠΌΠ½ΠΎΠΆΠ΅ΡΡΠ²Ρ ΡΠ΅ΠΊΡΡΠΎΠ²; Π²ΡΠΏΠΎΠ»Π½Π΅Π½ΠΈΡ ΡΠ΅ΠΌΠ°Π½ΡΠΈΡΠ΅ΡΠΊΠΎΠ³ΠΎ ΠΏΠΎΠΈΡΠΊΠ° ΠΏΠΎ Π·Π°Π΄Π°Π½Π½ΠΎΠΌΡ ΠΌΠ½ΠΎΠΆΠ΅ΡΡΠ²Ρ ΡΠ΅ΠΊΡΡΠΎΠ² Ρ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΠ΅ΠΌ ΡΠ°ΠΉΠ»ΠΎΠ² ΠΌΠ΅ΡΠ°Π΄Π°Π½Π½ΡΡ
; ΡΠΊΡΠΏΠΎΡΡ ΡΠ΅ΠΌΠ°Π½ΡΠΈΡΠ΅ΡΠΊΠΎΠΉ ΡΠ΅ΡΠΈ Π² ΡΠΎΡΠΌΠ°Ρ RDF. ΠΠ»Ρ ΡΠ΅ΡΠ΅Π½ΠΈΡ ΠΏΠΎΡΡΠ°Π²Π»Π΅Π½Π½ΡΡ
Π·Π°Π΄Π°Ρ ΠΎΠΏΠΈΡΠ°Π½Ρ Π°Π»Π³ΠΎΡΠΈΡΠΌΡ Π²ΡΠ΄Π΅Π»Π΅Π½ΠΈΡ Π·Π½Π°Π½ΠΈΠΉ ΠΈΠ· ΡΠ΅ΠΊΡΡΠΎΠ², ΠΏΡΠ΅Π΄ΡΡΠ°Π²Π»Π΅Π½ΠΈΠ΅ ΠΈΡ
Π² Π²ΠΈΠ΄Π΅ ΡΠ΅ΠΌΠ°Π½ΡΠΈΡΠ΅ΡΠΊΠΎΠΉ
ΡΠ΅ΡΠΈ ΠΈ Π²ΡΠΏΠΎΠ»Π½Π΅Π½ΠΈΠΈ Π·Π°ΠΏΡΠΎΡΠΎΠ² ΠΊ ΠΏΠΎΡΡΡΠΎΠ΅Π½Π½ΠΎΠΉ ΡΠ΅ΡΠΈ. ΠΡΠ½ΠΎΠ²Π½ΡΠΌ ΠΏΠΎΠ΄Ρ
ΠΎΠ΄ΠΎΠΌ ΠΊ ΡΠ΅ΡΠ΅Π½ΠΈΡ ΡΡΠΈΡ
Π·Π°Π΄Π°Ρ ΡΠ»ΡΠΆΠΈΠ» ΠΌΠ΅ΡΠΎΠ΄ Π»Π΅ΠΊΡΠΈΠΊΠΎ-ΡΠΈΠ½ΡΠ°ΠΊΡΠΈΡΠ΅ΡΠΊΠΈΡ
ΡΠ°Π±Π»ΠΎΠ½ΠΎΠ². ΠΠ»Ρ ΠΏΡΠΎΠ³ΡΠ°ΠΌΠΌΠ½ΠΎΠΉ ΡΠ΅Π°Π»ΠΈΠ·Π°ΡΠΈΠΈ ΠΌΠ΅ΡΠΎΠ΄Π° ΡΠ°Π·ΡΠ°Π±ΠΎΡΠ°Π½Ρ ΡΠΏΠ΅ΡΠΈΡΠΈΠΊΠ°ΡΠΈΠΈ ΠΎΠΏΠΈΡΠ°Π½ΠΈΡ Π»Π΅ΠΊΡΠΈΠΊΠΎ-ΡΠΈΠ½ΡΠ°ΠΊΡΠΈΡΠ΅ΡΠΊΠΈΡ
ΡΠ°Π±Π»ΠΎΠ½ΠΎΠ², ΡΠΎΠ·Π΄Π°Π½ ΠΈΠ½ΡΠ΅ΡΠΏΡΠ΅ΡΠ°ΡΠΎΡ ΡΠ°Π±Π»ΠΎΠ½ΠΎΠ² Π½Π° ΠΎΡΠ½ΠΎΠ²Π΅ ΠΌΠΎΡΡΠΎΠ»ΠΎΠ³ΠΈΡΠ΅ΡΠΊΠΎΠ³ΠΎ ΡΠ»ΠΎΠ²Π°ΡΠ΅ ΡΠΊΡΠ°ΠΈΠ½ΡΠΊΠΎΠ³ΠΎ ΡΠ·ΡΠΊΠ°. ΠΠΊΡΠΏΠ΅ΡΠΈΠΌΠ΅Π½ΡΠ°Π»ΡΠ½ΡΠ΅ ΠΈΡΡΠ»Π΅Π΄ΠΎΠ²Π°Π½ΠΈΡ ΠΏΡΠΎΠ²Π΅Π΄Π΅Π½Ρ Π΄Π»Ρ Π½Π°Π±ΠΎΡ ΡΠ°Π±Π»ΠΎΠ½ΠΎΠ² ΠΏΡΠ΅Π΄ΠΌΠ΅ΡΠ½ΠΎΠΉ ΡΡΠ΅Π΄Ρ Β«ΠΊΠ»Π°ΡΡΠΈΡΠΈΠΊΠ°ΡΠΈΡ ΠΆΠΈΠ²ΡΡ
ΠΎΡΠ³Π°Π½ΠΈΠ·ΠΌΠΎΠ²Β». ΠΠ»Ρ ΡΠ΅ΡΠ΅Π½ΠΈΡ Π·Π°Π΄Π°ΡΠΈ ΠΈΠ½ΡΠ΅ΡΠΏΡΠ΅ΡΠ°ΡΠΈΠΈ Π»Π΅ΠΊΡΠΈΠΊΠΎ-ΡΠΈΠ½ΡΠ°ΠΊΡΠΈΡΠ΅ΡΠΊΠΈΡ
ΡΠ°Π±Π»ΠΎΠ½ΠΎΠ² ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π»ΡΡ ΠΌΠΎΠ΄ΠΈΡΠΈΡΠΈΡΠΎΠ²Π°Π½Π½ΡΠΉ Π°Π»Π³ΠΎΡΠΈΡΠΌ ΠΠΎΠΉΠ΅ΡΠ°-ΠΡΡΠ°-Π₯ΠΎΡΠΏΡΡΠΊΡΠ»
Building WordNet for Afaan Oromoo
WordNet is a lexical database which has many relations to disambiguate the sense of words for natural languages. From the WordNet relations synonyms and hyponym has major role for natural language processing and artificial intelligence applications. In this paper, word embedding (Word2Vec) and lexico-syntactic pattern (LSP) are developed to extract automatically synonyms and hyponyms respectively. For this study, the word embedding is evaluated on two specialized domain algorithms such as a continuous bag of words and Skip Gram algorithms and show superior results. Applying word embedding (Word2Vec) algorithms for Afaan Oromo texts has been registered 80.09% and 85.04% for the continuous bag of words and Skip Gram respectively. According to the result achieved in this study, the skip-gram algorithm does a better job for frequent pairs of words than a continuous bag of words. But, a continuous bag of words algorithm is faster while skip-gram is slower. A lexical syntactic pattern with the combination of Word2Vec and without Word2Vec is also evaluated using information retrieval evaluation metrics such as precision, recall and F-measure to extract hyponym relation from Afaan Oromoo texts. The precision, recall and F-measure have been registered by lexical syntactic patterns without the combination of Word2Vec is 66.73%, 72%, and 69.26% respectively and with the combination of Word2Vec 81.14%, 80.8%, and 81.1% have been registered for precision, recall and F-measure respectively. There are factors that could affect the accuracy of results: 1) the style of writer of Afaan Oromoo i.e. they write a noun phrase with many adjective to express the noun for the reader; and, 2) it is possible that some instances of the LSP are missed due to misspellings and other typographical errors. Keywords: Afaan Oromoo WordNet, Word embedding, Lexico syntactic patterns, Extraction of WordNet relations. DOI: 10.7176/CEIS/11-3-01 Publication date:May 31st 202
Ekstraksi Relasi Meronymy dengan Lexico-Syntactic Patterns
Ontologi terdiri atas konsep dan relasi yang masing-masing dapat diekstrak dengan berbagai macam metode. Salah satu metode yang dapat digunakan untuk ekstraksi relasi adalah metode berdasarkan Lexico-Syntactic Patterns. Secara sederhana, ekstraksi relasi dilakukan dengan mendapatkan sebuah pola yang menunjukkan sebuah relasi. Kemudian dilakukan percobaan untuk menguji apakah pola yang didapatkan mampu memprediksi relasi dengan tepat. Pada penelitian ini dilakukan percobaan untuk menguji pola relasi meronymy yang didapatkan dari dataset penelitian terdahulu. Evaluasi dilakukan dengan menggunakan nilai recall dan precision. Dari penelitian ini, ditemukan bahwa banyaknya (keragaman) variasi dalam sekumpulan pola yang menunjukkan suatu relasi dapat mempengaruhi kemampuan kumpulan pola tersebut untuk memprediksi relasi dengan tepat. Semakin banyak variasi pola dalam satu relasi, maka ketepatan prediksi cenderung menurun
ΠΠ΅ΠΊΡΠΎΡΠ½ΠΎΠ΅ ΠΏΡΠ΅Π΄ΡΡΠ°Π²Π»Π΅Π½ΠΈΠ΅ ΡΠ»ΠΎΠ² Ρ ΡΠ΅ΠΌΠ°Π½ΡΠΈΡΠ΅ΡΠΊΠΈΠΌΠΈ ΠΎΡΠ½ΠΎΡΠ΅Π½ΠΈΡΠΌΠΈ: ΡΠΊΡΠΏΠ΅ΡΠΈΠΌΠ΅Π½ΡΠ°Π»ΡΠ½ΡΠ΅ Π½Π°Π±Π»ΡΠ΄Π΅Π½ΠΈΡ
The ability to identify semantic relations between words has made a word2vec model widely used in NLP tasks. The idea of word2vec is based on a simple rule that a higher similarity can be reached if two words have a similar context. Each word can be represented as a vector, so the closest coordinates of vectors can be interpreted as similar words. It allows to establish semantic relations (synonymy, relations of hypernymy and hyponymy and other semantic relations) by applying an automatic extraction. The extraction of semantic relations by hand is considered as a time-consuming and biased task, requiring a large amount of time and some help of experts. Unfortunately, the word2vec model provides an associative list of words which does not consist of relative words only. In this paper, we show some additional criteria that may be applicable to solve this problem. Observations and experiments with well-known characteristics, such as word frequency, a position in an associative list, might be useful for improving results for the task of extraction of semantic relations for the Russian language by using word embedding. In the experiments, the word2vec model trained on the Flibusta and pairs from Wiktionary are used as examples with semantic relationships. Semantically related words are applicable to thesauri, ontologies and intelligent systems for natural language processing.ΠΠΎΠ·ΠΌΠΎΠΆΠ½ΠΎΡΡΡ ΠΈΠ΄Π΅Π½ΡΠΈΡΠΈΠΊΠ°ΡΠΈΠΈ ΡΠ΅ΠΌΠ°Π½ΡΠΈΡΠ΅ΡΠΊΠΎΠΉ Π±Π»ΠΈΠ·ΠΎΡΡΠΈ ΠΌΠ΅ΠΆΠ΄Ρ ΡΠ»ΠΎΠ²Π°ΠΌΠΈ ΡΠ΄Π΅Π»Π°Π»Π° ΠΌΠΎΠ΄Π΅Π»Ρ word2vec ΡΠΈΡΠΎΠΊΠΎ ΠΈΡΠΏΠΎΠ»ΡΠ·ΡΠ΅ΠΌΠΎΠΉ Π² NLP-Π·Π°Π΄Π°ΡΠ°Ρ
. ΠΠ΄Π΅Ρ word2vec ΠΎΡΠ½ΠΎΠ²Π°Π½Π° Π½Π° ΠΊΠΎΠ½ΡΠ΅ΠΊΡΡΠ½ΠΎΠΉ Π±Π»ΠΈΠ·ΠΎΡΡΠΈ ΡΠ»ΠΎΠ². ΠΠ°ΠΆΠ΄ΠΎΠ΅ ΡΠ»ΠΎΠ²ΠΎ ΠΌΠΎΠΆΠ΅Ρ Π±ΡΡΡ ΠΏΡΠ΅Π΄ΡΡΠ°Π²Π»Π΅Π½ΠΎ Π² Π²ΠΈΠ΄Π΅ Π²Π΅ΠΊΡΠΎΡΠ°, Π±Π»ΠΈΠ·ΠΊΠΈΠ΅ ΠΊΠΎΠΎΡΠ΄ΠΈΠ½Π°ΡΡ Π²Π΅ΠΊΡΠΎΡΠΎΠ² ΠΌΠΎΠ³ΡΡ Π±ΡΡΡ ΠΈΠ½ΡΠ΅ΡΠΏΡΠ΅ΡΠΈΡΠΎΠ²Π°Π½Ρ ΠΊΠ°ΠΊ Π±Π»ΠΈΠ·ΠΊΠΈΠ΅ ΠΏΠΎ ΡΠΌΡΡΠ»Ρ ΡΠ»ΠΎΠ²Π°. Π’Π°ΠΊΠΈΠΌ ΠΎΠ±ΡΠ°Π·ΠΎΠΌ, ΠΈΠ·Π²Π»Π΅ΡΠ΅Π½ΠΈΠ΅ ΡΠ΅ΠΌΠ°Π½ΡΠΈΡΠ΅ΡΠΊΠΈΡ
ΠΎΡΠ½ΠΎΡΠ΅Π½ΠΈΠΉ (ΠΎΡΠ½ΠΎΡΠ΅Π½ΠΈΠ΅ ΡΠΈΠ½ΠΎΠ½ΠΈΠΌΠΈΠΈ, ΡΠΎΠ΄ΠΎ-Π²ΠΈΠ΄ΠΎΠ²ΡΠ΅ ΠΎΡΠ½ΠΎΡΠ΅Π½ΠΈΡ ΠΈ Π΄ΡΡΠ³ΠΈΠ΅) ΠΌΠΎΠΆΠ΅Ρ Π±ΡΡΡ Π°Π²ΡΠΎΠΌΠ°ΡΠΈΠ·ΠΈΡΠΎΠ²Π°Π½ΠΎ. Π£ΡΡΠ°Π½ΠΎΠ²Π»Π΅Π½ΠΈΠ΅ ΡΠ΅ΠΌΠ°Π½ΡΠΈΡΠ΅ΡΠΊΠΈΡ
ΠΎΡΠ½ΠΎΡΠ΅Π½ΠΈΠΉ Π²ΡΡΡΠ½ΡΡ ΡΡΠΈΡΠ°Π΅ΡΡΡ ΡΡΡΠ΄ΠΎΠ΅ΠΌΠΊΠΎΠΉ ΠΈ Π½Π΅ΠΎΠ±ΡΠ΅ΠΊΡΠΈΠ²Π½ΠΎΠΉ Π·Π°Π΄Π°ΡΠ΅ΠΉ, ΡΡΠ΅Π±ΡΡΡΠ΅ΠΉ Π±ΠΎΠ»ΡΡΠΎΠ³ΠΎ ΠΊΠΎΠ»ΠΈΡΠ΅ΡΡΠ²Π° Π²ΡΠ΅ΠΌΠ΅Π½ΠΈ ΠΈ ΠΏΡΠΈΠ²Π»Π΅ΡΠ΅Π½ΠΈΡ ΡΠΊΡΠΏΠ΅ΡΡΠΎΠ². ΠΠΎ ΡΡΠ΅Π΄ΠΈ Π°ΡΡΠΎΡΠΈΠ°ΡΠΈΠ²Π½ΡΡ
ΡΠ»ΠΎΠ², ΡΡΠΎΡΠΌΠΈΡΠΎΠ²Π°Π½Π½ΡΡ
Ρ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΠ΅ΠΌ ΠΌΠΎΠ΄Π΅Π»ΠΈ word2vec, Π²ΡΡΡΠ΅ΡΠ°ΡΡΡΡ ΡΠ»ΠΎΠ²Π°, Π½Π΅ ΠΏΡΠ΅Π΄ΡΡΠ°Π²Π»ΡΡΡΠΈΠ΅ Π½ΠΈΠΊΠ°ΠΊΠΈΡ
ΠΎΡΠ½ΠΎΡΠ΅Π½ΠΈΠΉ Ρ Π³Π»Π°Π²Π½ΡΠΌ ΡΠ»ΠΎΠ²ΠΎΠΌ, Π΄Π»Ρ ΠΊΠΎΡΠΎΡΠΎΠ³ΠΎ Π±ΡΠ» ΠΏΡΠ΅Π΄ΡΡΠ°Π²Π»Π΅Π½ Π°ΡΡΠΎΡΠΈΠ°ΡΠΈΠ²Π½ΡΠΉ ΡΡΠ΄. Π ΡΠ°Π±ΠΎΡΠ΅ ΡΠ°ΡΡΠΌΠ°ΡΡΠΈΠ²Π°ΡΡΡΡ Π΄ΠΎΠΏΠΎΠ»Π½ΠΈΡΠ΅Π»ΡΠ½ΡΠ΅ ΠΊΡΠΈΡΠ΅ΡΠΈΠΈ, ΠΊΠΎΡΠΎΡΡΠ΅ ΠΌΠΎΠ³ΡΡ Π±ΡΡΡ ΠΏΡΠΈΠΌΠ΅Π½ΠΈΠΌΡ Π΄Π»Ρ ΡΠ΅ΡΠ΅Π½ΠΈΡ Π΄Π°Π½Π½ΠΎΠΉ ΠΏΡΠΎΠ±Π»Π΅ΠΌΡ. ΠΠ°Π±Π»ΡΠ΄Π΅Π½ΠΈΡ ΠΈ ΠΏΡΠΎΠ²Π΅Π΄Π΅Π½Π½ΡΠ΅ ΡΠΊΡΠΏΠ΅ΡΠΈΠΌΠ΅Π½ΡΡ Ρ ΠΎΠ±ΡΠ΅ΠΈΠ·Π²Π΅ΡΡΠ½ΡΠΌΠΈ Ρ
Π°ΡΠ°ΠΊΡΠ΅ΡΠΈΡΡΠΈΠΊΠ°ΠΌΠΈ, ΡΠ°ΠΊΠΈΠΌΠΈ ΠΊΠ°ΠΊ ΡΠ°ΡΡΠΎΡΠ° ΡΠ»ΠΎΠ², ΠΏΠΎΠ·ΠΈΡΠΈΡ Π² Π°ΡΡΠΎΡΠΈΠ°ΡΠΈΠ²Π½ΠΎΠΌ ΡΡΠ΄Ρ, ΠΌΠΎΠ³ΡΡ Π±ΡΡΡ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½Ρ Π΄Π»Ρ ΡΠ»ΡΡΡΠ΅Π½ΠΈΡ ΡΠ΅Π·ΡΠ»ΡΡΠ°ΡΠΎΠ² ΠΏΡΠΈ ΡΠ°Π±ΠΎΡΠ΅ Ρ Π²Π΅ΠΊΡΠΎΡΠ½ΡΠΌ ΠΏΡΠ΅Π΄ΡΡΠ°Π²Π»Π΅Π½ΠΈΠ΅ΠΌ ΡΠ»ΠΎΠ² Π² ΡΠ°ΡΡΠΈ ΠΎΠΏΡΠ΅Π΄Π΅Π»Π΅Π½ΠΈΡ ΡΠ΅ΠΌΠ°Π½ΡΠΈΡΠ΅ΡΠΊΠΈΡ
ΠΎΡΠ½ΠΎΡΠ΅Π½ΠΈΠΉ Π΄Π»Ρ ΡΡΡΡΠΊΠΎΠ³ΠΎ ΡΠ·ΡΠΊΠ°. Π ΡΠΊΡΠΏΠ΅ΡΠΈΠΌΠ΅Π½ΡΠ°Ρ
ΠΈΡΠΏΠΎΠ»ΡΠ·ΡΠ΅ΡΡΡ ΠΎΠ±ΡΡΠ΅Π½Π½Π°Ρ Π½Π° ΠΊΠΎΡΠΏΡΡΠ°Ρ
Π€Π»ΠΈΠ±ΡΡΡΡ ΠΌΠΎΠ΄Π΅Π»Ρ word2vec ΠΈ ΡΠ°Π·ΠΌΠ΅ΡΠ΅Π½Π½ΡΠ΅ Π΄Π°Π½Π½ΡΠ΅ ΠΠΈΠΊΠΈΡΠ»ΠΎΠ²Π°ΡΡ Π² ΠΊΠ°ΡΠ΅ΡΡΠ²Π΅ ΠΎΠ±ΡΠ°Π·ΡΠΎΠ²ΡΡ
ΠΏΡΠΈΠΌΠ΅ΡΠΎΠ², Π² ΠΊΠΎΡΠΎΡΡΡ
ΠΎΡΡΠ°ΠΆΠ΅Π½Ρ ΡΠ΅ΠΌΠ°Π½ΡΠΈΡΠ΅ΡΠΊΠΈΠ΅ ΠΎΡΠ½ΠΎΡΠ΅Π½ΠΈΡ. Π‘Π΅ΠΌΠ°Π½ΡΠΈΡΠ΅ΡΠΊΠΈ ΡΠ²ΡΠ·Π°Π½Π½ΡΠ΅ ΡΠ»ΠΎΠ²Π° (ΠΈΠ»ΠΈ ΡΠ΅ΡΠΌΠΈΠ½Ρ) Π½Π°ΡΠ»ΠΈ ΡΠ²ΠΎΠ΅ ΠΏΡΠΈΠΌΠ΅Π½Π΅Π½ΠΈΠ΅ Π² ΡΠ΅Π·Π°ΡΡΡΡΠ°Ρ
, ΠΎΠ½ΡΠΎΠ»ΠΎΠ³ΠΈΡΡ
, ΠΈΠ½ΡΠ΅Π»Π»Π΅ΠΊΡΡΠ°Π»ΡΠ½ΡΡ
ΡΠΈΡΡΠ΅ΠΌΠ°Ρ
Π΄Π»Ρ ΠΎΠ±ΡΠ°Π±ΠΎΡΠΊΠΈ Π΅ΡΡΠ΅ΡΡΠ²Π΅Π½Π½ΠΎΠ³ΠΎ ΡΠ·ΡΠΊΠ°
ASPER: Attention-based Approach to Extract Syntactic Patterns denoting Semantic Relations in Sentential Context
Semantic relationships, such as hyponym-hypernym, cause-effect, meronym-holonym etc., between a pair of entities in a sentence are usually reflected through syntactic patterns. Automatic extraction of such patterns benefits several downstream tasks, including, entity extraction, ontology building, and question answering. Unfortunately, automatic extraction of such patterns has not yet received much attention from NLP and information retrieval researchers. In this work, we propose an attentionbased supervised deep learning model, ASPER, which extracts syntactic patterns between entities exhibiting a given semantic relation in the sentential context. We validate the performance of ASPER on three distinct semantic relationsβhyponym-hypernym, cause-effect, and meronym-holonym on six datasets. Experimental results show that for all these semantic relations, ASPER can automatically identify a collection of syntactic patterns reflecting the existence of such a relation between a pair of entities in a sentence. In comparison to the existing methodologies of syntactic pattern extraction, ASPERβs performance is substantially superior
Design of an E-learning system using semantic information and cloud computing technologies
Humanity is currently suffering from many difficult problems that threaten the life and survival of the human race. It is very easy for all mankind to be affected, directly or indirectly, by these problems. Education is a key solution for most of them. In our thesis we tried to make use of current technologies to enhance and ease the learning process.
We have designed an e-learning system based on semantic information and cloud computing, in addition to many other technologies that contribute to improving the educational process and raising the level of students. The design was built after much research on useful technology, its types, and examples of actual systems that were previously discussed by other researchers.
In addition to the proposed design, an algorithm was implemented to identify topics found in large textual educational resources. It was tested and proved to be efficient against other methods. The algorithm has the ability of extracting the main topics from textual learning resources, linking related resources and generating interactive dynamic knowledge graphs. This algorithm accurately and efficiently accomplishes those tasks even for bigger books. We used Wikipedia Miner, TextRank, and Gensim within our algorithm. Our algorithmβs accuracy was evaluated against Gensim, largely improving its accuracy.
Augmenting the system design with the implemented algorithm will produce many useful services for improving the learning process such as: identifying main topics of big textual learning resources automatically and connecting them to other well defined concepts from Wikipedia, enriching current learning resources with semantic information from external sources, providing student with browsable dynamic interactive knowledge graphs, and making use of learning groups to encourage students to share their learning experiences and feedback with other learners.Programa de Doctorado en IngenierΓa TelemΓ‘tica por la Universidad Carlos III de MadridPresidente: Luis SΓ‘nchez FernΓ‘ndez.- Secretario: Luis de la Fuente ValentΓn.- Vocal: Norberto FernΓ‘ndez GarcΓ