261 research outputs found

    Character-based Neural Semantic Parsing

    Get PDF
    Humans and computers do not speak the same language. A lot of day-to-day tasks would be vastly more efficient if we could communicate with computers using natural language instead of relying on an interface. It is necessary, then, that the computer does not see a sentence as a collection of individual words, but instead can understand the deeper, compositional meaning of the sentence. A way to tackle this problem is to automatically assign a formal, structured meaning representation to each sentence, which are easy for computers to interpret. There have been quite a few attempts at this before, but these approaches were usually heavily reliant on predefined rules, word lists or representations of the syntax of the text. This made the general usage of these methods quite complicated. In this thesis we employ an algorithm that can learn to automatically assign meaning representations to texts, without using any such external resource. Specifically, we use a type of artificial neural network called a sequence-to-sequence model, in a process that is often referred to as deep learning. The devil is in the details, but we find that this type of algorithm can produce high quality meaning representations, with better performance than the more traditional methods. Moreover, a main finding of the thesis is that, counter intuitively, it is often better to represent the text as a sequence of individual characters, and not words. This is likely the case because it helps the model in dealing with spelling errors, unknown words and inflections

    Rekomendasi Perbaikan Pernyataan Kebutuhan yang Rancu dalam Spesifikasi Kebutuhan Perangkat Lunak Menggunakan Teknik Berbasis Aturan

    Get PDF
    Tahap awal dalam pengembangan perangkat lunak ialah menelusuri, mengumpulkan dan menyajikan segala kebutuhan pengguna ke dalam sebuah dokumen spesifikasi kebutuhan perangkat lunak (SKPL). Latar belakang akademik yang beragam, pengalaman yang berbeda, dan keterbatasan pengetahuan yang dimiliki oleh perekayasa kebutuhan memungkinkan adanya kesalahan dalam pembuatan dokumen SKPL. Salah satu kesalahan yang sering muncul pada sebuah dokumen SKPL ialah terdapatnya penggunaan kata-kata yang rancu. Hal ini tentunya dapat menyebabkan kesalahan penafsiran dan kesulitan dalam memahami kebutuhan perangkat lunak yang hendak dibangun bagi pemangku kepentingan dalam proses pengembangan perangkat lunak. Penelitian ini bertujuan mengusulkan sebuah pendekatan untuk memberikan rekomendasi perbaikan pernyataan kebutuhan perangkat lunak yang rancu. Adapun metode yang diusulkan adalah teknik berbasis aturan dengan menggunakan model bahasa n-gram. Realibilitas metode usulan dievaluasi menggunakan indeks statistik Gwet’s AC1. Hasil analisis metode rekomendasi yang diusulkan memiliki tingkat proporsi kesepakatan yang lebih baik dibandingkan dengan metode rekomendasi menggunakan teknik statistik berbasis frekuensi n-gram. Metode rekomendasi yang diusulkan memiliki nilai indeks statistik Gwet’s AC1 tertinggi sebesar 0.5263 dengan tingkat proporsi kesepakatan sedang.Tahap awal dalam pengembangan perangkat lunak ialah menelusuri, mengumpulkan dan menyajikan segala kebutuhan pengguna ke dalam sebuah dokumen spesifikasi kebutuhan perangkat lunak (SKPL). Latar belakang akademik yang beragam, pengalaman yang berbeda, dan keterbatasan pengetahuan yang dimiliki oleh perekayasa kebutuhan memungkinkan adanya kesalahan dalam pembuatan dokumen SKPL. Salah satu kesalahan yang sering muncul pada sebuah dokumen SKPL ialah terdapatnya penggunaan kata-kata yang rancu. Hal ini tentunya dapat menyebabkan kesalahan penafsiran dan kesulitan dalam memahami kebutuhan perangkat lunak yang hendak dibangun bagi pemangku kepentingan dalam proses pengembangan perangkat lunak. Penelitian ini bertujuan mengusulkan sebuah pendekatan untuk memberikan rekomendasi perbaikan pernyataan kebutuhan perangkat lunak yang rancu. Adapun metode yang diusulkan adalah teknik berbasis aturan dengan menggunakan model bahasa n-gram. Realibilitas metode usulan di-evaluasi menggunakan indeks statistik Gwet’s AC1. Hasil analisis metode rekomendasi yang diusulkan memiliki tingkat proporsi kesepakatan yang lebih baik dibandingkan dengan metode rekomendasi menggunakan teknik statistik berbasis frekuensi n-gram. Metode rekomendasi yang diusulkan memiliki nilai indeks statistik Gwet’s AC1 tertinggi sebesar 0.5263 dengan tingkat proporsi kesepakatan sedang. AbstractThe first stage in software development is to investigate, collect and provide all user requirements into a software requirements specification document (SRS’s). Diverse academic background, different experiences, and the limitations of knowledge possessed by the requirement engineer make possible mistakes in the creation of SRS’s documents. One of the most common mistakes in SRS’s document is the use of ambiguous words. This can certainly lead to misinterpretation and difficulties in understanding the software requirement that stakeholders to built in the software development process. The purpose of this research is to build an approach that gives recommendation improvement of ambiguous software requirement statement. The proposed method is a rule-based technique using n-gram language model. The reliability of the proposed method is evaluated using Gwet's AC1 statistical index. The analysis results of the proposed recommendation method have a better level of agreement proportion than the recommendation method using the n-gram frequency-based statistical technique. The proposed recommendation method has the highest Gwet's AC1 statistic value of 0.5263 with a moderate agreement proportion rate

    Recent Trends in Computational Intelligence

    Get PDF
    Traditional models struggle to cope with complexity, noise, and the existence of a changing environment, while Computational Intelligence (CI) offers solutions to complicated problems as well as reverse problems. The main feature of CI is adaptability, spanning the fields of machine learning and computational neuroscience. CI also comprises biologically-inspired technologies such as the intellect of swarm as part of evolutionary computation and encompassing wider areas such as image processing, data collection, and natural language processing. This book aims to discuss the usage of CI for optimal solving of various applications proving its wide reach and relevance. Bounding of optimization methods and data mining strategies make a strong and reliable prediction tool for handling real-life applications

    Rekomendasi Perbaikan Pernyataan Kebutuhan yang Rancu dalam Spesifikasi Kebutuhan Perangkat Lunak Menggunakan Teknik Berbasis Aturan

    Get PDF
    Tahap awal dalam pengembangan perangkat lunak ialah menelusuri, mengumpulkan dan menyajikan segala kebutuhan pengguna ke dalam sebuah dokumen spesifikasi kebutuhan perangkat lunak (SKPL). Latar belakang akademik yang beragam, pengalaman yang berbeda, dan keterbatasan pengetahuan yang dimiliki oleh perekayasa kebutuhan memungkinkan adanya kesalahan dalam pembuatan dokumen SKPL. Salah satu kesalahan yang sering muncul pada sebuah dokumen SKPL ialah terdapatnya penggunaan kata-kata yang rancu. Hal ini tentunya dapat menyebabkan kesalahan penafsiran dan kesulitan dalam memahami kebutuhan perangkat lunak yang hendak dibangun bagi pemangku kepentingan dalam proses pengembangan perangkat lunak. Penelitian ini bertujuan mengusulkan sebuah pendekatan untuk memberikan rekomendasi perbaikan pernyataan kebutuhan perangkat lunak yang rancu. Adapun metode yang diusulkan adalah teknik berbasis aturan dengan menggunakan model bahasa n-gram. Realibilitas metode usulan di evaluasi menggunakan indeks statistik Gwet’s AC1. Hasil analisis metode rekomendasi yang diusulkan memiliki tingkat proporsi kesepakatan yang lebih baik dibandingkan dengan metode rekomendasi menggunakan teknik statistik berbasis frekuensi n-gram. Metode rekomendasi yang diusulkan memiliki nilai indeks statistik Gwet’s AC1 tertinggi sebesar 0.5263 dengan tingkat proporsi kesepakatan sedang. =============================================================================================================== The first stage in software development is to investigate, collect and provide all user requirements into a software requirements specification document (SRS’s). Diverse academic background, different experiences, and the limitations of knowledge possessed by the requirement engineer make possible mistakes in the creation of SRS’s documents. One of the most common mistakes in SRS’s document is the use of ambiguous words. This can certainly lead to misinterpretation and difficulties in understanding the software requirement that stakeholders to built in the software development process. The purpose of this research is to build an approach that gives recommendation improvement of ambiguous software requirement statement. The proposed method is a rule-based technique using n-gram language model. Analysis of the test results from the recommendation method can be evaluated using Gwet's AC1 statistics. The reliability of the proposed method is evaluated using Get's AC1 statistical index. The analysis results of the proposed recommendation method have a better level of agreement proportion than the recommendation method using the n-gram frequency-based statistical technique. The proposed recommendation method has the highest Gwet's AC1 statistic value of 0.5263 with a moderate agreement proportion rate

    Statistical language modelling of dialogue material in the British national corpus.

    Get PDF
    Statistical language modelling may not only be used to uncover the patterns which underlie the composition of utterances and texts, but also to build practical language processing technology. Contemporary language applications in automatic speech recognition, sentence interpretation and even machine translation exploit statistical models of language. Spoken dialogue systems, where a human user interacts with a machine via a speech interface in order to get information, make bookings, complaints, etc., are example of such systems which are now technologically feasible. The majority of statistical language modelling studies to date have concentrated on written text material (or read versions thereof). However, it is well-known that dialogue is significantly different from written text in its lexical content and sentence structure. Furthermore, there are expected to be significant logical, thematic and lexical connections between successive turns within a dialogue, but "turns" are not generally meaningful in written text. There is therefore a need for statistical language modeling studies to be performed on dialogue, particularly with a longer-term aim to using such models in human-machine dialogue interfaces. In this thesis, I describe the studies I have carried out on statistically modelling the dialogue material within the British National Corpus (BNC) - a very large corpus of modern British English compiled during the 1990s. This thesis presents a general introductory survey of the field of automatic speech recognition. This is followed by a general introduction to some standard techniques of statistical language modelling which will be employed later in the thesis. The structure of dialogue is discussed using some perspectives from linguistic theory, and reviews some previous approaches (not necessarily statistical) to modelling dialogue. Then a qualitative description is given of the BNC and the dialogue data within it, together with some descriptive statistics relating to it and results from constructing simple trigram language models for both dialogue and text data. The main part of the thesis describes experiments on the application of statistical language models based on word caches, word "trigger" pairs, and turn clustering to the dialogue data. Several different approaches are used for each type of model. An analysis of the strengths and weaknesses of these techniques is then presented. The results of the experiments lead to a better understanding of how statistical language modelling might be applied to dialogue for the benefit of future language technologies

    Proceedings of the 17th Annual Conference of the European Association for Machine Translation

    Get PDF
    Proceedings of the 17th Annual Conference of the European Association for Machine Translation (EAMT

    Proceedings of the 9th Dutch-Belgian Information Retrieval Workshop

    Get PDF

    Proceedings of the VIIth GSCP International Conference

    Get PDF
    The 7th International Conference of the Gruppo di Studi sulla Comunicazione Parlata, dedicated to the memory of Claire Blanche-Benveniste, chose as its main theme Speech and Corpora. The wide international origin of the 235 authors from 21 countries and 95 institutions led to papers on many different languages. The 89 papers of this volume reflect the themes of the conference: spoken corpora compilation and annotation, with the technological connected fields; the relation between prosody and pragmatics; speech pathologies; and different papers on phonetics, speech and linguistic analysis, pragmatics and sociolinguistics. Many papers are also dedicated to speech and second language studies. The online publication with FUP allows direct access to sound and video linked to papers (when downloaded)

    Baltic Journal of English Language, Literature and Culture, Vol.12

    Get PDF
    • …