Search CORE

1,138 research outputs found

Chart-driven Connectionist Categorial Parsing of Spoken Korean

Author: Lee Geunbae
Lee Jong-Hyeok
Lee WonIl
Publication venue
Publication date: 29/11/1995
Field of study

While most of the speech and natural language systems which were developed for English and other Indo-European languages neglect the morphological processing and integrate speech and natural language at the word level, for the agglutinative languages such as Korean and Japanese, the morphological processing plays a major role in the language processing since these languages have very complex morphological phenomena and relatively simple syntactic functionality. Obviously degenerated morphological processing limits the usable vocabulary size for the system and word-level dictionary results in exponential explosion in the number of dictionary entries. For the agglutinative languages, we need sub-word level integration which leaves rooms for general morphological processing. In this paper, we developed a phoneme-level integration model of speech and linguistic processings through general morphological analysis for agglutinative languages and a efficient parsing scheme for that integration. Korean is modeled lexically based on the categorial grammar formalism with unordered argument and suppressed category extensions, and chart-driven connectionist parsing method is introduced.Comment: 6 pages, Postscript file, Proceedings of ICCPOL'9

arXiv.org e-Print Archive

Methods for Structural Pattern Recognition: Complexity and Applications

Author: Průša Daniel
Publication venue
Publication date: 01/01/2018
Field of study

Katedra kybernetik

Data Compression Concepts and Algorithms and Their Applications to Bioinformatics

Author: Nalbantoglu Ozkan U
Russell David J
Sayood Khalid
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 01/12/2009
Field of study

Data compression at its base is concerned with how information is organized in data. Understanding this organization can lead to efficient ways of representing the information and hence data compression. In this paper we review the ways in which ideas and approaches fundamental to the theory and practice of data compression have been used in the area of bioinformatics. We look at how basic theoretical ideas from data compression, such as the notions of entropy, mutual information, and complexity have been used for analyzing biological sequences in order to discover hidden patterns, infer phylogenetic relationships between organisms and study viral populations. Finally, we look at how inferred grammars for biological sequences have been used to uncover structure in biological sequences

Directory of Open Access Journals

Erciyes University - AVESIS

Data Compression Concepts and Algorithms and Their Applications to Bioinformatics

Author: Nalbantoglu Ozkan U
Russell David J
Sayood Khalid
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 01/01/2010
Field of study

All Purpose Textual Data Information Extraction, Visualization and Querying

Author
Publication venue
Publication date: 01/01/2018
Field of study

abstract: Since the advent of the internet and even more after social media platforms, the explosive growth of textual data and its availability has made analysis a tedious task. Information extraction systems are available but are generally too specific and often only extract certain kinds of information they deem necessary and extraction worthy. Using data visualization theory and fast, interactive querying methods, leaving out information might not really be necessary. This thesis explores textual data visualization techniques, intuitive querying, and a novel approach to all-purpose textual information extraction to encode large text corpus to improve human understanding of the information present in textual data. This thesis presents a modified traversal algorithm on dependency parse output of text to extract all subject predicate object pairs from text while ensuring that no information is missed out. To support full scale, all-purpose information extraction from large text corpuses, a data preprocessing pipeline is recommended to be used before the extraction is run. The output format is designed specifically to fit on a node-edge-node model and form the building blocks of a network which makes understanding of the text and querying of information from corpus quick and intuitive. It attempts to reduce reading time and enhancing understanding of the text using interactive graph and timeline.Dissertation/ThesisMasters Thesis Software Engineering 201

The Cooperative Participatory Evaluation of Renewable Technologies on Ecosystem Services (CORPORATES)

Author: Davies I. M.
Gubbins M.
Irvine K. N.
Kafas A.
Kenter J.
MacDonald A.
O'Hara Murray R.
Potts T.
Scott B. E.
Slater A.-M.
Tweddle J. F.
Wright K.
Publication venue: Marine Scotland Science
Publication date: 04/02/2016
Field of study

Publisher PD

Toward Semantic Machine Translation

Author: Andreas Jacob
Publication venue
Publication date: 01/01/2012
Field of study

This thesis presents a novel approach to interlingual machine translation using Î»-calculus expressions as an intermediate representation. It investigates and extends existing algorithms which learn a combinatorial category grammar for semantic parsing, and introduces two new algorithms for generation out of logical forms inspired by that semantic parser. The results of a set of new experiments for generation and parsing are described, as well as an evaluation of the performance of a semantic translation system created by joining the semantic parser and generator together. Experimental results demonstrate that under certain conditions, this semantic model achieves better performance than a standard phrase-based statistical MT system in both an automated evaluation of translation output and a manual evaluation of adequacy and fluency

Natural Language Interfaces to Data

Author: Efthymiou Vasilis
Lei Chuan
Quamar Abdul
Özcan Fatma
Publication venue: 'Now Publishers'
Publication date: 26/12/2022
Field of study

Recent advances in NLU and NLP have resulted in renewed interest in natural language interfaces to data, which provide an easy mechanism for non-technical users to access and query the data. While early systems evolved from keyword search and focused on simple factual queries, the complexity of both the input sentences as well as the generated SQL queries has evolved over time. More recently, there has also been a lot of focus on using conversational interfaces for data analytics, empowering a line of non-technical users with quick insights into the data. There are three main challenges in natural language querying (NLQ): (1) identifying the entities involved in the user utterance, (2) connecting the different entities in a meaningful way over the underlying data source to interpret user intents, and (3) generating a structured query in the form of SQL or SPARQL. There are two main approaches for interpreting a user's NLQ. Rule-based systems make use of semantic indices, ontologies, and KGs to identify the entities in the query, understand the intended relationships between those entities, and utilize grammars to generate the target queries. With the advances in deep learning (DL)-based language models, there have been many text-to-SQL approaches that try to interpret the query holistically using DL models. Hybrid approaches that utilize both rule-based techniques as well as DL models are also emerging by combining the strengths of both approaches. Conversational interfaces are the next natural step to one-shot NLQ by exploiting query context between multiple turns of conversation for disambiguation. In this article, we review the background technologies that are used in natural language interfaces, and survey the different approaches to NLQ. We also describe conversational interfaces for data analytics and discuss several benchmarks used for NLQ research and evaluation.Comment: The full version of this manuscript, as published by Foundations and Trends in Databases, is available at http://dx.doi.org/10.1561/190000007

arXiv.org e-Print Archive