Search CORE

124,223 research outputs found

CHOOSING A DATABASE QUERY LANGUAGE

Author: Jarke Matthias
Vassiliou Yannis
Publication venue: Stern School of Business, New York University
Publication date: 01/04/1984
Field of study

A methodology is presented for selecting query languages suitable for certain user types. The method is based on a trend model of query language development on the dimensions of functional capabilities and usability. Expected developments are exemplified by the description of "second generationâ database query languages. From the trend model are derived: a classification scheme for query languages; a criterion hierarchy for query language evaluation; a comprehensive classification scheme of query language users and their requirements; and recommendations for allocating language classes to user types. The method integrates the results of existing human factors studies and provides a structured framework for future research.Information Systems Working Papers Serie

New York University Faculty Digital Archive

Investigation on Applying Modular Ontology to Statistical Language Model for Information Retrieval

Author: Chang Jia Kang
Publication venue
Publication date
Field of study

The objective of this research is to provide a novel approach to improving retrieval performance by exploiting Ontology with the statistical language model (SLM). The proposed methods consist of two major processes, namely ontology-based query expansion (OQE) and ontology-based document classification (ODC). Research experiments have required development of an independent search tool that can combine the OQE and ODC in a traditional SLM-based information retrieval (IR) process using a Web document collection. This research considers the ongoing challenges of modular ontology enhanced SLM-based search and addresses three contribution aspects. The first concerns how to apply modular ontology to query expansion, in a bespoke language model search tool (LMST). The second considers how to incorporate OQE with the language model to improve the search performance. The third examines how to manipulate such semantic-based document classification to improve the smoothing accuracy. The role of ontology in the research is to provide formally described domains of interest that serve as context, to enhance system query effectiveness

CLoK

Conversational Financial Information Retrieval Model (ConFIRM)

Author: Choi Stephen
Gazeley William
Li Tingting
Wong Siu Ho
Publication venue
Publication date: 10/11/2023
Field of study

With the exponential growth in large language models (LLMs), leveraging their emergent properties for specialized domains like finance merits exploration. However, regulated fields such as finance pose unique constraints, requiring domain-optimized frameworks. We present ConFIRM, an LLM-based conversational financial information retrieval model tailored for query intent classification and knowledge base labeling. ConFIRM comprises two modules: 1) a method to synthesize finance domain-specific question-answer pairs, and 2) evaluation of parameter efficient fine-tuning approaches for the query classification task. We generate a dataset of over 4000 samples, assessing accuracy on a separate test set. ConFIRM achieved over 90% accuracy, essential for regulatory compliance. ConFIRM provides a data-efficient solution to extract precise query intent for financial dialog systems.Comment: 10 pages, 2 figures, 2 tables, 2 appendice

arXiv.org e-Print Archive

Detecting missing content queries in an SMS-Based HIV/AIDS FAQ retrieval system

Author: Ounis Iadh
Rogers Simon
Thuma Edwin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Automated Frequently Asked Question (FAQ) answering systems use pre-stored sets of question-answer pairs as an information source to answer natural language questions posed by the users. The main problem with this kind of information source is that there is no guarantee that there will be a relevant question-answer pair for all user queries. In this paper, we propose to deploy a binary classifier in an existing SMS-Based HIV/AIDS FAQ retrieval system to detect user queries that do not have the relevant question-answer pair in the FAQ document collection. Before deploying such a classifier, we first evaluate different feature sets for training in order to determine the sets of features that can build a model that yields the best classification accuracy. We carry out our evaluation using seven different feature sets generated from a query log before and after retrieval by the FAQ retrieval system. Our results suggest that, combining different feature sets markedly improves the classification accuracy

Crossref

Enlighten

A Critical Review of Temporal Database Management Systems

Author: Fang Weiqi
Publication venue: ProQuest Dissertations & Theses,
Publication date: 01/01/1989
Field of study

There have been significant research activities in Temporal Databases during the last decade. However, the developments of a semantics of time, a temporal model for efficient database systems and temporal query languages still need much study. Based on the researches of the TDB group [Snodgrass 1987], the review of research about TDBMS in this dissertation mainly emphasises three aspects as follows. 1) The formulation of a semantics of time at the conceptual level. A topology of time and types of time attributes are introduced. A new taxonomy for time attributes is presented: assertion time, event time, and recording time. 2) The development of a model for TDBMS analogous to relational databases. Based on Snodgrass' classification, four kinds of databases: snapshot, rollback, historical and temporal are discussed in depth. But the discussion distinguishes some important differences from the representation of the TDB model: - historical relation for most enterprises is an interval relation, but not a sequence of snapshot slices indexed by valid time. The term "tuple" no longer simply refers to an entity as in traditional relational databases. It refers to different level representations of an object: entity, entity state, observation of entity, and observation of entity state in different types of databases. 3) The design of temporal query languages. We do not present a new temporal query language in this dissertation, but we discuss a Quel-like temporal query language, TQuel, in some depth. TQuel is compared with two other temporal query languages TOSQL and Legol 2.0. We centre the main discussion on TQuel's semantics for tuple calculus. The classification for the relationships between overlapping intervals suggests an approach using temporal logic to classify the derived tuples in tuple calculus. Under such an approach, a new presentation for tuple modification calculus is proposed, not only for interval relations, but also for event relations

Glasgow Theses Service

Committee-Based Sample Selection for Probabilistic Classifiers

Author: Argamon-Engelson S.
Dagan I.
Publication venue: 'AI Access Foundation'
Publication date: 01/06/2011
Field of study

In many real-world learning tasks, it is expensive to acquire a sufficient number of labeled examples for training. This paper investigates methods for reducing annotation cost by `sample selection'. In this approach, during training the learning program examines many unlabeled examples and selects for labeling only those that are most informative at each stage. This avoids redundantly labeling examples that contribute little new information. Our work follows on previous research on Query By Committee, extending the committee-based paradigm to the context of probabilistic classification. We describe a family of empirical methods for committee-based sample selection in probabilistic classification models, which evaluate the informativeness of an example by measuring the degree of disagreement between several model variants. These variants (the committee) are drawn randomly from a probability distribution conditioned by the training set labeled so far. The method was applied to the real-world natural language processing task of stochastic part-of-speech tagging. We find that all variants of the method achieve a significant reduction in annotation cost, although their computational efficiency differs. In particular, the simplest variant, a two member committee with no parameters to tune, gives excellent results. We also show that sample selection yields a significant reduction in the size of the model used by the tagger

arXiv.org e-Print Archive

Crossref

Analyse de sentiments et classification des phrases dans les longues requêtes de recherche de livres

Author: Bellot Patrice
Fournier Sébastien
HTAIT Amal
Publication venue: HAL CCSD
Publication date: 07/04/2019
Field of study

International audienceHandling long queries can involve either reducing its size by retaining only useful sentences, or decomposing the long query into several short queries based on their content. A proper sentence classification improves the utility of these procedures. Can Sentiment Analysis has a role in sentence classification? This paper analysis the correlation between sentiment analysis and sentence classification in long book-search queries. Also, it studies the similarity in writing style between book reviews and sentences in book-search queries. To accomplish this study, a semi-supervised method for sentiment intensity prediction, and a language model based on book reviews are presented. In addition to graphical illustrations reflecting the feedback of this study, followed by interpretations and conclusions