124,223 research outputs found
CHOOSING A DATABASE QUERY LANGUAGE
A methodology is presented for selecting query languages suitable for
certain user types. The method is based on a trend model of query
language development on the dimensions of functional capabilities and
usability. Expected developments are exemplified by the description
of "second generationâ database query languages. From the trend model
are derived: a classification scheme for query languages; a
criterion hierarchy for query language evaluation; a comprehensive
classification scheme of query language users and their requirements;
and recommendations for allocating language classes to user types.
The method integrates the results of existing human factors studies
and provides a structured framework for future research.Information Systems Working Papers Serie
Investigation on Applying Modular Ontology to Statistical Language Model for Information Retrieval
The objective of this research is to provide a novel approach to improving retrieval performance by exploiting Ontology with the statistical language model (SLM). The proposed methods consist of two major processes, namely ontology-based query expansion (OQE) and ontology-based document classification (ODC). Research experiments have required development of an independent search tool that can combine the OQE and ODC in a traditional SLM-based information retrieval (IR) process using a Web document collection.
This research considers the ongoing challenges of modular ontology enhanced SLM-based search and addresses three contribution aspects. The first concerns how to apply modular ontology to query expansion, in a bespoke language model search tool (LMST). The second considers how to incorporate OQE with the language model to improve the search performance. The third examines how to manipulate such semantic-based document classification to improve the smoothing accuracy. The role of ontology in the research is to provide formally described domains of interest that serve as context, to enhance system query effectiveness
Conversational Financial Information Retrieval Model (ConFIRM)
With the exponential growth in large language models (LLMs), leveraging their
emergent properties for specialized domains like finance merits exploration.
However, regulated fields such as finance pose unique constraints, requiring
domain-optimized frameworks. We present ConFIRM, an LLM-based conversational
financial information retrieval model tailored for query intent classification
and knowledge base labeling.
ConFIRM comprises two modules:
1) a method to synthesize finance domain-specific question-answer pairs, and
2) evaluation of parameter efficient fine-tuning approaches for the query
classification task. We generate a dataset of over 4000 samples, assessing
accuracy on a separate test set.
ConFIRM achieved over 90% accuracy, essential for regulatory compliance.
ConFIRM provides a data-efficient solution to extract precise query intent for
financial dialog systems.Comment: 10 pages, 2 figures, 2 tables, 2 appendice
Detecting missing content queries in an SMS-Based HIV/AIDS FAQ retrieval system
Automated Frequently Asked Question (FAQ) answering systems use pre-stored sets of question-answer pairs as an information source to answer natural language questions posed by the users. The main problem with this kind of information source is that there is no guarantee that there will be a relevant question-answer pair for all user queries. In this paper, we propose to deploy a binary classifier in an existing SMS-Based HIV/AIDS FAQ retrieval system to detect user queries that do not have the relevant question-answer pair in the FAQ document collection. Before deploying such a classifier, we first evaluate different feature sets for training in order to determine the sets of features that can build a model that yields the best classification accuracy. We carry out our evaluation using seven different feature sets generated from a query log before and after retrieval by the FAQ retrieval system. Our results suggest that, combining different feature sets markedly improves the classification accuracy
A Critical Review of Temporal Database Management Systems
There have been significant research activities in Temporal Databases during the last decade. However, the developments of a semantics of time, a temporal model for efficient database systems and temporal query languages still need much study. Based on the researches of the TDB group [Snodgrass 1987], the review of research about TDBMS in this dissertation mainly emphasises three aspects as follows. 1) The formulation of a semantics of time at the conceptual level. A topology of time and types of time attributes are introduced. A new taxonomy for time attributes is presented: assertion time, event time, and recording time. 2) The development of a model for TDBMS analogous to relational databases. Based on Snodgrass' classification, four kinds of databases: snapshot, rollback, historical and temporal are discussed in depth. But the discussion distinguishes some important differences from the representation of the TDB model: - historical relation for most enterprises is an interval relation, but not a sequence of snapshot slices indexed by valid time. The term "tuple" no longer simply refers to an entity as in traditional relational databases. It refers to different level representations of an object: entity, entity state, observation of entity, and observation of entity state in different types of databases. 3) The design of temporal query languages. We do not present a new temporal query language in this dissertation, but we discuss a Quel-like temporal query language, TQuel, in some depth. TQuel is compared with two other temporal query languages TOSQL and Legol 2.0. We centre the main discussion on TQuel's semantics for tuple calculus. The classification for the relationships between overlapping intervals suggests an approach using temporal logic to classify the derived tuples in tuple calculus. Under such an approach, a new presentation for tuple modification calculus is proposed, not only for interval relations, but also for event relations
Committee-Based Sample Selection for Probabilistic Classifiers
In many real-world learning tasks, it is expensive to acquire a sufficient
number of labeled examples for training. This paper investigates methods for
reducing annotation cost by `sample selection'. In this approach, during
training the learning program examines many unlabeled examples and selects for
labeling only those that are most informative at each stage. This avoids
redundantly labeling examples that contribute little new information. Our work
follows on previous research on Query By Committee, extending the
committee-based paradigm to the context of probabilistic classification. We
describe a family of empirical methods for committee-based sample selection in
probabilistic classification models, which evaluate the informativeness of an
example by measuring the degree of disagreement between several model variants.
These variants (the committee) are drawn randomly from a probability
distribution conditioned by the training set labeled so far. The method was
applied to the real-world natural language processing task of stochastic
part-of-speech tagging. We find that all variants of the method achieve a
significant reduction in annotation cost, although their computational
efficiency differs. In particular, the simplest variant, a two member committee
with no parameters to tune, gives excellent results. We also show that sample
selection yields a significant reduction in the size of the model used by the
tagger
Analyse de sentiments et classification des phrases dans les longues requêtes de recherche de livres
International audienceHandling long queries can involve either reducing its size by retaining only useful sentences, or decomposing the long query into several short queries based on their content. A proper sentence classification improves the utility of these procedures. Can Sentiment Analysis has a role in sentence classification? This paper analysis the correlation between sentiment analysis and sentence classification in long book-search queries. Also, it studies the similarity in writing style between book reviews and sentences in book-search queries. To accomplish this study, a semi-supervised method for sentiment intensity prediction, and a language model based on book reviews are presented. In addition to graphical illustrations reflecting the feedback of this study, followed by interpretations and conclusions
- …