Search CORE

271 research outputs found

Natural Language Interfaces for Tabular Data Querying and Visualization: A Survey

Author: Chan Jonathan H.
Qi Yiyan
Song Yuanfeng
Tian Yuxing
Wang Yifei
Wei Victor Junqiu
Wong Raymond Chi-Wing
Yang Haiqin
Zhang Weixu
Publication venue
Publication date: 27/10/2023
Field of study

The emergence of natural language processing has revolutionized the way users interact with tabular data, enabling a shift from traditional query languages and manual plotting to more intuitive, language-based interfaces. The rise of large language models (LLMs) such as ChatGPT and its successors has further advanced this field, opening new avenues for natural language processing techniques. This survey presents a comprehensive overview of natural language interfaces for tabular data querying and visualization, which allow users to interact with data using natural language queries. We introduce the fundamental concepts and techniques underlying these interfaces with a particular emphasis on semantic parsing, the key technology facilitating the translation from natural language to SQL queries or data visualization commands. We then delve into the recent advancements in Text-to-SQL and Text-to-Vis problems from the perspectives of datasets, methodologies, metrics, and system designs. This includes a deep dive into the influence of LLMs, highlighting their strengths, limitations, and potential for future improvements. Through this survey, we aim to provide a roadmap for researchers and practitioners interested in developing and applying natural language interfaces for data interaction in the era of large language models.Comment: 20 pages, 4 figures, 5 tables. Submitted to IEEE TKD

arXiv.org e-Print Archive

Translating Natural Language Queries to SQL Using the T5 Model

Author: Chan Shek
Cheng Florence Wing Yau
Clement Mathias
Ferri Michael
Khmelevsky Youry
Lee Young
Mahony Joe
Pham Lien
Sadaya Razel
Wong Albert
Publication venue
Publication date: 12/12/2023
Field of study

This paper presents the development process of a natural language to SQL model using the T5 model as the basis. The models, developed in August 2022 for an online transaction processing system and a data warehouse, have a 73\% and 84\% exact match accuracy respectively. These models, in conjunction with other work completed in the research project, were implemented for several companies and used successfully on a daily basis. The approach used in the model development could be implemented in a similar fashion for other database environments and with a more powerful pre-trained language model

arXiv.org e-Print Archive

Demonstration of InsightPilot: An LLM-Empowered Automated Data Exploration System

Author: Ding Rui
Han Shi
Ma Pingchuan
Wang Shuai
Zhang Dongmei
Publication venue
Publication date: 12/11/2023
Field of study

Exploring data is crucial in data analysis, as it helps users understand and interpret the data more effectively. However, performing effective data exploration requires in-depth knowledge of the dataset and expertise in data analysis techniques. Not being familiar with either can create obstacles that make the process time-consuming and overwhelming for data analysts. To address this issue, we introduce InsightPilot, an LLM (Large Language Model)-based, automated data exploration system designed to simplify the data exploration process. InsightPilot automatically selects appropriate analysis intents, such as understanding, summarizing, and explaining. Then, these analysis intents are concretized by issuing corresponding intentional queries (IQueries) to create a meaningful and coherent exploration sequence. In brief, an IQuery is an abstraction and automation of data analysis operations, which mimics the approach of data analysts and simplifies the exploration process for users. By employing an LLM to iteratively collaborate with a state-of-the-art insight engine via IQueries, InsightPilot is effective in analyzing real-world datasets, enabling users to gain valuable insights through natural language inquiries. We demonstrate the effectiveness of InsightPilot in a case study, showing how it can help users gain valuable insights from their datasets

arXiv.org e-Print Archive

Toward enhancement of deep learning techniques using fuzzy logic: a survey

Author: Hasan Dhafar Fakhry
Khidhir AdulSattar Mohammed
Publication venue: Institute of Advanced Engineering and Science
Publication date: 01/06/2023
Field of study

Deep learning has emerged recently as a type of artificial intelligence (AI) and machine learning (ML), it usually imitates the human way in gaining a particular knowledge type. Deep learning is considered an essential data science element, which comprises predictive modeling and statistics. Deep learning makes the processes of collecting, interpreting, and analyzing big data easier and faster. Deep neural networks are kind of ML models, where the non-linear processing units are layered for the purpose of extracting particular features from the inputs. Actually, the training process of similar networks is very expensive and it also depends on the used optimization method, hence optimal results may not be provided. The techniques of deep learning are also vulnerable to data noise. For these reasons, fuzzy systems are used to improve the performance of deep learning algorithms, especially in combination with neural networks. Fuzzy systems are used to improve the representation accuracy of deep learning models. This survey paper reviews some of the deep learning based fuzzy logic models and techniques that were presented and proposed in the previous studies, where fuzzy logic is used to improve deep learning performance. The approaches are divided into two categories based on how both of the samples are combined. Furthermore, the models' practicality in the actual world is revealed

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Institute of Advanced Engineering and Science

DevSecOps for web applications: a case study

Author: Gonçalves Diogo Henrique Araújo
Publication venue
Publication date: 01/01/2022
Field of study

O paradigma DevOps permite agilizar o processo de entrega de software. Visa reduzir as barreiras existentes entre as equipas responsáveis pelo desenvolvimento e as equipas de operação. Com recurso a estruturas de pipelines o processo de desenvolvimento de software é conduzido através de diversas etapas até à sua entrega. Estas estruturas permitem automatizar várias tarefas de forma a evitar erros humanos, liberta os intervenientes de tarefas morosas e repetitivas. Mais previsível e com maior exatidão o tempo necessário para as entregas de software é encurtado e mais frequente. Dadas estas vantagens o paradigma tem muita adoção por parte da indústria de desenvolvimento, no entanto, o aumento do volume das entregas acarreta desafios, nomeadamente no que diz respeito à segurança das soluções desenvolvidas. Negligenciar os fatores de segurança pode levar a organização a acarretar com custos financeiros e denegrir a sua reputação. A integração entre o paradigma DevOps e segurança originou o paradigma designado por DevSecOps. Este visa a adoção pelo processo de desenvolvimento de ações de segurança, que após inseridas nas diversas fases de entrega, permitirão analisar e validar a solução, de forma a assegurar a sua consistência. A arquitetura das aplicações web é por sua natureza acessível, o que resulta à sua maior exposição. Este projeto apresenta uma lista de problemas de segurança encontrados durante a pesquisa efetuada no domínio das aplicações web, analisa quais as ferramentas para a deteção e resolução destes problemas, quais as suas implicações no tempo de entrega de software e a sua eficiência na deteção de falhas. Concluí com uma implementação de um fluxo de execução utilizando o paradigma DevSecOps, para compreender a sua contribuição no melhoramento da qualidade do software.The DevOps paradigm streamlines the software delivery process, reducing the barriers between the teams involved in development and operations. It relies on pipelines to structure the development process until delivered. These structures enable the automation of many tasks, avoiding human error and freeing the team elements from doing slow and repeated tasks. More predictable and accurate development allows teams to reduce the time required for software deliveries and make them more frequent. Despite the wide adoption of the paradigm, the increase in deliveries cannot compromise the security aspects of the developed solutions. Companies may incur financial costs and tarnish their reputations by neglecting security factors. Joining security and DevOps originate a new paradigm, DevSecOps. It aims to bring more quality compliance and avoid risk by adding security considerations to discover all potential security defects before delivery. Web applications architecture, by their accessibility intent, has a vast exposed area. This project presents a list of common security issues found during the research performed in the web application security domain analyses, what tools are used to detect and solve these problems, which time implications they cause in the overall software delivery and their effectiveness in defect detection. It concludes with implementing a pipeline using the DevSecOps paradigm to establish its viability in improving software quality

Repositório Científico do Instituto Politécnico do Porto

Personalisation of web information search : an agent based approach

Author: Gopinathan Leela Ligon
Publication venue
Publication date: 01/01/2005
Field of study

University of Canberra Research Repository

Recommended from our members

Knowledge Discovery and Data Mining (KDDM) survey report.

Author: Bauer Travis L.
Chapman Leon Darrel
Elmore Mark T.
Homan Rossitza A.
Jordan Danyelle N.
Phillips Laurence R.
Spires Shannon V.
Treadwell Jim N.
Publication venue: Sandia National Laboratories
Publication date: 01/02/2005
Field of study

The large number of government and industry activities supporting the Unit of Action (UA), with attendant documents, reports and briefings, can overwhelm decision-makers with an overabundance of information that hampers the ability to make quick decisions often resulting in a form of gridlock. In particular, the large and rapidly increasing amounts of data and data formats stored on UA Advanced Collaborative Environment (ACE) servers has led to the realization that it has become impractical and even impossible to perform manual analysis leading to timely decisions. UA Program Management (PM UA) has recognized the need to implement a Decision Support System (DSS) on UA ACE. The objective of this document is to research the commercial Knowledge Discovery and Data Mining (KDDM) market and publish the results in a survey. Furthermore, a ranking mechanism based on UA ACE-specific criteria has been developed and applied to a representative set of commercially available KDDM solutions. In addition, an overview of four R&D areas identified as critical to the implementation of DSS on ACE is provided. Finally, a comprehensive database containing detailed information on surveyed KDDM tools has been developed and is available upon customer request

UNT Digital Library

Advances in Meta-Heuristic Optimization Algorithms in Big Data Text Clustering

Author: Abualigah L
Alshinwan M
Elaziz MA
Gandomi AH
Hamad HA
Khasawneh AM
Omari M
Publication venue: 'MDPI AG'
Publication date: 12/05/2021
Field of study

This paper presents a comprehensive survey of the meta-heuristic optimization algorithms on the text clustering applications and highlights its main procedures. These Artificial Intelligence (AI) algorithms are recognized as promising swarm intelligence methods due to their successful ability to solve machine learning problems, especially text clustering problems. This paper reviews all of the relevant literature on meta-heuristic-based text clustering applications, including many variants, such as basic, modified, hybridized, and multi-objective methods. As well, the main procedures of text clustering and critical discussions are given. Hence, this review reports its advantages and disadvantages and recommends potential future research paths. The main keywords that have been considered in this paper are text, clustering, meta-heuristic, optimization, and algorithm

OPUS - University of Technology Sydney

Enhanced Prediction of Network Attacks Using Incomplete Data

Author: Arthur Jacob D.
Publication venue: NSUWorks
Publication date: 01/01/2017
Field of study

For years, intrusion detection has been considered a key component of many organizations’ network defense capabilities. Although a number of approaches to intrusion detection have been tried, few have been capable of providing security personnel responsible for the protection of a network with sufficient information to make adjustments and respond to attacks in real-time. Because intrusion detection systems rarely have complete information, false negatives and false positives are extremely common, and thus valuable resources are wasted responding to irrelevant events. In order to provide better actionable information for security personnel, a mechanism for quantifying the confidence level in predictions is needed. This work presents an approach which seeks to combine a primary prediction model with a novel secondary confidence level model which provides a measurement of the confidence in a given attack prediction being made. The ability to accurately identify an attack and quantify the confidence level in the prediction could serve as the basis for a new generation of intrusion detection devices, devices that provide earlier and better alerts for administrators and allow more proactive response to events as they are occurring

NSU Works

Agriculture 4.0 and beyond: Evaluating cyber threat intelligence sources and techniques in smart farming ecosystems

Author: Aboutorab Hamed
Babar M. A.
Bewong Michael
Bui Hang T.
Camtepe Seyit A.
Chauhan Aufeef
Gao Yansong
Gauravaram Praveen
Islam Rafiqul
Islam Zahid
Mahboubi Arash
Parvez Mohammad Z.
Singh Dineshkumar
Sultan Nazatul H.
Yan Shihao
Publication venue: Edith Cowan University, Research Online, Perth, Western Australia
Publication date: 01/05/2024
Field of study

The digitisation of agriculture, integral to Agriculture 4.0, has brought significant benefits while simultaneously escalating cybersecurity risks. With the rapid adoption of smart farming technologies and infrastructure, the agricultural sector has become an attractive target for cyberattacks. This paper presents a systematic literature review that assesses the applicability of existing cyber threat intelligence (CTI) techniques within smart farming infrastructures (SFIs). We develop a comprehensive taxonomy of CTI techniques and sources, specifically tailored to the SFI context, addressing the unique cyber threat challenges in this domain. A crucial finding of our review is the identified need for a virtual Chief Information Security Officer (vCISO) in smart agriculture. While the concept of a vCISO is not yet established in the agricultural sector, our study highlights its potential significance. The implementation of a vCISO could play a pivotal role in enhancing cybersecurity measures by offering strategic guidance, developing robust security protocols, and facilitating real-time threat analysis and response strategies. This approach is critical for safeguarding the food supply chain against the evolving landscape of cyber threats. Our research underscores the importance of integrating a vCISO framework into smart farming practices as a vital step towards strengthening cybersecurity. This is essential for protecting the agriculture sector in the era of digital transformation, ensuring the resilience and sustainability of the food supply chain against emerging cyber risks

Research Online @ ECU