340 research outputs found

    ARTE: Automated Generation of Realistic Test Inputs for Web APIs

    Get PDF
    Automated test case generation for web APIs is a thriving research topic, where test cases are frequently derived from the API specification. However, this process is only partially automated since testers are usually obliged to manually set meaningful valid test inputs for each input parameter. In this article, we present ARTE, an approach for the automated extraction of realistic test data for web APIs from knowledge bases like DBpedia. Specifically, ARTE leverages the specification of the API parameters to automatically search for realistic test inputs using natural language processing, search-based, and knowledge extraction techniques. ARTE has been integrated into RESTest, an open-source testing framework for RESTful APIs, fully automating the test case generation process. Evaluation results on 140 operations from 48 real-world web APIs show that ARTE can efficiently generate realistic test inputs for 64.9% of the target parameters, outperforming the state-of-the-art approach SAIGEN (31.8%). More importantly, ARTE supported the generation of over twice as many valid API calls (57.3%) as random generation (20%) and SAIGEN (26%), leading to a higher failure detection capability and uncovering several real-world bugs. These results show the potential of ARTE for enhancing existing web API testing tools, achieving an unprecedented level of automationJunta de AndalucĂ­a APOLO (US-1264651)Junta de AndalucĂ­a EKIPMENT-PLUS (P18-FR-2895)Ministerio de Ciencia, InnovaciĂłn y Universidades RTI2018-101204-B-C21 (HORATIO)Ministerio de Ciencia, InnovaciĂłn y Universidades RED2018-102472-

    Workshop sensing a changing world : proceedings workshop November 19-21, 2008

    Get PDF

    A pattern-driven corpus to predictive analytics in mitigating SQL injection attack

    Get PDF
    The back-end database provides accessible and structured storage for each web application’s big data internet web traffic exchanges stemming from cloud-hosted web applications to the Internet of Things (IoT) smart devices in emerging computing. Structured Query Language Injection Attack (SQLIA) remains an intruder’s exploit of choice to steal confidential information from the database of vulnerable front-end web applications with potentially damaging security ramifications.Existing solutions to SQLIA still follows the on-premise web applications server hosting concept which were primarily developed before the recent challenges of the big data mining and as such lack the functionality and ability to cope with new attack signatures concealed in a large volume of web requests. Also, most organisations’ databases and services infrastructure no longer reside on-premise as internet cloud-hosted applications and services are increasingly used which limit existing Structured Query Language Injection (SQLI) detection and prevention approaches that rely on source code scanning. A bio-inspired approach such as Machine Learning (ML) predictive analytics provides functional and scalable mining for big data in the detection and prevention of SQLI in intercepting large volumes of web requests. Unfortunately, lack of availability of robust ready-made data set with patterns and historical data items to train a classifier are issues well known in SQLIA research applying ML in the field of Artificial Intelligence (AI). The purpose-built competition-driven test case data sets are antiquated and not pattern-driven to train a classifier for real-world application. Also, the web application types are so diverse to have an all-purpose generic data set for ML SQLIA mitigation.This thesis addresses the lack of pattern-driven data set by deriving one to predict SQLIA of any size and proposing a technique to obtain a data set on the fly and break the circle of relying on few outdated competitions-driven data sets which exist are not meant to benchmark real-world SQLIA mitigation. The thesis in its contributions derived pattern-driven data set of related member strings that are used in training a supervised learning model with validation through Receiver Operating Characteristic (ROC) curve and Confusion Matrix (CM) with results of low false positives and negatives. We further the evaluations with cross-validation to have obtained a low variance in accuracy that indicates of a successful trained model using the derived pattern-driven data set capable of generalisation of unknown data in the real-world with reduced biases. Also, we demonstrated a proof of concept with a test application by implementing an ML Predictive Analytics to SQLIA detection and prevention using this pattern-driven data set in a test web application. We observed in the experiments carried out in the course of this thesis, a data set of related member strings can be generated from a web expected input data and SQL tokens, including known SQLI signatures. The data set extraction ontology proposed in this thesis for applied ML in SQLIA mitigation in the context of emerging computing of big data internet, and cloud-hosted services set our proposal apart from existing approaches that were mostly on-premise source code scanning and queries structure comparisons of some sort

    Analysis and modeling a distributed co-operative multi agent system for scaling-up business intelligence

    Get PDF
    Modeling A Distributed Co-Operative Multi Agent System in the area of Business Intelligence is the newer topic. During the work carried out a software Integrated Intelligent Advisory Model (IIAM) has been develop, which is a personal finance portfolio ma

    Real-time classifiers from free-text for continuous surveillance of small animal disease

    Get PDF
    A wealth of information of epidemiological importance is held within unstructured narrative clinical records. Text mining provides computational techniques for extracting usable information from the language used to communicate between humans, including the spoken and written word. The aim of this work was to develop text-mining methodologies capable of rendering the large volume of information within veterinary clinical narratives accessible for research and surveillance purposes. The free-text records collated within the dataset of the Small Animal Veterinary Surveillance Network formed the development material and target of this work. The efficacy of pre-existent clinician-assigned coding applied to the dataset was evaluated and the nature of notation and vocabulary used in documenting consultations was explored and described. Consultation records were pre-processed to improve human and software readability, and software was developed to redact incidental identifiers present within the free-text. An automated system able to classify for the presence of clinical signs, utilising only information present within the free-text record, was developed with the aim that it would facilitate timely detection of spatio-temporal trends in clinical signs. Clinician-assigned main reason for visit coding provided a poor summary of the large quantity of information exchanged during a veterinary consultation and the nature of the coding and questionnaire triggering further obfuscated information. Delineation of the previously undocumented veterinary clinical sublanguage identified common themes and their manner of documentation, this was key to the development of programmatic methods. A rule-based classifier using logically-chosen dictionaries, sequential processing and data-masking redacted identifiers while maintaining research usability of records. Highly sensitive and specific free-text classification was achieved by applying classifiers for individual clinical signs within a context-sensitive scaffold, this permitted or prohibited matching dependent on the clinical context in which a clinical sign was documented. The mean sensitivity achieved within an unseen test dataset was 98.17 (74.47, 99.9)% and mean specificity 99.94 (77.1, 100.0)%. When used in combination to identify animals with any of a combination of gastrointestinal clinical signs, the sensitivity achieved was 99.44% (95% CI: 98.57, 99.78)% and specificity 99.74 (95% CI: 99.62, 99.83). This work illustrates the importance, utility and promise of free-text classification of clinical records and provides a framework within which this is possible whilst respecting the confidentiality of client and clinician

    Digital Classical Philology

    Get PDF
    The buzzwords “Information Society” and “Age of Access” suggest that information is now universally accessible without any form of hindrance. Indeed, the German constitution calls for all citizens to have open access to information. Yet in reality, there are multifarious hurdles to information access – whether physical, economic, intellectual, linguistic, political, or technical. Thus, while new methods and practices for making information accessible arise on a daily basis, we are nevertheless confronted by limitations to information access in various domains. This new book series assembles academics and professionals in various fields in order to illuminate the various dimensions of information's inaccessability. While the series discusses principles and techniques for transcending the hurdles to information access, it also addresses necessary boundaries to accessability.This book describes the state of the art of digital philology with a focus on ancient Greek and Latin. It addresses problems such as accessibility of information about Greek and Latin sources, data entry, collection and analysis of Classical texts and describes the fundamental role of libraries in building digital catalogs and developing machine-readable citation systems
    • …
    corecore