2,672 research outputs found
Mungo and StMungo: tools for typechecking protocols in Java
We present two tools that support static typechecking of communica- tion protocols in Java. Mungo associates Java classes with typestate specifications, which are state machines defining permitted sequences of method calls. StMungo translates a communication protocol specified in the Scribble protocol description language into a typestate specification for each role in the protocol by following the message sequence. Role implementations can be typechecked by Mungo to ensure that they satisfy their protocols, and then compiled as usual with javac. We demonstrate the Scribble, StMungo and Mungo toolchain via a typechecked POP3 client that can communicate with a real-world POP3 server
Integrating Distributed Sources of Information for Construction Cost Estimating using Semantic Web and Semantic Web Service technologies
A construction project requires collaboration of several organizations such as owner, designer, contractor, and material supplier organizations. These organizations need to exchange information to enhance their teamwork. Understanding the information received from other organizations requires specialized human resources. Construction cost estimating is one of the processes that requires information from several sources including a building information model (BIM) created by designers, estimating assembly and work item information maintained by contractors, and construction material cost data provided by material suppliers. Currently, it is not easy to integrate the information necessary for cost estimating over the Internet. This paper discusses a new approach to construction cost estimating that uses Semantic Web technology. Semantic Web technology provides an infrastructure and a data modeling format that enables accessing, combining, and sharing information over the Internet in a machine processable format. The estimating approach presented in this paper relies on BIM, estimating knowledge, and construction material cost data expressed in a web ontology language. The approach presented in this paper makes the various sources of estimating data accessible as Simple Protocol and Resource Description Framework Query Language (SPARQL) endpoints or Semantic Web Services. We present an estimating application that integrates distributed information provided by project designers, contractors, and material suppliers for preparing cost estimates. The purpose of this paper is not to fully automate the estimating process but to streamline it by reducing human involvement in repetitive cost estimating activities
You Can REST Now: Automated Specification Inference and Black-Box Testing of RESTful APIs with Large Language Models
RESTful APIs are popular web services, requiring documentation to ease their
comprehension, reusability and testing practices. The OpenAPI Specification
(OAS) is a widely adopted and machine-readable format used to document such
APIs. However, manually documenting RESTful APIs is a time-consuming and
error-prone task, resulting in unavailable, incomplete, or imprecise
documentation. As RESTful API testing tools require an OpenAPI specification as
input, insufficient or informal documentation hampers testing quality.
Recently, Large Language Models (LLMs) have demonstrated exceptional
abilities to automate tasks based on their colossal training data. Accordingly,
such capabilities could be utilized to assist the documentation and testing
process of RESTful APIs.
In this paper, we present RESTSpecIT, the first automated RESTful API
specification inference and black-box testing approach leveraging LLMs. The
approach requires minimal user input compared to state-of-the-art RESTful API
inference and testing tools; Given an API name and an LLM key, HTTP requests
are generated and mutated with data returned by the LLM. By sending the
requests to the API endpoint, HTTP responses can be analyzed for inference and
testing purposes. RESTSpecIT utilizes an in-context prompt masking strategy,
requiring no model fine-tuning. Our evaluation demonstrates that RESTSpecIT is
capable of: (1) inferring specifications with 85.05% of GET routes and 81.05%
of query parameters found on average, (2) discovering undocumented and valid
routes and parameters, and (3) uncovering server errors in RESTful APIs.
Inferred specifications can also be used as testing tool inputs
From Natural Language Specifications to Program Input Parsers
We present a method for automatically generating input parsers from English specifications of input file formats. We use a Bayesian generative model to capture relevant natural language phenomena and translate the English specification into a specification tree, which is then translated into a C++ input parser. We model the problem as a joint dependency parsing and semantic role labeling task. Our method is based on two sources of information: (1) the correlation between the text and the specification tree and (2) noisy supervision as determined by the success of the generated C++ parser in reading input examples.
Our results show that our approach achieves 80.0\% F-Score accuracy compared to an F-Score of 66.7\% produced by a state-of-the-art semantic parser on a dataset of input format specifications from the ACM International Collegiate Programming Contest (which were written in English for humans with no intention of providing support for automated processing)National Science Foundation (U.S.) (Grant IIS-0835652)Battelle Memorial Institute (PO #300662
Let's Discover More API Relations: A Large Language Model-based AI Chain for Unsupervised API Relation Inference
APIs have intricate relations that can be described in text and represented
as knowledge graphs to aid software engineering tasks. Existing relation
extraction methods have limitations, such as limited API text corpus and
affected by the characteristics of the input text.To address these limitations,
we propose utilizing large language models (LLMs) (e.g., GPT-3.5) as a neural
knowledge base for API relation inference. This approach leverages the entire
Web used to pre-train LLMs as a knowledge base and is insensitive to the
context and complexity of input texts. To ensure accurate inference, we design
our analytic flow as an AI Chain with three AI modules: API FQN Parser, API
Knowledge Extractor, and API Relation Decider. The accuracy of the API FQN
parser and API Relation Decider module are 0.81 and 0.83, respectively. Using
the generative capacity of the LLM and our approach's inference capability, we
achieve an average F1 value of 0.76 under the three datasets, significantly
higher than the state-of-the-art method's average F1 value of 0.40. Compared to
CoT-based method, our AI Chain design improves the inference reliability by
67%, and the AI-crowd-intelligence strategy enhances the robustness of our
approach by 26%
Learning Interpretable Spatial Operations in a Rich 3D Blocks World
In this paper, we study the problem of mapping natural language instructions
to complex spatial actions in a 3D blocks world. We first introduce a new
dataset that pairs complex 3D spatial operations to rich natural language
descriptions that require complex spatial and pragmatic interpretations such as
"mirroring", "twisting", and "balancing". This dataset, built on the simulation
environment of Bisk, Yuret, and Marcu (2016), attains language that is
significantly richer and more complex, while also doubling the size of the
original dataset in the 2D environment with 100 new world configurations and
250,000 tokens. In addition, we propose a new neural architecture that achieves
competitive results while automatically discovering an inventory of
interpretable spatial operations (Figure 5)Comment: AAAI 201
NLSC: Unrestricted Natural Language-based Service Composition through Sentence Embeddings
Current approaches for service composition (assemblies of atomic services)
require developers to use: (a) domain-specific semantics to formalize services
that restrict the vocabulary for their descriptions, and (b) translation
mechanisms for service retrieval to convert unstructured user requests to
strongly-typed semantic representations. In our work, we argue that effort to
developing service descriptions, request translations, and matching mechanisms
could be reduced using unrestricted natural language; allowing both: (1)
end-users to intuitively express their needs using natural language, and (2)
service developers to develop services without relying on syntactic/semantic
description languages. Although there are some natural language-based service
composition approaches, they restrict service retrieval to syntactic/semantic
matching. With recent developments in Machine learning and Natural Language
Processing, we motivate the use of Sentence Embeddings by leveraging richer
semantic representations of sentences for service description, matching and
retrieval. Experimental results show that service composition development
effort may be reduced by more than 44\% while keeping a high precision/recall
when matching high-level user requests with low-level service method
invocations.Comment: This paper will appear on SCC'19 (IEEE International Conference on
Services Computing) on July 1
AP: Artificial Programming
The ability to automatically discover a program consistent with a given user intent (specification) is the holy grail of Computer Science. While significant progress has been made on the so-called problem of Program Synthesis, a number of challenges remain; particularly for the case of synthesizing richer and larger programs. This is in large part due to the difficulty of search over the space of programs. In this paper, we argue that the above-mentioned challenge can be tackled by learning synthesizers automatically from a large amount of training data. We present a first step in this direction by describing our novel synthesis approach based on two neural architectures for tackling the two key challenges of Learning to understand partial input-output specifications and Learning to search programs. The first neural architecture called the Spec Encoder computes a continuous representation of the specification, whereas the second neural architecture called the Program Generator incrementally constructs programs in a hypothesis space that is conditioned by the specification vector. The key idea of the approach is to train these architectures using a large set of (spec,P) pairs, where P denotes a program sampled from the DSL L and spec denotes the corresponding specification satisfied by P. We demonstrate the effectiveness of our approach on two preliminary instantiations. The first instantiation, called Neural FlashFill, corresponds to the domain of string manipulation programs similar to that of FlashFill. The second domain considers string transformation programs consisting of composition of API functions. We show that a neural system is able to perform quite well in learning a large majority of programs from few input-output examples. We believe this new approach will not only dramatically expand the applicability and effectiveness of Program Synthesis, but also would lead to the coming together of the Program Synthesis and Machine Learning research disciplines
- …