6,823 research outputs found
Automated information extraction from web APIs documentation
A fundamental characteristic of Web APIs is the fact that, de facto, providers hardly follow any standard practices while implementing, publishing, and documenting their APIs. As a consequence, the discovery and use of these services by third parties is significantly hampered. In order to achieve further automation while exploiting Web APIs we present an approach for automatically extracting relevant technical information from the Web pages documenting them. In particular we have devised two algorithms that automatically extract technical details such as operation names, operation descriptions or URI templates from the documentation of Web APIs adopting either RPC or RESTful interfaces. The algorithms devised, which exploit advanced DOM processing as well as state of the art Information Extraction and Natural Language Processing techniques, have been evaluated against a detailed dataset exhibiting a high precision and recall–around 90% for both REST and RPC APIs outperforming state of the art information extraction algorithms
Statically Checking Web API Requests in JavaScript
Many JavaScript applications perform HTTP requests to web APIs, relying on
the request URL, HTTP method, and request data to be constructed correctly by
string operations. Traditional compile-time error checking, such as calling a
non-existent method in Java, are not available for checking whether such
requests comply with the requirements of a web API. In this paper, we propose
an approach to statically check web API requests in JavaScript. Our approach
first extracts a request's URL string, HTTP method, and the corresponding
request data using an inter-procedural string analysis, and then checks whether
the request conforms to given web API specifications. We evaluated our approach
by checking whether web API requests in JavaScript files mined from GitHub are
consistent or inconsistent with publicly available API specifications. From the
6575 requests in scope, our approach determined whether the request's URL and
HTTP method was consistent or inconsistent with web API specifications with a
precision of 96.0%. Our approach also correctly determined whether extracted
request data was consistent or inconsistent with the data requirements with a
precision of 87.9% for payload data and 99.9% for query data. In a systematic
analysis of the inconsistent cases, we found that many of them were due to
errors in the client code. The here proposed checker can be integrated with
code editors or with continuous integration tools to warn programmers about
code containing potentially erroneous requests.Comment: International Conference on Software Engineering, 201
Abmash: Mashing Up Legacy Web Applications by Automated Imitation of Human Actions
Many business web-based applications do not offer applications programming
interfaces (APIs) to enable other applications to access their data and
functions in a programmatic manner. This makes their composition difficult (for
instance to synchronize data between two applications). To address this
challenge, this paper presents Abmash, an approach to facilitate the
integration of such legacy web applications by automatically imitating human
interactions with them. By automatically interacting with the graphical user
interface (GUI) of web applications, the system supports all forms of
integrations including bi-directional interactions and is able to interact with
AJAX-based applications. Furthermore, the integration programs are easy to
write since they deal with end-user, visual user-interface elements. The
integration code is simple enough to be called a "mashup".Comment: Software: Practice and Experience (2013)
Developing a comprehensive framework for multimodal feature extraction
Feature extraction is a critical component of many applied data science
workflows. In recent years, rapid advances in artificial intelligence and
machine learning have led to an explosion of feature extraction tools and
services that allow data scientists to cheaply and effectively annotate their
data along a vast array of dimensions---ranging from detecting faces in images
to analyzing the sentiment expressed in coherent text. Unfortunately, the
proliferation of powerful feature extraction services has been mirrored by a
corresponding expansion in the number of distinct interfaces to feature
extraction services. In a world where nearly every new service has its own API,
documentation, and/or client library, data scientists who need to combine
diverse features obtained from multiple sources are often forced to write and
maintain ever more elaborate feature extraction pipelines. To address this
challenge, we introduce a new open-source framework for comprehensive
multimodal feature extraction. Pliers is an open-source Python package that
supports standardized annotation of diverse data types (video, images, audio,
and text), and is expressly with both ease-of-use and extensibility in mind.
Users can apply a wide range of pre-existing feature extraction tools to their
data in just a few lines of Python code, and can also easily add their own
custom extractors by writing modular classes. A graph-based API enables rapid
development of complex feature extraction pipelines that output results in a
single, standardized format. We describe the package's architecture, detail its
major advantages over previous feature extraction toolboxes, and use a sample
application to a large functional MRI dataset to illustrate how pliers can
significantly reduce the time and effort required to construct sophisticated
feature extraction workflows while increasing code clarity and maintainability
- …