7,819 research outputs found
Recommended from our members
Technical Issues in the Development of Knowledge-Based Services for the Semantic Web
The Semantic Web aims to extend the current Web with formal semantics in order to improve how users experience the Web, by ameliorating current activities and supporting the automation of some others. So far, current Semantic Web prototypes mostly aim at collecting and exposing information. Still, a semantic layer can support applying Knowledge-Based Systems techniques to the development of brand-new fully-fledged Knowledge-Based Services for the Web. In this paper, we present the technical issues that have to be faced in the development of such a kind of application by presenting the Online Design of Events Application: a Semantic Web-based design support system that assists event organisers in the process of preparing events such as workshops and conferences, by effectively reasoning over an inter-organisational process across the Web
Website Content Extraction Using Web Structure Analysis
The Web poses itself as the largest data repository ever available in the history of
humankind. Major efforts have been made in order to provide efficient to relevant
information within huge repository of data. Although several techniques have been
developed to the problem of Web data extraction, their use is still not spread, mostly
because of the need for high human intervention and the low quality of the extraction
results. For this project a domain-oriented approach to Web data extraction and discuss
it application to extracting news from Web Sites. It will use the abstraction method to
identify important sections in a web document. The relevance information will be taken
account and will be highlighted in order to develop a focused web content output. The
fact-finding and data about the project are gathered from various sources such as
internet, and books. The methodology used is a Waterfall Model that involves several
phases which are Planning, Analysis, Design and Implementation. The result of this
project is the display and review of web content extraction and how it being currently
being developed which the goals is to give more usability and easiness toward web
users
Interoperability and FAIRness through a novel combination of Web technologies
Data in the life sciences are extremely diverse and are stored in a broad spectrum of repositories ranging from those designed for particular data types (such as KEGG for pathway data or UniProt for protein data) to those that are general-purpose (such as FigShare, Zenodo, Dataverse or EUDAT). These data have widely different levels of sensitivity and security considerations. For example, clinical observations about genetic mutations in patients are highly sensitive, while observations of species diversity are generally not. The lack of uniformity in data models from one repository to another, and in the richness and availability of metadata descriptions, makes integration and analysis of these data a manual, time-consuming task with no scalability. Here we explore a set of resource-oriented Web design patterns for data discovery, accessibility, transformation, and integration that can be implemented by any general- or special-purpose repository as a means to assist users in finding and reusing their data holdings. We show that by using off-the-shelf technologies, interoperability can be achieved atthe level of an individual spreadsheet cell. We note that the behaviours of this architecture compare favourably to the desiderata defined by the FAIR Data Principles, and can therefore represent an exemplar implementation of those principles. The proposed interoperability design patterns may be used to improve discovery and integration of both new and legacy data, maximizing the utility of all scholarly outputs
Approach for Unwrapping the Unstructured to Structured Data the Case of Classified Ads in HTML Format
Data sources with various forms and formats available on the Internet. Data can be in the form of semi-structured and
unstructured data. Research‟s objective is developing approach for unwrapping the unstructured data available on the internet
into structured data / database. Unstructured data used in this study is in the case of classified ads on the Indonesia website,
and those unstructured data is in HTML format. The Illustration made to test the approach. The results of the test show the
value of f-measure 99.13%
Enhanced biomedical data extraction from scientific publications
The field of scientific research is constantly expanding, with thousands of new articles being published every day. As online databases grow, so does the need for technologies capable of navigating and extracting key information from the stored publications. In the biomedical field, these articles lay the foundation for advancing our understanding of human health and improving medical practices. With such a vast amount of data available, it can be difficult for researchers to quickly and efficiently extract the information they need. The challenge is compounded by the fact that many existing tools are expensive, hard to learn and not compatible with all article types. To address this, a prototype was developed. This prototype leverages the PubMed API to provide researchers access to the information in numerous open access articles. Features include the tracking of keywords and high frequent words along with the possibility of extracting table content. The prototype is designed to streamline the process of extracting data from research articles, allowing researchers to more efficiently analyze and synthesize information from multiple sources.Masteroppgave i informatikkINF399MAMN-INFMAMN-PRO
The NASA Astrophysics Data System: Architecture
The powerful discovery capabilities available in the ADS bibliographic
services are possible thanks to the design of a flexible search and retrieval
system based on a relational database model. Bibliographic records are stored
as a corpus of structured documents containing fielded data and metadata, while
discipline-specific knowledge is segregated in a set of files independent of
the bibliographic data itself.
The creation and management of links to both internal and external resources
associated with each bibliography in the database is made possible by
representing them as a set of document properties and their attributes.
To improve global access to the ADS data holdings, a number of mirror sites
have been created by cloning the database contents and software on a variety of
hardware and software platforms.
The procedures used to create and manage the database and its mirrors have
been written as a set of scripts that can be run in either an interactive or
unsupervised fashion.
The ADS can be accessed at http://adswww.harvard.eduComment: 25 pages, 8 figures, 3 table
Abmash: Mashing Up Legacy Web Applications by Automated Imitation of Human Actions
Many business web-based applications do not offer applications programming
interfaces (APIs) to enable other applications to access their data and
functions in a programmatic manner. This makes their composition difficult (for
instance to synchronize data between two applications). To address this
challenge, this paper presents Abmash, an approach to facilitate the
integration of such legacy web applications by automatically imitating human
interactions with them. By automatically interacting with the graphical user
interface (GUI) of web applications, the system supports all forms of
integrations including bi-directional interactions and is able to interact with
AJAX-based applications. Furthermore, the integration programs are easy to
write since they deal with end-user, visual user-interface elements. The
integration code is simple enough to be called a "mashup".Comment: Software: Practice and Experience (2013)
KSNet-Approach to Knowledge Fusion from Distributed Sources
The rapidity of the decision making process is an important factor in different branches of the human life (business, healthcare, industry, military applications etc.). Since responsible persons make decisions using available knowledge, it is important for knowledge management systems to deliver necessary and timely information. Knowledge logistics is a new direction in the knowledge management addressing this. Technology of knowledge fusion, based on the synergistic use of knowledge from multiple distributed sources, is a basis for these activities. The paper presents an overview of a Knowledge Source Network configuration approach (KSNet-approach) to knowledge fusion, multi-agent architecture and research prototype of the KSNet knowledge fusion system based on this approach
- …