26,158 research outputs found
An infrastructure for building semantic web portals
In this paper, we present our KMi semantic web portal infrastructure, which supports two important tasks of semantic web portals, namely metadata extraction and data querying. Central to our infrastructure are three components: i) an automated metadata extraction tool, ASDI, which supports the extraction of high quality metadata from heterogeneous sources, ii) an ontology-driven question answering tool, AquaLog, which makes use of the domain specific ontology and the semantic metadata extracted by ASDI to answers questions in natural language format, and iii) a semantic search engine, which enhances traditional
text-based searching by making use of the underlying ontologies and the extracted metadata. A semantic web portal application has been built, which illustrates the usage of this infrastructure
Distributed Information Retrieval using Keyword Auctions
This report motivates the need for large-scale distributed approaches to information retrieval, and proposes solutions based on keyword auctions
Impliance: A Next Generation Information Management Appliance
ably successful in building a large market and adapting to the changes of the
last three decades, its impact on the broader market of information management
is surprisingly limited. If we were to design an information management system
from scratch, based upon today's requirements and hardware capabilities, would
it look anything like today's database systems?" In this paper, we introduce
Impliance, a next-generation information management system consisting of
hardware and software components integrated to form an easy-to-administer
appliance that can store, retrieve, and analyze all types of structured,
semi-structured, and unstructured information. We first summarize the trends
that will shape information management for the foreseeable future. Those trends
imply three major requirements for Impliance: (1) to be able to store, manage,
and uniformly query all data, not just structured records; (2) to be able to
scale out as the volume of this data grows; and (3) to be simple and robust in
operation. We then describe four key ideas that are uniquely combined in
Impliance to address these requirements, namely the ideas of: (a) integrating
software and off-the-shelf hardware into a generic information appliance; (b)
automatically discovering, organizing, and managing all data - unstructured as
well as structured - in a uniform way; (c) achieving scale-out by exploiting
simple, massive parallel processing, and (d) virtualizing compute and storage
resources to unify, simplify, and streamline the management of Impliance.
Impliance is an ambitious, long-term effort to define simpler, more robust, and
more scalable information systems for tomorrow's enterprises.Comment: This article is published under a Creative Commons License Agreement
(http://creativecommons.org/licenses/by/2.5/.) You may copy, distribute,
display, and perform the work, make derivative works and make commercial use
of the work, but, you must attribute the work to the author and CIDR 2007.
3rd Biennial Conference on Innovative Data Systems Research (CIDR) January
710, 2007, Asilomar, California, US
Reasoning & Querying – State of the Art
Various query languages for Web and Semantic Web data, both for practical use and as an area of research in the scientific community, have emerged in recent years. At the same time, the broad adoption of the internet where keyword search is used in many applications, e.g. search engines, has familiarized casual users with using keyword queries to retrieve information on the internet. Unlike this easy-to-use querying, traditional query languages require knowledge of the language itself as well as of the data to be queried. Keyword-based query languages for XML and RDF bridge the gap between the two, aiming at enabling simple querying of semi-structured data, which is relevant e.g. in the context of the emerging Semantic Web. This article presents an overview of the field of keyword querying for XML and RDF
Innovation through pertinent patents research based on physical phenomena involved
One can find innovative solutions to complex industrial problems by looking for knowledge in patents. Traditional search using keywords in databases of patents has been widely used. Currently, different computational methods that limit human intervention have been developed. We aim to define a method to improve the search for relevant patents in order to solve industrial problems and specifically to deduce evolution opportunities. The non-automatic, semi-automatic, and automatic search methods use keywords. For a detailed keyword search, we propose as a basis the functional decomposition and the analysis of the physical phenomena involved in the achievement of the function to fulfill. The search for solutions to design a bi-phasic separator in deep offshore shows the method presented in this paper
Data-driven Job Search Engine Using Skills and Company Attribute Filters
According to a report online, more than 200 million unique users search for
jobs online every month. This incredibly large and fast growing demand has
enticed software giants such as Google and Facebook to enter this space, which
was previously dominated by companies such as LinkedIn, Indeed and
CareerBuilder. Recently, Google released their "AI-powered Jobs Search Engine",
"Google For Jobs" while Facebook released "Facebook Jobs" within their
platform. These current job search engines and platforms allow users to search
for jobs based on general narrow filters such as job title, date posted,
experience level, company and salary. However, they have severely limited
filters relating to skill sets such as C++, Python, and Java and company
related attributes such as employee size, revenue, technographics and
micro-industries. These specialized filters can help applicants and companies
connect at a very personalized, relevant and deeper level. In this paper we
present a framework that provides an end-to-end "Data-driven Jobs Search
Engine". In addition, users can also receive potential contacts of recruiters
and senior positions for connection and networking opportunities. The high
level implementation of the framework is described as follows: 1) Collect job
postings data in the United States, 2) Extract meaningful tokens from the
postings data using ETL pipelines, 3) Normalize the data set to link company
names to their specific company websites, 4) Extract and ranking the skill
sets, 5) Link the company names and websites to their respective company level
attributes with the EVERSTRING Company API, 6) Run user-specific search queries
on the database to identify relevant job postings and 7) Rank the job search
results. This framework offers a highly customizable and highly targeted search
experience for end users.Comment: 8 pages, 10 figures, ICDM 201
Recommended from our members
Enriching videos with light semantics
This paper describes an ongoing prototypical framework to annotate and retrieve web videos with light semantics. The proposed framework reuses many existing vocabularies along with a video model. The knowledge is captured from three different information spaces (media content, context, document). We also describe ways to extract the semantic content descriptions from the existing usergenerated content using multiple approaches of linguistic processing and Named Entity Recognition, which are later identified with DBpedia resources to establish meanings for the tags. Finally, the implemented prototype is described with multiple search interfaces and retrieval processes. Evaluation on semantic enrichment shows a considerable (50% of videos) improvement in content description
Utilising semantic technologies for intelligent indexing and retrieval of digital images
The proliferation of digital media has led to a huge interest in classifying and indexing media objects for generic search and usage. In particular, we are witnessing colossal growth in digital image repositories that are difficult to navigate using free-text search mechanisms, which often return inaccurate matches as they in principle rely on statistical analysis of query keyword recurrence in the image annotation or surrounding text. In this paper we present a semantically-enabled image annotation and retrieval engine that is designed to satisfy the requirements of the commercial image collections market in terms of both accuracy and efficiency of the retrieval process. Our search engine relies on methodically structured ontologies for image annotation, thus allowing for more intelligent reasoning about the image content and subsequently obtaining a more accurate set of results and a richer set of alternatives matchmaking the original query. We also show how our well-analysed and designed domain ontology contributes to the implicit expansion of user queries as well as the exploitation of lexical databases for explicit semantic-based query expansion
- …