818 research outputs found

    Bridging the gap between the semantic web and big data: answering SPARQL queries over NoSQL databases

    Get PDF
    Nowadays, the database field has gotten much more diverse, and as a result, a variety of non-relational (NoSQL) databases have been created, including JSON-document databases and key-value stores, as well as extensible markup language (XML) and graph databases. Due to the emergence of a new generation of data services, some of the problems associated with big data have been resolved. In addition, in the haste to address the challenges of big data, NoSQL abandoned several core databases features that make them extremely efficient and functional, for instance the global view, which enables users to access data regardless of how it is logically structured or physically stored in its sources. In this article, we propose a method that allows us to query non-relational databases based on the ontology-based access data (OBDA) framework by delegating SPARQL protocol and resource description framework (RDF) query language (SPARQL) queries from ontology to the NoSQL database. We applied the method on a popular database called Couchbase and we discussed the result obtained

    Data-driven Job Search Engine Using Skills and Company Attribute Filters

    Full text link
    According to a report online, more than 200 million unique users search for jobs online every month. This incredibly large and fast growing demand has enticed software giants such as Google and Facebook to enter this space, which was previously dominated by companies such as LinkedIn, Indeed and CareerBuilder. Recently, Google released their "AI-powered Jobs Search Engine", "Google For Jobs" while Facebook released "Facebook Jobs" within their platform. These current job search engines and platforms allow users to search for jobs based on general narrow filters such as job title, date posted, experience level, company and salary. However, they have severely limited filters relating to skill sets such as C++, Python, and Java and company related attributes such as employee size, revenue, technographics and micro-industries. These specialized filters can help applicants and companies connect at a very personalized, relevant and deeper level. In this paper we present a framework that provides an end-to-end "Data-driven Jobs Search Engine". In addition, users can also receive potential contacts of recruiters and senior positions for connection and networking opportunities. The high level implementation of the framework is described as follows: 1) Collect job postings data in the United States, 2) Extract meaningful tokens from the postings data using ETL pipelines, 3) Normalize the data set to link company names to their specific company websites, 4) Extract and ranking the skill sets, 5) Link the company names and websites to their respective company level attributes with the EVERSTRING Company API, 6) Run user-specific search queries on the database to identify relevant job postings and 7) Rank the job search results. This framework offers a highly customizable and highly targeted search experience for end users.Comment: 8 pages, 10 figures, ICDM 201

    Science Gateways with Embedded Ontology-based E-learning Support

    Get PDF
    Science gateways are widely utilised in a range of scientific disciplines to provide user-friendly access to complex distributed computing infrastructures. The traditional approach in science gateway development is to concentrate on this simplified resource access and provide scientists with a graphical user interface to conduct their experiments and visualise the results. However, as user communities behind these gateways are growing and opening their doors to less experienced scientists or even to the general public as “citizen scientists”, there is an emerging need to extend these gateways with training and learning support capabilities. This paper describes a novel approach showing how science gateways can be extended with embedded e-learning support using an ontology-based learning environment called Knowledge Repository Exchange and Learning (KREL). The paper also presents a prototype implementation of a science gateway for analysing earthquake data and demonstrates how the KREL can extend this gateway with ontology-based embedded e-learning support

    WikiSensing: A collaborative sensor management system with trust assessment for big data

    Get PDF
    Big Data for sensor networks and collaborative systems have become ever more important in the digital economy and is a focal point of technological interest while posing many noteworthy challenges. This research addresses some of the challenges in the areas of online collaboration and Big Data for sensor networks. This research demonstrates WikiSensing (www.wikisensing.org), a high performance, heterogeneous, collaborative data cloud for managing and analysis of real-time sensor data. The system is based on the Big Data architecture with comprehensive functionalities for smart city sensor data integration and analysis. The system is fully functional and served as the main data management platform for the 2013 UPLondon Hackathon. This system is unique as it introduced a novel methodology that incorporates online collaboration with sensor data. While there are other platforms available for sensor data management WikiSensing is one of the first platforms that enable online collaboration by providing services to store and query dynamic sensor information without any restriction of the type and format of sensor data. An emerging challenge of collaborative sensor systems is modelling and assessing the trustworthiness of sensors and their measurements. This is with direct relevance to WikiSensing as an open collaborative sensor data management system. Thus if the trustworthiness of the sensor data can be accurately assessed, WikiSensing will be more than just a collaborative data management system for sensor but also a platform that provides information to the users on the validity of its data. Hence this research presents a new generic framework for capturing and analysing sensor trustworthiness considering the different forms of evidence available to the user. It uses an extensible set of metrics that can represent such evidence and use Bayesian analysis to develop a trust classification model. Based on this work there are several publications and others are at the final stage of submission. Further improvement is also planned to make the platform serve as a cloud service accessible to any online user to build up a community of collaborators for smart city research.Open Acces

    Ideas 2019 Report Out

    Get PDF
    No abstract availabl

    Semantic Data Management in Data Lakes

    Full text link
    In recent years, data lakes emerged as away to manage large amounts of heterogeneous data for modern data analytics. One way to prevent data lakes from turning into inoperable data swamps is semantic data management. Some approaches propose the linkage of metadata to knowledge graphs based on the Linked Data principles to provide more meaning and semantics to the data in the lake. Such a semantic layer may be utilized not only for data management but also to tackle the problem of data integration from heterogeneous sources, in order to make data access more expressive and interoperable. In this survey, we review recent approaches with a specific focus on the application within data lake systems and scalability to Big Data. We classify the approaches into (i) basic semantic data management, (ii) semantic modeling approaches for enriching metadata in data lakes, and (iii) methods for ontologybased data access. In each category, we cover the main techniques and their background, and compare latest research. Finally, we point out challenges for future work in this research area, which needs a closer integration of Big Data and Semantic Web technologies
    corecore