3 research outputs found

    Big Data Analysis

    Get PDF
    The value of big data is predicated on the ability to detect trends and patterns and more generally to make sense of the large volumes of data that is often comprised of a heterogeneous mix of format, structure, and semantics. Big data analysis is the component of the big data value chain that focuses on transforming raw acquired data into a coherent usable resource suitable for analysis. Using a range of interviews with key stakeholders in small and large companies and academia, this chapter outlines key insights, state of the art, emerging trends, future requirements, and sectorial case studies for data analysis

    Linked Data Entity Summarization

    Get PDF
    On the Web, the amount of structured and Linked Data about entities is constantly growing. Descriptions of single entities often include thousands of statements and it becomes difficult to comprehend the data, unless a selection of the most relevant facts is provided. This doctoral thesis addresses the problem of Linked Data entity summarization. The contributions involve two entity summarization approaches, a common API for entity summarization, and an approach for entity data fusion

    User interfaces supporting entity search for linked data

    Get PDF
    One of the main goals of semantic search is to retrieve and connect information related to queries, offering users rich structured information about a topic instead of a set of documents relevant to the topic. Previous work reports that searching for information about individual entities such as persons, places and organisations is the most common form of Web search. Since the Semantic Web was first proposed, the amount of structured data on the Web has increased dramatically. This is particularly the case for what is known as Linked Data, information that has been published using Semantic Web standards such as RDF and OWL. Such structured data opens up new possibilities for improving entity search on the Web, integrating facts from independent sources, and presenting users with contextually-rich information about entities. This research focuses on entity search of Linked Data in terms of three different forms of search: structured queries, where users can use the SPARQL query language for manipulating data sources; exploratory search, where users can browse from one entity to another; and focused search, where users can input an entity query as a free text keyword search. We undertake a comparative study between two distinct information architectures for structured querying to manipulate Linked Data over the Web. Specifically, we evaluate some of the main operators in SPARQL using several datasets of Linked Data. We introduce a framework of five criteria to evaluate 15 current state-of-the-art semantic tools available for exploratory search of Linked Data, in order to establish how well these browsers make available the benefits of Linked Data and entity search for human users. We also use the criteria to determine the browsers that are best suited to entity exploration. Further, we propose a new model, the Attribute Importance Model, for entity-aggregated search, with the purpose of improving user experience when finding information about entities. The model develops three techniques: (1) presenting entity type-based query suggestions; (2) clustering aggregated attributes; and (3) ranking attributes based on their importance to a given query. Together these constitute a model for developing more informative views and enhancing users’ understanding of entity descriptions on the Web. We then use our model to provide an interactive approach, with the Information Visualisation toolkit InfoVis, that enables users to visualise entity clusters generated by our Attribute Importance Model. Thus this thesis addresses two challenges of searching Linked Data. The first challenge concerns the specific issue of information resolution during the search: the reduction of query ambiguity and redundant results that contain irrelevant descriptions when searching for information about an entity. The second challenge concerns the more general problem of technical complexity, and addresses to the limited adoption of Linked Data that we ascribe to the lack of understanding of Semantic Web technologies and data structures among general users. These technologies pose new design problems for human interaction such as overloading data, navigation styles, and browsing mechanisms. The Attribute Importance Model addresses both these challenges
    corecore