721 research outputs found

    Web Queries: From a Web of Data to a Semantic Web?

    Get PDF

    Enhancing privacy and security within social networks

    Get PDF
    The introduction of online social networking has allowed people across the world to share information with each other. It has risen to be one of the most popular forms of internet usage where almost everyone has created one or multiple types of social network accounts. Along with its great benefits has come many concerns though, mainly that the overabundance of information has created a plethora of privacy issues for not only the users of online social networks, but also for the company hosting the specific social network. Due to the wide range of privacy concerns associated with online social networks, it is incredibly difficult to tackle all the current possible concerns. In this thesis, we propose works to tackle two privacy issues associated with online social networks. These two privacy issues are: the friend search engine, and image content. Firstly we will introduce a new sub-graph approach to the friend search engine that removes the ability of attackers to gain more information of your friend list than your privacy settings allow. Secondly we will introduce a new privacy setting that allows users to define locations they do not wish their face to be seen in images. If an image is posted with their face in such a location, they will be privatized through facial replacement so that they are unrecognizable. The overall efficiency of these works will be tested so that their enhanced privacy does not cause usability issues if they are adopted by a social network site. These works allow users to remain more private while using social media and also help users to remain confident that their privacy is kept safe. These improvements not only help strengthen the privacy of users, but also help social network sites retain users that are more wary of privacy breaches online

    Graph-Based Weakly-Supervised Methods for Information Extraction & Integration

    Get PDF
    The variety and complexity of potentially-related data resources available for querying --- webpages, databases, data warehouses --- has been growing ever more rapidly. There is a growing need to pose integrative queries across multiple such sources, exploiting foreign keys and other means of interlinking data to merge information from diverse sources. This has traditionally been the focus of research within Information Extraction (IE) and Information Integration (II) communities, with IE focusing on converting unstructured sources into structured sources, and II focusing on providing a unified view of diverse structured data sources. However, most of the current IE and II methods, which can potentially be applied to the pro blem of integration across sources, require large amounts of human supervision, often in the form of annotated data. This need for extensive supervision makes existing methods expensive to deploy and difficult to maintain. In this thesis, we develop techniques that generalize from limited human input, via weakly-supervised methods for IE and II. In particular, we argue that graph-based representation of data and learning over such graphs can result in effective and scalable methods for large-scale Information Extraction and Integration. Within IE, we focus on the problem of assigning semantic classes to entities. First we develop a context pattern induction method to extend small initial entity lists of various semantic classes. We also demonstrate that features derived from such extended entity lists can significantly improve performance of state-of-the-art discriminative taggers. The output of pattern-based class-instance extractors is often high-precision and low-recall in nature, which is inadequate for many real world applications. We use Adsorption, a graph based label propagation algorithm, to significantly increase recall of an initial high-precision, low-recall pattern-based extractor by combining evidences from unstructured and structured text corpora. Building on Adsorption, we propose a new label propagation algorithm, Modified Adsorption (MAD), and demonstrate its effectiveness on various real-world datasets. Additionally, we also show how class-instance acquisition performance in the graph-based SSL setting can be improved by incorporating additional semantic constraints available in independently developed knowledge bases. Within Information Integration, we develop a novel system, Q, which draws ideas from machine learning and databases to help a non-expert user construct data-integrating queries based on keywords (across databases) and interactive feedback on answers. We also present an information need-driven strategy for automatically incorporating new sources and their information in Q. We also demonstrate that Q\u27s learning strategy is highly effective in combining the outputs of ``black box\u27\u27 schema matchers and in re-weighting bad alignments. This removes the need to develop an expensive mediated schema which has been necessary for most previous systems

    Natural Language Processing: Emerging Neural Approaches and Applications

    Get PDF
    This Special Issue highlights the most recent research being carried out in the NLP field to discuss relative open issues, with a particular focus on both emerging approaches for language learning, understanding, production, and grounding interactively or autonomously from data in cognitive and neural systems, as well as on their potential or real applications in different domains

    Spatial Queries for Indoor Location-based Services

    Get PDF
    Indoor Location-based Services (LBS) facilitate people in indoor scenarios such as airports, train stations, shopping malls, and office buildings. Indoor spatial queries are the foundation to support indoor LBSs. However, the existing techniques for indoor spatial queries are limited to support more advanced queries that consider semantic information, temporal variations, and crowd influence. This work studies indoor spatial queries for indoor LBSs. Some typical proposals for indoor spatial queries are compared theoretically and experimentally. Then, it studies three advanced indoor spatial queries, a) Indoor Keyword-aware Routing Query. b) Indoor Temporal-variation aware Routing Query. c) Indoor Crowd-aware Routing Query. A series of techniques are proposed to solve these problems.</p

    Usability and expressiveness in database keyword search : bridging the gap

    Get PDF
    [no abstract

    Model Transformation Languages with Modular Information Hiding

    Get PDF
    Model transformations, together with models, form the principal artifacts in model-driven software development. Industrial practitioners report that transformations on larger models quickly get sufficiently large and complex themselves. To alleviate entailed maintenance efforts, this thesis presents a modularity concept with explicit interfaces, complemented by software visualization and clustering techniques. All three approaches are tailored to the specific needs of the transformation domain

    Doctor of Philosophy

    Get PDF
    dissertationLinked data are the de-facto standard in publishing and sharing data on the web. To date, we have been inundated with large amounts of ever-increasing linked data in constantly evolving structures. The proliferation of the data and the need to access and harvest knowledge from distributed data sources motivate us to revisit several classic problems in query processing and query optimization. The problem of answering queries over views is commonly encountered in a number of settings, including while enforcing security policies to access linked data, or when integrating data from disparate sources. We approach this problem by efficiently rewriting queries over the views to equivalent queries over the underlying linked data, thus avoiding the costs entailed by view materialization and maintenance. An outstanding problem of query rewriting is the number of rewritten queries is exponential to the size of the query and the views, which motivates us to study problem of multiquery optimization in the context of linked data. Our solutions are declarative and make no assumption for the underlying storage, i.e., being store-independent. Unlike relational and XML data, linked data are schema-less. While tracking the evolution of schema for linked data is hard, keyword search is an ideal tool to perform data integration. Existing works make crippling assumptions for the data and hence fall short in handling massive linked data with tens to hundreds of millions of facts. Our study for keyword search on linked data brought together the classical techniques in the literature and our novel ideas, which leads to much better query efficiency and quality of the results. Linked data also contain rich temporal semantics. To cope with the ever-increasing data, we have investigated how to partition and store large temporal or multiversion linked data for distributed and parallel computation, in an effort to achieve load-balancing to support scalable data analytics for massive linked data

    The structure of broad topics on the web

    Get PDF
    • …
    corecore