13,693 research outputs found

    Exploiting Cognitive Structure for Adaptive Learning

    Full text link
    Adaptive learning, also known as adaptive teaching, relies on learning path recommendation, which sequentially recommends personalized learning items (e.g., lectures, exercises) to satisfy the unique needs of each learner. Although it is well known that modeling the cognitive structure including knowledge level of learners and knowledge structure (e.g., the prerequisite relations) of learning items is important for learning path recommendation, existing methods for adaptive learning often separately focus on either knowledge levels of learners or knowledge structure of learning items. To fully exploit the multifaceted cognitive structure for learning path recommendation, we propose a Cognitive Structure Enhanced framework for Adaptive Learning, named CSEAL. By viewing path recommendation as a Markov Decision Process and applying an actor-critic algorithm, CSEAL can sequentially identify the right learning items to different learners. Specifically, we first utilize a recurrent neural network to trace the evolving knowledge levels of learners at each learning step. Then, we design a navigation algorithm on the knowledge structure to ensure the logicality of learning paths, which reduces the search space in the decision process. Finally, the actor-critic algorithm is used to determine what to learn next and whose parameters are dynamically updated along the learning path. Extensive experiments on real-world data demonstrate the effectiveness and robustness of CSEAL.Comment: Accepted by KDD 2019 Research Track. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD'19

    MinoanER: Schema-Agnostic, Non-Iterative, Massively Parallel Resolution of Web Entities

    Get PDF
    Entity Resolution (ER) aims to identify different descriptions in various Knowledge Bases (KBs) that refer to the same entity. ER is challenged by the Variety, Volume and Veracity of entity descriptions published in the Web of Data. To address them, we propose the MinoanER framework that simultaneously fulfills full automation, support of highly heterogeneous entities, and massive parallelization of the ER process. MinoanER leverages a token-based similarity of entities to define a new metric that derives the similarity of neighboring entities from the most important relations, as they are indicated only by statistics. A composite blocking method is employed to capture different sources of matching evidence from the content, neighbors, or names of entities. The search space of candidate pairs for comparison is compactly abstracted by a novel disjunctive blocking graph and processed by a non-iterative, massively parallel matching algorithm that consists of four generic, schema-agnostic matching rules that are quite robust with respect to their internal configuration. We demonstrate that the effectiveness of MinoanER is comparable to existing ER tools over real KBs exhibiting low Variety, but it outperforms them significantly when matching KBs with high Variety.Comment: Presented at EDBT 2001

    Counteracting Bloom Filter Encoding Techniques for Private Record Linkage

    Get PDF
    Record Linkage is a process of combining records representing same entity spread across multiple and different data sources, primarily for data analytics. Traditionally, this could be performed with comparing personal identifiers present in data (e.g., given name, surname, social security number etc.). However, sharing information across databases maintained by disparate organizations leads to exchange of personal information pertaining to an individual. In practice, various statutory regulations and policies prohibit the disclosure of such identifiers. Private record linkage (PRL) techniques have been implemented to execute record linkage without disclosing any information about other dissimilar records. Various techniques have been proposed to implement PRL, including cryptographically secure multi-party computational protocols. However, these protocols have been debated over the scalability factors as they are computationally extensive by nature. Bloom filter encoding (BFE) for private record linkage has become a topic of recent interest in the medical informatics community due to their versatility and ability to match records approximately in a manner that is (ostensibly) privacy-preserving. It also has the advantage of computing matches directly in plaintext space making them much faster than their secure mutli-party computation counterparts. The trouble with BFEs lies in their security guarantees: by their very nature BFEs leak information to assist in the matching process. Despite this known shortcoming, BFEs continue to be studied in the context of new heuristically designed countermeasures to address known attacks. A new class of set-intersection attack is proposed in this thesis which re-examines the security of BFEs by conducting experiments, demonstrating an inverse relationship between security and accuracy. With real-world deployment of BFEs in the health information sector approaching, the results from this work will generate renewed discussion around the security of BFEs as well as motivate research into new, more efficient multi-party protocols for private approximate matching

    Generative Filtering for Recursive Bayesian Inference with Streaming Data

    Full text link
    In the streaming data setting, where data arrive continuously or in frequent batches and there is no pre-determined amount of total data, Bayesian models can employ recursive updates, incorporating each new batch of data into the model parameters' posterior distribution. Filtering methods are currently used to perform these updates efficiently, however, they suffer from eventual degradation as the number of unique values within the filtered samples decreases. We propose Generative Filtering, a method for efficiently performing recursive Bayesian updates in the streaming setting. Generative Filtering retains the speed of a filtering method while using parallel updates to avoid degenerate distributions after repeated applications. We derive rates of convergence for Generative Filtering and conditions for the use of sufficient statistics instead of fully storing all past data. We investigate the alleviation of filtering degradation through simulation and Ecological species count data
    • …
    corecore