13,498 research outputs found
The Bayesian Case Model: A Generative Approach for Case-Based Reasoning and Prototype Classification
We present the Bayesian Case Model (BCM), a general framework for Bayesian
case-based reasoning (CBR) and prototype classification and clustering. BCM
brings the intuitive power of CBR to a Bayesian generative framework. The BCM
learns prototypes, the "quintessential" observations that best represent
clusters in a dataset, by performing joint inference on cluster labels,
prototypes and important features. Simultaneously, BCM pursues sparsity by
learning subspaces, the sets of features that play important roles in the
characterization of the prototypes. The prototype and subspace representation
provides quantitative benefits in interpretability while preserving
classification accuracy. Human subject experiments verify statistically
significant improvements to participants' understanding when using explanations
produced by BCM, compared to those given by prior art.Comment: Published in Neural Information Processing Systems (NIPS) 2014,
Neural Information Processing Systems (NIPS) 201
BlogForever D2.6: Data Extraction Methodology
This report outlines an inquiry into the area of web data extraction, conducted within the context of blog preservation. The report reviews theoretical advances and practical developments for implementing data extraction. The inquiry is extended through an experiment that demonstrates the effectiveness and feasibility of implementing some of the suggested approaches. More specifically, the report discusses an approach based on unsupervised machine learning that employs the RSS feeds and HTML representations of blogs. It outlines the possibilities of extracting semantics available in blogs and demonstrates the benefits of exploiting available standards such as microformats and microdata. The report proceeds to propose a methodology for extracting and processing blog data to further inform the design and development of the BlogForever platform
- âŠ