24,326 research outputs found

    Logical analysis of data as a tool for the analysis of probabilistic discrete choice behavior

    Get PDF
    Probabilistic Discrete Choice Models (PDCM) have been extensively used to interpret the behavior of heterogeneous decision makers that face discrete alternatives. The classification approach of Logical Analysis of Data (LAD) uses discrete optimization to generate patterns, which are logic formulas characterizing the different classes. Patterns can be seen as rules explaining the phenomenon under analysis. In this work we discuss how LAD can be used as the first phase of the specification of PDCM. Since in this task the number of patterns generated may be extremely large, and many of them may be nearly equivalent, additional processing is necessary to obtain practically meaningful information. Hence, we propose computationally viable techniques to obtain small sets of patterns that constitute meaningful representations of the phenomenon and allow to discover significant associations between subsets of explanatory variables and the output. We consider the complex socio-economic problem of the analysis of the utilization of the Internet in Italy, using real data gathered by the Italian National Institute of Statistics

    Movie Popularity Classification based on Inherent Movie Attributes using C4.5,PART and Correlation Coefficient

    Get PDF
    Abundance of movie data across the internet makes it an obvious candidate for machine learning and knowledge discovery. But most researches are directed towards bi-polar classification of movie or generation of a movie recommendation system based on reviews given by viewers on various internet sites. Classification of movie popularity based solely on attributes of a movie i.e. actor, actress, director rating, language, country and budget etc. has been less highlighted due to large number of attributes that are associated with each movie and their differences in dimensions. In this paper, we propose classification scheme of pre-release movie popularity based on inherent attributes using C4.5 and PART classifier algorithm and define the relation between attributes of post release movies using correlation coefficient.Comment: 6 page

    BlogForever D2.6: Data Extraction Methodology

    Get PDF
    This report outlines an inquiry into the area of web data extraction, conducted within the context of blog preservation. The report reviews theoretical advances and practical developments for implementing data extraction. The inquiry is extended through an experiment that demonstrates the effectiveness and feasibility of implementing some of the suggested approaches. More specifically, the report discusses an approach based on unsupervised machine learning that employs the RSS feeds and HTML representations of blogs. It outlines the possibilities of extracting semantics available in blogs and demonstrates the benefits of exploiting available standards such as microformats and microdata. The report proceeds to propose a methodology for extracting and processing blog data to further inform the design and development of the BlogForever platform

    Data Mining

    Get PDF

    Email for clinical communication between healthcare professionals

    Get PDF
    Email is one of the most widely used methods of communication, but its use in healthcare is still uncommon. Where email communication has been utilised in health care, its purposes have included clinical communication between healthcare professionals, but the effects of using email in this way are not well known. We updated a 2012 review of the use of email for two-way clinical communication between healthcare professionals
    • 

    corecore