Search CORE

5,082 research outputs found

Generalized pattern extraction from concept lattices

Author: Balamane A.
Kwuida Léonard
Missaoui R.
Vaillancourt J.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/03/2014
Field of study

Expressive generalized itemsets

Author: Baralis Elena Maria
Cagliero Luca
Cerquitelli Tania
D’Elia V.
Garza Paolo
Publication venue: Elsevier
Publication date: 01/01/2014
Field of study

Generalized itemset mining is a powerful tool to discover multiple-level correlations among the analyzed data. A taxonomy is used to aggregate data items into higher-level concepts and to discover frequent recurrences among data items at different granularity levels. However, since traditional high-level itemsets may also represent the knowledge covered by their lower-level frequent descendant itemsets, the expressiveness of high-level itemsets can be rather limited. To overcome this issue, this article proposes two novel itemset types, called Expressive Generalized Itemset (EGI) and Maximal Expressive Generalized Itemset (Max-EGI), in which the frequency of occurrence of a high-level itemset is evaluated only on the portion of data not yet covered by any of its frequent descendants. Specifically, EGI s represent, at a high level of abstraction, the knowledge associated with sets of infrequent itemsets, while Max-EGIs compactly represent all the infrequent descendants of a generalized itemset. Furthermore, we also propose an algorithm to discover Max-EGIs at the top of the traditionally mined itemsets. Experiments, performed on both real and synthetic datasets, demonstrate the effectiveness, efficiency, and scalability of the proposed approac

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Data mining by means of generalized patterns

Author: Cagliero Luca
Publication venue
Publication date: 01/01/2012
Field of study

The thesis is mainly focused on the study and the application of pattern discovery algorithms that aggregate database knowledge to discover and exploit valuable correlations, hidden in the analyzed data, at different abstraction levels. The aim of the research effort described in this work is two-fold: the discovery of associations, in the form of generalized patterns, from large data collections and the inference of semantic models, i.e., taxonomies and ontologies, suitable for driving the mining proces

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

ADBSCAN: Adaptive Density-Based Spatial Clustering of Applications with Noise for Identifying Clusters with Varying Densities

Author: Arif Rezoana Bente
Khan Mohammad Mahmudur Rahman
Oishe Mahjabin Rahman
Siddique Md. Abu Bakr
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 29/11/2018
Field of study

Density-based spatial clustering of applications with noise (DBSCAN) is a data clustering algorithm which has the high-performance rate for dataset where clusters have the constant density of data points. One of the significant attributes of this algorithm is noise cancellation. However, DBSCAN demonstrates reduced performances for clusters with different densities. Therefore, in this paper, an adaptive DBSCAN is proposed which can work significantly well for identifying clusters with varying densities.Comment: To be published in the 4th IEEE International Conference on Electrical Engineering and Information & Communication Technology (iCEEiCT 2018

arXiv.org e-Print Archive

Crossref

On the Representation and Use of Semantic Categories: A Survey and Prospectus

Author: Schatz Bruce R.
Publication venue: MIT Artificial Intelligence Laboratory
Publication date: 01/01/1976
Field of study

This report describes research conducted at the Artificial Intelligence Laboratory of the Massachusetts Institute of Technology. Support for the Laboratory's artificial intelligence research is provided in part by the Advanced Research Projects Agency of the Department of Defense under Office of Naval Research contract number N00014-75-C-0643.This paper is intended as a brief introduction to several issues concerning semantic categories. These are the everyday, factual groupings of world knowledge according to some similarity in characteristics. Some psychological data concerning the structure, formation, and use of categories is surveyed. Then several psychological models (set-theoretic and network) are considered. Various artificial intelligence representations (concerning the symbol mapping and recognition problems) dealing with similar issues are also reviewed. It is argued that these data and representations approach semantic categories at too abstract a level and a set of guidelines which may be helpful in constructing a microworld are given.MIT Artificial Intelligence Laboratory Department of Defense Advanced Research Projects Agenc

CiteSeerX

DSpace@MIT

Exploring Data Hierarchies to Discover Knowledge in Different Domains

Author: Ricupero Giuseppe
Publication venue: Politecnico di Torino
Publication date
Field of study

L'abstract è presente nell'allegato / the abstract is in the attachmen

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Recommended from our members

Machine learning : techniques and foundations

Author: Carbonell Jaime G.
Langley Pat
Publication venue: eScholarship, University of California
Publication date: 30/03/1987
Field of study

The field of machine learning studies computational methods for acquiring new knowledge, new skills, and new ways to organize existing knowledge. In this paper we present some of the basic techniques and principles that underlie AI research on learning, including methods for learning from examples, learning in problem solving, learning by analogy, grammar acquisition, and machine discovery. In each case, we illustrate the techniques with paradigmatic examples

eScholarship - University of California

Making RBAC Work in Dynamic, Fast-Changing Corporate Environments

Author: Dimov Ruslan Y
Publication venue: Dartmouth Digital Commons
Publication date: 02/06/2008
Field of study

In large organizations with tens of thousands of employees, managing individual people\u27s permissions is tedious and error prone, and thus a possible source of security risks. Role-Based Access Control addresses this problem by grouping users into roles, which reflect job functions in the corporation. Permissions are assigned to roles instead of directly to users, which means that all users assigned to a role have the same set of permissions with respect to that role. However, adoption of RBAC in organizations such as investment banks is hindered by two main factors: first, it is costly and time-consuming to define roles. Second, there are certain job functions (such as consultant) that cannot be expressed as RBAC roles, because their users need to have different permission sets. The topic of this thesis is to investigate whether roles can be applied to domains that exhibit the peculiarities of the investment bank example. We introduce a new framework for roles that allows us to separately represent what the role means as a job function, and what permissions its individual users have. That way we maintain the key property of RBAC - that the number of roles is small, while allowing for variations among users. We have also investigated machine learning approaches in order to figure out whether roles are concepts that can be learned or approximated by a function. We present our findings that certain learning schemes, such as Probably Approximately Correct (PAC) earning and Instance-based learning are not applicable to roles, while others - such as decision-tree learning, might be useful

Dartmouth Digital Commons (Dartmouth College)

Relational clustering models for knowledge discovery and recommender systems

Author: Li Tao
Publication venue
Publication date
Field of study

Cluster analysis is a fundamental research field in Knowledge Discovery and Data Mining (KDD). It aims at partitioning a given dataset into some homogeneous clusters so as to reflect the natural hidden data structure. Various heuristic or statistical approaches have been developed for analyzing propositional datasets. Nevertheless, in relational clustering the existence of multi-type relationships will greatly degrade the performance of traditional clustering algorithms. This issue motivates us to find more effective algorithms to conduct the cluster analysis upon relational datasets. In this thesis we comprehensively study the idea of Representative Objects for approximating data distribution and then design a multi-phase clustering framework for analyzing relational datasets with high effectiveness and efficiency. The second task considered in this thesis is to provide some better data models for people as well as machines to browse and navigate a dataset. The hierarchical taxonomy is widely used for this purpose. Compared with manually created taxonomies, automatically derived ones are more appealing because of their low creation/maintenance cost and high scalability. Up to now, the taxonomy generation techniques are mainly used to organize document corpus. We investigate the possibility of utilizing them upon relational datasets and then propose some algorithmic improvements. Another non-trivial problem is how to assign suitable labels for the taxonomic nodes so as to credibly summarize the content of each node. Unfortunately, this field has not been investigated sufficiently to the best of our knowledge, and so we attempt to fill the gap by proposing some novel approaches. The final goal of our cluster analysis and taxonomy generation techniques is to improve the scalability of recommender systems that are developed to tackle the problem of information overload. Recent research in recommender systems integrates the exploitation of domain knowledge to improve the recommendation quality, which however reduces the scalability of the whole system at the same time. We address this issue by applying the automatically derived taxonomy to preserve the pair-wise similarities between items, and then modeling the user visits by another hierarchical structure. Experimental results show that the computational complexity of the recommendation procedure can be greatly reduced and thus the system scalability be improved

Warwick Research Archives Portal Repository