46 research outputs found

    The Design and Implementation of Collaborative Filtering in Data Mining

    Get PDF
    Data mining is the process of discovering explicit knowledge from large amounts of data stored in database, data warehouse or other repositories. There have been many studies about models of data mining such as association rule, sequential pattern and so on. Collaborative filtering is one of data mining models. In this paper, we propose two approaches to solving the mining process of collaborative filtering. Finally, collaborative filtering mining is applied to Knowledge Management system

    Data mining techniques application for prediction in OLAP cube

    Get PDF
    Data warehouses represent collections of data organized to support a process of decision support, and provide an appropriate solution for managing large volumes of data. OLAP online analytics is a technology that complements data warehouses to make data usable and understandable by users, by providing tools for visualization, exploration, and navigation of data-cubes. On the other hand, data mining allows the extraction of knowledge from data with different methods of description, classification, explanation and prediction. As part of this work, we propose new ways to improve existing approaches in the process of decision support. In the continuity of the work treating the coupling between the online analysis and data mining to integrate prediction into OLAP, an approach based on automatic learning with Clustering is proposed in order to partition an initial data cube into dense sub-cubes that could serve as a learning set to build a prediction model. The technique of data mining by regression trees is then applied for each sub-cube to predict the value of a cell

    Fast Cartography for Data Explorers

    Get PDF

    Análise exploratória de hierarquias em base de dados multidimensionais

    Get PDF
    Cada vez mais, as empresas e organizações utilizam bases de dados multidimensionais e ferramentas OLAP como forma de organizar informação proveniente de sistemas transacionais com o objetivo de aceder e analisar dados com elevada flexibilidade e desempenho. No modelo de dados OLAP, a informação é concetualmente organizada em cubos. Cada dimensão do cubo tem uma hierarquia associada, o que possibilita analisar os dados em diferentes níveis de agregação. Apresenta-se uma metodologia que explora os diferentes níveis de agregação das hierarquias para uma análise exploratória assim como previsões para diferentes horizontes temporais. Esta metodologia mostrou-se muito eficiente, apresentando melhores resultados em comparação com as técnicas usuais de previsão. Os resultados das previsões realizadas são promissores e coerentes com a respetiva análise exploratória.Increasingly, companies and organizations use multidimensional databases and OLAP tools to structure and organize information from transactional systems, with the objective of accessing and analyzing data with high level of flexibility and performance. In the OLAP models, data is conceptually organized into cubes. Each cube’s dimension has an associated hierarchy, which allows for data analysis at different levels of aggregation. A methodology is presented, which explores the different levels of aggregation of the hierarchies for an exploratory analysis as well as the forecasts for different time horizons. This methodology proved to be very efficient, with better results than those obtained from the usual techniques of forecasting. The forecasting results are promising and in line with the respective exploratory analysis

    model checking for data anomaly detection

    Get PDF
    Abstract Data tipically evolve according to specific processes, with the consequent possibility to identify a profile of evolution: the values it may assume, the frequencies at which it changes, the temporal variation in relation to other data, or other constraints that are directly connected to the reference domain. A violation of these conditions could be the signal of different menaces that threat the system, as well as: attempts of a tampering or a cyber attack, a failure in the system operation, a bug in the applications which manage the life cycle of data. To detect such violations is not straightforward as processes could be unknown or hard to extract. In this paper we propose an approach to detect data anomalies. We represent data user behaviours in terms of labelled transition systems and through the model checking techniques we demonstrate the proposed modeling can be exploited to successfully detect data anomalies

    Text Assisted Insight Ranking Using Context-Aware Memory Network

    Full text link
    Extracting valuable facts or informative summaries from multi-dimensional tables, i.e. insight mining, is an important task in data analysis and business intelligence. However, ranking the importance of insights remains a challenging and unexplored task. The main challenge is that explicitly scoring an insight or giving it a rank requires a thorough understanding of the tables and costs a lot of manual efforts, which leads to the lack of available training data for the insight ranking problem. In this paper, we propose an insight ranking model that consists of two parts: A neural ranking model explores the data characteristics, such as the header semantics and the data statistical features, and a memory network model introduces table structure and context information into the ranking process. We also build a dataset with text assistance. Experimental results show that our approach largely improves the ranking precision as reported in multi evaluation metrics.Comment: Accepted to AAAI 201
    corecore