6 research outputs found

    Privacy-Preserving OLAP-based monitoring of data streams: The PP-OMDS approach

    Get PDF
    In this paper, we propose PP-OMDS (Privacy-Preserving OLAP-based Monitoring of Data Streams), an innovative framework for supporting the OLAP-based monitoring of data streams, which is relevant for a plethora of application scenarios (e.g., security, emergency management, and so forth), in a privacy-preserving manner. The paper describes motivations, principles and achievements of the PP-OMDS framework, along with technological advancements and innovations. We also incorporate a detailed comparative analysis with competitive frameworks, along with a trade-off analysis

    Parallel and distributed clustering framework for big spatial data mining

    Get PDF
    Clustering techniques are very attractive for identifying and extracting patterns of interests from datasets. However, their application to very large spatial datasets presents numerous challenges such as high-dimensionality, heterogeneity, and high complexity of some algorithms. Distributed clustering techniques constitute a very good alternative to the Big Data challenges (e.g., Volume, Variety, Veracity, and Velocity). In this paper, we developed and implemented a Dynamic Parallel and Distributed clustering (DPDC) approach that can analyse Big Data within a reasonable response time and produce accurate results, by using existing and current computing and storage infrastructure, such as cloud computing. The DPDC approach consists of two phases. The first phase is fully parallel and it generates local clusters and the second phase aggregates the local results to obtain global clusters. The aggregation phase is designed in such a way that the final clusters are compact and accurate while the overall process is efficient in time and memory allocation. DPDC was thoroughly tested and compared to well-known clustering algorithms BIRCH and CURE. The results show that the approach not only produces high-quality results but also scales up very well by taking advantage of the Hadoop MapReduce paradigm or any distributed system

    Next Generation of Product Search and Discovery

    Get PDF
    Online shopping has become an important part of people’s daily life with the rapid development of e-commerce. In some domains such as books, electronics, and CD/DVDs, online shopping has surpassed or even replaced the traditional shopping method. Compared with traditional retailing, e-commerce is information intensive. One of the key factors to succeed in e-business is how to facilitate the consumers’ approaches to discover a product. Conventionally a product search engine based on a keyword search or category browser is provided to help users find the product information they need. The general goal of a product search system is to enable users to quickly locate information of interest and to minimize users’ efforts in search and navigation. In this process human factors play a significant role. Finding product information could be a tricky task and may require an intelligent use of search engines, and a non-trivial navigation of multilayer categories. Searching for useful product information can be frustrating for many users, especially those inexperienced users. This dissertation focuses on developing a new visual product search system that effectively extracts the properties of unstructured products, and presents the possible items of attraction to users so that the users can quickly locate the ones they would be most likely interested in. We designed and developed a feature extraction algorithm that retains product color and local pattern features, and the experimental evaluation on the benchmark dataset demonstrated that it is robust against common geometric and photometric visual distortions. Besides, instead of ignoring product text information, we investigated and developed a ranking model learned via a unified probabilistic hypergraph that is capable of capturing correlations among product visual content and textual content. Moreover, we proposed and designed a fuzzy hierarchical co-clustering algorithm for the collaborative filtering product recommendation. Via this method, users can be automatically grouped into different interest communities based on their behaviors. Then, a customized recommendation can be performed according to these implicitly detected relations. In summary, the developed search system performs much better in a visual unstructured product search when compared with state-of-art approaches. With the comprehensive ranking scheme and the collaborative filtering recommendation module, the user’s overhead in locating the information of value is reduced, and the user’s experience of seeking for useful product information is optimized

    Cross domain recommender systems using matrix and tensor factorization

    Get PDF
    Today, the amount and importance of available data on the internet are growing exponentially. These digital data has become a primary source of information and the people’s life bonded to them tightly. The data comes in diverse shapes and from various resources and users utilize them in almost all their personal or social activities. However, selecting a desirable option from the huge list of available options can be really frustrating and time-consuming. Recommender systems aim to ease this process by finding the proper items which are more likely to be interested by users. Undoubtedly, there is not even one social media or online service which can continue its’ work properly without using recommender systems. On the other hand, almost all available recommendation techniques suffer from some common issues: the data sparsity, the cold-start, and the new-user problems. This thesis tackles the mentioned problems using different methods. While, most of the recommender methods rely on using single domain information, in this thesis, the main focus is on using multi-domain information to create cross-domain recommender systems. A cross-domain recommender system is not only able to handle the cold-start and new-user situations much better, but it also helps to incorporate different features exposed in diverse domains together and capture a better understanding of the users’ preferences which means producing more accurate recommendations. In this thesis, a pre-clustering stage is proposed to reduce the data sparsity as well. Various cross-domain knowledge-based recommender systems are suggested to recommend items in two popular social media, the Twitter and LinkedIn, by using different information available in both domains. The state of art techniques in this field, namely matrix factorization and tensor decomposition, are implemented to develop cross-domain recommender systems. The presented recommender systems based on the coupled nonnegative matrix factorization and PARAFAC-style tensor decomposition are evaluated using real-world datasets and it is shown that they superior to the baseline matrix factorization collaborative filtering. In addition, network analysis is performed on the extracted data from Twitter and LinkedIn

    Exploring New Computing Paradigms for Data-Intensive Applications

    Get PDF
    L'abstract Ăš presente nell'allegato / the abstract is in the attachmen