6 research outputs found
Privacy-Preserving OLAP-based monitoring of data streams: The PP-OMDS approach
In this paper, we propose PP-OMDS (Privacy-Preserving OLAP-based Monitoring of Data Streams), an innovative framework for supporting the OLAP-based monitoring of data streams, which is relevant for a plethora of application scenarios (e.g., security, emergency management, and so forth), in a privacy-preserving manner. The paper describes motivations, principles and achievements of the PP-OMDS framework, along with technological advancements and innovations. We also incorporate a detailed comparative analysis with competitive frameworks, along with a trade-off analysis
Parallel and distributed clustering framework for big spatial data mining
Clustering techniques are very attractive for identifying and extracting patterns of interests from datasets. However, their application to very large spatial datasets presents numerous challenges such as high-dimensionality, heterogeneity, and high complexity of some algorithms. Distributed clustering techniques constitute a very good alternative to the Big Data challenges (e.g., Volume, Variety, Veracity, and Velocity). In this paper, we developed and implemented a Dynamic Parallel and Distributed clustering (DPDC) approach that can analyse Big Data within a reasonable response time and produce accurate results, by using existing and current computing and storage infrastructure, such as cloud computing. The DPDC approach consists of two phases. The first phase is fully parallel and it generates local clusters and the second phase aggregates the local results to obtain global clusters. The aggregation phase is designed in such a way that the final clusters are compact and accurate while the overall process is efficient in time and memory allocation. DPDC was thoroughly tested and compared to well-known clustering algorithms BIRCH and CURE. The results show that the approach not only produces high-quality results but also scales up very well by taking advantage of the Hadoop MapReduce paradigm or any distributed system
Next Generation of Product Search and Discovery
Online shopping has become an important part of peopleâs daily life with the rapid development of e-commerce. In some domains such as books, electronics, and CD/DVDs, online shopping has surpassed or even replaced the traditional shopping method. Compared with traditional retailing, e-commerce is information intensive. One of the key factors to succeed in e-business is how to facilitate the consumersâ approaches to discover a product. Conventionally a product search engine based on a keyword search or category browser is provided to help users find the product information they need. The general goal of a product search system is to enable users to quickly locate information of interest and to minimize usersâ efforts in search and navigation. In this process human factors play a significant role. Finding product information could be a tricky task and may require an intelligent use of search engines, and a non-trivial navigation of multilayer categories. Searching for useful product information can be frustrating for many users, especially those inexperienced users.
This dissertation focuses on developing a new visual product search system that effectively extracts the properties of unstructured products, and presents the possible items of attraction to users so that the users can quickly locate the ones they would be most likely interested in. We designed and developed a feature extraction algorithm that retains product color and local pattern features, and the experimental evaluation on the benchmark dataset demonstrated that it is robust against common geometric and photometric visual distortions. Besides, instead of ignoring product text information, we investigated and developed a ranking model learned via a unified probabilistic hypergraph that is capable of capturing correlations among product visual content and textual content. Moreover, we proposed and designed a fuzzy hierarchical co-clustering algorithm for the collaborative filtering product recommendation. Via this method, users can be automatically grouped into different interest communities based on their behaviors. Then, a customized recommendation can be performed according to these implicitly detected relations. In summary, the developed search system performs much better in a visual unstructured product search when compared with state-of-art approaches. With the comprehensive ranking scheme and the collaborative filtering recommendation module, the userâs overhead in locating the information of value is reduced, and the userâs experience of seeking for useful product information is optimized
Cross domain recommender systems using matrix and tensor factorization
Today, the amount and importance of available data on the internet are growing exponentially. These digital data has become a primary source of information and the peopleâs life bonded to them tightly. The data comes in diverse shapes and from various resources and users utilize them in almost all their personal or social activities. However, selecting a desirable option from the huge list of available options can be really frustrating and time-consuming. Recommender systems aim to ease this process by finding the proper items which are more likely to be interested by users. Undoubtedly, there is not even one social media or online service which can continue itsâ work properly without using recommender systems. On the other hand, almost all available recommendation techniques suffer from some common issues: the data sparsity, the cold-start, and the new-user problems.
This thesis tackles the mentioned problems using different methods. While, most of the recommender methods rely on using single domain information, in this thesis, the main focus is on using multi-domain information to create cross-domain recommender systems. A cross-domain recommender system is not only able to handle the cold-start and new-user situations much better, but it also helps to incorporate different features exposed in diverse domains together and capture a better understanding of the usersâ preferences which means producing more accurate recommendations.
In this thesis, a pre-clustering stage is proposed to reduce the data sparsity as well. Various cross-domain knowledge-based recommender systems are suggested to recommend items in two popular social media, the Twitter and LinkedIn, by using different information available in both domains. The state of art techniques in this field, namely matrix factorization and tensor decomposition, are implemented to develop cross-domain recommender systems. The presented recommender systems based on the coupled nonnegative matrix factorization and PARAFAC-style tensor decomposition are evaluated using real-world datasets and it is shown that they superior to the baseline matrix factorization collaborative filtering. In addition, network analysis is performed on the extracted data from Twitter and LinkedIn
Exploring New Computing Paradigms for Data-Intensive Applications
L'abstract Ăš presente nell'allegato / the abstract is in the attachmen
Recommended from our members
An Evaluation of Computational Methods to Support the Clinical Management of Chronic Disease Populations
Innovative primary care models that deliver comprehensive primary care to address medical and social needs are an established means of improving health outcomes and reducing healthcare costs among persons living with chronic disease. Care management is one such approach that requires providers to monitor their respective patient panels and intervene on patients requiring care. Health information technology (IT) has been established as a critical component of care management and similar care models. While there exist a plethora of health IT systems for facilitating primary care, there is limited research on their ability to support care management and its emphasis on monitoring panels of patients with complex needs. In this dissertation, I advance the understanding of how computational methods can better support clinicians delivering care management, and use the management of human immunodeficiency virus (HIV) as an example scenario of use.
The research described herein is segmented into 3 aims; the first was to understand the processes and barriers associated with care management and assess whether existing IT can support clinicians in this domain. The second and third aim focused on informing potential solutions to the technological shortcomings identified in the first aim. In the studies of the first aim, I conducted interviews and observations in two HIV primary care programs and analyzed the data generated to create a conceptual framework of population monitoring and identify challenges faced by clinicians in delivering care management. In the studies of the second aim, I used computational methods to advance the science of extracting from the patient record social and behavioral determinants of health (SBDH), which are not easily accessible to clinicians and represent an important barrier to care management. In the third aim, I conducted a controlled experimental evaluation to assess whether data visualization can improve clinicianâs ability to maintain awareness of their patient panels