25 research outputs found

    MANET Mining: Mining Association Rules

    Get PDF

    Knowledge extraction from courses and online learning activities

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Business AnalyticsTechnological advancement has led to the increasing use of all types of electronic devices, which causes large volumes of data to be constantly generated and stored in repositories. This growth in data through Information Technology (IT) systems makes it necessary to continue its exploration and analysis to support institutions in the decision-making process. Due to the importance of education in society, this field has been the target of several studies over the years. Taking that into account, and knowing that association rules and regression analysis are among the most popular data mining algorithms for finding the hidden patterns in data, the purpose of this paper is to find exciting trends across courses considering the students’ grades, as well as study if, and to what extent, the student’s learning performance is related to their interaction in moodle. The data used were collected through the netp@ and moodle systems, consisting of all student learning data and activities/logs history. This data belongs to students of all masters who attended the academic years between 2012-2013 and 2020- 2021. We chose Sample, Explore, Modify, Model, and Assess (SEMMA) methodology for the applicability of its steps to accomplish the study’s goals. Through the Partial Least Squares Regression (PLSR) algorithm, it was shown that Gestão do Conhecimento, Metodologias de Investigação and Métodos Descritivos de Data Mining are the most importants courses that affect the grades of Dissertation/Work Project/Intership Report in the Business Intelligence specialization. In addition, according to the predictive model, Metodologias de Investigação was the most important variable for predicting the performance of the Dissertation/Work Project/Internship Report of Information Systems and Technologies Management specialization. Finally, the association rules algorithms used were the Apriori, FP-Growth and Eclat. From their results, it was found that courses with continuous assessment methods achieve better academic performance compared to others. Furthermore, higher levels of online interaction are associated with better achievement

    On Privacy-Enhanced Distributed Analytics in Online Social Networks

    Get PDF
    More than half of the world's population benefits from online social network (OSN) services. A considerable part of these services is mainly based on applying analytics on user data to infer their preferences and enrich their experience accordingly. At the same time, user data is monetized by service providers to run their business models. Therefore, providers tend to extensively collect (personal) data about users. However, this data is oftentimes used for various purposes without informed consent of the users. Providers share this data in different forms with third parties (e.g., data brokers). Moreover, user sensitive data was repeatedly a subject of unauthorized access by malicious parties. These issues have demonstrated the insufficient commitment of providers to user privacy, and consequently, raised users' concerns. Despite the emergence of privacy regulations (e.g., GDPR and CCPA), recent studies showed that user personal data collection and sharing sensitive data are still continuously increasing. A number of privacy-friendly OSNs have been proposed to enhance user privacy by reducing the need for central service providers. However, this improvement in privacy protection usually comes at the cost of losing social connectivity and many analytics-based services of the wide-spread OSNs. This dissertation addresses this issue by first proposing an approach to privacy-friendly OSNs that maintains established social connections. Second, approaches that allow users to collaboratively apply distributed analytics while preserving their privacy are presented. Finally, the dissertation contributes to better assessment and mitigation of the risks associated with distributed analytics. These three research directions are treated through the following six contributions. Conceptualizing Hybrid Online Social Networks: We conceptualize a hybrid approach to privacy-friendly OSNs, HOSN. This approach combines the benefits of using COSNs and DOSN. Users can maintain their social experience in their preferred COSN while being provided with additional means to enhance their privacy. Users can seamlessly post public content or private content that is accessible only by authorized users (friends) beyond the reach of the service providers. Improving the Trustworthiness of HOSNs: We conceptualize software features to address users' privacy concerns in OSNs. We prototype these features in our HOSN}approach and evaluate their impact on the privacy concerns and the trustworthiness of the approach. Also, we analyze the relationships between four important aspects that influence users' behavior in OSNs: privacy concerns, trust beliefs, risk beliefs, and the willingness to use. Privacy-Enhanced Association Rule Mining: We present an approach to enable users to apply efficiently privacy-enhanced association rule mining on distributed data. This approach can be employed in DOSN and HOSN to generate recommendations. We leverage a privacy-enhanced distributed graph sampling method to reduce the data required for the mining and lower the communication and computational overhead. Then, we apply a distributed frequent itemset mining algorithm in a privacy-friendly manner. Privacy Enhancements on Federated Learning (FL): We identify several privacy-related issues in the emerging distributed machine learning technique, FL. These issues are mainly due to the centralized nature of this technique. We discuss tackling these issues by applying FL in a hierarchical architecture. The benefits of this approach include a reduction in the centralization of control and the ability to place defense and verification methods more flexibly and efficiently within the hierarchy. Systematic Analysis of Threats in Federated Learning: We conduct a critical study of the existing attacks in FL to better understand the actual risk of these attacks under real-world scenarios. First, we structure the literature in this field and show the research foci and gaps. Then, we highlight a number of issues in (1) the assumptions commonly made by researchers and (2) the evaluation practices. Finally, we discuss the implications of these issues on the applicability of the proposed attacks and recommend several remedies. Label Leakage from Gradients: We identify a risk of information leakage when sharing gradients in FL. We demonstrate the severity of this risk by proposing a novel attack that extracts the user annotations that describe the data (i.e., ground-truth labels) from gradients. We show the high effectiveness of the attack under different settings such as different datasets and model architectures. We also test several defense mechanisms to mitigate this attack and conclude the effective ones

    Mobile Ad-Hoc Networks

    Get PDF
    Being infrastructure-less and without central administration control, wireless ad-hoc networking is playing a more and more important role in extending the coverage of traditional wireless infrastructure (cellular networks, wireless LAN, etc). This book includes state-of the-art techniques and solutions for wireless ad-hoc networks. It focuses on the following topics in ad-hoc networks: vehicular ad-hoc networks, security and caching, TCP in ad-hoc networks and emerging applications. It is targeted to provide network engineers and researchers with design guidelines for large scale wireless ad hoc networks

    Efficient Learning Machines

    Get PDF
    Computer scienc

    Cyber Security

    Get PDF
    This open access book constitutes the refereed proceedings of the 16th International Annual Conference on Cyber Security, CNCERT 2020, held in Beijing, China, in August 2020. The 17 papers presented were carefully reviewed and selected from 58 submissions. The papers are organized according to the following topical sections: access control; cryptography; denial-of-service attacks; hardware security implementation; intrusion/anomaly detection and malware mitigation; social network security and privacy; systems security

    Geographic Feature Mining: Framework and Fundamental Tasks for Geographic Knowledge Discovery from User-generated Data

    Get PDF
    We live in a data-rich environment where massive amounts of data such as text messages, articles, images, and search queries are continuously generated by users. In this environment, new opportunities to discover and utilize knowledge about the real-world arise, such as the extraction and description of places and events from social media records, the organization of documents by spatio-temporal topics, and the prediction of epidemics by search engine queries. Major challenges addressed in these data- and application-specific works arise from the unstructured and complex nature of the data, and the high level of uncertainty and sparsity of the attributes. Despite the evident progress in utilizing specific data sources for different applications, there remains a lack of common concepts and techniques on how to exploit the data as high-quality sensors of geographic space in a general manner. However, such a general point of view allows to address the common challenges and to define fundamental building blocks to deal with problems in fields like information retrieval, recommender systems, market research, health surveillance, and social sciences. In this thesis, we develop concepts and techniques to utilize various kinds of user-generated data as a steady source of information about geographic processes and entities (together called geographic phenomena). For this, we introduce a novel conceptual data mining framework, called geographic feature mining, that provides the foundation to discover and extract highly informative and discriminative dimensions of geographic space in a unifying and systematic fashion. This is achieved by representing the qualitative and geographic information in the records as geographic feature signals, each constituting a potential dimensions to describe geographic space. The mining process then determines highly informative features or feature combinations from the candidate sets that can be used as a steady source of auxiliary information for domain-specific applications. In developing the framework, we make contributions to several fundamental problems: (1) We introduce a novel probabilistic model to extract high-quality geographic feature signals. The signals are robust to noise and background distributions, and the model allows to exploit diverse kinds of qualitative and geographic information in the records. This flexibility is achieved by utilizing a Bayesian network model and the robustness by choosing appropriate prior distributions. (2) We address the problem of categorizing and selecting geographic features based on their spatio-temporal type, such as feature signals having landmark, regional, or global semantics. For this, we introduce representations of the signals by interaction characteristics and evaluate their performance in clustering and data summarization tasks. (3) To extract a small number of highly informative feature combinations that reflect geographic phenomena, we introduce a model that extracts latent geographic features from the candidate signals using dimensionality reduction. We show that this model outperforms document-centric topic models with respect to the informativeness of the extracted phenomena, and we exhaustively evaluate how different statistical properties of the approaches affect the characteristics of the resulting feature combinations

    Community-driven & Work-integrated Creation, Use and Evolution of Ontological Knowledge Structures

    Get PDF

    Proceedings of the 6th Dutch-Belgian Information Retrieval Workshop

    Get PDF
    corecore