41 research outputs found

    Privacy-preserving distributed data mining

    Get PDF
    This thesis is concerned with privacy-preserving distributed data mining algorithms. The main challenges in this setting are inference attacks and the formation of collusion groups. The inference problem is the reconstruction of sensitive data by attackers from non-sensitive sources, such as intermediate results, exchanged messages, or public information. Moreover, in a distributed scenario, malicious insiders can organize collusion groups to deploy more effective inference attacks. This thesis shows that existing privacy measures do not adequately protect privacy against inference and collusion. Therefore, in this thesis, new measures based on information theory are developed to overcome the identiffied limitations. Furthermore, a new distributed data clustering algorithm is presented. The clustering approach is based on a kernel density estimates approximation that generates a controlled amount of ambiguity in the density estimates and provides privacy to original data. Besides, this thesis also introduces the first privacy-preserving algorithms for frequent pattern discovery in a distributed time series. Time series are transformed into a set of n-dimensional data points and finding frequent patterns reduced to finding local maxima in the n-dimensional density space. The proposed algorithms are linear in the size of the dataset with low communication costs, validated by experimental evaluation using different datasets.Diese Arbeit befasst sich mit vertraulichkeitsbewahrendem Data Mining in verteilten Umgebungen mit Schwerpunkt auf ausgewĂ€hlten N-Agenten-Angriffsszenarien fĂŒr das Inferenzproblem im Data-Clustering und der Zeitreihenanalyse. Dabei handelt es sich um Angriffe von einzelnen oder Teilgruppen von Agenten innerhalb einer verteilten Data Mining-Gruppe oder von einem einzelnen Agenten außerhalb dieser Gruppe. ZunĂ€chst werden in dieser Arbeit zwei neue Privacy-Maße vorgestellt, die im Gegensatz zu bislang existierenden, die im verteilten Data Mining allgemein geforderte Eigenschaften zur Vertraulichkeitsbewahrung erfĂŒllen und bei denen sich der gemessene Grad der Vertraulichkeit auf die verwendete Datenanalysemethode und die Anzahl von Angreifern bezieht. FĂŒr den Zweck eines vertraulichkeitsbewahrenden, verteilten Data-Clustering wird ein neues Kernel-DichteabschĂ€tzungsbasiertes Verfahren namens KDECS vorgestellt. KDECS verwendet eine Approximation der originalen, lokalen Kernel-DichteschĂ€tzung, so dass die ursprĂŒnglichen Daten anderer Agenten in der Data Mining-Gruppe mit einer höheren Wahrscheinlichkeit als einem hierfĂŒr vorgegebenen Wert nicht mehr zu rekonstruieren sind. Das Verfahren ist nachweislich sicherer als Data-Clustering mit generativen Mixture Modellen und SMC-basiert sicherem k-means Data-Clustering. ZusĂ€tzlich stellen wir neue Verfahren, namens DPD-TS, DPD-HE und DPDFS, fĂŒr eine vertraulichkeitsbewahrende, verteilte Mustererkennung in Zeitreihen vor, deren KomplexitĂ€t und Sicherheitsgrad wir mit den zuvor erwĂ€hnten neuen Privacy-Maßen analysieren. Dabei hĂ€ngt ein von einzelnen Agenten einer Data Mining-Gruppe jeweils vorgegebener, minimaler Sicherheitsgrad von DPD-TS und DPD-FS nur von der Dimensionsreduktion der Zeitreihenwerte und ihrer Diskretisierung ab und kann leicht ĂŒberprĂŒft werden. Einen noch besseren Schutz von sensiblen Daten bietet das Verfahren DPD HE mit Hilfe von homomorpher VerschlĂŒsselung. Neben der theoretischen Analyse wurden die experimentellen Leistungsbewertungen der entwickelten Verfahren mit verschiedenen, öffentlich verfĂŒgbaren DatensĂ€tzen durchgefĂŒhrt

    Security Supports for Cyber-Physical System and its Communication Networks

    Get PDF
    A cyber-physical system (CPS) is a sensing and communication platform that features tight integration and combination of computation, networking, and physical processes. In such a system, embedded computers and networks monitor and control the physical processes through a feedback loop, in which physical processes affect computations and vice versa. In recent years, CPS has caught much attention in many different aspects of research, such as security and privacy. In this dissertation, we focus on supporting security in CPS and its communication networks. First, we investigate the electric power system, which is an important CPS in modern society. as crucial and valuable infrastructure, the electric power system inevitably becomes the target of malicious users and attackers. In our work, we point out that the electric power system is vulnerable to potential cyber attacks, and we introduce a new type of attack model, in which an attack cannot be completely identified, even though its presence may be detected. to defend against such an attack, we present an efficient heuristic algorithm to narrow down the attack region, and then enumerate all feasible attack scenarios. Furthermore, based on the feasible attack scenarios, we design an optimization strategy to minimize the damage caused by the attack. Next, we study cognitive radio networks, which are a typical communication network in CPS in the areas of security and privacy. as for the security of cognitive radio networks, we point out that a prominent existing algorithm in cooperative spectrum sensing works poorly under a certain attack model. In defense of this attack, we present a modified combinatorial optimization algorithm that utilizes the branch-and-bound method in a decision tree to identify all possible false data efficiently. In regard to privacy in cognitive radio networks, we consider incentive-based cognitive radio transactions, where the primary users sell time slices of their licensed spectrum to secondary users in the network. There are two concerns in such a transaction. The first is the primary user\u27s interest, and the second is the secondary user\u27s privacy. to verify that the payment made by a secondary user is trustworthy, the primary user needs detailed spectrum utilization information from the secondary user. However, disclosing this detailed information compromises the secondary user\u27s privacy. to solve this dilemma, we propose a privacy-preserving scheme by repeatedly using a commitment scheme and zero-knowledge proof scheme

    Machine learning and blockchain technologies for cybersecurity in connected vehicles

    Get PDF
    Future connected and autonomous vehicles (CAVs) must be secured againstcyberattacks for their everyday functions on the road so that safety of passengersand vehicles can be ensured. This article presents a holistic review of cybersecurityattacks on sensors and threats regardingmulti-modal sensor fusion. A compre-hensive review of cyberattacks on intra-vehicle and inter-vehicle communicationsis presented afterward. Besides the analysis of conventional cybersecurity threatsand countermeasures for CAV systems,a detailed review of modern machinelearning, federated learning, and blockchain approach is also conducted to safe-guard CAVs. Machine learning and data mining-aided intrusion detection systemsand other countermeasures dealing with these challenges are elaborated at theend of the related section. In the last section, research challenges and future direc-tions are identified

    Database anonymization and protections of sensitive attributes

    Get PDF
    The importance of database anonymization has become increasingly critical for organizations that publish their database to the public. Current security measures for anonymization poses different manner of drawbacks. k-anonymity is prone to many varieties of attack; !-diversity does not work well with categorical or numerical attributes; t-closeness erases too much information in the database. Moreover, some measures of information loss are designed for anonymization measure, such as k-anonymity, where sensitive attributes do not play a part in measuring database's security. Not measuring the re-distribution of sensitive attributes will result in an underestimate for information loss such as 1- diversity or t-closeness which intentionally tries removing the association between non-sensitive attributes and sensitive attributes for better protecting individuals from being indentified. This thesis provides a more generalized version of !-diversity that will better protect categorical attributes and numerical attributes and analyzes the effectiveness and complexity of our new security scheme. Another focus of this thesis is to design a better approach of measuring information loss and lay down a new standard for evaluating information loss on security measures such as 1- diversity and t-closeness and quantify actual information loss from deliberately hiding relations between non-sensitive attributes and sensitive attributes. This new standard of information loss measure should provide a better estimation of the data mining potential remained in a generalized database. This thesis also proves that unlike k-anonymity which can be solved in polynomial time when k=2. 1-diversity in fact remains NP-Hard in the special case where 1=2, and even when there are only 2 possible sensitive attributes in the alphabet

    The Ethical Issues of Location-Based Services on Big Data and IoT

    Get PDF
    Both Internet of Things (IoT) and big data are hot topics in recent years. They indeed have brought about the change of business, promoted the progress of science and technology, and facilitated the lives of human beings. IoT creates the opportunity to connect every item to the Internet, and countless science and technology have supported the achievement of this goal. LBS is one of the indispensable technologies. It brings significant benefits to the business community, the individual, the society, and the national defense. However, at the same time, an individual’s personal information is disclosed and even attacked by ‘information thieves’. An inevitable reality is that the prerequisite of getting a location service is to expose your position first. Therefore, the privacy-related ethics issues are generated, and the danger is imminent, although there are corresponding protective measures

    Quantitative Evaluation and Reevaluation of Security in Services

    Get PDF
    Services are software components or systems designed to support interoperable machine or application-oriented interaction over a network. The popularity of services grows because they are easily accessible, very flexible, provide reach functionality, and can constitute more complex services. During the service selection, the user considers not only functional requirements to a service but also security requirements. The user would like to be aware that security of the service satisfies security requirements before starting the exploitation of the service, i.e., before the service is granted to access assets of the user. Moreover, the user wants to be sure that security of the service satisfies security requirements during the exploitation which may last for a long period. Pursuing these two goals require security of the service to be evaluated before the exploitation and continuously reevaluated during the exploitation. This thesis aims at a framework consisting of several quantitative methods for evaluation and continuous reevaluation of security in services. The methods should help a user to select a service and to control the service security level during the exploitation. The thesis starts with the formal model for general quantitative security metrics and for risk that may be used for the evaluation of security in services. Next, we adjust the computation of security metrics with a refined model of an attacker. Then, the thesis proposes a general method for the evaluation of security of a complex service composed from several simple services using different security metrics. The method helps to select the most secure design of the complex service. In addition, the thesis describes an approach based on the Usage Control (UCON) model for continuous reevaluation of security in services. Finally, the thesis discusses several strategies for a cost-effective decision making in the UCON unde

    CARD: Concealed and remote discovery of IoT devices in victims\u27 home networks

    Get PDF
    Smart devices are becoming more common in the standard households. They range from lights to refrigerators and their functionality and applications continues to grow with consumer demand. This increase in networked, complex devices has also brought an increase in vulnerabilities in the average consumer\u27s home. There now exists an Internet of Things (IoT) ecosystem that creates new attack vectors for adversaries to spread malware, build botnets, and participate in other malicious activities. We will overview some of these new attack vectors as well as go over a framework that would allow an adversary to target a user\u27s home network and any other networks that user may join --Abstract, page iii

    Message traceback systems dancing with the devil

    Get PDF
    The research community has produced a great deal of work in recent years in the areas of IP, layer 2 and connection-chain traceback. We collectively designate these as message traceback systems which, invariably aim to locate the origin of network data, in spite of any alterations effected to that data (whether legitimately or fraudulently). This thesis provides a unifying definition of spoofing and a classification based on this which aims to encompass all streams of message traceback research. The feasibility of this classification is established through its application to our literature review of the numerous known message traceback systems. We propose two layer 2 (L2) traceback systems, switch-SPIE and COTraSE, which adopt different approaches to logging based L2 traceback for switched ethernet. Whilst message traceback in spite of spoofing is interesting and perhaps more challenging than at first seems, one might say that it is rather academic. Logging of network data is a controversial and unpopular notion and network administrators don't want the added installation and maintenance costs. However, European Parliament Directive 2006/24/EC requires that providers of publicly available electronic communications networks retain data in a form similar to mobile telephony call records, from April 2009 and for periods of up to 2 years. This thesis identifies the relevance of work in all areas of message traceback to the European data retention legislation. In the final part of this thesis we apply our experiences with L2 traceback, together with our definitions and classification of spoofing to discuss the issues that EU data retention implementations should consider. It is possible to 'do logging right' and even safeguard user privacy. However this can only occur if we fully understand the technical challenges, requiring much further work in all areas of logging based, message traceback systems. We have no choice but to dance with the devil.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    User-Centric Security and Privacy Mechanisms in Untrusted Networking and Computing Environments

    Get PDF
    Our modern society is increasingly relying on the collection, processing, and sharing of digital information. There are two fundamental trends: (1) Enabled by the rapid developments in sensor, wireless, and networking technologies, communication and networking are becoming more and more pervasive and ad hoc. (2) Driven by the explosive growth of hardware and software capabilities, computation power is becoming a public utility and information is often stored in centralized servers which facilitate ubiquitous access and sharing. Many emerging platforms and systems hinge on both dimensions, such as E-healthcare and Smart Grid. However, the majority information handled by these critical systems is usually sensitive and of high value, while various security breaches could compromise the social welfare of these systems. Thus there is an urgent need to develop security and privacy mechanisms to protect the authenticity, integrity and confidentiality of the collected data, and to control the disclosure of private information. In achieving that, two unique challenges arise: (1) There lacks centralized trusted parties in pervasive networking; (2) The remote data servers tend not to be trusted by system users in handling their data. They make existing security solutions developed for traditional networked information systems unsuitable. To this end, in this dissertation we propose a series of user-centric security and privacy mechanisms that resolve these challenging issues in untrusted network and computing environments, spanning wireless body area networks (WBAN), mobile social networks (MSN), and cloud computing. The main contributions of this dissertation are fourfold. First, we propose a secure ad hoc trust initialization protocol for WBAN, without relying on any pre-established security context among nodes, while defending against a powerful wireless attacker that may or may not compromise sensor nodes. The protocol is highly usable for a human user. Second, we present novel schemes for sharing sensitive information among distributed mobile hosts in MSN which preserves user privacy, where the users neither need to fully trust each other nor rely on any central trusted party. Third, to realize owner-controlled sharing of sensitive data stored on untrusted servers, we put forward a data access control framework using Multi-Authority Attribute-Based Encryption (ABE), that supports scalable fine-grained access and on-demand user revocation, and is free of key-escrow. Finally, we propose mechanisms for authorized keyword search over encrypted data on untrusted servers, with efficient multi-dimensional range, subset and equality query capabilities, and with enhanced search privacy. The common characteristic of our contributions is they minimize the extent of trust that users must place in the corresponding network or computing environments, in a way that is user-centric, i.e., favoring individual owners/users
    corecore