17 research outputs found

    Secure Data Management : 10th VLDB workshop, SDM 2013, Trento, Italy, August 30, 2013. Proceedings

    No full text
    This book constitutes the refereed proceedings of the 10th VLDB Workshop on Secure Data Management held in Trento, Italy, on August 30, 2013. The 15 revised full papers and one keynote paper presented were carefully reviewed and selected from various submissions. The papers are organized in technical papers and 10 vision papers which address key challenges in secure data management and indicate interesting research questions

    Secure Data Management : 10th VLDB workshop, SDM 2013, Trento, Italy, August 30, 2013. Proceedings

    No full text
    This book constitutes the refereed proceedings of the 10th VLDB Workshop on Secure Data Management held in Trento, Italy, on August 30, 2013. The 15 revised full papers and one keynote paper presented were carefully reviewed and selected from various submissions. The papers are organized in technical papers and 10 vision papers which address key challenges in secure data management and indicate interesting research questions

    A Comprehensive Bibliometric Analysis on Social Network Anonymization: Current Approaches and Future Directions

    Full text link
    In recent decades, social network anonymization has become a crucial research field due to its pivotal role in preserving users' privacy. However, the high diversity of approaches introduced in relevant studies poses a challenge to gaining a profound understanding of the field. In response to this, the current study presents an exhaustive and well-structured bibliometric analysis of the social network anonymization field. To begin our research, related studies from the period of 2007-2022 were collected from the Scopus Database then pre-processed. Following this, the VOSviewer was used to visualize the network of authors' keywords. Subsequently, extensive statistical and network analyses were performed to identify the most prominent keywords and trending topics. Additionally, the application of co-word analysis through SciMAT and the Alluvial diagram allowed us to explore the themes of social network anonymization and scrutinize their evolution over time. These analyses culminated in an innovative taxonomy of the existing approaches and anticipation of potential trends in this domain. To the best of our knowledge, this is the first bibliometric analysis in the social network anonymization field, which offers a deeper understanding of the current state and an insightful roadmap for future research in this domain.Comment: 73 pages, 28 figure

    Efficient Sketching Algorithm for Sparse Binary Data

    Full text link
    Recent advancement of the WWW, IOT, social network, e-commerce, etc. have generated a large volume of data. These datasets are mostly represented by high dimensional and sparse datasets. Many fundamental subroutines of common data analytic tasks such as clustering, classification, ranking, nearest neighbour search, etc. scale poorly with the dimension of the dataset. In this work, we address this problem and propose a sketching (alternatively, dimensionality reduction) algorithm -- \binsketch (Binary Data Sketch) -- for sparse binary datasets. \binsketch preserves the binary version of the dataset after sketching and maintains estimates for multiple similarity measures such as Jaccard, Cosine, Inner-Product similarities, and Hamming distance, on the same sketch. We present a theoretical analysis of our algorithm and complement it with extensive experimentation on several real-world datasets. We compare the performance of our algorithm with the state-of-the-art algorithms on the task of mean-square-error and ranking. Our proposed algorithm offers a comparable accuracy while suggesting a significant speedup in the dimensionality reduction time, with respect to the other candidate algorithms. Our proposal is simple, easy to implement, and therefore can be adopted in practice

    Data-driven conceptual modeling: how some knowledge drivers for the enterprise might be mined from enterprise data

    Get PDF
    As organizations perform their business, they analyze, design and manage a variety of processes represented in models with different scopes and scale of complexity. Specifying these processes requires a certain level of modeling competence. However, this condition does not seem to be balanced with adequate capability of the person(s) who are responsible for the task of defining and modeling an organization or enterprise operation. On the other hand, an enterprise typically collects various records of all events occur during the operation of their processes. Records, such as the start and end of the tasks in a process instance, state transitions of objects impacted by the process execution, the message exchange during the process execution, etc., are maintained in enterprise repositories as various logs, such as event logs, process logs, effect logs, message logs, etc. Furthermore, the growth rate in the volume of these data generated by enterprise process execution has increased manyfold in just a few years. On top of these, models often considered as the dashboard view of an enterprise. Models represents an abstraction of the underlying reality of an enterprise. Models also served as the knowledge driver through which an enterprise can be managed. Data-driven extraction offers the capability to mine these knowledge drivers from enterprise data and leverage the mined models to establish the set of enterprise data that conforms with the desired behaviour. This thesis aimed to generate models or knowledge drivers from enterprise data to enable some type of dashboard view of enterprise to provide support for analysts. The rationale for this has been started as the requirement to improve an existing process or to create a new process. It was also mentioned models can also serve as a collection of effectors through which an organization or an enterprise can be managed. The enterprise data refer to above has been identified as process logs, effect logs, message logs, and invocation logs. The approach in this thesis is to mine these logs to generate process, requirement, and enterprise architecture models, and how goals get fulfilled based on collected operational data. The above a research question has been formulated as whether it is possible to derive the knowledge drivers from the enterprise data, which represent the running operation of the enterprise, or in other words, is it possible to use the available data in the enterprise repository to generate the knowledge drivers? . In Chapter 2, review of literature that can provide the necessary background knowledge to explore the above research question has been presented. Chapter 3 presents how process semantics can be mined. Chapter 4 suggest a way to extract a requirements model. The Chapter 5 presents a way to discover the underlying enterprise architecture and Chapter 6 presents a way to mine how goals get orchestrated. Overall finding have been discussed in Chapter 7 to derive some conclusions

    Trustworthiness in Mobile Cyber Physical Systems

    Get PDF
    Computing and communication capabilities are increasingly embedded in diverse objects and structures in the physical environment. They will link the ‘cyberworld’ of computing and communications with the physical world. These applications are called cyber physical systems (CPS). Obviously, the increased involvement of real-world entities leads to a greater demand for trustworthy systems. Hence, we use "system trustworthiness" here, which can guarantee continuous service in the presence of internal errors or external attacks. Mobile CPS (MCPS) is a prominent subcategory of CPS in which the physical component has no permanent location. Mobile Internet devices already provide ubiquitous platforms for building novel MCPS applications. The objective of this Special Issue is to contribute to research in modern/future trustworthy MCPS, including design, modeling, simulation, dependability, and so on. It is imperative to address the issues which are critical to their mobility, report significant advances in the underlying science, and discuss the challenges of development and implementation in various applications of MCPS

    Untersuchungen zur Risikominimierungstechnik Stealth Computing für verteilte datenverarbeitende Software-Anwendungen mit nutzerkontrollierbar zusicherbaren Eigenschaften

    Get PDF
    Die Sicherheit und Zuverlässigkeit von Anwendungen, welche schutzwürdige Daten verarbeiten, lässt sich durch die geschützte Verlagerung in die Cloud mit einer Kombination aus zielgrößenabhängiger Datenkodierung, kontinuierlicher mehrfacher Dienstauswahl, dienstabhängiger optimierter Datenverteilung und kodierungsabhängiger Algorithmen deutlich erhöhen und anwenderseitig kontrollieren. Die Kombination der Verfahren zu einer anwendungsintegrierten Stealth-Schutzschicht ist eine notwendige Grundlage für die Konstruktion sicherer Anwendungen mit zusicherbaren Sicherheitseigenschaften im Rahmen eines darauf angepassten Softwareentwicklungsprozesses.:1 Problemdarstellung 1.1 Einführung 1.2 Grundlegende Betrachtungen 1.3 Problemdefinition 1.4 Einordnung und Abgrenzung 2 Vorgehensweise und Problemlösungsmethodik 2.1 Annahmen und Beiträge 2.2 Wissenschaftliche Methoden 2.3 Struktur der Arbeit 3 Stealth-Kodierung für die abgesicherte Datennutzung 3.1 Datenkodierung 3.2 Datenverteilung 3.3 Semantische Verknüpfung verteilter kodierter Daten 3.4 Verarbeitung verteilter kodierter Daten 3.5 Zusammenfassung der Beiträge 4 Stealth-Konzepte für zuverlässige Dienste und Anwendungen 4.1 Überblick über Plattformkonzepte und -dienste 4.2 Netzwerkmultiplexerschnittstelle 4.3 Dateispeicherschnittstelle 4.4 Datenbankschnittstelle 4.5 Stromspeicherdienstschnittstelle 4.6 Ereignisverarbeitungsschnittstelle 4.7 Dienstintegration 4.8 Entwicklung von Anwendungen 4.9 Plattformäquivalente Cloud-Integration sicherer Dienste und Anwendungen 4.10 Zusammenfassung der Beiträge 5 Szenarien und Anwendungsfelder 5.1 Online-Speicherung von Dateien mit Suchfunktion 5.2 Persönliche Datenanalyse 5.3 Mehrwertdienste für das Internet der Dinge 6 Validierung 6.1 Infrastruktur für Experimente 6.2 Experimentelle Validierung der Datenkodierung 6.3 Experimentelle Validierung der Datenverteilung 6.4 Experimentelle Validierung der Datenverarbeitung 6.5 Funktionstüchtigkeit und Eigenschaften der Speicherdienstanbindung 6.6 Funktionstüchtigkeit und Eigenschaften der Speicherdienstintegration 6.7 Funktionstüchtigkeit und Eigenschaften der Datenverwaltung 6.8 Funktionstüchtigkeit und Eigenschaften der Datenstromverarbeitung 6.9 Integriertes Szenario: Online-Speicherung von Dateien 6.10 Integriertes Szenario: Persönliche Datenanalyse 6.11 Integriertes Szenario: Mobile Anwendungen für das Internet der Dinge 7 Zusammenfassung 7.1 Zusammenfassung der Beiträge 7.2 Kritische Diskussion und Bewertung 7.3 Ausblick Verzeichnisse Tabellenverzeichnis Abbildungsverzeichnis Listings Literaturverzeichnis Symbole und Notationen Software-Beiträge für native Cloud-Anwendungen Repositorien mit ExperimentdatenThe security and reliability of applications processing sensitive data can be significantly increased and controlled by the user by a combination of techniques. These encompass a targeted data coding, continuous multiple service selection, service-specific optimal data distribution and coding-specific algorithms. The combination of the techniques towards an application-integrated stealth protection layer is a necessary precondition for the construction of safe applications with guaranteeable safety properties in the context of a custom software development process.:1 Problemdarstellung 1.1 Einführung 1.2 Grundlegende Betrachtungen 1.3 Problemdefinition 1.4 Einordnung und Abgrenzung 2 Vorgehensweise und Problemlösungsmethodik 2.1 Annahmen und Beiträge 2.2 Wissenschaftliche Methoden 2.3 Struktur der Arbeit 3 Stealth-Kodierung für die abgesicherte Datennutzung 3.1 Datenkodierung 3.2 Datenverteilung 3.3 Semantische Verknüpfung verteilter kodierter Daten 3.4 Verarbeitung verteilter kodierter Daten 3.5 Zusammenfassung der Beiträge 4 Stealth-Konzepte für zuverlässige Dienste und Anwendungen 4.1 Überblick über Plattformkonzepte und -dienste 4.2 Netzwerkmultiplexerschnittstelle 4.3 Dateispeicherschnittstelle 4.4 Datenbankschnittstelle 4.5 Stromspeicherdienstschnittstelle 4.6 Ereignisverarbeitungsschnittstelle 4.7 Dienstintegration 4.8 Entwicklung von Anwendungen 4.9 Plattformäquivalente Cloud-Integration sicherer Dienste und Anwendungen 4.10 Zusammenfassung der Beiträge 5 Szenarien und Anwendungsfelder 5.1 Online-Speicherung von Dateien mit Suchfunktion 5.2 Persönliche Datenanalyse 5.3 Mehrwertdienste für das Internet der Dinge 6 Validierung 6.1 Infrastruktur für Experimente 6.2 Experimentelle Validierung der Datenkodierung 6.3 Experimentelle Validierung der Datenverteilung 6.4 Experimentelle Validierung der Datenverarbeitung 6.5 Funktionstüchtigkeit und Eigenschaften der Speicherdienstanbindung 6.6 Funktionstüchtigkeit und Eigenschaften der Speicherdienstintegration 6.7 Funktionstüchtigkeit und Eigenschaften der Datenverwaltung 6.8 Funktionstüchtigkeit und Eigenschaften der Datenstromverarbeitung 6.9 Integriertes Szenario: Online-Speicherung von Dateien 6.10 Integriertes Szenario: Persönliche Datenanalyse 6.11 Integriertes Szenario: Mobile Anwendungen für das Internet der Dinge 7 Zusammenfassung 7.1 Zusammenfassung der Beiträge 7.2 Kritische Diskussion und Bewertung 7.3 Ausblick Verzeichnisse Tabellenverzeichnis Abbildungsverzeichnis Listings Literaturverzeichnis Symbole und Notationen Software-Beiträge für native Cloud-Anwendungen Repositorien mit Experimentdate

    Advances in knowledge discovery and data mining Part II

    Get PDF
    19th Pacific-Asia Conference, PAKDD 2015, Ho Chi Minh City, Vietnam, May 19-22, 2015, Proceedings, Part II</p

    Hierarchical distributed fog-to-cloud data management in smart cities

    Get PDF
    There is a vast amount of data being generated every day in the world with different formats, quality levels, etc. This new data, together with the archived historical data, constitute the seed for future knowledge discovery and value generation in several fields of science and big data environments. Discovering value from data is a complex computing process where data is the key resource, not only during its processing, but also during its entire life cycle. However, there is still a huge concern about how to organize and manage this data in all fields for efficient usage and exploitation during all data life cycles. Although several specific Data LifeCycle (DLC) models have been recently defined for particular scenarios, we argue that there is no global and comprehensive DLC framework to be widely used in different fields. In particular scenario, smart cities are the current technological solutions to handle the challenges and complexity of the growing urban density. Traditionally, Smart City resources management rely on cloud based solutions where sensors data are collected to provide a centralized and rich set of open data. The advantages of cloud-based frameworks are their ubiquity, as well as an (almost) unlimited resources capacity. However, accessing data from the cloud implies large network traffic, high latencies usually not appropriate for real-time or critical solutions, as well as higher security risks. Alternatively, fog computing emerges as a promising technology to absorb these inconveniences. It proposes the use of devices at the edge to provide closer computing facilities and, therefore, reducing network traffic, reducing latencies drastically while improving security. We have defined a new framework for data management in the context of a Smart City through a global fog to cloud resources management architecture. This model has the advantages of both, fog and cloud technologies, as it allows reduced latencies for critical applications while being able to use the high computing capabilities of cloud technology. In this thesis, we propose many novel ideas in the design of a novel F2C Data Management architecture for smart cities as following. First, we draw and describe a comprehensive scenario agnostic Data LifeCycle model successfully addressing all challenges included in the 6Vs not tailored to any specific environment, but easy to be adapted to fit the requirements of any particular field. Then, we introduce the Smart City Comprehensive Data LifeCycle model, a data management architecture generated from a comprehensive scenario agnostic model, tailored for the particular scenario of Smart Cities. We define the management of each data life phase, and explain its implementation on a Smart City with Fog-to-Cloud (F2C) resources management. And then, we illustrate a novel architecture for data management in the context of a Smart City through a global fog to cloud resources management architecture. We show this model has the advantages of both, fog and cloud, as it allows reduced latencies for critical applications while being able to use the high computing capabilities of cloud technology. As a first experiment for the F2C data management architecture, a real Smart City is analyzed, corresponding to the city of Barcelona, with special emphasis on the layers responsible for collecting the data generated by the deployed sensors. The amount of daily sensors data transmitted through the network has been estimated and a rough projection has been made assuming an exhaustive deployment that fully covers all city. And, we provide some solutions to both reduce the data transmission and improve the data management. Then, we used some data filtering techniques (including data aggregation and data compression) to estimate the network traffic in this model during data collection and compare it with a traditional real system. Indeed, we estimate the total data storage sizes through F2C scenario for Barcelona smart citiesAl món es generen diàriament una gran quantitat de dades, amb diferents formats, nivells de qualitat, etc. Aquestes noves dades, juntament amb les dades històriques arxivades, constitueixen la llavor per al descobriment de coneixement i la generació de valor en diversos camps de la ciència i grans entorns de dades (big data). Descobrir el valor de les dades és un procés complex de càlcul on les dades són el recurs clau, no només durant el seu processament, sinó també durant tot el seu cicle de vida. Tanmateix, encara hi ha una gran preocupació per com organitzar i gestionar aquestes dades en tots els camps per a un ús i explotació eficients durant tots els cicles de vida de les dades. Encara que recentment s'han definit diversos models específics de Data LifeCycle (DLC) per a escenaris particulars, argumentem que no hi ha un marc global i complet de DLC que s'utilitzi àmpliament en diferents camps. En particular, les ciutats intel·ligents són les solucions tecnològiques actuals per fer front als reptes i la complexitat de la creixent densitat urbana. Tradicionalment, la gestió de recursos de Smart City es basa en solucions basades en núvol (cloud computing) on es recopilen dades de sensors per proporcionar un conjunt de dades obert i centralitzat. Les avantatges dels entorns basats en núvol són la seva ubiqüitat, així com una capacitat (gairebé) il·limitada de recursos. Tanmateix, l'accés a dades del núvol implica un gran trànsit de xarxa i, en general, les latències elevades no són apropiades per a solucions crítiques o en temps real, així com també per a riscos de seguretat més elevats. Alternativament, el processament de boira (fog computing) sorgeix com una tecnologia prometedora per absorbir aquests inconvenients. Proposa l'ús de dispositius a la vora per proporcionar recuirsos informàtics més propers i, per tant, reduir el trànsit de la xarxa, reduint les latències dràsticament mentre es millora la seguretat. Hem definit un nou marc per a la gestió de dades en el context d'una ciutat intel·ligent a través d'una arquitectura de gestió de recursos des de la boira fins al núvol (Fog-to-Cloud computing, o F2C). Aquest model té els avantatges combinats de les tecnologies de boira i de núvol, ja que permet reduir les latències per a aplicacions crítiques mentre es poden utilitzar les grans capacitats informàtiques de la tecnologia en núvol. En aquesta tesi, proposem algunes idees noves en el disseny d'una arquitectura F2C de gestió de dades per a ciutats intel·ligents. En primer lloc, dibuixem i descrivim un model de Data LifeCycle global agnòstic que aborda amb èxit tots els reptes inclosos en els 6V i no adaptats a un entorn específic, però fàcil d'adaptar-se als requisits de qualsevol camp en concret. A continuació, presentem el model de Data LifeCycle complet per a una ciutat intel·ligent, una arquitectura de gestió de dades generada a partir d'un model agnòstic d'escenari global, adaptat a l'escenari particular de ciutat intel·ligent. Definim la gestió de cada fase de la vida de les dades i expliquem la seva implementació en una ciutat intel·ligent amb gestió de recursos F2C. I, a continuació, il·lustrem la nova arquitectura per a la gestió de dades en el context d'una Smart City a través d'una arquitectura de gestió de recursos F2C. Mostrem que aquest model té els avantatges d'ambdues, la tecnologia de boira i de núvol, ja que permet reduir les latències per a aplicacions crítiques mentre es pot utilitzar la gran capacitat de processament de la tecnologia en núvol. Com a primer experiment per a l'arquitectura de gestió de dades F2C, s'analitza una ciutat intel·ligent real, corresponent a la ciutat de Barcelona, amb especial èmfasi en les capes responsables de recollir les dades generades pels sensors desplegats. S'ha estimat la quantitat de dades de sensors diàries que es transmet a través de la xarxa i s'ha realitzat una projecció aproximada assumint un desplegament exhaustiu que cobreix tota la ciutat
    corecore