Search CORE

390 research outputs found

ARCHANGEL: Tamper-proofing Video Archives using Temporal Content Hashes on the Blockchain

Author: Bell Mark
Brown Alan
Bui Tu
Collomosse John
Cooper Daniel
Das Arindra
Green Alex
Higgins Jez
Keller Jared
Sheridan John
Thereaux Olivier
Publication venue
Publication date: 01/01/2019
Field of study

We present ARCHANGEL; a novel distributed ledger based system for assuring the long-term integrity of digital video archives. First, we describe a novel deep network architecture for computing compact temporal content hashes (TCHs) from audio-visual streams with durations of minutes or hours. Our TCHs are sensitive to accidental or malicious content modification (tampering) but invariant to the codec used to encode the video. This is necessary due to the curatorial requirement for archives to format shift video over time to ensure future accessibility. Second, we describe how the TCHs (and the models used to derive them) are secured via a proof-of-authority blockchain distributed across multiple independent archives. We report on the efficacy of ARCHANGEL within the context of a trial deployment in which the national government archives of the United Kingdom, Estonia and Norway participated.Comment: Accepted to CVPR Blockchain Workshop 201

arXiv.org e-Print Archive

Surrey Research Insight

Incentivizing Private Data Sharing in Vehicular Networks: A Game-Theoretic Approach

Author: AlSaqabi Yousef
Krishnamachari Bhaskar
Publication venue
Publication date: 21/09/2023
Field of study

In the context of evolving smart cities and autonomous transportation systems, Vehicular Ad-hoc Networks (VANETs) and the Internet of Vehicles (IoV) are growing in significance. Vehicles are becoming more than just a means of transportation; they are collecting, processing, and transmitting massive amounts of data to make driving safer and more convenient. However, this advancement ushers in complex issues concerning the centralized structure of traditional vehicular networks and the privacy and security concerns around vehicular data. This paper offers a novel, game-theoretic network architecture to address these challenges. Our approach decentralizes data collection through distributed servers across the network, aggregating vehicular data into spatio-temporal maps via secure multi-party computation (SMPC). This strategy effectively reduces the chances of adversaries reconstructing a vehicle's complete path, increasing privacy. We also introduce an economic model grounded in game theory that incentivizes vehicle owners to participate in the network, balancing the owners' privacy concerns with the monetary benefits of data sharing. This model aims to maximize the data consumer's utility from the gathered sensor data by determining the most suitable payment to participating vehicles, the frequency in which these vehicles share their data, and the total number of servers in the network. We explore the interdependencies among these parameters and present our findings accordingly. To define meaningful utility and loss functions for our study, we utilize a real dataset of vehicular movement traces.Comment: To Appear in the Proceedings of The 2023 IEEE 98th Vehicular Technology Conference (VTC2023-Fall), 6 pages, 5 figure

arXiv.org e-Print Archive

SMAP: A Novel Heterogeneous Information Framework for Scenario-based Optimal Model Assignment

Author: Cheng Ke
Ji Zehua
Mao Yuhao
Qiu Zekun
Xie Zhipu
Publication venue
Publication date: 22/05/2023
Field of study

The increasing maturity of big data applications has led to a proliferation of models targeting the same objectives within the same scenarios and datasets. However, selecting the most suitable model that considers model's features while taking specific requirements and constraints into account still poses a significant challenge. Existing methods have focused on worker-task assignments based on crowdsourcing, they neglect the scenario-dataset-model assignment problem. To address this challenge, a new problem named the Scenario-based Optimal Model Assignment (SOMA) problem is introduced and a novel framework entitled Scenario and Model Associative percepts (SMAP) is developed. SMAP is a heterogeneous information framework that can integrate various types of information to intelligently select a suitable dataset and allocate the optimal model for a specific scenario. To comprehensively evaluate models, a new score function that utilizes multi-head attention mechanisms is proposed. Moreover, a novel memory mechanism named the mnemonic center is developed to store the matched heterogeneous information and prevent duplicate matching. Six popular traffic scenarios are selected as study cases and extensive experiments are conducted on a dataset to verify the effectiveness and efficiency of SMAP and the score function

arXiv.org e-Print Archive

Storage Solutions for Big Data Systems: A Qualitative Study and Comparison

Author: Alam Mansaf
Ali Syed Arshad
Khan Samiya
Liu Xiufeng
Publication venue
Publication date: 01/01/2019
Field of study

Big data systems development is full of challenges in view of the variety of application areas and domains that this technology promises to serve. Typically, fundamental design decisions involved in big data systems design include choosing appropriate storage and computing infrastructures. In this age of heterogeneous systems that integrate different technologies for optimized solution to a specific real world problem, big data system are not an exception to any such rule. As far as the storage aspect of any big data system is concerned, the primary facet in this regard is a storage infrastructure and NoSQL seems to be the right technology that fulfills its requirements. However, every big data application has variable data characteristics and thus, the corresponding data fits into a different data model. This paper presents feature and use case analysis and comparison of the four main data models namely document oriented, key value, graph and wide column. Moreover, a feature analysis of 80 NoSQL solutions has been provided, elaborating on the criteria and points that a developer must consider while making a possible choice. Typically, big data storage needs to communicate with the execution engine and other processing and visualization technologies to create a comprehensive solution. This brings forth second facet of big data storage, big data file formats, into picture. The second half of the research paper compares the advantages, shortcomings and possible use cases of available big data file formats for Hadoop, which is the foundation for most big data computing technologies. Decentralized storage and blockchain are seen as the next generation of big data storage and its challenges and future prospects have also been discussed

arXiv.org e-Print Archive

Intelligent Computing for Big Data

Author
Publication venue: 'MDPI AG'
Publication date: 06/12/2022
Field of study

Recent advances in artificial intelligence have the potential to further develop current big data research. The Special Issue on ‘Intelligent Computing for Big Data’ highlighted a number of recent studies related to the use of intelligent computing techniques in the processing of big data for text mining, autism diagnosis, behaviour recognition, and blockchain-based storage

Secure Data Hiding for Contact Tracing

Author: Gotsman Craig
Hormann Kai
Publication venue
Publication date: 14/08/2020
Field of study

Contact tracing is an effective tool in controlling the spread of infectious diseases such as COVID-19. It involves digital monitoring and recording of physical proximity between people over time with a central and trusted authority, so that when one user reports infection, it is possible to identify all other users who have been in close proximity to that person during a relevant time period in the past and alert them. One way to achieve this involves recording on the server the locations, e.g. by reading and reporting the GPS coordinates of a smartphone, of all users over time. Despite its simplicity, privacy concerns have prevented widespread adoption of this method. Technology that would enable the "hiding" of data could go a long way towards alleviating privacy concerns and enable contact tracing at a very large scale. In this article we describe a general method to hide data. By hiding, we mean that instead of disclosing a data value x, we would disclose an "encoded" version of x, namely E(x), where E(x) is easy to compute but very difficult, from a computational point of view, to invert. We propose a general construction of such a function E and show that it guarantees perfect recall, namely, all individuals who have potentially been exposed to infection are alerted, at the price of an infinitesimal number of false alarms, namely, only a negligible number of individuals who have not actually been exposed will be wrongly informed that they have

arXiv.org e-Print Archive

Distributed Spatial Data Sharing: a new era in sharing spatial data

Author: Hojati Majid
Publication venue: Scholars Commons @ Laurier
Publication date: 01/01/2023
Field of study

The advancements in information and communications technology, including the widespread adoption of GPS-based sensors, improvements in computational data processing, and satellite imagery, have resulted in new data sources, stakeholders, and methods of producing, using, and sharing spatial data. Daily, vast amounts of data are produced by individuals interacting with digital content and through automated and semi-automated sensors deployed across the environment. A growing portion of this information contains geographic information directly or indirectly embedded within it. The widespread use of automated smart sensors and an increased variety of georeferenced media resulted in new individual data collectors. This raises a new set of social concerns around individual geopricacy and data ownership. These changes require new approaches to managing, sharing, and processing geographic data. With the appearance of distributed data-sharing technologies, some of these challenges may be addressed. This can be achieved by moving from centralized control and ownership of the data to a more distributed system. In such a system, the individuals are responsible for gathering and controlling access and storing data. Stepping into the new area of distributed spatial data sharing needs preparations, including developing tools and algorithms to work with spatial data in this new environment efficiently. Peer-to-peer (P2P) networks have become very popular for storing and sharing information in a decentralized approach. However, these networks lack the methods to process spatio-temporal queries. During the first chapter of this research, we propose a new spatio-temporal multi-level tree structure, Distributed Spatio-Temporal Tree (DSTree), which aims to address this problem. DSTree is capable of performing a range of spatio-temporal queries. We also propose a framework that uses blockchain to share a DSTree on the distributed network, and each user can replicate, query, or update it. Next, we proposed a dynamic k-anonymity algorithm to address geoprivacy concerns in distributed platforms. Individual dynamic control of geoprivacy is one of the primary purposes of the proposed framework introduced in this research. Sharing data within and between organizations can be enhanced by greater trust and transparency offered by distributed or decentralized technologies. Rather than depending on a central authority to manage geographic data, a decentralized framework would provide a fine-grained and transparent sharing capability. Users can also control the precision of shared spatial data with others. They are not limited to third-party algorithms to decide their privacy level and are also not limited to the binary levels of location sharing. As mentioned earlier, individuals and communities can benefit from distributed spatial data sharing. During the last chapter of this work, we develop an image-sharing platform, aka harvester safety application, for the Kakisa indigenous community in northern Canada. During this project, we investigate the potential of using a Distributed Spatial Data sharing (DSDS) infrastructure for small-scale data-sharing needs in indigenous communities. We explored the potential use case and challenges and proposed a DSDS architecture to allow users in small communities to share and query their data using DSDS. Looking at the current availability of distributed tools, the sustainable development of such applications needs accessible technology. We need easy-to-use tools to use distributed technologies on community-scale SDS. In conclusion, distributed technology is in its early stages and requires easy-to-use tools/methods and algorithms to handle, share and query geographic information. Once developed, it will be possible to contrast DSDS against other data systems and thereby evaluate the practical benefit of such systems. A distributed data-sharing platform needs a standard framework to share data between different entities. Just like the first decades of the appearance of the web, these tools need regulations and standards. Such can benefit individuals and small communities in the current chaotic spatial data-sharing environment controlled by the central bodies

Differential Privacy for Industrial Internet of Things: Opportunities, Applications and Challenges

Author: Jiang Bin
Li Jianqiang
Song Houbing
Yue Guanghui
Publication venue: Scholarly Commons
Publication date: 04/02/2021
Field of study

The development of Internet of Things (IoT) brings new changes to various fields. Particularly, industrial Internet of Things (IIoT) is promoting a new round of industrial revolution. With more applications of IIoT, privacy protection issues are emerging. Specially, some common algorithms in IIoT technology such as deep models strongly rely on data collection, which leads to the risk of privacy disclosure. Recently, differential privacy has been used to protect user-terminal privacy in IIoT, so it is necessary to make in-depth research on this topic. In this paper, we conduct a comprehensive survey on the opportunities, applications and challenges of differential privacy in IIoT. We firstly review related papers on IIoT and privacy protection, respectively. Then we focus on the metrics of industrial data privacy, and analyze the contradiction between data utilization for deep models and individual privacy protection. Several valuable problems are summarized and new research ideas are put forward. In conclusion, this survey is dedicated to complete comprehensive summary and lay foundation for the follow-up researches on industrial differential privacy

arXiv.org e-Print Archive

Embry-Riddle Aeronautical University