Search CORE

397 research outputs found

On Collaborative Predictive Blacklisting

Author: De Cristofaro Emiliano
Melis Luca
Pyrgelis Apostolos
Publication venue
Publication date: 01/10/2018
Field of study

Collaborative predictive blacklisting (CPB) allows to forecast future attack sources based on logs and alerts contributed by multiple organizations. Unfortunately, however, research on CPB has only focused on increasing the number of predicted attacks but has not considered the impact on false positives and false negatives. Moreover, sharing alerts is often hindered by confidentiality, trust, and liability issues, which motivates the need for privacy-preserving approaches to the problem. In this paper, we present a measurement study of state-of-the-art CPB techniques, aiming to shed light on the actual impact of collaboration. To this end, we reproduce and measure two systems: a non privacy-friendly one that uses a trusted coordinating party with access to all alerts (Soldo et al., 2010) and a peer-to-peer one using privacy-preserving data sharing (Freudiger et al., 2015). We show that, while collaboration boosts the number of predicted attacks, it also yields high false positives, ultimately leading to poor accuracy. This motivates us to present a hybrid approach, using a semi-trusted central entity, aiming to increase utility from collaboration while, at the same time, limiting information disclosure and false positives. This leads to a better trade-off of true and false positive rates, while at the same time addressing privacy concerns.Comment: A preliminary version of this paper appears in ACM SIGCOMM's Computer Communication Review (Volume 48 Issue 5, October 2018). This is the full versio

arXiv.org e-Print Archive

UCL Discovery

Differentially Private Mixture of Generative Neural Networks

Author: Acs Gergely
Castelluccia Claude
De Cristofaro Emiliano
Melis Luca
Publication venue
Publication date: 18/11/2017
Field of study

Generative models are used in a wide range of applications building on large amounts of contextually rich information. Due to possible privacy violations of the individuals whose data is used to train these models, however, publishing or sharing generative models is not always viable. In this paper, we present a novel technique for privately releasing generative models and entire high-dimensional datasets produced by these models. We model the generator distribution of the training data with a mixture of

k

generative neural networks. These are trained together and collectively learn the generator distribution of a dataset. Data is divided into

k

clusters, using a novel differentially private kernel

k

-means, then each cluster is given to separate generative neural networks, such as Restricted Boltzmann Machines or Variational Autoencoders, which are trained only on their own cluster using differentially private gradient descent. We evaluate our approach using the MNIST dataset, as well as call detail records and transit datasets, showing that it produces realistic synthetic samples, which can also be used to accurately compute arbitrary number of counting queries.Comment: A shorter version of this paper appeared at the 17th IEEE International Conference on Data Mining (ICDM 2017). This is the full version, published in IEEE Transactions on Knowledge and Data Engineering (TKDE

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

ReMasker: Imputing Tabular Data with Masked Autoencoding

Author: Du Tianyu
Melis Luca
Wang Ting
Publication venue
Publication date: 24/09/2023
Field of study

We present ReMasker, a new method of imputing missing values in tabular data by extending the masked autoencoding framework. Compared with prior work, ReMasker is both simple -- besides the missing values (i.e., naturally masked), we randomly ``re-mask'' another set of values, optimize the autoencoder by reconstructing this re-masked set, and apply the trained model to predict the missing values; and effective -- with extensive evaluation on benchmark datasets, we show that ReMasker performs on par with or outperforms state-of-the-art methods in terms of both imputation fidelity and utility under various missingness settings, while its performance advantage often increases with the ratio of missing data. We further explore theoretical justification for its effectiveness, showing that ReMasker tends to learn missingness-invariant representations of tabular data. Our findings indicate that masked modeling represents a promising direction for further research on tabular data imputation. The code is publicly available

arXiv.org e-Print Archive

Building and evaluating privacy-preserving data processing systems

Author: Melis Luca
Publication venue: UCL (University College London)
Publication date: 28/08/2018
Field of study

Large-scale data processing prompts a number of important challenges, including guaranteeing that collected or published data is not misused, preventing disclosure of sensitive information, and deploying privacy protection frameworks that support usable and scalable services. In this dissertation, we study and build systems geared for privacy-friendly data processing, enabling computational scenarios and applications where potentially sensitive data can be used to extract useful knowledge, and which would otherwise be impossible without such strong privacy guarantees. For instance, we show how to privately and efficiently aggregate data from many sources and large streams, and how to use the aggregates to extract useful statistics and train simple machine learning models. We also present a novel technique for privately releasing generative machine learning models and entire high-dimensional datasets produced by these models. Finally, we demonstrate that the data used by participants in training generative and collaborative learning models may be vulnerable to inference attacks and discuss possible mitigation strategies

UCL Discovery

Novel homogeneous selective electrocatalysts for CO2 reduction: an electrochemical and computational study of cyclopentadienyl-phenylendiamino-cobalt complexes

Author: Melis Nicola
Mocci Francesca
Pilia Luca
Vacca Annalisa
Publication venue: 'Royal Society of Chemistry (RSC)'
Publication date: 01/01/2020
Field of study

Four cyclopentadienyl-phenylendiamino-cobalt complexes [CoCp(bqdi)] with different substituents (R) at the phenylene moiety (bqdi, I; o-perfluoro-bqdi, II; p-NO2-bqdi, III; p-COOH-bqdi, IV) have been studied with an aim to investigate their capability as catalysts for the CO2 reduction. These compounds were characterized by cyclic voltammetry measurements both under nitrogen and CO2 atmospheres, showing an increase in the cathodic current ranging from 3.36 (III) to 5.59 times (II) that of the measurement under nitrogen. Moreover, with the addition of water, the current enhancement in the presence of CO2 reaches 31.07 times that of the case of complex II. Interestingly, these complexes exhibit very good selectivity toward CO2 reduction irrespective of hydrogen even in the presence of water. The relative turnover frequencies were also estimated, given the values ranging from 3.23 (III) to 187.21 s−1 (II) in the presence of water. In addition, these results were analysed by means of density functional theory (DFT) calculations and Fukui functions analysis. In particular, DFT results clearly show effects of different substituents on the electrochemical properties of these compounds. Whereas, the Fukui functions analysis indicates that the most favourable positions for an electrophilic attack on the reduced complex are the nitrogen and cobalt atoms

Archivio istituzionale della ricerca - Università di Cagliari

Sulphur vs NH Group: Effects on the CO2 Electroreduction Capability of Phenylenediamine-Cp Cobalt Complexes

Author: Melis Nicola
Mocci Francesca
Pilia Luca
Vacca Annalisa
Publication venue: 'MDPI AG'
Publication date: 01/01/2023
Field of study

The cobalt complex (I) with cyclopentadienyl and 2-aminothiophenolate ligands was investigated as a homogeneous catalyst for electrochemical CO2 reduction. By comparing its behavior with an analogous complex with the phenylenediamine (II), the effect of sulfur atom as a substituent has been evaluated. As a result, a positive shift of the reduction potential and the reversibility of the corresponding redox process have been observed, also suggesting a higher stability of the compound with sulfur. Under anhydrous conditions, complex I showed a higher current enhancement in the presence of CO2 (9.41) in comparison with II (4.12). Moreover, the presence of only one -NH group in I explained the difference in the observed increases on the catalytic activity toward CO2 due to the presence of water, with current enhancements of 22.73 and 24.40 for I and II, respectively. DFT calculations confirmed the effect of sulfur on the lowering of the energy of the frontier orbitals of I, highlighted by electrochemical measurements. Furthermore, the condensed Fukui function f - values agreed very well with the current enhancement observed in the absence of water

Archivio istituzionale della ricerca - Università di Cagliari

On the Use of Tri-Stereo Pleiades Images for the Morphometric Measurement of Dolines in the Basaltic Plateau of Azrou (Middle Atlas, Morocco)

Author: De Waele Jo
Melis Maria Teresa
Pisani Luca
Publication venue: 'MDPI AG'
Publication date: 01/01/2021
Field of study

Hundreds of large and deep collapse dolines dot the surface of the Quaternary basaltic plateau of Azrou, in the Middle Atlas of Morocco. In the absence of detailed topographic maps, the morphometric study of such a large number of features requires the use of remote sensing techniques. We present the processing, extraction, and validation of depth measurements of 89 dolines using tri-stereo Pleiades images acquired in 2018–2019 (the European Space Agency (ESA) © CNES 2018, distributed by Airbus DS). Satellite image-derived DEMs were field-verified using traditional mapping techniques, which showed a very good agreement between field and remote sensing measures. The high resolution of these tri-stereo images allowed to automatically generate accurate morphometric datasets not only regarding the planimetric parameters of the dolines (diameters, contours, orientation of long axes), but also for what concerns their depth and altimetric profiles. Our study demonstrates the potential of using these types of images on rugged morphologies and for the measurement of steep depressions, where traditional remote sensing techniques may be hindered by shadow zones and blind portions. Tri-stereo images might also be suitable for the measurement of deep and steep depressions (skylights and collapses) on Martian and Lunar lava flows, suitable targets for future planetary cave exploration

Archivio istituzionale della ricerca - Università di Cagliari

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Evaluating Privacy Leakage in Split Learning

Author: Leontiadis Ilias
Melis Luca
Qiu Xinchi
Sablayrolles Alex
Stock Pierre
Publication venue
Publication date: 19/01/2024
Field of study

Privacy-Preserving machine learning (PPML) can help us train and deploy models that utilize private information. In particular, on-device machine learning allows us to avoid sharing raw data with a third-party server during inference. On-device models are typically less accurate when compared to their server counterparts due to the fact that (1) they typically only rely on a small set of on-device features and (2) they need to be small enough to run efficiently on end-user devices. Split Learning (SL) is a promising approach that can overcome these limitations. In SL, a large machine learning model is divided into two parts, with the bigger part residing on the server side and a smaller part executing on-device, aiming to incorporate the private features. However, end-to-end training of such models requires exchanging gradients at the cut layer, which might encode private features or labels. In this paper, we provide insights into potential privacy risks associated with SL. Furthermore, we also investigate the effectiveness of various mitigation strategies. Our results indicate that the gradients significantly improve the attackers' effectiveness in all tested datasets reaching almost perfect reconstruction accuracy for some features. However, a small amount of differential privacy (DP) can effectively mitigate this risk without causing significant training degradation.Comment: 10 page

arXiv.org e-Print Archive

Moving towards happiness? Understanding travel moods through twitter data in Turin

Author: Melis Giulia
Rybarczyk Greg
Staricco Luca
VITALE BROVARONE Elisabetta
Publication venue: Universitade de Lisboa
Publication date
Field of study

The paper will address the following questions: does urban mobility matter for health, and mental health in particular? How does each transport mode relate to our level of stress/happiness? A previous study conducted on Turin (Melis et al. 2015) showed that among indicators related to urban structure and social composition, ‘accessibility by public transport’ seems to be the one with strongest relation with mental health (depression) outcomes. Starting from this results, we decided to further explore this association through the use of data from social media. Recent trends in the use of social networks have opened up new opportunities in the field of urban and transport studies: the great amount of data coming from Twitter is an example, providing easily available, often geo-referenced, marginally costly, datasets offering new insights on individual and collective life. The accuracy and reliability, as well as representativeness of the results coming from the use of this new source of data in the mobility and planning field is undoubtedly growing. The project uses Twitter data collected for the metropolitan area of Turin (IT) and analyses it using a Semantic Analysis algorithm to show spatiotemporal levels of happiness (valence) of users, related to the transport mode they have been using. Geographic Information Systems (GIS) and spatial analysis techniques are then used to visualize spatial patterns and associations among happiness levels and contextual variables, such as land-use. From a methodological point of view, results can be compared to research conducted on US cities by Flint University (Rybarczyk and Banerjee 2015), as the method used is the same. The purpose of the study is exploratory, in order to understand which use can be done of such a rich data source as social media information. Therefore, the results may be used to promote the use of social media data by transportation planners and public health officials for developing more effective transportation plans and policies, as well as to understand the degree of satisfaction/stress linked to different transport modes

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)