Search CORE

10,248 research outputs found

Graph Embedded One-Class Classifiers for media data classification

Author: Iosifidis Alexandros
Mygdalis Vasileios
Pitas Ioannis
Tefas Anastasios
Publication venue
Publication date: 01/12/2016
Field of study

Crossref

Explore Bristol Research

Social Fingerprinting: detection of spambot groups through DNA-inspired behavioral modeling

Author: Cresci Stefano
Di Pietro Roberto
Petrocchi Marinella
Spognardi Angelo
Tesconi Maurizio
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Spambot detection in online social networks is a long-lasting challenge involving the study and design of detection techniques capable of efficiently identifying ever-evolving spammers. Recently, a new wave of social spambots has emerged, with advanced human-like characteristics that allow them to go undetected even by current state-of-the-art algorithms. In this paper, we show that efficient spambots detection can be achieved via an in-depth analysis of their collective behaviors exploiting the digital DNA technique for modeling the behaviors of social network users. Inspired by its biological counterpart, in the digital DNA representation the behavioral lifetime of a digital account is encoded in a sequence of characters. Then, we define a similarity measure for such digital DNA sequences. We build upon digital DNA and the similarity between groups of users to characterize both genuine accounts and spambots. Leveraging such characterization, we design the Social Fingerprinting technique, which is able to discriminate among spambots and genuine accounts in both a supervised and an unsupervised fashion. We finally evaluate the effectiveness of Social Fingerprinting and we compare it with three state-of-the-art detection algorithms. Among the peculiarities of our approach is the possibility to apply off-the-shelf DNA analysis techniques to study online users behaviors and to efficiently rely on a limited number of lightweight account characteristics

arXiv.org e-Print Archive

Crossref

Archivio della ricerca- Università di Roma La Sapienza

Online Research Database In Technology

Leak localization in water distribution networks using pressure and data-driven classifier approach

Author: Cembrano Gennari Gabriela
Parellada Calderer Benjamí
Puig Cayuela Vicenç
Sun Congcong
Publication venue: 'MDPI AG'
Publication date: 01/01/2019
Field of study

Leaks in water distribution networks (WDNs) are one of the main reasons for water loss during fluid transportation. Considering the worldwide problem of water scarcity, added to the challenges that a growing population brings, minimizing water losses through leak detection and localization, timely and efficiently using advanced techniques is an urgent humanitarian need. There are numerous methods being used to localize water leaks in WDNs through constructing hydraulic models or analyzing flow/pressure deviations between the observed data and the estimated values. However, from the application perspective, it is very practical to implement an approach which does not rely too much on measurements and complex models with reasonable computation demand. Under this context, this paper presents a novel method for leak localization which uses a data-driven approach based on limit pressure measurements in WDNs with two stages included: (1) Two different machine learning classifiers based on linear discriminant analysis (LDA) and neural networks (NNET) are developed to determine the probabilities of each node having a leak inside a WDN; (2) Bayesian temporal reasoning is applied afterwards to rescale the probabilities of each possible leak location at each time step after a leak is detected, with the aim of improving the localization accuracy. As an initial illustration, the hypothetical benchmark Hanoi district metered area (DMA) is used as the case study to test the performance of the proposed approach. Using the fitting accuracy and average topological distance (ATD) as performance indicators, the preliminary results reaches more than 80% accuracy in the best cases.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Multiple Instance Learning: A Survey of Problem Characteristics and Applications

Author: Carbonneau Marc-André
Cheplygina Veronika
Gagnon Ghyslain
Granger Eric
Publication venue: 'Elsevier BV'
Publication date: 10/12/2016
Field of study

Multiple instance learning (MIL) is a form of weakly supervised learning where training instances are arranged in sets, called bags, and a label is provided for the entire bag. This formulation is gaining interest because it naturally fits various problems and allows to leverage weakly labeled data. Consequently, it has been used in diverse application fields such as computer vision and document classification. However, learning from bags raises important challenges that are unique to MIL. This paper provides a comprehensive survey of the characteristics which define and differentiate the types of MIL problems. Until now, these problem characteristics have not been formally identified and described. As a result, the variations in performance of MIL algorithms from one data set to another are difficult to explain. In this paper, MIL problem characteristics are grouped into four broad categories: the composition of the bags, the types of data distribution, the ambiguity of instance labels, and the task to be performed. Methods specialized to address each category are reviewed. Then, the extent to which these characteristics manifest themselves in key MIL application areas are described. Finally, experiments are conducted to compare the performance of 16 state-of-the-art MIL methods on selected problem characteristics. This paper provides insight on how the problem characteristics affect MIL algorithms, recommendations for future benchmarking and promising avenues for research

arXiv.org e-Print Archive

A cDNA Microarray Gene Expression Data Classifier for Clinical Diagnostics Based on Graph Theory

Author: Benso Alfredo
Di Carlo Stefano
Politano Gianfranco Michele Maria
Publication venue: IEEE Computer Society
Publication date: 01/01/2011
Field of study

Despite great advances in discovering cancer molecular profiles, the proper application of microarray technology to routine clinical diagnostics is still a challenge. Current practices in the classification of microarrays' data show two main limitations: the reliability of the training data sets used to build the classifiers, and the classifiers' performances, especially when the sample to be classified does not belong to any of the available classes. In this case, state-of-the-art algorithms usually produce a high rate of false positives that, in real diagnostic applications, are unacceptable. To address this problem, this paper presents a new cDNA microarray data classification algorithm based on graph theory and is able to overcome most of the limitations of known classification methodologies. The classifier works by analyzing gene expression data organized in an innovative data structure based on graphs, where vertices correspond to genes and edges to gene expression relationships. To demonstrate the novelty of the proposed approach, the authors present an experimental performance comparison between the proposed classifier and several state-of-the-art classification algorithm

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino