3,817 research outputs found

    Social Fingerprinting: detection of spambot groups through DNA-inspired behavioral modeling

    Full text link
    Spambot detection in online social networks is a long-lasting challenge involving the study and design of detection techniques capable of efficiently identifying ever-evolving spammers. Recently, a new wave of social spambots has emerged, with advanced human-like characteristics that allow them to go undetected even by current state-of-the-art algorithms. In this paper, we show that efficient spambots detection can be achieved via an in-depth analysis of their collective behaviors exploiting the digital DNA technique for modeling the behaviors of social network users. Inspired by its biological counterpart, in the digital DNA representation the behavioral lifetime of a digital account is encoded in a sequence of characters. Then, we define a similarity measure for such digital DNA sequences. We build upon digital DNA and the similarity between groups of users to characterize both genuine accounts and spambots. Leveraging such characterization, we design the Social Fingerprinting technique, which is able to discriminate among spambots and genuine accounts in both a supervised and an unsupervised fashion. We finally evaluate the effectiveness of Social Fingerprinting and we compare it with three state-of-the-art detection algorithms. Among the peculiarities of our approach is the possibility to apply off-the-shelf DNA analysis techniques to study online users behaviors and to efficiently rely on a limited number of lightweight account characteristics

    Mining Heterogeneous Multivariate Time-Series for Learning Meaningful Patterns: Application to Home Health Telecare

    Full text link
    For the last years, time-series mining has become a challenging issue for researchers. An important application lies in most monitoring purposes, which require analyzing large sets of time-series for learning usual patterns. Any deviation from this learned profile is then considered as an unexpected situation. Moreover, complex applications may involve the temporal study of several heterogeneous parameters. In that paper, we propose a method for mining heterogeneous multivariate time-series for learning meaningful patterns. The proposed approach allows for mixed time-series -- containing both pattern and non-pattern data -- such as for imprecise matches, outliers, stretching and global translating of patterns instances in time. We present the early results of our approach in the context of monitoring the health status of a person at home. The purpose is to build a behavioral profile of a person by analyzing the time variations of several quantitative or qualitative parameters recorded through a provision of sensors installed in the home

    Querying recurrent convoys over trajectory data

    Get PDF
    National Research Foundation (NRF) Singapore under International Research Centres in Singapore Funding Initiativ

    Predicting Community Evolution in Social Networks

    Full text link
    Nowadays, sustained development of different social media can be observed worldwide. One of the relevant research domains intensively explored recently is analysis of social communities existing in social media as well as prediction of their future evolution taking into account collected historical evolution chains. These evolution chains proposed in the paper contain group states in the previous time frames and its historical transitions that were identified using one out of two methods: Stable Group Changes Identification (SGCI) and Group Evolution Discovery (GED). Based on the observed evolution chains of various length, structural network features are extracted, validated and selected as well as used to learn classification models. The experimental studies were performed on three real datasets with different profile: DBLP, Facebook and Polish blogosphere. The process of group prediction was analysed with respect to different classifiers as well as various descriptive feature sets extracted from evolution chains of different length. The results revealed that, in general, the longer evolution chains the better predictive abilities of the classification models. However, chains of length 3 to 7 enabled the GED-based method to almost reach its maximum possible prediction quality. For SGCI, this value was at the level of 3 to 5 last periods.Comment: Entropy 2015, 17, 1-x manuscripts; doi:10.3390/e170x000x 46 page

    The genome of the protozoan parasite Cystoisospora suis and a reverse vaccinology approach to identify vaccine candidates

    Get PDF
    Vaccine development targeting protozoan parasites remains challenging, partly due to the complex interactions between these eukaryotes and the host immune system. Reverse vaccinology is a promising approach for direct screening of genome sequence assemblies for new vaccine candidate proteins. Here, we applied this paradigm to Cystoisospora suis, an apicomplexan parasite that causes enteritis and diarrhea in suckling piglets and economic losses in pig production worldwide. Using Next Generation Sequencing we produced an ∌84 Mb sequence assembly for the C. suis genome, making it the first available reference for the genus Cystoisospora. Then, we derived a manually curated annotation of more than 11,000 protein-coding genes and applied the tool Vacceed to identify 1,168 vaccine candidates by screening the predicted C. suis proteome. To refine the set of candidates, we looked at proteins that are highly expressed in merozoites and specific to apicomplexans. The stringent set of candidates included 220 proteins, among which were 152 proteins with unknown function, 17 surface antigens of the SAG and SRS gene families, 12 proteins of the apicomplexan-specific secretory organelles including AMA1, MIC6, MIC13, ROP6, ROP12, ROP27, ROP32 and three proteins related to cell adhesion. Finally, we demonstrated in vitro the immunogenic potential of a C. suis-specific 42 kDa transmembrane protein, which might constitute an attractive candidate for further testing

    Behavioral Genetics Research and Criminal DNA Databases

    Get PDF
    Kaye discusses DNA databanks and the potential use of such databanks for behavioral genetics research. He addresses the concern that DNA databanks serve as a limitless repository for future research and that the samples used in the databanks could be used for research into a crime gene

    New directions in the analysis of movement patterns in space and time

    Get PDF

    KC Two-Way Clustering Algorithms For Multi-Child Semantic Maps In Image Mining

    Get PDF
    Image mining is now a thriving and expanding field of computer science research. Image mining is linked to the advancement of data mining in image preparation. Image mining is used to extract hidden information and in other situations where the photos do not clearly describe the situation. Image mining combines machine learning, data handling, application autonomy, and image preparation concepts. Semantic maps are used to visualize image data stored in image databases. We recommend using Multi-Child Semantic Maps to build semantic maps which fully display the image. In this study, we propose two path clustering on Multi-Child Semantic Maps (MCSM) using the K-C Means Clustering Algorithm, also known as the MCSMK-C algorithm. This algorithm causes image clustering and instructs the mining system to look at the image's top area. When mining, the MCSMK-C algorithm considers the X and Y coordinates. The system looks for groups by examining each object's territory in the database, and it saves a region if it contains more objects than the required number
    • 

    corecore