3,817 research outputs found
Social Fingerprinting: detection of spambot groups through DNA-inspired behavioral modeling
Spambot detection in online social networks is a long-lasting challenge
involving the study and design of detection techniques capable of efficiently
identifying ever-evolving spammers. Recently, a new wave of social spambots has
emerged, with advanced human-like characteristics that allow them to go
undetected even by current state-of-the-art algorithms. In this paper, we show
that efficient spambots detection can be achieved via an in-depth analysis of
their collective behaviors exploiting the digital DNA technique for modeling
the behaviors of social network users. Inspired by its biological counterpart,
in the digital DNA representation the behavioral lifetime of a digital account
is encoded in a sequence of characters. Then, we define a similarity measure
for such digital DNA sequences. We build upon digital DNA and the similarity
between groups of users to characterize both genuine accounts and spambots.
Leveraging such characterization, we design the Social Fingerprinting
technique, which is able to discriminate among spambots and genuine accounts in
both a supervised and an unsupervised fashion. We finally evaluate the
effectiveness of Social Fingerprinting and we compare it with three
state-of-the-art detection algorithms. Among the peculiarities of our approach
is the possibility to apply off-the-shelf DNA analysis techniques to study
online users behaviors and to efficiently rely on a limited number of
lightweight account characteristics
Mining Heterogeneous Multivariate Time-Series for Learning Meaningful Patterns: Application to Home Health Telecare
For the last years, time-series mining has become a challenging issue for
researchers. An important application lies in most monitoring purposes, which
require analyzing large sets of time-series for learning usual patterns. Any
deviation from this learned profile is then considered as an unexpected
situation. Moreover, complex applications may involve the temporal study of
several heterogeneous parameters. In that paper, we propose a method for mining
heterogeneous multivariate time-series for learning meaningful patterns. The
proposed approach allows for mixed time-series -- containing both pattern and
non-pattern data -- such as for imprecise matches, outliers, stretching and
global translating of patterns instances in time. We present the early results
of our approach in the context of monitoring the health status of a person at
home. The purpose is to build a behavioral profile of a person by analyzing the
time variations of several quantitative or qualitative parameters recorded
through a provision of sensors installed in the home
Querying recurrent convoys over trajectory data
National Research Foundation (NRF) Singapore under International Research Centres in Singapore Funding Initiativ
Predicting Community Evolution in Social Networks
Nowadays, sustained development of different social media can be observed
worldwide. One of the relevant research domains intensively explored recently
is analysis of social communities existing in social media as well as
prediction of their future evolution taking into account collected historical
evolution chains. These evolution chains proposed in the paper contain group
states in the previous time frames and its historical transitions that were
identified using one out of two methods: Stable Group Changes Identification
(SGCI) and Group Evolution Discovery (GED). Based on the observed evolution
chains of various length, structural network features are extracted, validated
and selected as well as used to learn classification models. The experimental
studies were performed on three real datasets with different profile: DBLP,
Facebook and Polish blogosphere. The process of group prediction was analysed
with respect to different classifiers as well as various descriptive feature
sets extracted from evolution chains of different length. The results revealed
that, in general, the longer evolution chains the better predictive abilities
of the classification models. However, chains of length 3 to 7 enabled the
GED-based method to almost reach its maximum possible prediction quality. For
SGCI, this value was at the level of 3 to 5 last periods.Comment: Entropy 2015, 17, 1-x manuscripts; doi:10.3390/e170x000x 46 page
The genome of the protozoan parasite Cystoisospora suis and a reverse vaccinology approach to identify vaccine candidates
Vaccine development targeting protozoan parasites remains challenging, partly due to the complex interactions between these eukaryotes and the host immune system. Reverse vaccinology is a promising approach for direct screening of genome sequence assemblies for new vaccine candidate proteins. Here, we applied this paradigm to Cystoisospora suis, an apicomplexan parasite that causes enteritis and diarrhea in suckling piglets and economic losses in pig production worldwide. Using Next Generation Sequencing we produced an âŒ84 Mb sequence assembly for the C. suis genome, making it the first available reference for the genus Cystoisospora. Then, we derived a manually curated annotation of more than 11,000 protein-coding genes and applied the tool Vacceed to identify 1,168 vaccine candidates by screening the predicted C. suis proteome. To refine the set of candidates, we looked at proteins that are highly expressed in merozoites and specific to apicomplexans. The stringent set of candidates included 220 proteins, among which were 152 proteins with unknown function, 17 surface antigens of the SAG and SRS gene families, 12 proteins of the apicomplexan-specific secretory organelles including AMA1, MIC6, MIC13, ROP6, ROP12, ROP27, ROP32 and three proteins related to cell adhesion. Finally, we demonstrated in vitro the immunogenic potential of a C. suis-specific 42 kDa transmembrane protein, which might constitute an attractive candidate for further testing
Behavioral Genetics Research and Criminal DNA Databases
Kaye discusses DNA databanks and the potential use of such databanks for behavioral genetics research. He addresses the concern that DNA databanks serve as a limitless repository for future research and that the samples used in the databanks could be used for research into a crime gene
KC Two-Way Clustering Algorithms For Multi-Child Semantic Maps In Image Mining
Image mining is now a thriving and expanding field of computer science research. Image mining is linked to the advancement of data mining in image preparation. Image mining is used to extract hidden information and in other situations where the photos do not clearly describe the situation. Image mining combines machine learning, data handling, application autonomy, and image preparation concepts. Semantic maps are used to visualize image data stored in image databases. We recommend using Multi-Child Semantic Maps to build semantic maps which fully display the image. In this study, we propose two path clustering on Multi-Child Semantic Maps (MCSM) using the K-C Means Clustering Algorithm, also known as the MCSMK-C algorithm. This algorithm causes image clustering and instructs the mining system to look at the image's top area. When mining, the MCSMK-C algorithm considers the X and Y coordinates. The system looks for groups by examining each object's territory in the database, and it saves a region if it contains more objects than the required number
- âŠ