8,030 research outputs found
Recommended from our members
The detection and classification of blast cell in Leukaemia Acute Promyelocytic Leukaemia (AML M3) blood using simulated annealing and neural networks
This paper was delivered at AIME 2011: 13th Conference on Artifical Intelligence in Medicine.This paper presents a method for the detection and classification of blast cells in M3 with others sub-types using simulated annealing and neural networks. In this paper, we increased our test result from 10 images to 20 images. We performed Hill Climbing, Simulated Annealing and Genetic Algorithms for detecting the blast cells. As a result, simulated annealing is the “best” heuristic search for detecting the leukaemia cells. From the detection, we performed features extraction on the blast cells and we classifying based on M3 and other sub-types using neural networks. We received convincing result which has targeting around 97% in classifying of M3 with other sub-types. Our results are based on real world image data from a Haematology Department.Universiti Sains Islam Malaysia and the Ministry of Higher Education, Malaysi
Motif Clustering and Overlapping Clustering for Social Network Analysis
Motivated by applications in social network community analysis, we introduce
a new clustering paradigm termed motif clustering. Unlike classical clustering,
motif clustering aims to minimize the number of clustering errors associated
with both edges and certain higher order graph structures (motifs) that
represent "atomic units" of social organizations. Our contributions are
two-fold: We first introduce motif correlation clustering, in which the goal is
to agnostically partition the vertices of a weighted complete graph so that
certain predetermined "important" social subgraphs mostly lie within the same
cluster, while "less relevant" social subgraphs are allowed to lie across
clusters. We then proceed to introduce the notion of motif covers, in which the
goal is to cover the vertices of motifs via the smallest number of (near)
cliques in the graph. Motif cover algorithms provide a natural solution for
overlapping clustering and they also play an important role in latent feature
inference of networks. For both motif correlation clustering and its extension
introduced via the covering problem, we provide hardness results, algorithmic
solutions and community detection results for two well-studied social networks
Link Prediction by De-anonymization: How We Won the Kaggle Social Network Challenge
This paper describes the winning entry to the IJCNN 2011 Social Network
Challenge run by Kaggle.com. The goal of the contest was to promote research on
real-world link prediction, and the dataset was a graph obtained by crawling
the popular Flickr social photo sharing website, with user identities scrubbed.
By de-anonymizing much of the competition test set using our own Flickr crawl,
we were able to effectively game the competition. Our attack represents a new
application of de-anonymization to gaming machine learning contests, suggesting
changes in how future competitions should be run.
We introduce a new simulated annealing-based weighted graph matching
algorithm for the seeding step of de-anonymization. We also show how to combine
de-anonymization with link prediction---the latter is required to achieve good
performance on the portion of the test set not de-anonymized---for example by
training the predictor on the de-anonymized portion of the test set, and
combining probabilistic predictions from de-anonymization and link prediction.Comment: 11 pages, 13 figures; submitted to IJCNN'201
A General Optimization Technique for High Quality Community Detection in Complex Networks
Recent years have witnessed the development of a large body of algorithms for
community detection in complex networks. Most of them are based upon the
optimization of objective functions, among which modularity is the most common,
though a number of alternatives have been suggested in the scientific
literature. We present here an effective general search strategy for the
optimization of various objective functions for community detection purposes.
When applied to modularity, on both real-world and synthetic networks, our
search strategy substantially outperforms the best existing algorithms in terms
of final scores of the objective function; for description length, its
performance is on par with the original Infomap algorithm. The execution time
of our algorithm is on par with non-greedy alternatives present in literature,
and networks of up to 10,000 nodes can be analyzed in time spans ranging from
minutes to a few hours on average workstations, making our approach readily
applicable to tasks which require the quality of partitioning to be as high as
possible, and are not limited by strict time constraints. Finally, based on the
most effective of the available optimization techniques, we compare the
performance of modularity and code length as objective functions, in terms of
the quality of the partitions one can achieve by optimizing them. To this end,
we evaluated the ability of each objective function to reconstruct the
underlying structure of a large set of synthetic and real-world networks.Comment: MAIN text: 14 pages, 4 figures, 1 table Supplementary information: 19
pages, 8 figures, 5 table
A Hierarchical, Fuzzy Inference Approach to Data Filtration and Feature Prioritization in the Connected Manufacturing Enterprise
The current big data landscape is one such that the technology and capability to capture and storage of data has preceded and outpaced the corresponding capability to analyze and interpret it. This has led naturally to the development of elegant and powerful algorithms for data mining, machine learning, and artificial intelligence to harness the potential of the big data environment. A competing reality, however, is that limitations exist in how and to what extent human beings can process complex information. The convergence of these realities is a tension between the technical sophistication or elegance of a solution and its transparency or interpretability by the human data scientist or decision maker. This dissertation, contextualized in the connected manufacturing enterprise, presents an original Fuzzy Approach to Feature Reduction and Prioritization (FAFRAP) approach that is designed to assist the data scientist in filtering and prioritizing data for inclusion in supervised machine learning models. A set of sequential filters reduces the initial set of independent variables, and a fuzzy inference system outputs a crisp numeric value associated with each feature to rank order and prioritize for inclusion in model training. Additionally, the fuzzy inference system outputs a descriptive label to assist in the interpretation of the feature’s usefulness with respect to the problem of interest. Model testing is performed using three publicly available datasets from an online machine learning data repository and later applied to a case study in electronic assembly manufacture. Consistency of model results is experimentally verified using Fisher’s Exact Test, and results of filtered models are compared to results obtained by the unfiltered sets of features using a proposed novel metric of performance-size ratio (PSR)
- …