3,210 research outputs found
Unsupervised clustering of IoT signals through feature extraction and self organizing maps
This thesis scope is to build a clustering model to inspect the structural properties of a dataset composed of IoT signals and to classify these through unsupervised clustering algorithms. To this end, a feature-based representation of the signals is used. Different feature selection algorithms are then used to obtain reduced feature spaces, so as to decrease the computational cost and the memory demand. Thus, the IoT signals are clustered using Self-Organizing Maps (SOM) and then evaluatedope
Partition Decoupling for Multi-gene Analysis of Gene Expression Profiling Data
We present the extention and application of a new unsupervised statistical
learning technique--the Partition Decoupling Method--to gene expression data.
Because it has the ability to reveal non-linear and non-convex geometries
present in the data, the PDM is an improvement over typical gene expression
analysis algorithms, permitting a multi-gene analysis that can reveal
phenotypic differences even when the individual genes do not exhibit
differential expression. Here, we apply the PDM to publicly-available gene
expression data sets, and demonstrate that we are able to identify cell types
and treatments with higher accuracy than is obtained through other approaches.
By applying it in a pathway-by-pathway fashion, we demonstrate how the PDM may
be used to find sets of mechanistically-related genes that discriminate
phenotypes.Comment: Revise
Data Management and Mining in Astrophysical Databases
We analyse the issues involved in the management and mining of astrophysical
data. The traditional approach to data management in the astrophysical field is
not able to keep up with the increasing size of the data gathered by modern
detectors. An essential role in the astrophysical research will be assumed by
automatic tools for information extraction from large datasets, i.e. data
mining techniques, such as clustering and classification algorithms. This asks
for an approach to data management based on data warehousing, emphasizing the
efficiency and simplicity of data access; efficiency is obtained using
multidimensional access methods and simplicity is achieved by properly handling
metadata. Clustering and classification techniques, on large datasets, pose
additional requirements: computational and memory scalability with respect to
the data size, interpretability and objectivity of clustering or classification
results. In this study we address some possible solutions.Comment: 10 pages, Late
DRBM-ClustNet: A Deep Restricted Boltzmann-Kohonen Architecture for Data Clustering
A Bayesian Deep Restricted Boltzmann-Kohonen architecture for data clustering
termed as DRBM-ClustNet is proposed. This core-clustering engine consists of a
Deep Restricted Boltzmann Machine (DRBM) for processing unlabeled data by
creating new features that are uncorrelated and have large variance with each
other. Next, the number of clusters are predicted using the Bayesian
Information Criterion (BIC), followed by a Kohonen Network-based clustering
layer. The processing of unlabeled data is done in three stages for efficient
clustering of the non-linearly separable datasets. In the first stage, DRBM
performs non-linear feature extraction by capturing the highly complex data
representation by projecting the feature vectors of dimensions into
dimensions. Most clustering algorithms require the number of clusters to be
decided a priori, hence here to automate the number of clusters in the second
stage we use BIC. In the third stage, the number of clusters derived from BIC
forms the input for the Kohonen network, which performs clustering of the
feature-extracted data obtained from the DRBM. This method overcomes the
general disadvantages of clustering algorithms like the prior specification of
the number of clusters, convergence to local optima and poor clustering
accuracy on non-linear datasets. In this research we use two synthetic
datasets, fifteen benchmark datasets from the UCI Machine Learning repository,
and four image datasets to analyze the DRBM-ClustNet. The proposed framework is
evaluated based on clustering accuracy and ranked against other
state-of-the-art clustering methods. The obtained results demonstrate that the
DRBM-ClustNet outperforms state-of-the-art clustering algorithms.Comment: 14 pages, 7 figure
Medical imaging analysis with artificial neural networks
Given that neural networks have been widely reported in the research community of medical imaging, we provide a focused literature survey on recent neural network developments in computer-aided diagnosis, medical image segmentation and edge detection towards visual content analysis, and medical image registration for its pre-processing and post-processing, with the aims of increasing awareness of how neural networks can be applied to these areas and to provide a foundation for further research and practical development. Representative techniques and algorithms are explained in detail to provide inspiring examples illustrating: (i) how a known neural network with fixed structure and training procedure could be applied to resolve a medical imaging problem; (ii) how medical images could be analysed, processed, and characterised by neural networks; and (iii) how neural networks could be expanded further to resolve problems relevant to medical imaging. In the concluding section, a highlight of comparisons among many neural network applications is included to provide a global view on computational intelligence with neural networks in medical imaging
ANTIDS: Self-Organized Ant-based Clustering Model for Intrusion Detection System
Security of computers and the networks that connect them is increasingly
becoming of great significance. Computer security is defined as the protection
of computing systems against threats to confidentiality, integrity, and
availability. There are two types of intruders: the external intruders who are
unauthorized users of the machines they attack, and internal intruders, who
have permission to access the system with some restrictions. Due to the fact
that it is more and more improbable to a system administrator to recognize and
manually intervene to stop an attack, there is an increasing recognition that
ID systems should have a lot to earn on following its basic principles on the
behavior of complex natural systems, namely in what refers to
self-organization, allowing for a real distributed and collective perception of
this phenomena. With that aim in mind, the present work presents a
self-organized ant colony based intrusion detection system (ANTIDS) to detect
intrusions in a network infrastructure. The performance is compared among
conventional soft computing paradigms like Decision Trees, Support Vector
Machines and Linear Genetic Programming to model fast, online and efficient
intrusion detection systems.Comment: 13 pages, 3 figures, Swarm Intelligence and Patterns (SIP)- special
track at WSTST 2005, Muroran, JAPA
- …