276,738 research outputs found
Hierarchical Design Based Intrusion Detection System For Wireless Ad hoc Network
In recent years, wireless ad hoc sensor network becomes popular both in civil
and military jobs. However, security is one of the significant challenges for
sensor network because of their deployment in open and unprotected environment.
As cryptographic mechanism is not enough to protect sensor network from
external attacks, intrusion detection system needs to be introduced. Though
intrusion prevention mechanism is one of the major and efficient methods
against attacks, but there might be some attacks for which prevention method is
not known. Besides preventing the system from some known attacks, intrusion
detection system gather necessary information related to attack technique and
help in the development of intrusion prevention system. In addition to
reviewing the present attacks available in wireless sensor network this paper
examines the current efforts to intrusion detection system against wireless
sensor network. In this paper we propose a hierarchical architectural design
based intrusion detection system that fits the current demands and restrictions
of wireless ad hoc sensor network. In this proposed intrusion detection system
architecture we followed clustering mechanism to build a four level
hierarchical network which enhances network scalability to large geographical
area and use both anomaly and misuse detection techniques for intrusion
detection. We introduce policy based detection mechanism as well as intrusion
response together with GSM cell concept for intrusion detection architecture.Comment: 16 pages, International Journal of Network Security & Its
Applications (IJNSA), Vol.2, No.3, July 2010. arXiv admin note: text overlap
with arXiv:1111.1933 by other author
apk2vec: Semi-supervised multi-view representation learning for profiling Android applications
Building behavior profiles of Android applications (apps) with holistic, rich
and multi-view information (e.g., incorporating several semantic views of an
app such as API sequences, system calls, etc.) would help catering downstream
analytics tasks such as app categorization, recommendation and malware analysis
significantly better. Towards this goal, we design a semi-supervised
Representation Learning (RL) framework named apk2vec to automatically generate
a compact representation (aka profile/embedding) for a given app. More
specifically, apk2vec has the three following unique characteristics which make
it an excellent choice for largescale app profiling: (1) it encompasses
information from multiple semantic views such as API sequences, permissions,
etc., (2) being a semi-supervised embedding technique, it can make use of
labels associated with apps (e.g., malware family or app category labels) to
build high quality app profiles, and (3) it combines RL and feature hashing
which allows it to efficiently build profiles of apps that stream over time
(i.e., online learning). The resulting semi-supervised multi-view hash
embeddings of apps could then be used for a wide variety of downstream tasks
such as the ones mentioned above. Our extensive evaluations with more than
42,000 apps demonstrate that apk2vec's app profiles could significantly
outperform state-of-the-art techniques in four app analytics tasks namely,
malware detection, familial clustering, app clone detection and app
recommendation.Comment: International Conference on Data Mining, 201
An Enhanced Spectral Clustering Algorithm with S-Distance
This work is partially supported by the project "Prediction of diseases through computer assisted diagnosis system using images captured by minimally-invasive and non-invasive modalities", Computer Science and Engineering, PDPM Indian Institute of Information Technology, Design and Manufacturing, Jabalpur India (under ID: SPARCMHRD-231). This work is also partially supported by the project "Smart Solutions in Ubiquitous Computing Environments", Grant Agency of Excellence, University of Hradec Kralove, Faculty of Informatics and Management, Czech Republic (under ID: UHK-FIM-GE-2204/2021); project at Universiti Teknologi Malaysia (UTM) under Research University Grant Vot-20H04, Malaysia Research University Network (MRUN) Vot 4L876 and the Fundamental Research Grant Scheme (FRGS) Vot5F073 supported by the Ministry of Education Malaysia for the completion of the research.Calculating and monitoring customer churn metrics is important for companies to retain customers and earn more profit in business. In this study, a churn prediction framework is developed by modified spectral clustering (SC). However, the similarity measure plays an imperative role in clustering for predicting churn with better accuracy by analyzing industrial data. The linear Euclidean distance in the traditional SC is replaced by the non-linear S-distance (Sd). The Sd is deduced from the concept of S-divergence (SD). Several characteristics of Sd are discussed in this work. Assays are conducted to endorse the proposed clustering algorithm on four synthetics, eight UCI, two industrial databases and one telecommunications database related to customer churn. Three existing clustering algorithms-k-means, density-based spatial clustering of applications with noise and conventional SC-are also implemented on the above-mentioned 15 databases. The empirical outcomes show that the proposed clustering algorithm beats three existing clustering algorithms in terms of its Jaccard index, f-score, recall, precision and accuracy. Finally, we also test the significance of the clustering results by the Wilcoxon's signed-rank test, Wilcoxon's rank-sum test, and sign tests. The relative study shows that the outcomes of the proposed algorithm are interesting, especially in the case of clusters of arbitrary shape.project "Prediction of diseases through computer assisted diagnosis system using images captured by minimally-invasive and non-invasive modalities", Computer Science and Engineering, PDPM Indian Institute of Information Technology, Design and Manufacturing
SPARCMHRD-231project "Smart Solutions in Ubiquitous Computing Environments", Grant Agency of Excellence, University of Hradec Kralove, Faculty of Informatics and Management, Czech Republic
UHK-FIM-GE-2204/2021Universiti Teknologi Malaysia (UTM)
20H04Malaysia Research University Network (MRUN)
4L876Fundamental Research Grant Scheme (FRGS) by the Ministry of Education Malaysia
5F07
USER NAVIGATION DESIGN WITH DOCUMENT CLUSTERING FOR DIGITAL LEARNING RESOURCE IN INDONESIAN (CASE STUDY: Q-JOURNAL PT TELKOM INDONESIA)
Information technologies are growing rapidly today. Those are also used by content providers to deliver
a web based application that can help educational institutions to maintain their knowledge resources
such as e-journal. There are many e-journal applications have been built that we can find easily, but
most of them are English based journals. Among the few e-journal applications in Indonesian, it was
rarely well maintain. Q-Journal, developed by PT Telkom Indonesia, is an integrated digital platform
for discovery and publishing research result such as journals. To be a competitive application, e-journal
system must provide a better experience to users by creating a learning journal contents system.
Learning contents mean that the system has capabilities to understand what users need by learning their
behavior and store it as form of user profile. Once user profile created, the system will suggest the other
contents which are match with their profile for the next visit or search.
Those capabilities could only be implemented by using data mining techniques. Combine with suitable
algorithms and approaches, the application could deliver an effective and efficient both for the system
itself and the user. In this research, Hierarchical Agglomerative Clustering (HAC) method capable to
provide a good accuracy in performing topic clustering. So it has generated learning content as a good
user navigation system for digital learning resources. The validation process is calculated by cophenetic
correlation coefficient value (CPCC).
As a result, average value performance measurement of hierarchical clustering result produced by
CPCC is 0.91 that indicates the clustering result have a good performance. The implementation of the
propose design for user navigation has proved to be able providing a better result for e-Journal existing
system. The structure of hierarchy topic-subtopic can be organized with keywords extraction approach
so it become a good user navigation. The propose system can optimize facilitating contextual browse
by considering user information which is adopted from users\u27 interest. Implementation of user
navigation design makes easier for user getting information proper to their interests. User navigation
generating system is considered to allow a user to find a learning path for adjustment the user\u27s interest
on a particular learning topic
Can Clustering Improve Requirements Traceability? A Tracelab-enabled Study
Software permeates every aspect of our modern lives. In many applications, such in the software for airplane flight controls, or nuclear power control systems software failures can have catastrophic consequences. As we place so much trust in software, how can we know if it is trustworthy? Through software assurance, we can attempt to quantify just that.
Building complex, high assurance software is no simple task. The difficult information landscape of a software engineering project can make verification and validation, the process by which the assurance of a software is assessed, very difficult. In order to manage the inevitable information overload of complex software projects, we need software traceability, the ability to describe and follow the life of a requirement, in both forwards and backwards direction.
The Center of Excellence for Software Traceability (CoEST) has created a compelling research agenda with the goal of ubiquitous traceability by 2035. As part of this goal, they have developed TraceLab, a visual experimental workbench built to support design, implementation, and execution of traceability experiments. Through our collaboration with CoEST, we have made several contributions to TraceLab and its community.
This work contributes to the goals of the traceability research community. The three key contributions are (a) a machine learning component package for TraceLab featuring six (6) classifier algorithms, five (5) clustering algorithms, and a total of over 40 components for creating TraceLab experiments, built upon the WEKA machine learning package, as well as implementing methods outside of WEKA; (b) the design for an automated tracing system that uses clustering to decompose the task of tracing into many smaller tracing subproblems; and (c) an implementation of several key components of this tracing system using TraceLab and its experimental evaluation
Clustering and Classification of Multi-domain Proteins
Rapid development of next-generation sequencing technology has led to an unprecedented growth in protein sequence data repositories over the last decade. Majority of these proteins lack structural and functional characterization. This necessitates design and development of fast, efficient, and sensitive computational tools and algorithms that can classify these proteins into functionally coherent groups.
Domains are fundamental units of protein structure and function. Multi-domain proteins are extremely complex as opposed to proteins that have single or no domains. They exhibit network-like complex evolutionary events such as domain shuffling, domain loss, and domain gain. These events therefore, cannot be represented in the conventional protein clustering algorithms like phylogenetic reconstruction and Markov clustering. In this thesis, a multi-domain protein classification system is developed primarily based on the domain composition of protein sequences. Using the principle of co-clustering (biclustering), both proteins and domains are simultaneously clustered, where each bicluster contains a subset of proteins and domains forming a complete bipartite graph. These clusters are then converted into a network of biclusters based on the domains shared between the clusters, thereby classifying the proteins into similar protein families.
We applied our biclustering network approach on a multi-domain protein family, Regulator of G-protein Signalling (RGS) proteins, where heterogeneous domain composition exists among subfamilies. Our approach showed mostly consistent clustering with the existing RGS subfamilies. The average maximum Jaccard Index scores for the clusters obtained by Markov Clustering and phylogenetic clustering methods against the biclusters were 0.64 and 0.60, respectively. Compared to other clustering methods, our approach uses auxiliary domain information of each protein, and therefore, generates more functionally coherent protein clusters and differentiates each protein subfamily from each other. Biclustered networks on complete nine proteomes showed that the number of multi-domain proteins included in connected biclusters rapidly increased with genome complexity, 48.5% in bacteria to 80% in eukaryotes.
Protein clustering and classification, incorporating such wealth of additonal domain information on protein networks has wide applications and would impact functional analysis and characterization of novel proteins.
Advisers: Stephen D. Scott and Etsuko N. Moriyam
Clustering and Classification of Multi-domain Proteins
Rapid development of next-generation sequencing technology has led to an unprecedented growth in protein sequence data repositories over the last decade. Majority of these proteins lack structural and functional characterization. This necessitates design and development of fast, efficient, and sensitive computational tools and algorithms that can classify these proteins into functionally coherent groups.
Domains are fundamental units of protein structure and function. Multi-domain proteins are extremely complex as opposed to proteins that have single or no domains. They exhibit network-like complex evolutionary events such as domain shuffling, domain loss, and domain gain. These events therefore, cannot be represented in the conventional protein clustering algorithms like phylogenetic reconstruction and Markov clustering. In this thesis, a multi-domain protein classification system is developed primarily based on the domain composition of protein sequences. Using the principle of co-clustering (biclustering), both proteins and domains are simultaneously clustered, where each bicluster contains a subset of proteins and domains forming a complete bipartite graph. These clusters are then converted into a network of biclusters based on the domains shared between the clusters, thereby classifying the proteins into similar protein families.
We applied our biclustering network approach on a multi-domain protein family, Regulator of G-protein Signalling (RGS) proteins, where heterogeneous domain composition exists among subfamilies. Our approach showed mostly consistent clustering with the existing RGS subfamilies. The average maximum Jaccard Index scores for the clusters obtained by Markov Clustering and phylogenetic clustering methods against the biclusters were 0.64 and 0.60, respectively. Compared to other clustering methods, our approach uses auxiliary domain information of each protein, and therefore, generates more functionally coherent protein clusters and differentiates each protein subfamily from each other. Biclustered networks on complete nine proteomes showed that the number of multi-domain proteins included in connected biclusters rapidly increased with genome complexity, 48.5% in bacteria to 80% in eukaryotes.
Protein clustering and classification, incorporating such wealth of additonal domain information on protein networks has wide applications and would impact functional analysis and characterization of novel proteins.
Advisers: Stephen D. Scott and Etsuko N. Moriyam
- …