276,738 research outputs found

    Hierarchical Design Based Intrusion Detection System For Wireless Ad hoc Network

    Full text link
    In recent years, wireless ad hoc sensor network becomes popular both in civil and military jobs. However, security is one of the significant challenges for sensor network because of their deployment in open and unprotected environment. As cryptographic mechanism is not enough to protect sensor network from external attacks, intrusion detection system needs to be introduced. Though intrusion prevention mechanism is one of the major and efficient methods against attacks, but there might be some attacks for which prevention method is not known. Besides preventing the system from some known attacks, intrusion detection system gather necessary information related to attack technique and help in the development of intrusion prevention system. In addition to reviewing the present attacks available in wireless sensor network this paper examines the current efforts to intrusion detection system against wireless sensor network. In this paper we propose a hierarchical architectural design based intrusion detection system that fits the current demands and restrictions of wireless ad hoc sensor network. In this proposed intrusion detection system architecture we followed clustering mechanism to build a four level hierarchical network which enhances network scalability to large geographical area and use both anomaly and misuse detection techniques for intrusion detection. We introduce policy based detection mechanism as well as intrusion response together with GSM cell concept for intrusion detection architecture.Comment: 16 pages, International Journal of Network Security & Its Applications (IJNSA), Vol.2, No.3, July 2010. arXiv admin note: text overlap with arXiv:1111.1933 by other author

    apk2vec: Semi-supervised multi-view representation learning for profiling Android applications

    Full text link
    Building behavior profiles of Android applications (apps) with holistic, rich and multi-view information (e.g., incorporating several semantic views of an app such as API sequences, system calls, etc.) would help catering downstream analytics tasks such as app categorization, recommendation and malware analysis significantly better. Towards this goal, we design a semi-supervised Representation Learning (RL) framework named apk2vec to automatically generate a compact representation (aka profile/embedding) for a given app. More specifically, apk2vec has the three following unique characteristics which make it an excellent choice for largescale app profiling: (1) it encompasses information from multiple semantic views such as API sequences, permissions, etc., (2) being a semi-supervised embedding technique, it can make use of labels associated with apps (e.g., malware family or app category labels) to build high quality app profiles, and (3) it combines RL and feature hashing which allows it to efficiently build profiles of apps that stream over time (i.e., online learning). The resulting semi-supervised multi-view hash embeddings of apps could then be used for a wide variety of downstream tasks such as the ones mentioned above. Our extensive evaluations with more than 42,000 apps demonstrate that apk2vec's app profiles could significantly outperform state-of-the-art techniques in four app analytics tasks namely, malware detection, familial clustering, app clone detection and app recommendation.Comment: International Conference on Data Mining, 201

    An Enhanced Spectral Clustering Algorithm with S-Distance

    Get PDF
    This work is partially supported by the project "Prediction of diseases through computer assisted diagnosis system using images captured by minimally-invasive and non-invasive modalities", Computer Science and Engineering, PDPM Indian Institute of Information Technology, Design and Manufacturing, Jabalpur India (under ID: SPARCMHRD-231). This work is also partially supported by the project "Smart Solutions in Ubiquitous Computing Environments", Grant Agency of Excellence, University of Hradec Kralove, Faculty of Informatics and Management, Czech Republic (under ID: UHK-FIM-GE-2204/2021); project at Universiti Teknologi Malaysia (UTM) under Research University Grant Vot-20H04, Malaysia Research University Network (MRUN) Vot 4L876 and the Fundamental Research Grant Scheme (FRGS) Vot5F073 supported by the Ministry of Education Malaysia for the completion of the research.Calculating and monitoring customer churn metrics is important for companies to retain customers and earn more profit in business. In this study, a churn prediction framework is developed by modified spectral clustering (SC). However, the similarity measure plays an imperative role in clustering for predicting churn with better accuracy by analyzing industrial data. The linear Euclidean distance in the traditional SC is replaced by the non-linear S-distance (Sd). The Sd is deduced from the concept of S-divergence (SD). Several characteristics of Sd are discussed in this work. Assays are conducted to endorse the proposed clustering algorithm on four synthetics, eight UCI, two industrial databases and one telecommunications database related to customer churn. Three existing clustering algorithms-k-means, density-based spatial clustering of applications with noise and conventional SC-are also implemented on the above-mentioned 15 databases. The empirical outcomes show that the proposed clustering algorithm beats three existing clustering algorithms in terms of its Jaccard index, f-score, recall, precision and accuracy. Finally, we also test the significance of the clustering results by the Wilcoxon's signed-rank test, Wilcoxon's rank-sum test, and sign tests. The relative study shows that the outcomes of the proposed algorithm are interesting, especially in the case of clusters of arbitrary shape.project "Prediction of diseases through computer assisted diagnosis system using images captured by minimally-invasive and non-invasive modalities", Computer Science and Engineering, PDPM Indian Institute of Information Technology, Design and Manufacturing SPARCMHRD-231project "Smart Solutions in Ubiquitous Computing Environments", Grant Agency of Excellence, University of Hradec Kralove, Faculty of Informatics and Management, Czech Republic UHK-FIM-GE-2204/2021Universiti Teknologi Malaysia (UTM) 20H04Malaysia Research University Network (MRUN) 4L876Fundamental Research Grant Scheme (FRGS) by the Ministry of Education Malaysia 5F07

    USER NAVIGATION DESIGN WITH DOCUMENT CLUSTERING FOR DIGITAL LEARNING RESOURCE IN INDONESIAN (CASE STUDY: Q-JOURNAL PT TELKOM INDONESIA)

    Get PDF
    Information technologies are growing rapidly today. Those are also used by content providers to deliver a web based application that can help educational institutions to maintain their knowledge resources such as e-journal. There are many e-journal applications have been built that we can find easily, but most of them are English based journals. Among the few e-journal applications in Indonesian, it was rarely well maintain. Q-Journal, developed by PT Telkom Indonesia, is an integrated digital platform for discovery and publishing research result such as journals. To be a competitive application, e-journal system must provide a better experience to users by creating a learning journal contents system. Learning contents mean that the system has capabilities to understand what users need by learning their behavior and store it as form of user profile. Once user profile created, the system will suggest the other contents which are match with their profile for the next visit or search. Those capabilities could only be implemented by using data mining techniques. Combine with suitable algorithms and approaches, the application could deliver an effective and efficient both for the system itself and the user. In this research, Hierarchical Agglomerative Clustering (HAC) method capable to provide a good accuracy in performing topic clustering. So it has generated learning content as a good user navigation system for digital learning resources. The validation process is calculated by cophenetic correlation coefficient value (CPCC). As a result, average value performance measurement of hierarchical clustering result produced by CPCC is 0.91 that indicates the clustering result have a good performance. The implementation of the propose design for user navigation has proved to be able providing a better result for e-Journal existing system. The structure of hierarchy topic-subtopic can be organized with keywords extraction approach so it become a good user navigation. The propose system can optimize facilitating contextual browse by considering user information which is adopted from users\u27 interest. Implementation of user navigation design makes easier for user getting information proper to their interests. User navigation generating system is considered to allow a user to find a learning path for adjustment the user\u27s interest on a particular learning topic

    Can Clustering Improve Requirements Traceability? A Tracelab-enabled Study

    Get PDF
    Software permeates every aspect of our modern lives. In many applications, such in the software for airplane flight controls, or nuclear power control systems software failures can have catastrophic consequences. As we place so much trust in software, how can we know if it is trustworthy? Through software assurance, we can attempt to quantify just that. Building complex, high assurance software is no simple task. The difficult information landscape of a software engineering project can make verification and validation, the process by which the assurance of a software is assessed, very difficult. In order to manage the inevitable information overload of complex software projects, we need software traceability, the ability to describe and follow the life of a requirement, in both forwards and backwards direction. The Center of Excellence for Software Traceability (CoEST) has created a compelling research agenda with the goal of ubiquitous traceability by 2035. As part of this goal, they have developed TraceLab, a visual experimental workbench built to support design, implementation, and execution of traceability experiments. Through our collaboration with CoEST, we have made several contributions to TraceLab and its community. This work contributes to the goals of the traceability research community. The three key contributions are (a) a machine learning component package for TraceLab featuring six (6) classifier algorithms, five (5) clustering algorithms, and a total of over 40 components for creating TraceLab experiments, built upon the WEKA machine learning package, as well as implementing methods outside of WEKA; (b) the design for an automated tracing system that uses clustering to decompose the task of tracing into many smaller tracing subproblems; and (c) an implementation of several key components of this tracing system using TraceLab and its experimental evaluation

    Clustering and Classification of Multi-domain Proteins

    Get PDF
    Rapid development of next-generation sequencing technology has led to an unprecedented growth in protein sequence data repositories over the last decade. Majority of these proteins lack structural and functional characterization. This necessitates design and development of fast, efficient, and sensitive computational tools and algorithms that can classify these proteins into functionally coherent groups. Domains are fundamental units of protein structure and function. Multi-domain proteins are extremely complex as opposed to proteins that have single or no domains. They exhibit network-like complex evolutionary events such as domain shuffling, domain loss, and domain gain. These events therefore, cannot be represented in the conventional protein clustering algorithms like phylogenetic reconstruction and Markov clustering. In this thesis, a multi-domain protein classification system is developed primarily based on the domain composition of protein sequences. Using the principle of co-clustering (biclustering), both proteins and domains are simultaneously clustered, where each bicluster contains a subset of proteins and domains forming a complete bipartite graph. These clusters are then converted into a network of biclusters based on the domains shared between the clusters, thereby classifying the proteins into similar protein families. We applied our biclustering network approach on a multi-domain protein family, Regulator of G-protein Signalling (RGS) proteins, where heterogeneous domain composition exists among subfamilies. Our approach showed mostly consistent clustering with the existing RGS subfamilies. The average maximum Jaccard Index scores for the clusters obtained by Markov Clustering and phylogenetic clustering methods against the biclusters were 0.64 and 0.60, respectively. Compared to other clustering methods, our approach uses auxiliary domain information of each protein, and therefore, generates more functionally coherent protein clusters and differentiates each protein subfamily from each other. Biclustered networks on complete nine proteomes showed that the number of multi-domain proteins included in connected biclusters rapidly increased with genome complexity, 48.5% in bacteria to 80% in eukaryotes. Protein clustering and classification, incorporating such wealth of additonal domain information on protein networks has wide applications and would impact functional analysis and characterization of novel proteins. Advisers: Stephen D. Scott and Etsuko N. Moriyam

    Clustering and Classification of Multi-domain Proteins

    Get PDF
    Rapid development of next-generation sequencing technology has led to an unprecedented growth in protein sequence data repositories over the last decade. Majority of these proteins lack structural and functional characterization. This necessitates design and development of fast, efficient, and sensitive computational tools and algorithms that can classify these proteins into functionally coherent groups. Domains are fundamental units of protein structure and function. Multi-domain proteins are extremely complex as opposed to proteins that have single or no domains. They exhibit network-like complex evolutionary events such as domain shuffling, domain loss, and domain gain. These events therefore, cannot be represented in the conventional protein clustering algorithms like phylogenetic reconstruction and Markov clustering. In this thesis, a multi-domain protein classification system is developed primarily based on the domain composition of protein sequences. Using the principle of co-clustering (biclustering), both proteins and domains are simultaneously clustered, where each bicluster contains a subset of proteins and domains forming a complete bipartite graph. These clusters are then converted into a network of biclusters based on the domains shared between the clusters, thereby classifying the proteins into similar protein families. We applied our biclustering network approach on a multi-domain protein family, Regulator of G-protein Signalling (RGS) proteins, where heterogeneous domain composition exists among subfamilies. Our approach showed mostly consistent clustering with the existing RGS subfamilies. The average maximum Jaccard Index scores for the clusters obtained by Markov Clustering and phylogenetic clustering methods against the biclusters were 0.64 and 0.60, respectively. Compared to other clustering methods, our approach uses auxiliary domain information of each protein, and therefore, generates more functionally coherent protein clusters and differentiates each protein subfamily from each other. Biclustered networks on complete nine proteomes showed that the number of multi-domain proteins included in connected biclusters rapidly increased with genome complexity, 48.5% in bacteria to 80% in eukaryotes. Protein clustering and classification, incorporating such wealth of additonal domain information on protein networks has wide applications and would impact functional analysis and characterization of novel proteins. Advisers: Stephen D. Scott and Etsuko N. Moriyam
    • …
    corecore