121 research outputs found
Dark Web Data Classification Using Neural Network
There are several issues associated with Dark Web Structural Patterns mining (including many redundant and irrelevant information), which increases the numerous types of cybercrime like illegal trade, forums, terrorist activity, and illegal online shopping. Understanding online criminal behavior is challenging because the data is available in a vast amount. To require an approach for learning the criminal behavior to check the recent request for improving the labeled data as a user profiling, Dark Web Structural Patterns mining in the case of multidimensional data sets gives uncertain results. Uncertain classification results cause a problem of not being able to predict user behavior. Since data of multidimensional nature has feature mixes, it has an adverse influence on classification. The data associated with Dark Web inundation has restricted us from giving the appropriate solution according to the need. In the research design, a Fusion NN (Neural network)-S3VM for Criminal Network activity prediction model is proposed based on the neural network; NN- S3VM can improve the prediction
Data mining and predictive modeling of biomolecular network from biomedical literature databases
IEEE/ACM Transactions on Computational Biology and Bioinformatics, 4(2): pp. 251-263 .In this paper, we present a novel approach Bio-IEDM (Biomedical Information Extraction and Data Mining) to integrate text
mining and predictive modeling to analyze biomolecular network from biomedical literature databases. Our method consists of two
phases. In phase 1, we discuss a semisupervised efficient learning approach to automatically extract biological relationships such as
protein-protein interaction, protein-gene interaction from the biomedical literature databases to construct the biomolecular network.
Our method automatically learns the patterns based on a few user seed tuples and then extracts new tuples from the biomedical
literature based on the discovered patterns. The derived biomolecular network forms a large scale-free network graph. In phase 2, we
present a novel clustering algorithm to analyze the biomolecular network graph to identify biologically meaningful subnetworks
(communities). The clustering algorithm considers the characteristics of the scale-free network graphs and is based on the local
density of the vertex and its neighborhood functions that can be used to find more meaningful clusters with different density level. The
experimental results indicate our approach is very effective in extracting biological knowledge from a huge collection of biomedical
literature. The integration of data mining and information extraction provides a promising direction for analyzing the biomolecular
network
Semi-supervised and Active Learning Models for Software Fault Prediction
As software continues to insinuate itself into nearly every aspect of our life, the quality of software has been an extremely important issue. Software Quality Assurance (SQA) is a process that ensures the development of high-quality software. It concerns the important problem of maintaining, monitoring, and developing quality software. Accurate detection of fault prone components in software projects is one of the most commonly practiced techniques that offer the path to high quality products without excessive assurance expenditures. This type of quality modeling requires the availability of software modules with known fault content developed in similar environment. However, collection of fault data at module level, particularly in new projects, is expensive and time-consuming. Semi-supervised learning and active learning offer solutions to this problem for learning from limited labeled data by utilizing inexpensive unlabeled data.;In this dissertation, we investigate semi-supervised learning and active learning approaches in the software fault prediction problem. The role of base learner in semi-supervised learning is discussed using several state-of-the-art supervised learners. Our results showed that semi-supervised learning with appropriate base learner leads to better performance in fault proneness prediction compared to supervised learning. In addition, incorporating pre-processing technique prior to semi-supervised learning provides a promising direction to further improving the prediction performance. Active learning, sharing the similar idea as semi-supervised learning in utilizing unlabeled data, requires human efforts for labeling fault proneness in its learning process. Empirical results showed that active learning supplemented by dimensionality reduction technique performs better than the supervised learning on release-based data sets
Artificial Intelligence in Process Engineering
In recent years, the field of Artificial Intelligence (AI) is experiencing a boom, caused by recent breakthroughs in computing power, AI techniques, and software architectures. Among the many fields being impacted by this paradigm shift, process engineering has experienced the benefits caused by AI. However, the published methods and applications in process engineering are diverse, and there is still much unexploited potential. Herein, the goal of providing a systematic overview of the current state of AI and its applications in process engineering is discussed. Current applications are described and classified according to a broader systematic. Current techniques, types of AI as well as pre- and postprocessing will be examined similarly and assigned to the previously discussed applications. Given the importance of mechanistic models in process engineering as opposed to the pure black box nature of most of AI, reverse engineering strategies as well as hybrid modeling will be highlighted. Furthermore, a holistic strategy will be formulated for the application of the current state of AI in process engineering
Applications of Multi-view Learning Approaches for Software Comprehension
Program comprehension concerns the ability of an individual to make an
understanding of an existing software system to extend or transform it.
Software systems comprise of data that are noisy and missing, which makes
program understanding even more difficult. A software system consists of
various views including the module dependency graph, execution logs,
evolutionary information and the vocabulary used in the source code, that
collectively defines the software system. Each of these views contain unique
and complementary information; together which can more accurately describe the
data. In this paper, we investigate various techniques for combining different
sources of information to improve the performance of a program comprehension
task. We employ state-of-the-art techniques from learning to 1) find a suitable
similarity function for each view, and 2) compare different multi-view learning
techniques to decompose a software system into high-level units and give
component-level recommendations for refactoring of the system, as well as
cross-view source code search. The experiments conducted on 10 relatively large
Java software systems show that by fusing knowledge from different views, we
can guarantee a lower bound on the quality of the modularization and even
improve upon it. We proceed by integrating different sources of information to
give a set of high-level recommendations as to how to refactor the software
system. Furthermore, we demonstrate how learning a joint subspace allows for
performing cross-modal retrieval across views, yielding results that are more
aligned with what the user intends by the query. The multi-view approaches
outlined in this paper can be employed for addressing problems in software
engineering that can be encoded in terms of a learning problem, such as
software bug prediction and feature location
- …