4,908 research outputs found
Information Flow Model for Commercial Security
Information flow in Discretionary Access Control (DAC) is a well-known difficult problem. This paper formalizes the fundamental concepts and establishes a theory of information flow security. A DAC system is information flow secure (IFS), if any data never flows into the hands of owner’s enemies (explicitly denial access list.
Fuzzy-Granular Based Data Mining for Effective Decision Support in Biomedical Applications
Due to complexity of biomedical problems, adaptive and intelligent knowledge discovery and data mining systems are highly needed to help humans to understand the inherent mechanism of diseases. For biomedical classification problems, typically it is impossible to build a perfect classifier with 100% prediction accuracy. Hence a more realistic target is to build an effective Decision Support System (DSS). In this dissertation, a novel adaptive Fuzzy Association Rules (FARs) mining algorithm, named FARM-DS, is proposed to build such a DSS for binary classification problems in the biomedical domain. Empirical studies show that FARM-DS is competitive to state-of-the-art classifiers in terms of prediction accuracy. More importantly, FARs can provide strong decision support on disease diagnoses due to their easy interpretability. This dissertation also proposes a fuzzy-granular method to select informative and discriminative genes from huge microarray gene expression data. With fuzzy granulation, information loss in the process of gene selection is decreased. As a result, more informative genes for cancer classification are selected and more accurate classifiers can be modeled. Empirical studies show that the proposed method is more accurate than traditional algorithms for cancer classification. And hence we expect that genes being selected can be more helpful for further biological studies
Optimal Categorical Attribute Transformation for Granularity Change in Relational Databases for Binary Decision Problems in Educational Data Mining
This paper presents an approach for transforming data granularity in
hierarchical databases for binary decision problems by applying regression to
categorical attributes at the lower grain levels. Attributes from a lower
hierarchy entity in the relational database have their information content
optimized through regression on the categories histogram trained on a small
exclusive labelled sample, instead of the usual mode category of the
distribution. The paper validates the approach on a binary decision task for
assessing the quality of secondary schools focusing on how logistic regression
transforms the students and teachers attributes into school attributes.
Experiments were carried out on Brazilian schools public datasets via 10-fold
cross-validation comparison of the ranking score produced also by logistic
regression. The proposed approach achieved higher performance than the usual
distribution mode transformation and equal to the expert weighing approach
measured by the maximum Kolmogorov-Smirnov distance and the area under the ROC
curve at 0.01 significance level.Comment: 5 pages, 2 figures, 2 table
Data mining in soft computing framework: a survey
The present article provides a survey of the available literature on data mining using soft computing. A categorization has been provided based on the different soft computing tools and their hybridizations used, the data mining function implemented, and the preference criterion selected by the model. The utility of the different soft computing methodologies is highlighted. Generally fuzzy sets are suitable for handling the issues related to understandability of patterns, incomplete/noisy data, mixed media information and human interaction, and can provide approximate solutions faster. Neural networks are nonparametric, robust, and exhibit good learning and generalization capabilities in data-rich environments. Genetic algorithms provide efficient search algorithms to select a model, from mixed media data, based on some preference criterion/objective function. Rough sets are suitable for handling different types of uncertainty in data. Some challenges to data mining and the application of soft computing methodologies are indicated. An extensive bibliography is also included
LearnFCA: A Fuzzy FCA and Probability Based Approach for Learning and Classification
Formal concept analysis(FCA) is a mathematical theory based on lattice and order theory used for data analysis and knowledge representation. Over the past several years, many of its extensions have been proposed and applied in several domains including data mining, machine learning, knowledge management, semantic web, software development, chemistry ,biology, medicine, data analytics, biology and ontology engineering.
This thesis reviews the state-of-the-art of theory of Formal Concept Analysis(FCA) and its various extensions that have been developed and well-studied in the past several years. We discuss their historical roots, reproduce the original definitions and derivations with illustrative examples. Further, we provide a literature review of it’s applications and various approaches adopted by researchers in the areas of dataanalysis, knowledge management with emphasis to data-learning and classification problems.
We propose LearnFCA, a novel approach based on FuzzyFCA and probability theory for learning and classification problems. LearnFCA uses an enhanced version of FuzzyLattice which has been developed to store class labels and probability vectors and has the capability to be used for classifying instances with encoded and unlabelled features. We evaluate LearnFCA on encodings from three datasets - mnist, omniglot and cancer images with interesting results and varying degrees of success.
Adviser: Dr Jitender Deogu
A GIS-based multi-criteria evaluation framework for uncertainty reduction in earthquake disaster management using granular computing
One of the most important steps in earthquake disaster management is the prediction of probable damages which is called earthquake vulnerability assessment. Earthquake vulnerability assessment is a multicriteria problem and a number of multi-criteria decision making models have been proposed for the problem. Two main sources of uncertainty including uncertainty associated with experts‘ point of views and the one associated with attribute values exist in the earthquake vulnerability assessment problem. If the uncertainty in these two sources is not handled properly the resulted seismic vulnerability map will be unreliable. The main objective of this research is to propose a reliable model for earthquake vulnerability assessment which is able to manage the uncertainty associated with the experts‘ opinions. Granular Computing (GrC) is able to extract a set of if-then rules with minimum incompatibility from an information table. An integration of Dempster-Shafer Theory (DST) and GrC is applied in the current research to minimize the entropy in experts‘ opinions. The accuracy of the model based on the integration of the DST and GrC is 83%, while the accuracy of the single-expert model is 62% which indicates the importance of uncertainty management in seismic vulnerability assessment problem. Due to limited accessibility to current data, only six criteria are used in this model. However, the model is able to take into account both qualitative and quantitative criteria
Growing a Tree in the Forest: Constructing Folksonomies by Integrating Structured Metadata
Many social Web sites allow users to annotate the content with descriptive
metadata, such as tags, and more recently to organize content hierarchically.
These types of structured metadata provide valuable evidence for learning how a
community organizes knowledge. For instance, we can aggregate many personal
hierarchies into a common taxonomy, also known as a folksonomy, that will aid
users in visualizing and browsing social content, and also to help them in
organizing their own content. However, learning from social metadata presents
several challenges, since it is sparse, shallow, ambiguous, noisy, and
inconsistent. We describe an approach to folksonomy learning based on
relational clustering, which exploits structured metadata contained in personal
hierarchies. Our approach clusters similar hierarchies using their structure
and tag statistics, then incrementally weaves them into a deeper, bushier tree.
We study folksonomy learning using social metadata extracted from the
photo-sharing site Flickr, and demonstrate that the proposed approach addresses
the challenges. Moreover, comparing to previous work, the approach produces
larger, more accurate folksonomies, and in addition, scales better.Comment: 10 pages, To appear in the Proceedings of ACM SIGKDD Conference on
Knowledge Discovery and Data Mining(KDD) 201
- …