Search CORE

4,908 research outputs found

Information Flow Model for Commercial Security

Author: Pan Jene
Publication venue: SJSU ScholarWorks
Publication date: 01/01/2009
Field of study

Information flow in Discretionary Access Control (DAC) is a well-known difficult problem. This paper formalizes the fundamental concepts and establishes a theory of information flow security. A DAC system is information flow secure (IFS), if any data never flows into the hands of owner’s enemies (explicitly denial access list.

SJSU ScholarWorks

Fuzzy-Granular Based Data Mining for Effective Decision Support in Biomedical Applications

Author: He Yuanchen
Publication venue: ScholarWorks @ Georgia State University
Publication date: 04/12/2006
Field of study

Due to complexity of biomedical problems, adaptive and intelligent knowledge discovery and data mining systems are highly needed to help humans to understand the inherent mechanism of diseases. For biomedical classification problems, typically it is impossible to build a perfect classifier with 100% prediction accuracy. Hence a more realistic target is to build an effective Decision Support System (DSS). In this dissertation, a novel adaptive Fuzzy Association Rules (FARs) mining algorithm, named FARM-DS, is proposed to build such a DSS for binary classification problems in the biomedical domain. Empirical studies show that FARM-DS is competitive to state-of-the-art classifiers in terms of prediction accuracy. More importantly, FARs can provide strong decision support on disease diagnoses due to their easy interpretability. This dissertation also proposes a fuzzy-granular method to select informative and discriminative genes from huge microarray gene expression data. With fuzzy granulation, information loss in the process of gene selection is decreased. As a result, more informative genes for cancer classification are selected and more accurate classifiers can be modeled. Empirical studies show that the proposed method is more accurate than traditional algorithms for cancer classification. And hence we expect that genes being selected can be more helpful for further biological studies

ScholarWorks @ Georgia State University

Optimal Categorical Attribute Transformation for Granularity Change in Relational Databases for Binary Decision Problems in Educational Data Mining

Author: Adeodato Paulo J. L.
Neto Rosalvo F. Oliveira
Pereira Fábio C.
Publication venue
Publication date: 28/02/2017
Field of study

This paper presents an approach for transforming data granularity in hierarchical databases for binary decision problems by applying regression to categorical attributes at the lower grain levels. Attributes from a lower hierarchy entity in the relational database have their information content optimized through regression on the categories histogram trained on a small exclusive labelled sample, instead of the usual mode category of the distribution. The paper validates the approach on a binary decision task for assessing the quality of secondary schools focusing on how logistic regression transforms the students and teachers attributes into school attributes. Experiments were carried out on Brazilian schools public datasets via 10-fold cross-validation comparison of the ranking score produced also by logistic regression. The proposed approach achieved higher performance than the usual distribution mode transformation and equal to the expert weighing approach measured by the maximum Kolmogorov-Smirnov distance and the area under the ROC curve at 0.01 significance level.Comment: 5 pages, 2 figures, 2 table

arXiv.org e-Print Archive

Crossref

Gaining insight into clinical pathway with process discovery techniques.

Author: Poelmans Jonas
Publication venue
Publication date
Field of study

Research Papers in Economics

Data mining in soft computing framework: a survey

Author: Mitra P.
Mitra S.
Pal S. K.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2002
Field of study

The present article provides a survey of the available literature on data mining using soft computing. A categorization has been provided based on the different soft computing tools and their hybridizations used, the data mining function implemented, and the preference criterion selected by the model. The utility of the different soft computing methodologies is highlighted. Generally fuzzy sets are suitable for handling the issues related to understandability of patterns, incomplete/noisy data, mixed media information and human interaction, and can provide approximate solutions faster. Neural networks are nonparametric, robust, and exhibit good learning and generalization capabilities in data-rich environments. Genetic algorithms provide efficient search algorithms to select a model, from mixed media data, based on some preference criterion/objective function. Rough sets are suitable for handling different types of uncertainty in data. Some challenges to data mining and the application of soft computing methodologies are indicated. An extensive bibliography is also included

LearnFCA: A Fuzzy FCA and Probability Based Approach for Learning and Classification

Author: Samal Suraj Ketan
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 22/08/2019
Field of study

Formal concept analysis(FCA) is a mathematical theory based on lattice and order theory used for data analysis and knowledge representation. Over the past several years, many of its extensions have been proposed and applied in several domains including data mining, machine learning, knowledge management, semantic web, software development, chemistry ,biology, medicine, data analytics, biology and ontology engineering. This thesis reviews the state-of-the-art of theory of Formal Concept Analysis(FCA) and its various extensions that have been developed and well-studied in the past several years. We discuss their historical roots, reproduce the original definitions and derivations with illustrative examples. Further, we provide a literature review of it’s applications and various approaches adopted by researchers in the areas of dataanalysis, knowledge management with emphasis to data-learning and classification problems. We propose LearnFCA, a novel approach based on FuzzyFCA and probability theory for learning and classification problems. LearnFCA uses an enhanced version of FuzzyLattice which has been developed to store class labels and probability vectors and has the capability to be used for classifying instances with encoded and unlabelled features. We evaluate LearnFCA on encodings from three datasets - mnist, omniglot and cancer images with interesting results and varying degrees of success. Adviser: Dr Jitender Deogu

DigitalCommons@University of Nebraska

A GIS-based multi-criteria evaluation framework for uncertainty reduction in earthquake disaster management using granular computing

Author: Delavar Mahmoud Reza
Khamespanah Fatemeh
Moradi Milad
Sheikhian Hossein
Publication venue: 'Vilnius Gediminas Technical University'
Publication date: 22/06/2016
Field of study

One of the most important steps in earthquake disaster management is the prediction of probable damages which is called earthquake vulnerability assessment. Earthquake vulnerability assessment is a multicriteria problem and a number of multi-criteria decision making models have been proposed for the problem. Two main sources of uncertainty including uncertainty associated with experts‘ point of views and the one associated with attribute values exist in the earthquake vulnerability assessment problem. If the uncertainty in these two sources is not handled properly the resulted seismic vulnerability map will be unreliable. The main objective of this research is to propose a reliable model for earthquake vulnerability assessment which is able to manage the uncertainty associated with the experts‘ opinions. Granular Computing (GrC) is able to extract a set of if-then rules with minimum incompatibility from an information table. An integration of Dempster-Shafer Theory (DST) and GrC is applied in the current research to minimize the entropy in experts‘ opinions. The accuracy of the model based on the integration of the DST and GrC is 83%, while the accuracy of the single-expert model is 62% which indicates the importance of uncertainty management in seismic vulnerability assessment problem. Due to limited accessibility to current data, only six criteria are used in this model. However, the model is able to take into account both qualitative and quantitative criteria

VGTU Journals (Vilnius Gediminas Technical University - Vilnius Tech)

Growing a Tree in the Forest: Constructing Folksonomies by Integrating Structured Metadata

Author: Getoor Lise
Lerman Kristina
Plangprasopchok Anon
Publication venue
Publication date: 01/01/2010
Field of study

Many social Web sites allow users to annotate the content with descriptive metadata, such as tags, and more recently to organize content hierarchically. These types of structured metadata provide valuable evidence for learning how a community organizes knowledge. For instance, we can aggregate many personal hierarchies into a common taxonomy, also known as a folksonomy, that will aid users in visualizing and browsing social content, and also to help them in organizing their own content. However, learning from social metadata presents several challenges, since it is sparse, shallow, ambiguous, noisy, and inconsistent. We describe an approach to folksonomy learning based on relational clustering, which exploits structured metadata contained in personal hierarchies. Our approach clusters similar hierarchies using their structure and tag statistics, then incrementally weaves them into a deeper, bushier tree. We study folksonomy learning using social metadata extracted from the photo-sharing site Flickr, and demonstrate that the proposed approach addresses the challenges. Moreover, comparing to previous work, the approach produces larger, more accurate folksonomies, and in addition, scales better.Comment: 10 pages, To appear in the Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining(KDD) 201

arXiv.org e-Print Archive

CiteSeerX