Search CORE

270 research outputs found

Certainty of outlier and boundary points processing in data mining

Author: Guo Yanhui
Minaei-bidgoli Behrouz
Norouzi Sanaz Saki
Rashno Elyas
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 19/12/2018
Field of study

Data certainty is one of the issues in the real-world applications which is caused by unwanted noise in data. Recently, more attentions have been paid to overcome this problem. We proposed a new method based on neutrosophic set (NS) theory to detect boundary and outlier points as challenging points in clustering methods. Generally, firstly, a certainty value is assigned to data points based on the proposed definition in NS. Then, certainty set is presented for the proposed cost function in NS domain by considering a set of main clusters and noise cluster. After that, the proposed cost function is minimized by gradient descent method. Data points are clustered based on their membership degrees. Outlier points are assigned to noise cluster and boundary points are assigned to main clusters with almost same membership degrees. To show the effectiveness of the proposed method, two types of datasets including 3 datasets in Scatter type and 4 datasets in UCI type are used. Results demonstrate that the proposed cost function handles boundary and outlier points with more accurate membership degrees and outperforms existing state of the art clustering methods.Comment: Conference Paper, 6 page

arXiv.org e-Print Archive

Crossref

Automated cleansing of POI databases

Author: A. Bronselaer
A. Bronselaer
A. Bronselaer
A. Bronselaer
A. Bronselaer
C. Baral
C. Baral
D. Dubois
G. Bordogna
G. Cooman De
G. Nachouki
G. Tré De
H. Foley
I. Bloch
I. Fellegi
J. Dujmović
J. Lin
J. Lin
L.A. Zadeh
L.A. Zadeh
M. Bright
M.A. Rodríguez
P. Carrara
R. Torres
R. Yager
R. Yager
R. Yager
R.W. Sinnott
S. Destercke
S. Konieczny
S. Rahimi
S. Sandri
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Crossref

Ghent University Academic Bibliography

The notion of H-IFS -an approach for enhancing query capabilities in Oracle10g

Author: Atanassov K.T.
Atanassov K.T.
Chountas P.
Chountas P.
Mohammed S.
Mohammed S.
Rogova E.
Rogova E.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

Query answering requirements for a knowledge based treatment of user requests led us to introduce the concept of closure of an Intuitionistic fuzzy set over a universe that has a hierarchical structure. We introduce the automatic analysis of queries according to concepts defined as part of a knowledge based hierarchy in order to guide the query answering as part of an integrated database environment with the aid of hierarchical Intuitionistic fuzzy sets, H-IFS. In this paper based on the notion of H-IFS we propose an ad-hoc utility build on top of Oracle10g that allows us to enhance the query capabilities of by providing better and knowledgeable answers to user’s requests. The theoretical aspects as well the practical issues and achieved results are presented throughout the rest of the paper

Crossref

WestminsterResearch

The notion of H-IFS in data modelling

Author: Atanassov K.T.
Atanassov K.T.
Chountas P.
Chountas P.
Rogova E.
Rogova E.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

In this paper we revise the context of "value imprecision", as part of an knowledge-based environment We present our approach for including value imprecision as pan of a non-rigid hierarchical structures of organization. This led us to introduce the concept of closure of an Intuitionistic fuzzy set over a universe that has a hierarchical structure. Intuitively, in the closure of this Intuitionistic fuzzy set, the "kind of" relation is taken into account by propagating the degree associated wit an element to its sub-elements in the hierarchy. We introduce the automatic analysis according to concepts defined as part of a knowledge hierarchy in order to guide the query answering as part of an integrated database environment with the aid of hierarchical intuitionistic fuzzy sets

Crossref

WestminsterResearch

Methods for fast and reliable clustering

Author: Kärkkäinen Ismo
Publication venue: University of Joensuu
Publication date
Field of study

UEF Electronic Publications

Flexible hierarchies and fuzzy knowledge-based OLAP

Author: Atanassov K.T.
Atanassov K.T.
Chountas P.
Chountas P.
Rogova E.
Rogova E.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2007
Field of study

WestminsterResearch

Treatment of imprecision in data repositories with the aid of KNOLAP

Author: Rogova E.
Rogova E.
Publication venue
Publication date: 01/01/2010
Field of study

Traditional data repositories introduced for the needs of business processing, typically focus on the storage and querying of crisp domains of data. As a result, current commercial data repositories have no facilities for either storing or querying imprecise/ approximate data. No significant attempt has been made for a generic and applicationindependent representation of value imprecision mainly as a property of axes of analysis and also as part of dynamic environment, where potential users may wish to define their “own” axes of analysis for querying either precise or imprecise facts. In such cases, measured values and facts are characterised by descriptive values drawn from a number of dimensions, whereas values of a dimension are organised as hierarchical levels. A solution named H-IFS is presented that allows the representation of flexible hierarchies as part of the dimension structures. An extended multidimensional model named IF-Cube is put forward, which allows the representation of imprecision in facts and dimensions and answering of queries based on imprecise hierarchical preferences. Based on the H-IFS and IF-Cube concepts, a post relational OLAP environment is delivered, the implementation of which is DBMS independent and its performance solely dependent on the underlying DBMS engine

WestminsterResearch

Information Integration - the process of integration, evolution and versioning

Author: Keijzer Ander de
Keulen Maurice van
Publication venue: University of Twente, Centre for Telematica and Information Technology (CTIT)
Publication date: 01/01/2005
Field of study

At present, many information sources are available wherever you are. Most of the time, the information needed is spread across several of those information sources. Gathering this information is a tedious and time consuming job. Automating this process would assist the user in its task. Integration of the information sources provides a global information source with all information needed present. All of these information sources also change over time. With each change of the information source, the schema of this source can be changed as well. The data contained in the information source, however, cannot be changed every time, due to the huge amount of data that would have to be converted in order to conform to the most recent schema.\ud In this report we describe the current methods to information integration, evolution and versioning. We distinguish between integration of schemas and integration of the actual data. We also show some key issues when integrating XML data sources

University of Twente Research Information

Proceedings of the first international VLDB workshop on Management of Uncertain Data

Author: Dekhtyar A.
Publication venue: Centre for Telematics and Information Technology (CTIT)
Publication date: 24/09/2007
Field of study

University of Twente Research Information

Fuzzy clustering for content-based indexing in multimedia databases.

Author
Publication venue
Publication date: 01/01/2001
Field of study

Yue Ho-Yin.Thesis (M.Phil.)--Chinese University of Hong Kong, 2001.Includes bibliographical references (leaves 129-137).Abstracts in English and Chinese.Abstract --- p.iAcknowledgement --- p.ivChapter 1 --- Introduction --- p.1Chapter 1.1 --- Problem Definition --- p.7Chapter 1.2 --- Contributions --- p.8Chapter 1.3 --- Thesis Organization --- p.10Chapter 2 --- Literature Review --- p.11Chapter 2.1 --- "Content-based Retrieval, Background and Indexing Problem" --- p.11Chapter 2.1.1 --- Feature Extraction --- p.12Chapter 2.1.2 --- Nearest-neighbor Search --- p.13Chapter 2.1.3 --- Content-based Indexing Methods --- p.15Chapter 2.2 --- Indexing Problems --- p.25Chapter 2.3 --- Data Clustering Methods for Indexing --- p.26Chapter 2.3.1 --- Probabilistic Clustering --- p.27Chapter 2.3.2 --- Possibilistic Clustering --- p.34Chapter 3 --- Fuzzy Clustering Algorithms --- p.37Chapter 3.1 --- Fuzzy Competitive Clustering --- p.38Chapter 3.2 --- Sequential Fuzzy Competitive Clustering --- p.40Chapter 3.3 --- Experiments --- p.43Chapter 3.3.1 --- Experiment 1: Data set with different number of samples --- p.44Chapter 3.3.2 --- Experiment 2: Data set on different dimensionality --- p.46Chapter 3.3.3 --- Experiment 3: Data set with different number of natural clusters inside --- p.55Chapter 3.3.4 --- Experiment 4: Data set with different noise level --- p.56Chapter 3.3.5 --- Experiment 5: Clusters with different geometry size --- p.60Chapter 3.3.6 --- Experiment 6: Clusters with different number of data instances --- p.67Chapter 3.3.7 --- Experiment 7: Performance on real data set --- p.71Chapter 3.4 --- Discussion --- p.72Chapter 3.4.1 --- "Differences Between FCC, SFCC, and Others Clustering Algorithms" --- p.72Chapter 3.4.2 --- Variations on SFCC --- p.75Chapter 3.4.3 --- Why SFCC? --- p.75Chapter 4 --- Hierarchical Indexing based on Natural Clusters Information --- p.77Chapter 4.1 --- The Hierarchical Approach --- p.77Chapter 4.2 --- The Sequential Fuzzy Competitive Clustering Binary Tree (SFCC- b-tree) --- p.79Chapter 4.2.1 --- Data Structure of SFCC-b-tree --- p.80Chapter 4.2.2 --- Tree Building of SFCC-b-Tree --- p.82Chapter 4.2.3 --- Insertion of SFCC-b-tree --- p.83Chapter 4.2.4 --- Deletion of SFCC-b-Tree --- p.84Chapter 4.2.5 --- Searching in SFCC-b-Tree --- p.84Chapter 4.3 --- Experiments --- p.88Chapter 4.3.1 --- Experimental Setting --- p.88Chapter 4.3.2 --- Experiment 8: Test for different leaf node sizes --- p.90Chapter 4.3.3 --- Experiment 9: Test for different dimensionality --- p.97Chapter 4.3.4 --- Experiment 10: Test for different sizes of data sets --- p.104Chapter 4.3.5 --- Experiment 11: Test for different data distributions --- p.109Chapter 4.4 --- Summary --- p.113Chapter 5 --- A Case Study on SFCC-b-tree --- p.114Chapter 5.1 --- Introduction --- p.114Chapter 5.2 --- Data Collection --- p.115Chapter 5.3 --- Data Pre-processing --- p.116Chapter 5.4 --- Experimental Results --- p.119Chapter 5.5 --- Summary --- p.121Chapter 6 --- Conclusion --- p.122Chapter 6.1 --- An Efficiency Formula --- p.122Chapter 6.1.1 --- Motivation --- p.122Chapter 6.1.2 --- Regression Model --- p.123Chapter 6.1.3 --- Discussion --- p.124Chapter 6.2 --- Future Directions --- p.127Chapter 6.3 --- Conclusion --- p.128Bibliography --- p.12

CUHK Digital Repository