264,425 research outputs found
Analisis dan Implementasi Rough Set Outlier Factor (RSetOF) untuk Deteksi Outlier
ABSTRAKSI: Outlier merupakan suatu data yang memiliki karakteristik yang berbeda dari data pada umumnya. Outlier ini seringkali mengandung knowldegde yang tak terduga. Oleh karena itu dalam banyak aplikasi Knowledge Discovery menemukan outlier lebih menarik daripada menemukan inlier pada dataset. Deteksi outlier merupakan salah satu fungsionalitas dalam data mining yang bertujuan untuk outlier dalam suatu dataset. Ada banyak metode untuk mendeteksi outlier, namun kebanyakan mengalami kendala dalam menangani skalabilitas pada data. Masalah skalabilitas menyebabkan penggunaan jarak tidak tepat untuk menemukan outlier pada data berdimensi tinggi. RsetOF (Rough Set Outlier Factor) merupakan suatu metode untuk mendeteksi outlier pada data dengan dimensi tinggi dengan menggunakan konsep Non-Reduct dari pendekatan Rough Set. Nilai RSetOF untuk tiap data akan dihitung berdasarkan rule dari Non-Reduct untuk menentukan data tersebut outlier atau tidak. RSetOF dapat mendeteksi outlier dengan akurasi cukup baik dalam beberapa skenario pengujianm berdasarkan parameter pengukuran RSetOF, top n outlier dan parameter evaluasi detection rate dan false positive rate.Kata Kunci : outlier, RSetOF, data mining, skalabilitasABSTRACT: Outlier is data which have different characteristic when compared with the large amount of data. Outliers often contain unexpected knowledge. Because of that in many Knowledge Discovery, finding outliers is more interesting than finding inlier in dataset. Outlier Detection is one of data mining’s functionalities that aims to find outlier in dataset. There many methods to detect outlier, but most of them faced the problems of handling the scalability of dataset. Scalability problem had caused the using of distances of points inappropriate to discover outliers in high dimensional. RSetOF (Rough Set Outlier Factor) is a method o detecting outlier in high dimensional dataset based on Non-Reduct from Rough Set approach. A RsetOF value calculated for each data based on rules from Non-Reduct, whether outlier data or not. RsetOF can detect outliers with relatively good accuracy in some test scenarios based on measurement parameters RsetOF value, top n outlier and parameter evaluation of detection rate and false positive rate.Keyword: outlier, RSetOF, data mining, scalabilit
Rough set theory applied to pattern recognition of partial discharge in noise affected cable data
This paper presents an effective, Rough Set (RS) based, pattern recognition method for rejecting interference signals and recognising Partial Discharge (PD) signals from different sources. Firstly, RS theory is presented in terms of Information System, Lower and Upper Approximation, Signal Discretisation, Attribute Reduction and a flowchart of the RS based pattern recognition method. Secondly, PD testing of five types of artificial defect in ethylene-propylene rubber (EPR) cable is carried out and data pre-processing and feature extraction are employed to separate PD and interference signals. Thirdly, the RS based PD signal recognition method is applied to 4000 samples and is proven to have 99% accuracy. Fourthly, the RS based PD recognition method is applied to signals from five different sources and an accuracy of more than 93% is attained when a combination of signal discretisation and attribute reduction methods are applied. Finally, Back-propagation Neural Network (BPNN) and Support Vector Machine (SVM) methods are studied and compared with the developed method. The proposed RS method is proven to have higher accuracy than SVM and BPNN and can be applied for on-line PD monitoring of cable systems after training with valid sample data
Gabor Filter and Rough Clustering Based Edge Detection
This paper introduces an efficient edge detection method based on Gabor
filter and rough clustering. The input image is smoothed by Gabor function, and
the concept of rough clustering is used to focus on edge detection with soft
computational approach. Hysteresis thresholding is used to get the actual
output, i.e. edges of the input image. To show the effectiveness, the proposed
technique is compared with some other edge detection methods.Comment: Proc. IEEE Conf. #30853, International Conference on Human Computer
Interactions (ICHCI'13), Chennai, India, 23-24 Aug., 201
Locating Multiple Multi-scale Electromagnetic Scatterers by A Single Far-field Measurement
Two inverse scattering schemes were recently developed in
\cite{LiLiuShangSun} for locating multiple electromagnetic (EM) scatterers,
respectively, of small size and regular size compared to the detecting EM
wavelength. Both schemes make use of a single far-field measurement. The scheme
of locating regular-size scatterers requires the {\it a priori} knowledge of
the possible shapes, orientations and sizes of the underlying scatterer
components. In this paper, we extend that imaging scheme to a much more
practical setting by relaxing the requirement on the orientations and sizes. We
also develop an imaging scheme of locating multiple multi-scale EM scatterers,
which may include at the same time, both components of regular size and small
size. For the second scheme, a novel local re-sampling technique is developed.
Furthermore, more robust and accurate reconstruction can be achieved for the
second scheme if an additional far-field measurement is used. Rigorous
mathematical justifications are provided and numerical results are presented to
demonstrate the effectiveness and the promising features of the proposed
imaging schemes.Comment: Any comments are welcom
The probability of default in internal ratings based (IRB) models in Basel II: an application of the rough sets methodology
El nuevo Acuerdo de Capital de junio de 2004 (Basilea II) da cabida e incentiva la
implantación de modelos propios para la medición de los riesgos financieros en las
entidades de crédito. En el trabajo que presentamos nos centramos en los modelos internos
para la valoración del riesgo de crédito (IRB) y concretamente en la aproximación a uno de
sus componentes: la probabilidad de impago (PD).
Los métodos tradicionales usados para la modelización del riesgo de crédito, como son el
análisis discriminante y los modelos logit y probit, parten de una serie de restricciones
estadísticas. La metodología rough sets se presenta como una alternativa a los métodos
estadísticos clásicos, salvando las limitaciones de estos.
En nuestro trabajo aplicamos la metodología rought sets a una base de datos, compuesta
por 106 empresas, solicitantes de créditos, con el objeto de obtener aquellos ratios que
mejor discriminan entre empresas sanas y fallidas, así como una serie de reglas de decisión
que ayudarán a detectar las operaciones potencialmente fallidas, como primer paso en la
modelización de la probabilidad de impago. Por último, enfrentamos los resultados obtenidos
con los alcanzados con el análisis discriminante clásico, para concluir que la metodología de
los rough sets presenta mejores resultados de clasificación, en nuestro caso.The new Capital Accord of June 2004 (Basel II) opens the way for and encourages credit entities to implement
their own models for measuring financial risks. In the paper presented, we focus on the use of internal rating
based (IRB) models for the assessment of credit risk and specifically on the approach to one of their
components: probability of default (PD).
In our study we apply the rough sets methodology to a database composed of 106 companies, applicants for
credit, with the object of obtaining those ratios that discriminate best between healthy and bankrupt companies,
together with a series of decision rules that will help to detect the operations potentially in default, as a first step
in modelling the probability of default. Lastly, we compare the results obtained against those obtained using
classic discriminant análisis. We conclude that the rough sets methodology presents better risk classification
results.Junta de Andalucía P06-SEJ-0153
Modelling potential movement in constrained travel environments using rough space-time prisms
The widespread adoption of location-aware technologies (LATs) has afforded analysts new opportunities for efficiently collecting trajectory data of moving individuals. These technologies enable measuring trajectories as a finite sample set of time-stamped locations. The uncertainty related to both finite sampling and measurement errors makes it often difficult to reconstruct and represent a trajectory followed by an individual in space-time. Time geography offers an interesting framework to deal with the potential path of an individual in between two sample locations. Although this potential path may be easily delineated for travels along networks, this will be less straightforward for more nonnetwork-constrained environments. Current models, however, have mostly concentrated on network environments on the one hand and do not account for the spatiotemporal uncertainties of input data on the other hand. This article simultaneously addresses both issues by developing a novel methodology to capture potential movement between uncertain space-time points in obstacle-constrained travel environments
- …