23,933 research outputs found
Taming Wild High Dimensional Text Data with a Fuzzy Lash
The bag of words (BOW) represents a corpus in a matrix whose elements are the
frequency of words. However, each row in the matrix is a very high-dimensional
sparse vector. Dimension reduction (DR) is a popular method to address sparsity
and high-dimensionality issues. Among different strategies to develop DR
method, Unsupervised Feature Transformation (UFT) is a popular strategy to map
all words on a new basis to represent BOW. The recent increase of text data and
its challenges imply that DR area still needs new perspectives. Although a wide
range of methods based on the UFT strategy has been developed, the fuzzy
approach has not been considered for DR based on this strategy. This research
investigates the application of fuzzy clustering as a DR method based on the
UFT strategy to collapse BOW matrix to provide a lower-dimensional
representation of documents instead of the words in a corpus. The quantitative
evaluation shows that fuzzy clustering produces superior performance and
features to Principal Components Analysis (PCA) and Singular Value
Decomposition (SVD), two popular DR methods based on the UFT strategy
Gray Image extraction using Fuzzy Logic
Fuzzy systems concern fundamental methodology to represent and process
uncertainty and imprecision in the linguistic information. The fuzzy systems
that use fuzzy rules to represent the domain knowledge of the problem are known
as Fuzzy Rule Base Systems (FRBS). On the other hand image segmentation and
subsequent extraction from a noise-affected background, with the help of
various soft computing methods, are relatively new and quite popular due to
various reasons. These methods include various Artificial Neural Network (ANN)
models (primarily supervised in nature), Genetic Algorithm (GA) based
techniques, intensity histogram based methods etc. providing an extraction
solution working in unsupervised mode happens to be even more interesting
problem. Literature suggests that effort in this respect appears to be quite
rudimentary. In the present article, we propose a fuzzy rule guided novel
technique that is functional devoid of any external intervention during
execution. Experimental results suggest that this approach is an efficient one
in comparison to different other techniques extensively addressed in
literature. In order to justify the supremacy of performance of our proposed
technique in respect of its competitors, we take recourse to effective metrics
like Mean Squared Error (MSE), Mean Absolute Error (MAE), Peak Signal to Noise
Ratio (PSNR).Comment: 8 pages, 5 figures, Fuzzy Rule Base, Image Extraction, Fuzzy
Inference System (FIS), Membership Functions, Membership values,Image coding
and Processing, Soft Computing, Computer Vision Accepted and published in
IEEE. arXiv admin note: text overlap with arXiv:1206.363
Fuzzy Clustering of Histopathological Images Using Deep Learning Embeddings
Metric learning is a machine learning approach that aims to learn a new distance metric by increasing (reducing) the similarity of examples belonging to the same (different) classes. The output of these approaches are embeddings, where the input data are mapped to improve a crisp or fuzzy classification process. The deep metric learning approaches regard metric learning, implemented by using deep neural networks. Such models have the advantage to discover very representative nonlinear embeddings. In this work, we propose a triplet network deep metric learning approach, based on ResNet50, to find a representative embedding for the unsupervised fuzzy classification of benign and malignant histopathological images of breast cancer tissues. Experiments computed on the BreakHis benchmark dataset, using Fuzzy C-Means Clustering, show the benefit of using very low dimensional embeddings found by the deep metric learning approach
A Fully Unsupervised Texture Segmentation Algorithm
This paper presents a fully unsupervised texture segmentation algorithm by using a modified discrete wavelet frames decomposition and a mean shift algorithm. By fully unsupervised, we mean the algorithm does not require any knowledge of the type of texture present nor the number of textures in the image to be segmented. The basic idea of the proposed method is to use the modified discrete wavelet frames to extract useful information from the image. Then, starting from the lowest level, the mean shift algorithm is used together with the fuzzy c-means clustering to divide the data into an appropriate number of clusters. The data clustering process is then refined at every level by taking into account the data at that particular level. The final crispy segmentation is obtained at the root level. This approach is applied to segment a variety of composite texture images into homogeneous texture areas and very good segmentation results are reported
A systematic review of data quality issues in knowledge discovery tasks
Hay un gran crecimiento en el volumen de datos porque las organizaciones capturan permanentemente la cantidad colectiva de datos para lograr un mejor proceso de toma de decisiones. El desafío mas fundamental es la exploración de los grandes volúmenes de datos y la extracción de conocimiento útil para futuras acciones por medio de tareas para el descubrimiento del conocimiento; sin embargo, muchos datos presentan mala calidad. Presentamos una revisión sistemática de los asuntos de calidad de datos en las áreas del descubrimiento de conocimiento y un estudio de caso aplicado a la enfermedad agrícola conocida como la roya del café.Large volume of data is growing because the organizations are continuously capturing the collective amount of data for better decision-making process. The most fundamental challenge is to explore the large volumes of data and extract useful knowledge for future actions through knowledge discovery tasks, nevertheless many data has poor quality. We presented a systematic review of the data quality issues in knowledge discovery tasks and a case study applied to agricultural disease named coffee rust
- …