Search CORE

30 research outputs found

Numerical Data Clustering Ontology Approach

Author: Grabusts Peter
Publication venue: 'Brno University of Technology'
Publication date: 01/06/2018
Field of study

Clustering algorithm tasks are used to group given objects defined by a set of numerical properties in such a way that the objects within a group are more similar than the objects in different groups. All clustering algorithms have common parameters the choice of which characterizes the effectiveness of clustering. The most important parameters characterizing clustering are: metrics, number of clusters and cluster validity criteria. In classic clustering algorithms semantic knowledge is ignored. This creates difficulties in interpreting the results of clustering. At present, the use of ontology opportunities is developing very rapidly, that provide an explicit model for structuring concepts, together with their interrelationship, which allows you to gain knowledge of a particular data model. According to the previously obtained results of clustering study, the author will make an attempt to create ontology-based concept from numerical data using similarity measures, cluster numbers, cluster validity and others characteristic features. To scientific novelty should be attributed the combination of approaches of classical data analysis and ontological approach to their structuring, that increases the efficiency of their use in engineering practice

Directory of Open Access Journals

Digital library of Brno University of Technology

CLASS - A Study of methods for coarse phonetic classification

Author: Delmege James
Publication venue: RIT Scholar Works
Publication date: 01/01/1988
Field of study

The objective of this thesis was to examine computer techniques for classifying speech signals into four coarse phonetic classes: vowel-like, strong fricative, weak fricative and silence. The study compared classification results from the K-means clustering algorithm using Euclidian distance measurements with classification using a multivariate maximum likelihood distance measure. In addition to the comparison of statistical methods, this study compared classification using several tree-structured decision making processes. The system was trained on ten speakers using 98 utterances with both known and unknown speakers. Results showed very little difference between the Euclidian distance and maximum likelihood; however, the introduction of the tree structure on both systems had a positive influence on their performance

RIT Scholar Works

Analyzing Destination Choices of Tourists and Residents from Location Based Social Media Data

Author: Hasnat Md Mehedi
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2018
Field of study

Ubiquitous uses of social media platforms in smartphones have created an opportunity to gather digital traces of individual activities at a large scale. Traditional travel surveys fall short in collecting longitudinal travel behavior data for a large number of people in a cost effective way, especially for the transient population such as tourists. This study presents an innovating methodological framework, using machine learning and econometric approaches, to gather and analyze location-based social media (LBSM) data to understand individual destination choices. First, using Twitter\u27s search interface, we have collected Twitter posts of nearly 156,000 users for the state of Florida. We have adopted several filtering techniques to create a reliable sample from noisy Twitter data. An ensemble classification technique is proposed to classify tourists and residents from user coordinates. The performance of the proposed classifier has been validated using manually labeled data and compared against the state-of-the-art classification methods. Second, using different clustering methods, we have analyzed the spatial distributions of destination choices of tourists and residents. The clusters from tourist destinations revealed most popular tourist spots including emerging tourist attractions in Florida. Third, to predict a tourist\u27s next destination type, we have estimated a Conditional Random Field (CRF) model with reasonable accuracy. Fourth, to analyze resident destination choice behavior, this study proposes an extensive data merging operation among the collected Twitter data and different geographic database from state level data libraries. We have estimated a Panel Latent Segmentation Multinomial Logit (PLSMNL) model to find the characteristics affecting individual destination choices. The proposed PLSMNL model is found to better explain the effects of variables on destination choices compared to trip-specific Multinomial Logit Models. The findings of this study show the potential of LBSM data in future transportation and planning studies where collecting individual activity data is expensive

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Unsupervised Segmentation Method for Diseases of Soybean Color Image Based on Fuzzy Clustering

Author: Jiangsheng Gui
Li Hao
Shusen Sen
Wenshu Li
Yanfei Liu
Publication venue: IFSA Publishing, S.L.
Publication date: 01/11/2013
Field of study

The method of color image segmentation based on Fuzzy C-Means (FCM) clustering is simple, intuitive and is to be implemented. However, the clustering performance is affected by the center point of initialization and high computation and other issues. In this research, we propose a new color image unsupervised segmentation method based on fuzzy clustering. This method combines advantages of the fuzzy C-means algorithm and unsupervised clustering algorithm. Firstly, by gradually changing clusters c, and according to validity measurement, it can unsupervised search for optimal clusters c; then in order to achieve higher accuracy of clustering effect, the distance measurement scale was improved. In our experiments, this method was applied to color image segmentation for three kinds of soybean diseases. The results show that this method can more accurately segment the lesion area from the color image, and the segmentation processing of soybean disease is ideal, robustness, and have a high accuracy

Directory of Open Access Journals

Ontology Partitioning: Clustering Based Approach

Author
Publication venue: 'MECS Publisher'
Publication date
Field of study

Crossref

Despliegue óptimo de redes inalámbricas para la infraestructura de medición inteligente de energía eléctrica

Author: Galiano Hernández José Andrés
Quel Novillo Santiago Xavier
Publication venue: Universidad Politécnica Salesiana. Carrera de Ingeniería Eléctrica. Sede Quito
Publication date: 01/11/2015
Field of study

In this document we present an optimization model of the base stations that serve as collection points for data that are sent from the smart meters in order to cover a group of users that are grouped in a residential area, which sends Data collected by the distribution companies taking control data of energy consumption in each area where the user is located. In the article, we propose the method ILP and two heuristic methods of cluster users who are the K-Means method and the method of K-Medoids for each base station is required to install in the area. The article presents a comparison between the three algorithms we propose for the grouping of users to discuss which of the clustering methods have less coverage error, less time and better clustering performance so you can see which of the three methods is applied more efficient with the use of graphics. With the optimization of base stations we can get a glimpse of how many BS (base stations) we will install in the real way ruling out other SB that were proposed initially as candidates, resulting in a cost minimization installation and an intelligent network that is efficient, reliable and economical with the main objective is to cover all users, or people who are in the area who benefit from the mains.En este trabajo se presenta un modelo de optimización de las estaciones base que sirven como puntos de recolección de datos, los cuales son enviados desde los medidores inteligentes con el fin de dar cobertura a un grupo de usuarios que se agrupan en una zona residencial, desde las estaciones base se envían los datos recogidos a las empresas distribuidoras quienes llevaran el control del consumo de energía de cada área donde el usuario se encuentra ubicado. En la investigación se propone el método ILP y dos métodos heurísticos de agrupación de usuarios que son el método de K-Means y el método de K-Medoids para cada estación base que se requiere instalar en la zona. Se presenta una comparación entre los tres algoritmos que se propone para la agrupación de los usuarios, con el fin dar un análisis de cuál de los métodos de agrupación tienen menor error de cobertura, menor tiempo en ejecución y mejor clusterización y así poder ver cuál de los tres métodos aplicados es el más óptimo. Con la optimización de las estaciones de base se obtendrá una visión de cuantas BS (estaciones base) se instalara en la zona de manera real descartando a las demás BS que fueron propuestas en un principio como candidatas, teniendo como resultado una minimización de los costos de instalación y una red inteligente que sea eficiente, fiable y económica con el objetivo principal de dar cobertura a todos los usuarios o habitantes que se encuentran en la localidad que se benefician de la red de eléctrica

Repositorio Digital Universidad Politécnica Salesiana

New machine-learning-based techniques for DNA microarray image segmentation.

Author: Qin Li
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2004
Field of study

Microarray technology, which provides detailed and abundant information about biological experiments, is a significant achievement in the history of biology. One of the key issues in the microarray processing is to extract quantitative information from the spots, which represent the genes in the experiments. The process of identifying the spots and separating the foreground from the background is known as microarray image segmentation. In this thesis, we present two methods for microarray image segmentation. First, we conduct an in-depth analysis of the influence of important factors on clustering-based microarray image segmentation algorithms. Based on our analysis, we present an optimized clustering-based algorithm for microarray image segmentation, which exploits more than one feature to gain better results comparing to the state-of-the-art clustering-based algorithms. We also consider the fact that most of the spots in a microarray image are ellipses in shape, and hence introduce a novel adaptive ellipse method. This method shows various advantages when compared to the adaptive circle method, one of the most used approaches in microarray image segmentation. The simulations on real-life microarray images show that our method is capable of extracting information from the images which is ignored by the traditional adaptive circle method, and hence showing more flexibility. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2004 .Q26. Source: Masters Abstracts International, Volume: 43-03, page: 0887. Adviser: Luis Rueda. Thesis (M.Sc.)--University of Windsor (Canada), 2004

Scholarship at UWindsor

Machine Learning and Images for Malware Detection and Classification

Author: Kosmidis Konstantinos
Publication venue
Publication date: 06/04/2017
Field of study

International Hellenic University: IHU Open Access Repository

Detecting Road Intersections from Coarse-gained GPS Traces Based on Clustering

Author: Cao
Chen
Fathi
Junwei Wu
Lei
Li
Liang Wang
Lin
Liu
MacQueen
Pelleg
Tao Ku
Worrall
Yunlong Zhu
Publication venue: 'Academy Publisher'
Publication date
Field of study

Crossref

Advances in Meta-Heuristic Optimization Algorithms in Big Data Text Clustering

Author: Abualigah L
Alshinwan M
Elaziz MA
Gandomi AH
Hamad HA
Khasawneh AM
Omari M
Publication venue: 'MDPI AG'
Publication date: 12/05/2021
Field of study

This paper presents a comprehensive survey of the meta-heuristic optimization algorithms on the text clustering applications and highlights its main procedures. These Artificial Intelligence (AI) algorithms are recognized as promising swarm intelligence methods due to their successful ability to solve machine learning problems, especially text clustering problems. This paper reviews all of the relevant literature on meta-heuristic-based text clustering applications, including many variants, such as basic, modified, hybridized, and multi-objective methods. As well, the main procedures of text clustering and critical discussions are given. Hence, this review reports its advantages and disadvantages and recommends potential future research paths. The main keywords that have been considered in this paper are text, clustering, meta-heuristic, optimization, and algorithm

OPUS - University of Technology Sydney