3,939 research outputs found

    A Clustering-Based Algorithm for Data Reduction

    Get PDF
    Finding an efficient data reduction method for large-scale problems is an imperative task. In this paper, we propose a similarity-based self-constructing fuzzy clustering algorithm to do the sampling of instances for the classification task. Instances that are similar to each other are grouped into the same cluster. When all the instances have been fed in, a number of clusters are formed automatically. Then the statistical mean for each cluster will be regarded as representing all the instances covered in the cluster. This approach has two advantages. One is that it can be faster and uses less storage memory. The other is that the number of new representative instances need not be specified in advance by the user. Experiments on real-world datasets show that our method can run faster and obtain better reduction rate than other methods

    Vertical wind profile characterization and identification of patterns based on a shape clustering algorithm

    Get PDF
    Wind power plants are becoming a generally accepted resource in the generation mix of many utilities. At the same time, the size and the power rating of individual wind turbines have increased considerably. Under these circumstances, the sector is increasingly demanding an accurate characterization of vertical wind speed profiles to estimate properly the incoming wind speed at the rotor swept area and, consequently, assess the potential for a wind power plant site. The present paper describes a shape-based clustering characterization and visualization of real vertical wind speed data. The proposed solution allows us to identify the most likely vertical wind speed patterns for a specific location based on real wind speed measurements. Moreover, this clustering approach also provides characterization and classification of such vertical wind profiles. This solution is highly suitable for a large amount of data collected by remote sensing equipment, where wind speed values at different heights within the rotor swept area are available for subsequent analysis. The methodology is based on z-normalization, shape-based distance metric solution and the Ward-hierarchical clustering method. Real vertical wind speed profile data corresponding to a Spanish wind power plant and collected by using a commercialWindcube equipment during several months are used to assess the proposed characterization and clustering process, involving more than 100000 wind speed data values. All analyses have been implemented using open-source R-software. From the results, at least four different vertical wind speed patterns are identified to characterize properly over 90% of the collected wind speed data along the day. Therefore, alternative analytical function criteria should be subsequently proposed for vertical wind speed characterization purposes.The authors are grateful for the financial support from the Spanish Ministry of the Economy and Competitiveness and the European Union —ENE2016-78214-C2-2-R—and the Spanish Education, Culture and Sport Ministry —FPU16/042

    Personalized Fuzzy Text Search Using Interest Prediction and Word Vectorization

    Full text link
    In this paper we study the personalized text search problem. The keyword based search method in conventional algorithms has a low efficiency in understanding users' intention since the semantic meaning, user profile, user interests are not always considered. Firstly, we propose a novel text search algorithm using a inverse filtering mechanism that is very efficient for label based item search. Secondly, we adopt the Bayesian network to implement the user interest prediction for an improved personalized search. According to user input, it searches the related items using keyword information, predicted user interest. Thirdly, the word vectorization is used to discover potential targets according to the semantic meaning. Experimental results show that the proposed search engine has an improved efficiency and accuracy and it can operate on embedded devices with very limited computational resources

    Unsupervised segmentation of natural images based on the adaptive integration of colour-texture descriptors

    Get PDF
    corecore