3,939 research outputs found
A Clustering-Based Algorithm for Data Reduction
Finding an efficient data reduction method for large-scale
problems is an imperative task. In this paper, we propose a similarity-based self-constructing fuzzy clustering algorithm to do the sampling of instances for the classification task. Instances that are similar to each other are grouped into the same cluster. When all the instances have been fed in, a number of clusters are formed automatically. Then the statistical mean for each cluster will be regarded as representing all the instances covered in the cluster. This approach has two advantages. One is that it can be faster and uses less storage memory. The other is that the number of new representative instances need not be specified in advance by the user. Experiments on real-world datasets show that our method can run faster and obtain better reduction rate than other methods
Vertical wind profile characterization and identification of patterns based on a shape clustering algorithm
Wind power plants are becoming a generally accepted resource in the generation mix of many utilities. At the same time, the size and the power rating of individual wind turbines have increased considerably. Under these circumstances, the sector is increasingly demanding an accurate characterization of vertical wind speed profiles to estimate properly the incoming wind speed at the rotor swept area and, consequently, assess the potential for a wind power plant site. The present paper describes a shape-based clustering characterization and visualization of real vertical wind speed data. The proposed solution allows us to identify the most likely vertical wind speed patterns for a specific location based on real wind speed measurements. Moreover, this clustering approach also provides characterization and classification of such vertical wind profiles. This solution is highly suitable for a large amount of data collected by remote sensing equipment, where wind speed values at different heights within the rotor swept area are available for subsequent analysis. The methodology is based on z-normalization, shape-based distance metric solution and the Ward-hierarchical clustering method. Real vertical wind speed profile data corresponding to a Spanish wind power plant and collected by using a commercialWindcube equipment during several months are used to assess the proposed characterization and clustering process, involving more than 100000 wind speed data values. All analyses have been implemented using open-source R-software. From the results, at least four different vertical wind speed patterns are identified to characterize properly over 90% of the collected wind speed data along the day. Therefore, alternative analytical function criteria should be subsequently proposed for vertical wind speed characterization purposes.The authors are grateful for the financial support from the Spanish Ministry of the Economy and Competitiveness and the European Union —ENE2016-78214-C2-2-R—and the Spanish Education, Culture and Sport Ministry —FPU16/042
Personalized Fuzzy Text Search Using Interest Prediction and Word Vectorization
In this paper we study the personalized text search problem. The keyword
based search method in conventional algorithms has a low efficiency in
understanding users' intention since the semantic meaning, user profile, user
interests are not always considered. Firstly, we propose a novel text search
algorithm using a inverse filtering mechanism that is very efficient for label
based item search. Secondly, we adopt the Bayesian network to implement the
user interest prediction for an improved personalized search. According to user
input, it searches the related items using keyword information, predicted user
interest. Thirdly, the word vectorization is used to discover potential targets
according to the semantic meaning. Experimental results show that the proposed
search engine has an improved efficiency and accuracy and it can operate on
embedded devices with very limited computational resources
- …