4,144 research outputs found

    A new unsupervised feature selection method for text clustering based on genetic algorithms

    Get PDF
    Nowadays a vast amount of textual information is collected and stored in various databases around the world, including the Internet as the largest database of all. This rapidly increasing growth of published text means that even the most avid reader cannot hope to keep up with all the reading in a field and consequently the nuggets of insight or new knowledge are at risk of languishing undiscovered in the literature. Text mining offers a solution to this problem by replacing or supplementing the human reader with automatic systems undeterred by the text explosion. It involves analyzing a large collection of documents to discover previously unknown information. Text clustering is one of the most important areas in text mining, which includes text preprocessing, dimension reduction by selecting some terms (features) and finally clustering using selected terms. Feature selection appears to be the most important step in the process. Conventional unsupervised feature selection methods define a measure of the discriminating power of terms to select proper terms from corpus. However up to now the valuation of terms in groups has not been investigated in reported works. In this paper a new and robust unsupervised feature selection approach is proposed that evaluates terms in groups. In addition a new Modified Term Variance measuring method is proposed for evaluating groups of terms. Furthermore a genetic based algorithm is designed and implemented for finding the most valuable groups of terms based on the new measure. These terms then will be utilized to generate the final feature vector for the clustering process . In order to evaluate and justify our approach the proposed method and also a conventional term variance method are implemented and tested using corpus collection Reuters-21578. For a more accurate comparison, methods have been tested on three corpuses and for each corpus clustering task has been done ten times and results are averaged. Results of comparing these two methods are very promising and show that our method produces better average accuracy and F1-measure than the conventional term variance method

    Modeling and control of complex dynamic systems: Applied mathematical aspects

    Get PDF
    The concept of complex dynamic systems arises in many varieties, including the areas of energy generation, storage and distribution, ecosystems, gene regulation and health delivery, safety and security systems, telecommunications, transportation networks, and the rapidly emerging research topics seeking to understand and analyse. Such systems are often concurrent and distributed, because they have to react to various kinds of events, signals, and conditions. They may be characterized by a system with uncertainties, time delays, stochastic perturbations, hybrid dynamics, distributed dynamics, chaotic dynamics, and a large number of algebraic loops. This special issue provides a platform for researchers to report their recent results on various mathematical methods and techniques for modelling and control of complex dynamic systems and identifying critical issues and challenges for future investigation in this field. This special issue amazingly attracted one-hundred-and eighteen submissions, and twenty-eight of them are selected through a rigorous review procedure

    Shifts in the suitable habitat available for brown trout (Salmo trutta L.) under short-term climate change scenarios

    Full text link
    The impact of climate change on the habitat suitability for large brown trout (Salmo trutta L.) was studied in a segment of the Cabriel River (Iberian Peninsula). The future flow and water temperature patterns were simulated at a daily time step with M5 models' trees (NSE of 0.78 and 0.97 respectively) for two short-term scenarios (2011 2040) under the representative concentration pathways (RCP 4.5 and 8.5). An ensemble of five strongly regularized machine learning techniques (generalized additive models, multilayer perceptron ensembles, random forests, support vector machines and fuzzy rule base systems) was used to model the microhabitat suitability (depth, velocity and substrate) during summertime and to evaluate several flows simulated with River2D©. The simulated flow rate and water temperature were combined with the microhabitat assessment to infer bivariate habitat duration curves (BHDCs) under historical conditions and climate change scenarios using either the weighted usable area (WUA) or the Boolean-based suitable area (SA). The forecasts for both scenarios jointly predicted a significant reduction in the flow rate and an increase in water temperature (mean rate of change of ca. −25% and +4% respectively). The five techniques converged on the modelled suitability and habitat preferences; large brown trout selected relatively high flow velocity, large depth and coarse substrate. However, the model developed with support vector machines presented a significantly trimmed output range (max.: 0.38), and thus its predictions were banned from the WUA-based analyses. The BHDCs based on the WUA and the SA broadly matched, indicating an increase in the number of days with less suitable habitat available (WUA and SA) and/or with higher water temperature (trout will endure impoverished environmental conditions ca. 82% of the days). Finally, our results suggested the potential extirpation of the species from the study site during short time spans.The study has been partially funded by the IMPADAPT project (CGL2013-48424-C2-1-R) - Spanish MINECO (Ministerio de Economia y Competitividad) - and FEDER funds and by the Confederacion Hidrografica del Jucar (Spanish Ministry of Agriculture, Food and Environment). We are grateful to the colleagues who worked in the field and in the preliminary data analyses, especially Juan Diego Alcaraz-Henandez, David Argibay, Aina Hernandez and Marta Bargay. Thanks to Matthew J. Cashman for the academic review of English. Finally, the authors would also to thank the Direccion General del Agua and INFRAECO for the cession of the trout data. The authors thank AEMET and UC by the data provided for this work (dataset Spain02).Muñoz Mas, R.; López Nicolás, AF.; Martinez-Capel, F.; Pulido-Velazquez, M. (2016). Shifts in the suitable habitat available for brown trout (Salmo trutta L.) under short-term climate change scenarios. Science of the Total Environment. 544:686-700. https://doi.org/10.1016/j.scitotenv.2015.11.14768670054

    Application of artificial neural network in market segmentation: A review on recent trends

    Full text link
    Despite the significance of Artificial Neural Network (ANN) algorithm to market segmentation, there is a need of a comprehensive literature review and a classification system for it towards identification of future trend of market segmentation research. The present work is the first identifiable academic literature review of the application of neural network based techniques to segmentation. Our study has provided an academic database of literature between the periods of 2000-2010 and proposed a classification scheme for the articles. One thousands (1000) articles have been identified, and around 100 relevant selected articles have been subsequently reviewed and classified based on the major focus of each paper. Findings of this study indicated that the research area of ANN based applications are receiving most research attention and self organizing map based applications are second in position to be used in segmentation. The commonly used models for market segmentation are data mining, intelligent system etc. Our analysis furnishes a roadmap to guide future research and aid knowledge accretion and establishment pertaining to the application of ANN based techniques in market segmentation. Thus the present work will significantly contribute to both the industry and academic research in business and marketing as a sustainable valuable knowledge source of market segmentation with the future trend of ANN application in segmentation.Comment: 24 pages, 7 figures,3 Table

    Adaptive multimodal continuous ant colony optimization

    Get PDF
    Seeking multiple optima simultaneously, which multimodal optimization aims at, has attracted increasing attention but remains challenging. Taking advantage of ant colony optimization algorithms in preserving high diversity, this paper intends to extend ant colony optimization algorithms to deal with multimodal optimization. First, combined with current niching methods, an adaptive multimodal continuous ant colony optimization algorithm is introduced. In this algorithm, an adaptive parameter adjustment is developed, which takes the difference among niches into consideration. Second, to accelerate convergence, a differential evolution mutation operator is alternatively utilized to build base vectors for ants to construct new solutions. Then, to enhance the exploitation, a local search scheme based on Gaussian distribution is self-adaptively performed around the seeds of niches. Together, the proposed algorithm affords a good balance between exploration and exploitation. Extensive experiments on 20 widely used benchmark multimodal functions are conducted to investigate the influence of each algorithmic component and results are compared with several state-of-the-art multimodal algorithms and winners of competitions on multimodal optimization. These comparisons demonstrate the competitive efficiency and effectiveness of the proposed algorithm, especially in dealing with complex problems with high numbers of local optima
    corecore