Search CORE

1,428 research outputs found

Adaptive firefly algorithm for hierarchical text clustering

Author: Mohammed Athraa Jasim
Publication venue
Publication date: 01/01/2016
Field of study

Text clustering is essentially used by search engines to increase the recall and precision in information retrieval. As search engine operates on Internet content that is constantly being updated, there is a need for a clustering algorithm that offers automatic grouping of items without prior knowledge on the collection. Existing clustering methods have problems in determining optimal number of clusters and producing compact clusters. In this research, an adaptive hierarchical text clustering algorithm is proposed based on Firefly Algorithm. The proposed Adaptive Firefly Algorithm (AFA) consists of three components: document clustering, cluster refining, and cluster merging. The first component introduces Weight-based Firefly Algorithm (WFA) that automatically identifies initial centers and their clusters for any given text collection. In order to refine the obtained clusters, a second algorithm, termed as Weight-based Firefly Algorithm with Relocate (WFAR), is proposed. Such an approach allows the relocation of a pre-assigned document into a newly created cluster. The third component, Weight-based Firefly Algorithm with Relocate and Merging (WFARM), aims to reduce the number of produced clusters by merging nonpure clusters into the pure ones. Experiments were conducted to compare the proposed algorithms against seven existing methods. The percentage of success in obtaining optimal number of clusters by AFA is 100% with purity and f-measure of 83% higher than the benchmarked methods. As for entropy measure, the AFA produced the lowest value (0.78) when compared to existing methods. The result indicates that Adaptive Firefly Algorithm can produce compact clusters. This research contributes to the text mining domain as hierarchical text clustering facilitates the indexing of documents and information retrieval processes

Universiti Utara Malaysia: UUM eTheses

Tour recommendation for groups

Author: Anagnostopoulos Aris
Atassi Reem
Becchetti Luca
Fazzone Adriano
Silvestri Fabrizio
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Consider a group of people who are visiting a major touristic city, such as NY, Paris, or Rome. It is reasonable to assume that each member of the group has his or her own interests or preferences about places to visit, which in general may differ from those of other members. Still, people almost always want to hang out together and so the following question naturally arises: What is the best tour that the group could perform together in the city? This problem underpins several challenges, ranging from understanding people’s expected attitudes towards potential points of interest, to modeling and providing good and viable solutions. Formulating this problem is challenging because of multiple competing objectives. For example, making the entire group as happy as possible in general conflicts with the objective that no member becomes disappointed. In this paper, we address the algorithmic implications of the above problem, by providing various formulations that take into account the overall group as well as the individual satisfaction and the length of the tour. We then study the computational complexity of these formulations, we provide effective and efficient practical algorithms, and, finally, we evaluate them on datasets constructed from real city data

Archivio della ricerca- Università di Roma La Sapienza

Search based software engineering: Trends, techniques and applications

Author: Adamopoulos K.
Afzal W.
Afzal W.
Aguilar
Al Ba E.
Alander J. T.
Alander J. T.
Alander J. T.
Alba E.
Alba E.
Amoui M.
Ant Oniol G.
Antoniol G.
Antoniol G.
Arcuri A.
Aversano L.
Bodhuin T.
Bouktif S.
Canfora G.
Chang C. K.
Chang C. K.
Chang C. K.
Chao C.
Chicano F.
Clark J. A.
Cortellessa V.
Cowan G. S.
Dolado J. J.
Doval D.
Dozier G.
El-Faki H K.
Erformat M.
Evett M. P.
Fatiregun D.
Feather M. S.
Feather M. S.
Feldt R.
Ferreira M.
Funes P.
Gross H.-G.
Gross H.-G.
Harman M.
Harman M.
Hart J.
He P.
Hodjat B.
Jaeger M. C.
Jarillo G.
Jiang H.
Joshi A. M.
Katz G.
Khoshgoftaar T. M.
Khoshgoftaar T. M.
Kirsopp C.
Lefley M.
Li C.
Liu Y.
Liu Y.
Liu Y.
Mahanti P. K.
Mahdavi K.
Mahdavi K.
Mancoridis S.
Mancoridis S.
Mark Harman
Minohara T.
Mitchell B. S.
Mitchell B. S.
Mitchell B. S.
Monnier Y.
Nguyen C.
Pohlheim H.
Raiha O.
Ruhe G.
Ruhe G.
S. Afshin Mansouri
Sahraoui H. A.
Shan Y.
Shepperd M.
Shyang W.
Simons C. L.
Stephenson M.
Su S.
van Belle T.
Van Den Akker M.
Vivanco R.
Wang Z.
Wegener J.
Yoo S.
Yuanyuan Zhang
Zhang X.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/11/2012
Field of study

© ACM, 2012. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version is available from the link below.In the past five years there has been a dramatic increase in work on Search-Based Software Engineering (SBSE), an approach to Software Engineering (SE) in which Search-Based Optimization (SBO) algorithms are used to address problems in SE. SBSE has been applied to problems throughout the SE lifecycle, from requirements and project planning to maintenance and reengineering. The approach is attractive because it offers a suite of adaptive automated and semiautomated solutions in situations typified by large complex problem spaces with multiple competing and conflicting objectives. This article provides a review and classification of literature on SBSE. The work identifies research trends and relationships between the techniques applied and the applications to which they have been applied and highlights gaps in the literature and avenues for further research.EPSRC and E

Crossref

UCL Discovery

Brunel University Research Archive

Recommended from our members

A resource aware distributed LSI algorithm for scalable information retrieval

Author: Liu Yang
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2011
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Latent Semantic Indexing (LSI) is one of the popular techniques in the information retrieval fields. Different from the traditional information retrieval techniques, LSI is not based on the keyword matching simply. It uses statistics and algebraic computations. Based on Singular Value Decomposition (SVD), the higher dimensional matrix is converted to a lower dimensional approximate matrix, of which the noises could be filtered. And also the issues of synonymy and polysemy in the traditional techniques can be overcome based on the investigations of the terms related with the documents. However, it is notable that LSI suffers a scalability issue due to the computing complexity of SVD. This thesis presents a resource aware distributed LSI algorithm MR-LSI which can solve the scalability issue using Hadoop framework based on the distributed computing model MapReduce. It also solves the overhead issue caused by the involved clustering algorithm. The evaluations indicate that MR-LSI can gain significant enhancement compared to the other strategies on processing large scale of documents. One remarkable advantage of Hadoop is that it supports heterogeneous computing environments so that the issue of unbalanced load among nodes is highlighted. Therefore, a load balancing algorithm based on genetic algorithm for balancing load in static environment is proposed. The results show that it can improve the performance of a cluster according to heterogeneity levels. Considering dynamic Hadoop environments, a dynamic load balancing strategy with varying window size has been proposed. The algorithm works depending on data selecting decision and modeling Hadoop parameters and working mechanisms. Employing improved genetic algorithm for achieving optimized scheduler, the algorithm enhances the performance of a cluster with certain heterogeneity levels

Brunel University Research Archive

Multi-Quality Auto-Tuning by Contract Negotiation

Author: Götz Sebastian
Publication venue
Publication date: 17/07/2013
Field of study

A characteristic challenge of software development is the management of omnipresent change. Classically, this constant change is driven by customers changing their requirements. The wish to optimally leverage available resources opens another source of change: the software systems environment. Software is tailored to specific platforms (e.g., hardware architectures) resulting in many variants of the same software optimized for different environments. If the environment changes, a different variant is to be used, i.e., the system has to reconfigure to the variant optimized for the arisen situation. The automation of such adjustments is subject to the research community of self-adaptive systems. The basic principle is a control loop, as known from control theory. The system (and environment) is continuously monitored, the collected data is analyzed and decisions for or against a reconfiguration are computed and realized. Central problems in this field, which are addressed in this thesis, are the management of interdependencies between non-functional properties of the system, the handling of multiple criteria subject to decision making and the scalability. In this thesis, a novel approach to self-adaptive software--Multi-Quality Auto-Tuning (MQuAT)--is presented, which provides design and operation principles for software systems which automatically provide the best possible utility to the user while producing the least possible cost. For this purpose, a component model has been developed, enabling the software developer to design and implement self-optimizing software systems in a model-driven way. This component model allows for the specification of the structure as well as the behavior of the system and is capable of covering the runtime state of the system. The notion of quality contracts is utilized to cover the non-functional behavior and, especially, the dependencies between non-functional properties of the system. At runtime the component model covers the runtime state of the system. This runtime model is used in combination with the contracts to generate optimization problems in different formalisms (Integer Linear Programming (ILP), Pseudo-Boolean Optimization (PBO), Ant Colony Optimization (ACO) and Multi-Objective Integer Linear Programming (MOILP)). Standard solvers are applied to derive solutions to these problems, which represent reconfiguration decisions, if the identified configuration differs from the current. Each approach is empirically evaluated in terms of its scalability showing the feasibility of all approaches, except for ACO, the superiority of ILP over PBO and the limits of all approaches: 100 component types for ILP, 30 for PBO, 10 for ACO and 30 for 2-objective MOILP. In presence of more than two objective functions the MOILP approach is shown to be infeasible

Technische Universität Dresden: Qucosa

A New Manufacturing Service Selection and Composition Method Using Improved Flower Pollination Algorithm

Author: Dejian Yu
Shuai Zhang
Wenyu Zhang
Yangbing Xu
Yushu Yang
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2016
Field of study

With an increasing number of manufacturing services, the means by which to select and compose these manufacturing services have become a challenging problem. It can be regarded as a multiobjective optimization problem that involves a variety of conflicting quality of service (QoS) attributes. In this study, a multiobjective optimization model of manufacturing service composition is presented that is based on QoS and an environmental index. Next, the skyline operator is applied to reduce the solution space. And then a new method called improved Flower Pollination Algorithm (FPA) is proposed for solving the problem of manufacturing service selection and composition. The improved FPA enhances the performance of basic FPA by combining the latter with crossover and mutation operators of the Differential Evolution (DE) algorithm. Finally, a case study is conducted to compare the proposed method with other evolutionary algorithms, including the Genetic Algorithm, DE, basic FPA, and extended FPA. The experimental results reveal that the proposed method performs best at solving the problem of manufacturing service selection and composition

Crossref

Directory of Open Access Journals

The multiple pheromone Ant clustering algorithm

Author: Chircop Jan
Publication venue
Publication date
Field of study

Ant Colony Optimisation algorithms mimic the way ants use pheromones for marking paths to important locations. Pheromone traces are followed and reinforced by other ants, but also evaporate over time. As a consequence, optimal paths attract more pheromone, whilst the less useful paths fade away. In the Multiple Pheromone Ant Clustering Algorithm (MPACA), ants detect features of objects represented as nodes within graph space. Each node has one or more ants assigned to each feature. Ants attempt to locate nodes with matching feature values, depositing pheromone traces on the way. This use of multiple pheromone values is a key innovation. Ants record other ant encounters, keeping a record of the features and colony membership of ants. The recorded values determine when ants should combine their features to look for conjunctions and whether they should merge into colonies. This ability to detect and deposit pheromone representative of feature combinations, and the resulting colony formation, renders the algorithm a powerful clustering tool. The MPACA operates as follows: (i) initially each node has ants assigned to each feature; (ii) ants roam the graph space searching for nodes with matching features; (iii) when departing matching nodes, ants deposit pheromones to inform other ants that the path goes to a node with the associated feature values; (iv) ant feature encounters are counted each time an ant arrives at a node; (v) if the feature encounters exceed a threshold value, feature combination occurs; (vi) a similar mechanism is used for colony merging. The model varies from traditional ACO in that: (i) a modified pheromone-driven movement mechanism is used; (ii) ants learn feature combinations and deposit multiple pheromone scents accordingly; (iii) ants merge into colonies, the basis of cluster formation. The MPACA is evaluated over synthetic and real-world datasets and its performance compares favourably with alternative approaches

Aston Publications Explorer

Graph partitioning algorithms for optimizing software deployment in mobile cloud computing

Author: De Turck Filip
Dhoedt Bart
Stevens Tim
Verbelen Tim
Publication venue: 'Elsevier BV'
Publication date: 01/01/2012
Field of study

As cloud computing is gaining popularity, an important question is how to optimally deploy software applications on the offered infrastructure in the cloud. Especially in the context of mobile computing where software components could be offloaded from the mobile device to the cloud, it is important to optimize the deployment, by minimizing the network usage. Therefore we have designed and evaluated graph partitioning algorithms that allocate software components to machines in the cloud while minimizing the required bandwidth. Contrary to the traditional graph partitioning problem our algorithms are not restricted to balanced partitions and take into account infrastructure heterogenity. To benchmark our algorithms we evaluated their performance and found they produce 10 to 40 % smaller graph cut sizes than METIS 4.0 for typical mobile computing scenarios

CiteSeerX

Ghent University Academic Bibliography

Métaheuristiques pour la résolution de problème de covoiturage régulier de grande taille et d'une extension

Author: GONCALVES Gilles
GUO Yuhan
Publication venue
Publication date: 01/01/2012
Field of study

La dispersion spatiale de l'habitat et des activités de ces dernières décennies a fortement contribué à un allongement des distances et des temps de trajets domicile-travail. Cela a pour conséquence un accroissement de l'utilisation des voitures particulières, notamment au sein et aux abords des grandes agglomérations. Afin de réduire les impacts dus à l'augmentation du trafic routier, des services de covoiturage, où des usagers ayant la même destination se regroupent en équipage pour se déplacer, ont été mis en place partout dans le monde. Nous présentons ici nos travaux sur le problème de covoiturage régulier. Dans cette thèse, le problème de covoiturage régulier a été modélisé et plusieurs métaheuristiques de résolution ont été implémentées, testées et comparées. La thèse est organisée de la façon suivante: tout d'abord, nous commençons par présenter la définition et la description du problème ainsi que le modèle mathématique associé. Ensuite, plusieurs métaheuristiques pour résoudre le problème sont présentées. Ces approches sont au nombre de quatre: un algorithme de recherche locale à voisinage variable, un algorithme à base de colonies de fourmis, un algorithme génétique guidée et un système multi-agents génétiques auto-adaptatif. Des expériences ont été menées pour démontrer l'efficacité de nos approches. Nous continuons ensuite avec la présentation et la résolution d'une extension du problème de covoiturage occasionel comportant plusieurs destinations. Pour terminer, une plate-forme de test et d'analyse pour évaluer nos approches et une plate-forme de covoiturage sont présentées dans l'annexe.Nowadays, the increased human mobility combined with high use of private cars increases the load on environment and raises issues about quality of life. The extensive use of private cars lends to high levels of air pollution, parking problem, traffic congestion and low transfer velocity. In order to ease these shortcomings, the car pooling program, where sets of car owners having the same travel destination share their vehicles, has emerged all around the world. We present here our research on the long-term car pooling problem. In this thesis, the long-term car pooling problem is modeled and metaheuristics for solving the problem are investigated. The thesis is organized as follows. First, the definition and description of the problem as well as its mathematical model are introduced. Then, several metaheuristics to effectively and efficiently solve the problem are presented. These approaches include a Variable Neighborhood Search Algorithm, a Clustering Ant Colony Algorithm, a Guided Genetic Algorithm and a Multi-agent Self-adaptive Genetic Algorithm. Experiments have been conducted to demonstrate the effectiveness of these approaches on solving the long-term car pooling problem. Afterwards, we extend our research to a multi-destination daily car pooling problem, which is introduced in detail manner along with its resolution method. At last, an algorithm test and analysis platform for evaluating the algorithms and a car pooling platform are presented in the appendix.ARRAS-Bib.electronique (620419901) / SudocSudocFranceF

OpenGrey Repository

Development of a R package to facilitate the learning of clustering techniques

Author: Ruiz Sabajanes Eduardo
Publication venue
Publication date: 01/01/2023
Field of study

This project explores the development of a tool, in the form of a R package, to ease the process of learning clustering techniques, how they work and what their pros and cons are. This tool should provide implementations for several different clustering techniques with explanations in order to allow the student to get familiar with the characteristics of each algorithm by testing them against several different datasets while deepening their understanding of them through the explanations. Additionally, these explanations should adapt to the input data, making the tool not only adept for self-regulated learning but for teaching too.Grado en Ingeniería Informátic

e_Buah - Biblioteca Digital de la Universidad de Alcalá