11 research outputs found
Applying Genetic Algorithm In Query Improvement Problem
This paper presents an adaptive method using genetic algorithm to modify user’s queries, based on
relevance judgments. This algorithm was adapted for the three well-known documents collections (CISI, NLP and
CACM). The method is shown to be applicable to large text collections, where more relevant documents are
presented to users in the genetic modification. The algorithm shows the effects of applying GA to improve the
effectiveness of queries in IR systems. Further studies are planned to adjust the system parameters to improve
its effectiveness. The goal is to retrieve most relevant documents with less number of non-relevant documents
with respect to user's query in information retrieval system using genetic algorithm
A review on the application of evolutionary computation to information retrieval
In this contribution, different proposals found in the specialized literature for the
application of evolutionary computation to the field of information retrieval will be
reviewed. To do so, different kinds of IR problems that have been solved by evolutionary
algorithms are analyzed. Some of the specific existing approaches will be specifically
described for some of these problems and the obtained results will be critically
evaluated in order to give a clear view of the topic to the reader.CICYT under project TIC2002-03276University of Granada under project ‘‘Mejora de Metaheur ısticas mediante Hibridaci on y sus
Aplicaciones
State-of-the-art review on relevance of genetic algorithm to internet web search
People use search engines to find information they desire with the aim that their information needs will be met. Information
retrieval (IR) is a field that is concerned primarily with the searching and retrieving of information in the documents and also
searching the search engine, online databases, and Internet. Genetic algorithms (GAs) are robust, efficient, and optimizated
methods in a wide area of search problems motivated by Darwin’s principles of natural selection and survival of the fittest. This
paper describes information retrieval systems (IRS) components. This paper looks at how GAs can be applied in the field of IR and
specifically the relevance of genetic algorithms to internet web search. Finally, from the proposals surveyed it turns out that GA is
applied to diverse problem fields of internet web search
A Taxonomy of Information Retrieval Models and Tools
Information retrieval is attracting significant attention due to the exponential growth of the amount of information available in digital format. The proliferation of information retrieval objects, including algorithms, methods, technologies, and tools, makes it difficult to assess their capabilities and features and to understand the relationships that exist among them. In addition, the terminology is often confusing and misleading, as different terms are used to denote the same, or similar, tasks.
This paper proposes a taxonomy of information retrieval models and tools and provides precise definitions for the key terms. The taxonomy consists of superimposing two views: a vertical taxonomy, that classifies IR models with respect to a set of basic features, and a horizontal taxonomy, which classifies IR systems and services with respect to the tasks they support.
The aim is to provide a framework for classifying existing information retrieval models and tools and a solid point to assess future developments in the field
Recommended from our members
Local search: A guide for the information retrieval practitioner
There are a number of combinatorial optimisation problems in information retrieval in which the use of local search methods are worthwhile. The purpose of this paper is to show how local search can be used to solve some well known tasks in information retrieval (IR), how previous research in the field is piecemeal, bereft of a structure and methodologically flawed, and to suggest more rigorous ways of applying local search methods to solve IR problems. We provide a query based taxonomy for analysing the use of local search in IR tasks and an overview of issues such as fitness functions, statistical significance and test collections when conducting experiments on combinatorial optimisation problems. The paper gives a guide on the pitfalls and problems for IR practitioners who wish to use local search to solve their research issues, and gives practical advice on the use of such methods. The query based taxonomy is a novel structure which can be used by the IR practitioner in order to examine the use of local search in IR
A smart itsy bitsy spider for the Web
Artificial Intelligence Lab, Department of MIS, University of ArizonaAs part of the ongoing Illinois Digital Library Initiative project, this research proposes an intelligent agent approach to Web searching. In this experiment, we developed two Web personal spiders based on best first search and genetic algorithm techniques, respectively. These personal spiders can dynamically take a userâ s selected starting homepages and search for the most closely related homepages in the Web, based on the links and keyword indexing. A graphical, dynamic, Java-based interface was developed and is available for Web access. A system architecture for implementing such an agent-based spider is presented, followed by detailed discussions of benchmark testing and user evaluation results. In benchmark testing, although the genetic algorithm spider did not outperform the best first search spider, we found both results to be comparable and complementary. In user evaluation, the genetic algorithm spider obtained significantly higher recall value than that of the best first search spider. However, their precision values were not statistically different. The mutation process introduced in genetic algorithm allows users to find other potential relevant homepages that cannot be explored via a conventional local search process. In addition, we found the Java-based interface to be a necessary component for design of a truly interactive and dynamic Web agent
Un environnement de spécification et de découverte pour la réutilisation des composants logiciels dans le développement des logiciels distribués
Notre travail vise à élaborer une solution efficace pour la découverte et la réutilisation des composants logiciels dans les environnements de développement existants et couramment utilisés. Nous proposons une ontologie pour décrire et découvrir des composants logiciels élémentaires. La description couvre à la fois les propriétés fonctionnelles et les propriétés non fonctionnelles des composants logiciels exprimées comme des paramètres de QoS. Notre processus de recherche est basé sur la fonction qui calcule la distance sémantique entre la signature d'un composant et la signature d'une requête donnée, réalisant ainsi une comparaison judicieuse. Nous employons également la notion de " subsumption " pour comparer l'entrée-sortie de la requête et des composants. Après sélection des composants adéquats, les propriétés non fonctionnelles sont employées comme un facteur distinctif pour raffiner le résultat de publication des composants résultats. Nous proposons une approche de découverte des composants composite si aucun composant élémentaire n'est trouvé, cette approche basée sur l'ontologie commune. Pour intégrer le composant résultat dans le projet en cours de développement, nous avons développé l'ontologie d'intégration et les deux services " input/output convertor " et " output Matching ".Our work aims to develop an effective solution for the discovery and the reuse of software components in existing and commonly used development environments. We propose an ontology for describing and discovering atomic software components. The description covers both the functional and non functional properties which are expressed as QoS parameters. Our search process is based on the function that calculates the semantic distance between the component interface signature and the signature of a given query, thus achieving an appropriate comparison. We also use the notion of "subsumption" to compare the input/output of the query and the components input/output. After selecting the appropriate components, the non-functional properties are used to refine the search result. We propose an approach for discovering composite components if any atomic component is found, this approach based on the shared ontology. To integrate the component results in the project under development, we developed the ontology integration and two services " input/output convertor " and " output Matching "
Generación de métodos basados en inteligencia artificial para el análisis de datos medioambientales. Aplicaciones prácticas
Sastre Merlín, Antonio, tutorEn los últimos tiempos se ha puesto de manifiesto la gran importancia del análisis de datos con vistas a la búsqueda de modelos y a la inferencia de información nueva y relevante. En concreto, en ciencias medioambientales estas tareas de análisis son de especial importancia debido a la paulatina degradación ambiental que sufre nuestro entorno y que requiere actuaciones urgentes y de gran precisión. La investigación que se presenta en este trabajo de tesis es el fruto de la integración de dos áreas de conocimiento bien conocidas; las áreas de inteligencia artificial y de ciencias medioambientales, con el objetivo de diseñar y desarrollar métodos de análisis o de inferencia de modelos que permitan explorar nuevos aspectos de los problemas medioambientales a partir de un conjunto de observaciones. Habitualmente estos problemas presentan una gran complejidad que limita, en muchos casos, la eficacia de las técnicas estadísticas de inferencia para la extracción de información o conocimiento. La metodología propuesta pretende ser una ayuda útil y complementaria a los estudios estadísticos. La memoria presenta todas las fases del diseño y del desarrollo de un sistema de extracción de conocimiento en bases de datos (Knowledge Discovery Database - KDD) que ha sido implementado teniendo en cuenta características propias de los datos y muestreos medioambientales. Entre las aportaciones principales se encuentra un sistema de inferencia de modelos que utiliza un procedimiento de aprendizaje automático, en concreto aprendizaje basado en ejemplos. El sistema genera modelos fácilmente interpretables ya que el conocimiento viene representado por un conjunto de reglas Si-entonces. En este sistema de inferencia de modelos se ha implementado un algoritmo genético como método de búsqueda de los mejores conjuntos de reglas que permite evitar la exploración sesgada del espacio de posibles soluciones (modelos) que presentan otros procedimientos de búsqueda. Además como parte del sistema KDD desarrollado, se ha implementado una herramienta de ayuda a la recogida georeferenciada de datos en campo que los almacena, en tiempo real, en una base de datos relacional con un formato que permite el tratamiento posterior de la información almacenada con un Sistema de Información Geográfica. El conjunto de herramientas desarrolladas se aplican a un problema medioambiental; el control de malas hierbas en sistemas agrícolas, una de las líneas centrales de la denominada agricultura de precisión, área que desde las perspectivas ecológica y económica busca una gestión óptima de los productos agroquímicos empleados en los tratamientos fitosanitarios. En concreto el análisis que se presenta en la memoria va encaminado a la obtención, a partir de un conjunto de datos, de modelos basados en reglas que expliquen, en función de parámetros ambientales y para un mismo campo, la existencia de una mayor cantidad de malas hierbas en unas zonas del cultivo frente a otras. El conocimiento incluido en los modelos extraídos aporta información de utilidad que puede plasmarse en un mapa de riesgo que permita asesorar en la aplicación precisa de herbicida sólo en las zonas del cultivo que lo requieran y en una dosis ajustada a cada situación de infestación. Los datos utilizados para la obtención de los modelos provienen de varias parcelas de cereal de invierno situadas en la Comunidad de Madrid y en la provincia de Barcelona y de dos tipos de mala hierba (Avena sterilis L. y Lolium rigidum G.). Asimismo, los conjuntos de reglas obtenidos con la metodología propuesta se han contrastado con los modelos generados, para el mismo conjunto de datos, con algoritmos comerciales como C&RT y C5.0, dando como resultado una mejora en la calidad de los modelos inducidos con los métodos desarrollados, es decir que nuestros modelos describen con mayor exactitud y confianza las observaciones de partida
Recommended from our members
Inductive Query by Examples (IQBE): A Machine Learning Approach
Artificial Intelligence Lab, Department of MIS, University of ArizonaThis paper presents an incremental, inductive learning approach to query-by examples for information retrieval (IR) and database management systems (DBMS). After briefly reviewing conventional information retrieval techniques and the prevailing database query paradigms, we introduce the ID5R algorithm, previously developed by Utgoff, for ``intelligent'' and system-supported query processing. We describe in detail how we adapted the ID5R algorithm for IR/DBMS applications and we present two examples, one for IR applications and the other for DBMS applications, to demonstrate the feasibility of the approach. Using a larger test collection of about 1000 document records from the COMPEN CD-ROM computing literature database and using recall as a performance measure, our experiment showed that the incremental ID5R performed significantly better than a batch inductive learning algorithm (called ID3) which we developed earlier. Both algorithms, however, were robust and efficient in helping users develop abstract queries from examples. We believe this research has shed light on the feasibility and the novel characteristics of a new query paradigm, namely, inductive query-by examples (IQBE). Directions of our current research are summarized at the end of the paper