648 research outputs found

    Enhancing Content-And-Structure Information Retrieval using a Native XML Database

    Get PDF
    Three approaches to content-and-structure XML retrieval are analysed in this paper: first by using Zettair, a full-text information retrieval system; second by using eXist, a native XML database, and third by using a hybrid XML retrieval system that uses eXist to produce the final answers from likely relevant articles retrieved by Zettair. INEX 2003 content-and-structure topics can be classified in two categories: the first retrieving full articles as final answers, and the second retrieving more specific elements within articles as final answers. We show that for both topic categories our initial hybrid system improves the retrieval effectiveness of a native XML database. For ranking the final answer elements, we propose and evaluate a novel retrieval model that utilises the structural relationships between the answer elements of a native XML database and retrieves Coherent Retrieval Elements. The final results of our experiments show that when the XML retrieval task focusses on highly relevant elements our hybrid XML retrieval system with the Coherent Retrieval Elements module is 1.8 times more effective than Zettair and 3 times more effective than eXist, and yields an effective content-and-structure XML retrieval

    A Field Guide to Genetic Programming

    Get PDF
    xiv, 233 p. : il. ; 23 cm.Libro ElectrónicoA Field Guide to Genetic Programming (ISBN 978-1-4092-0073-4) is an introduction to genetic programming (GP). GP is a systematic, domain-independent method for getting computers to solve problems automatically starting from a high-level statement of what needs to be done. Using ideas from natural evolution, GP starts from an ooze of random computer programs, and progressively refines them through processes of mutation and sexual recombination, until solutions emerge. All this without the user having to know or specify the form or structure of solutions in advance. GP has generated a plethora of human-competitive results and applications, including novel scientific discoveries and patentable inventions. The authorsIntroduction -- Representation, initialisation and operators in Tree-based GP -- Getting ready to run genetic programming -- Example genetic programming run -- Alternative initialisations and operators in Tree-based GP -- Modular, grammatical and developmental Tree-based GP -- Linear and graph genetic programming -- Probalistic genetic programming -- Multi-objective genetic programming -- Fast and distributed genetic programming -- GP theory and its applications -- Applications -- Troubleshooting GP -- Conclusions.Contents xi 1 Introduction 1.1 Genetic Programming in a Nutshell 1.2 Getting Started 1.3 Prerequisites 1.4 Overview of this Field Guide I Basics 2 Representation, Initialisation and GP 2.1 Representation 2.2 Initialising the Population 2.3 Selection 2.4 Recombination and Mutation Operators in Tree-based 3 Getting Ready to Run Genetic Programming 19 3.1 Step 1: Terminal Set 19 3.2 Step 2: Function Set 20 3.2.1 Closure 21 3.2.2 Sufficiency 23 3.2.3 Evolving Structures other than Programs 23 3.3 Step 3: Fitness Function 24 3.4 Step 4: GP Parameters 26 3.5 Step 5: Termination and solution designation 27 4 Example Genetic Programming Run 4.1 Preparatory Steps 29 4.2 Step-by-Step Sample Run 31 4.2.1 Initialisation 31 4.2.2 Fitness Evaluation Selection, Crossover and Mutation Termination and Solution Designation Advanced Genetic Programming 5 Alternative Initialisations and Operators in 5.1 Constructing the Initial Population 5.1.1 Uniform Initialisation 5.1.2 Initialisation may Affect Bloat 5.1.3 Seeding 5.2 GP Mutation 5.2.1 Is Mutation Necessary? 5.2.2 Mutation Cookbook 5.3 GP Crossover 5.4 Other Techniques 32 5.5 Tree-based GP 39 6 Modular, Grammatical and Developmental Tree-based GP 47 6.1 Evolving Modular and Hierarchical Structures 47 6.1.1 Automatically Defined Functions 48 6.1.2 Program Architecture and Architecture-Altering 50 6.2 Constraining Structures 51 6.2.1 Enforcing Particular Structures 52 6.2.2 Strongly Typed GP 52 6.2.3 Grammar-based Constraints 53 6.2.4 Constraints and Bias 55 6.3 Developmental Genetic Programming 57 6.4 Strongly Typed Autoconstructive GP with PushGP 59 7 Linear and Graph Genetic Programming 61 7.1 Linear Genetic Programming 61 7.1.1 Motivations 61 7.1.2 Linear GP Representations 62 7.1.3 Linear GP Operators 64 7.2 Graph-Based Genetic Programming 65 7.2.1 Parallel Distributed GP (PDGP) 65 7.2.2 PADO 67 7.2.3 Cartesian GP 67 7.2.4 Evolving Parallel Programs using Indirect Encodings 68 8 Probabilistic Genetic Programming 8.1 Estimation of Distribution Algorithms 69 8.2 Pure EDA GP 71 8.3 Mixing Grammars and Probabilities 74 9 Multi-objective Genetic Programming 75 9.1 Combining Multiple Objectives into a Scalar Fitness Function 75 9.2 Keeping the Objectives Separate 76 9.2.1 Multi-objective Bloat and Complexity Control 77 9.2.2 Other Objectives 78 9.2.3 Non-Pareto Criteria 80 9.3 Multiple Objectives via Dynamic and Staged Fitness Functions 80 9.4 Multi-objective Optimisation via Operator Bias 81 10 Fast and Distributed Genetic Programming 83 10.1 Reducing Fitness Evaluations/Increasing their Effectiveness 83 10.2 Reducing Cost of Fitness with Caches 86 10.3 Parallel and Distributed GP are Not Equivalent 88 10.4 Running GP on Parallel Hardware 89 10.4.1 Master–slave GP 89 10.4.2 GP Running on GPUs 90 10.4.3 GP on FPGAs 92 10.4.4 Sub-machine-code GP 93 10.5 Geographically Distributed GP 93 11 GP Theory and its Applications 97 11.1 Mathematical Models 98 11.2 Search Spaces 99 11.3 Bloat 101 11.3.1 Bloat in Theory 101 11.3.2 Bloat Control in Practice 104 III Practical Genetic Programming 12 Applications 12.1 Where GP has Done Well 12.2 Curve Fitting, Data Modelling and Symbolic Regression 12.3 Human Competitive Results – the Humies 12.4 Image and Signal Processing 12.5 Financial Trading, Time Series, and Economic Modelling 12.6 Industrial Process Control 12.7 Medicine, Biology and Bioinformatics 12.8 GP to Create Searchers and Solvers – Hyper-heuristics xiii 12.9 Entertainment and Computer Games 127 12.10The Arts 127 12.11Compression 128 13 Troubleshooting GP 13.1 Is there a Bug in the Code? 13.2 Can you Trust your Results? 13.3 There are No Silver Bullets 13.4 Small Changes can have Big Effects 13.5 Big Changes can have No Effect 13.6 Study your Populations 13.7 Encourage Diversity 13.8 Embrace Approximation 13.9 Control Bloat 13.10 Checkpoint Results 13.11 Report Well 13.12 Convince your Customers 14 Conclusions Tricks of the Trade A Resources A.1 Key Books A.2 Key Journals A.3 Key International Meetings A.4 GP Implementations A.5 On-Line Resources 145 B TinyGP 151 B.1 Overview of TinyGP 151 B.2 Input Data Files for TinyGP 153 B.3 Source Code 154 B.4 Compiling and Running TinyGP 162 Bibliography 167 Inde

    Improving the geospatial consistency of digital libraries metadata

    Get PDF
    Consistency is an essential aspect of the quality of metadata. Inconsistent metadata records are harmful: given a themed query, the set of retrieved metadata records would contain descriptions of unrelated or irrelevant resources, and may even not contain some resources considered obvious. This is even worse when the description of the location is inconsistent. Inconsistent spatial descriptions may yield invisible or hidden geographical resources that cannot be retrieved by means of spatially themed queries. Therefore, ensuring spatial consistency should be a primary goal when reusing, sharing and developing georeferenced digital collections. We present a methodology able to detect geospatial inconsistencies in metadata collections based on the combination of spatial ranking, reverse geocoding, geographic knowledge organization systems and information-retrieval techniques. This methodology has been applied to a collection of metadata records describing maps and atlases belonging to the Library of Congress. The proposed approach was able to automatically identify inconsistent metadata records (870 out of 10,575) and propose fixes to most of them (91.5%) These results support the ability of the proposed methodology to assess the impact of spatial inconsistency in the retrievability and visibility of metadata records and improve their spatial consistency

    Context-Aware and Adaptable eLearning Systems

    Get PDF
    The full text file attached to this record contains a copy of the thesis without the authors publications attached. The list of publications that are attached to the complete thesis can be found on pages 6-7 in the thesis.This thesis proposed solutions to some shortcomings to current eLearning architectures. The proposed DeLC architecture supports context-aware and adaptable provision of eLearning services and electronic content. The architecture is fully distributed and integrates service-oriented development with agent technology. Central to this architecture is that a node is our unit of computation (known as eLearning node) which can have purely service-oriented architecture, agent-oriented architecture or mixed architecture. Three eLeaerning Nodes have been implemented in order to demonstrate the vitality of the DeLC concept. The Mobile eLearning Node uses a three-level communication network, called InfoStations network, supporting mobile service provision. The services, displayed on this node, are to be aware of its context, gather required learning material and adapted to the learner request. This is supported trough a multi-layered hybrid (service- and agent-oriented) architecture whose kernel is implemented as middleware. For testing of the middleware a simulation environment has been developed. In addition, the DeLC development approach is proposed. The second eLearning node has been implemented as Education Portal. The architecture of this node is poorly service-oriented and it adopts a client-server architecture. In the education portal, there are incorporated education services and system services, called engines. The electronic content is kept in Digital Libraries. Furthermore, in order to facilitate content creators in DeLC, the environment Selbo2 was developed. The environment allows for creating new content, editing available content, as well as generating educational units out of preexisting standardized elements. In the last two years, the portal is used in actual education at the Faculty of Mathematics and Informatics, University of Plovdiv. The third eLearning node, known as Agent Village, exhibits a purely agent-oriented architecture. The purpose of this node is to provide intelligent assistance to the services deployed on the Education Pportal. Currently, two kinds of assistants are implemented in the node - eTesting Assistants and Refactoring eLearning Environment (ReLE). A more complex architecture, known as Education Cluster, is presented in this thesis as well. The Education Cluster incorporates two eLearning nodes, namely the Education Portal and the Agent Village. eLearning services and intelligent agents interact in the cluster

    WeaFQAs: A Software Product Line Approach for Customizing and Weaving Efficient Functional Quality Attributes

    Get PDF
    Fecha de Lectura de Tesis: 10 de julio de 2018Los atributos de calidad funcionales (FQA) son aquellos que tienen una clara implicación en la funcionalidad del sistema, es decir, existen unos componentes específicos que deben ser incorporados a la arquitectura software del sistema para satisfacer sus requisitos de atributos de calidad. Ejemplos de FQAs son seguridad, usabilidad, o persistencia. Modelar estos atributos es una tarea compleja. Por un lado, se componen de muchas características relacionadas, por ejemplo seguridad está compuesto, entre otros, por autenticación, confidencialidad y encriptación. Tienen dependencias entre ellos, por ejemplo, seguridad puede ser requerido por usabilidad o persistencia. Por otro lado, tienen muchos puntos de variabilidad: una aplicación concreta puede requerir autenticación y control de acceso mientras que otra puede necesitar sólo encriptación. Además, su funcionalidad suele estar dispersa afectando a varios componentes del sistema en desarrollo. El objetivo de esta tesis es definir una línea de productos software orientada a aspectos que permita: (1) modelar las similitudes y la variabilidad de los FQAs desde las primeras etapas del proceso de desarrollo, (2) gestionar las dependencias existentes entre los FQAs, (3) independizar el modelado de los FQAs de la arquitectura de la aplicación afectada, (4) configurar los FQAs en base a los requisitos de cada aplicación teniendo además en cuenta propiedades no funcionales como el rendimiento y el consumo energético de cada solución, (5) incorporar las configuraciones a la arquitectura de la aplicación de manera automática; y (6) gestionar la evolución de los FQAs cuando los requisitos cambien en el futuro. Como resultado se ha definido WeaFQAs, un proceso software para gestionar los FQAs que cubre todos los puntos mencionados. Se han realizado y comparado dos instanciaciones de WeaFQAs usando diferentes lenguajes de variabilidad y de modelado, además de proporcionar soporte con una herramienta basada en el lenguaje CVL

    Biodiversity and Biocollections: Problem of Correspondence

    Get PDF
    This text is an English translation of those several sections of the original paper in Russian, where collection-related issues are considered. The full citation of the original paper is as following: Pavlinov I.Ya. 2016. [Bioraznoobrazie i biokollektsii: problema sootvetstvia]. In: Pavlinov I.Ya. (comp.). Aspects of Biodiversity. Archives of Zoological Museum of Lomonosov Moscow State University, Vol. 54, Pр. 733–786. Orientation of biology, as a natural science, on the study and explanation of the similarities and differences between organisms led in the second half of the 20th century to the recognition of a specifi c subject area of biological explorations, viz. biodiversity (BD). One of the important general scientifi c prerequisites for this shift was understanding that (at the level of ontology) the structured diversity of the living nature is its fundamental property equivocal to subjecting of some of its manifestations to certain laws. At the level of epistemology, this led to acknowledging that the “diversifi cationary” approach to description of the living beings is as justifi able as the before dominated “unifi cationary” one. This general trend has led to a signifi cant increase in the attention to BD. From a pragmatic perspective, its leitmotif was conservation of BD as a renewable resource, while from a scientifi c perspective the leitmotif was studying it was studying BD as a specifi c natural phenomenon. These two points of view are united by recognition of the need for scientific substantiation of BD conservation strategy, which implies the need for a detailed study of BD itself. At the level of ontology, one of the key problems in the study of BD (leaving aside the question of its genesis) is determination of its structure, which is interpreted as a manifestation of the structure of the Earth’s biota itself. With this, it is acknowledged that the subject area of empirical explorations is not the BD as a whole ( “Umgebung”) but its particular manifestations (“Umwelts”). It is proposed herewith to recognized, within the latter: fragments of BD (especially taxa and ecosystems), hierarchical levels of BD (primarily within- and interorganismal ones), and aspects of BD (before all taxonomic and meronomic ones). Attention is drawn to a new interpretation of bioinformatics as a discipline that studies the information support of BD explorations. An important fraction of this support are biocollections. The scientifi c value of collections means that they make it possible both empirical inferring and testing (verification) of the knowledge about BD. This makes biocollections, in their epistemological status, equivalent to experiments, and so makes studies of BD quite scientific. It is emphasized that the natural objects (naturalia), which are permanently kept in collections, contain primary (objective) information about BD, while information retrieved somehow from them is a secondary (subjective) one. Collection, as an information resource, serves as a research sample in the studies of BD. Collection pool, as the totality of all collection materials kept in repositories according to certain standards, can be treated as a general sample, and every single collection as a local sample. The main characteristic of collection-as-sample is its representativeness; so the basic strategy of development of the collection pool is to maximize its representativeness as a means to ensure correspondence of structure of biocollection pool to that of BD itself. The most fundamental characteristic of collection, as an information resource, is its scientific signifi cance. The following three main groups of more particular characteristics are distinguished: — the “proper” characteristics of every collection are its meaningfulness, informativeness, reliability, adequacy, documenting, systematicity, volume, structure, uniqueness, stability, lability; — the “external” characteristics of collection are resolution, usability, ethic constituent; — the “service” characteristics of collection are its museofication, storage system security, inclusion in metastructure, cost. In the contemporary world, development of the biocollection pool, as a specific resource for BD research, requires considerable organizational efforts, including work on their “information support” aimed at demonstrating the necessity of existence of the biocollections

    Der Lehrstuhl Datenbank- und Informationssysteme der Universität Rostock

    Get PDF
    Im Jahr 2014 feierte der Lehrstuhl Datenbank- und Informationssysteme (LS DBIS) an der Universität Rostock sein zwanzigjähriges Bestehen. Zur Jubiläumsveranstaltung mit ehemaligen und aktuellen Studenten, Mitarbeitern, Kollegen und Kooperationspartnern wurde diverses Material aus 20 Jahren aufbereitet. In diesem Beitrag soll daraus ein Rückblick auf 20 Jahre Forschung und Lehre im Bereich Datenbank- und Informationssysteme sowie ein Ein- und Ausblick auf aktuelle Forschungsarbeiten gegeben werden

    Performance Optimization Strategies for Transactional Memory Applications

    Get PDF
    This thesis presents tools for Transactional Memory (TM) applications that cover multiple TM systems (Software, Hardware, and hybrid TM) and use information of all different layers of the TM software stack. Therefore, this thesis addresses a number of challenges to extract static information, information about the run time behavior, and expert-level knowledge to develop these new methods and strategies for the optimization of TM applications
    corecore