517 research outputs found

    The Family of MapReduce and Large Scale Data Processing Systems

    Full text link
    In the last two decades, the continuous increase of computational power has produced an overwhelming flow of data which has called for a paradigm shift in the computing architecture and large scale data processing mechanisms. MapReduce is a simple and powerful programming model that enables easy development of scalable parallel applications to process vast amounts of data on large clusters of commodity machines. It isolates the application from the details of running a distributed program such as issues on data distribution, scheduling and fault tolerance. However, the original implementation of the MapReduce framework had some limitations that have been tackled by many research efforts in several followup works after its introduction. This article provides a comprehensive survey for a family of approaches and mechanisms of large scale data processing mechanisms that have been implemented based on the original idea of the MapReduce framework and are currently gaining a lot of momentum in both research and industrial communities. We also cover a set of introduced systems that have been implemented to provide declarative programming interfaces on top of the MapReduce framework. In addition, we review several large scale data processing systems that resemble some of the ideas of the MapReduce framework for different purposes and application scenarios. Finally, we discuss some of the future research directions for implementing the next generation of MapReduce-like solutions.Comment: arXiv admin note: text overlap with arXiv:1105.4252 by other author

    Modern Trends in the Automatic Generation of Content for Video Games

    Get PDF
    Attractive and realistic content has always played a crucial role in the penetration and popularity of digital games, virtual environments, and other multimedia applications. Procedural content generation enables the automatization of production of any type of game content including not only landscapes and narratives but also game mechanics and generation of whole games. The article offers a comparative analysis of the approaches to automatic generation of content for video games proposed in last five years. It suggests a new typology of the use of procedurally generated game content comprising of categories structured in three groups: content nature, generation process, and game dependence. Together with two other taxonomies – one of content type and the other of methods for content generation – this typology is used for comparing and discussing some specific approaches to procedural content generation in three promising research directions based on applying personalization and adaptation, descriptive languages, and semantic specifications

    Declarative Rules for Annotated Expert Knowledge in Change Management

    Get PDF
    In this paper, we use declarative and domain-specific languages for representing expert knowledge in the field of change management in organisational psychology. Expert rules obtained in practical case studies are represented as declarative rules in a deductive database. The expert rules are annotated by information describing their provenance and confidence. Additional provenance information for the whole - or parts of the - rule base can be given by ontologies. Deductive databases allow for declaratively defining the semantics of the expert knowledge with rules; the evaluation of the rules can be optimised and the inference mechanisms could be changed, since they are specified in an abstract way. As the logical syntax of rules had been a problem in previous applications of deductive databases, we use specially designed domain-specific languages to make the rule syntax easier for non-programmers. The semantics of the whole knowledge base is declarative. The rules are written declaratively in an extension datalogs of the well-known deductive database language datalog on the data level, and additional datalogs rules can configure the processing of the annotated rules and the ontologies

    Data Models in Neuroinformatics

    Get PDF
    Advancements in integrated neuroscience are often characterized with data-driven approaches for discovery; these progressions are the result of continuous efforts aimed at developing integrated frameworks for the investigation of neuronal dynamics at increasing resolution and in varying scales. Since insights from integrated neuronal models frequently rely on both experimental and computational approaches, simulations and data modeling have inimitable roles. Moreover, data sharing across the neuroscientific community has become an essential component of data-driven approaches to neuroscience as is evident from the number and scale of ongoing national and multinational projects, engaging scientists from diverse branches of knowledge. In this heterogeneous environment, the need to share neuroscientific data as well as to utilize it across different simulation environments drove the momentum for standardizing data models for neuronal morphologies, biophysical properties, and connectivity schemes. Here, I review existing data models in neuroinformatics, ranging from flat to hybrid object-hierarchical approaches, and suggest a framework with which these models can be linked to experimental data, as well as to established records from existing databases. Linking neuronal models and experimental results with data on relevant articles, genes, proteins, disease, etc., might open a new dimension for data-driven neuroscience

    Algorithm-aided Information Design: Hybrid Design approach on the edge of associative methodologies in AEC

    Get PDF
    Dissertação de mestrado em European Master in Building Information ModellingLast three decades have brought colossal progress to design methodologies within the common pursuit toward a seamless fusion between digital and physical worlds and augmenting it with the of computation power and network coverage. For this historically short period, two generations of methodologies and tools have emerged: Additive generation and parametric Associative generation of CAD. Currently, designers worldwide engaged in new forms of design exploration. From this race, two prominent methodologies have developed from Associative Design approach – Object-Oriented Design (OOD) and Algorithm-Aided Design (AAD). The primary research objective is to investigate, examine, and push boundaries between OOD and AAD for new design space determination, where advantages of both design methods are fused to produce a new generation methodology which is called in the present study AID (Algorithm-aided Information Design). The study methodology is structured into two flows. In the first flow, existing CAD methodologies are investigated, and the conceptual framework is extracted based on the state of art analysis, then analysed data is synthesized into the subject proposal. In the second flow, tools and workflows are elaborated and examined on practice to confirm the subject proposal. In compliance, the content of the research consists of two theoretical and practical parts. In the first theoretical part, a literature review is conducted, and assumptions are made to speculate about AID methodology, its tools, possible advantages and drawbacks. Next, case studies are performed according to sequential stages of digital design through the lens of practical AID methodology implementation. Case studies are covering such design aspects as model & documentation generation, design automation, interoperability, manufacturing control, performance analysis and optimization. Ultimately, a set of test projects is developed with the AID methodology applied. After the practical part, research returns to the theory where analytical information is gathered based on the literature review, conceptual framework, and experimental practice reports. In summary, the study synthesizes AID methodology as part of Hybrid Design, which enables creative use of tools and elaborating of agile design systems integrating additive and associative methodologies of Digital Design. In general, the study is based on agile methods and cyclic research development mixed between practice and theory to achieve a comprehensive vision of the subject.Last three decades have brought colossal progress to design methodologies within the common pursuit toward a seamless fusion between digital and physical worlds and augmenting it with the of computation power and network coverage. For this historically short period, two generations of methodologies and tools have emerged: Additive generation and parametric Associative generation of CAD. Currently, designers worldwide engaged in new forms of design exploration. From this race, two prominent methodologies have developed from Associative Design approach – Object-Oriented Design (OOD) and Algorithm-Aided Design (AAD). The primary research objective is to investigate, examine, and push boundaries between OOD and AAD for new design space determination, where advantages of both design methods are fused to produce a new generation methodology which is called in the present study AID (Algorithm-aided Information Design). The study methodology is structured into two flows. In the first flow, existing CAD methodologies are investigated, and the conceptual framework is extracted based on the state of art analysis, then analysed data is synthesized into the subject proposal. In the second flow, tools and workflows are elaborated and examined on practice to confirm the subject proposal. In compliance, the content of the research consists of two theoretical and practical parts. In the first theoretical part, a literature review is conducted, and assumptions are made to speculate about AID methodology, its tools, possible advantages and drawbacks. Next, case studies are performed according to sequential stages of digital design through the lens of practical AID methodology implementation. Case studies are covering such design aspects as model & documentation generation, design automation, interoperability, manufacturing control, performance analysis and optimization. Ultimately, a set of test projects is developed with the AID methodology applied. After the practical part, research returns to the theory where analytical information is gathered based on the literature review, conceptual framework, and experimental practice reports. In summary, the study synthesizes AID methodology as part of Hybrid Design, which enables creative use of tools and elaborating of agile design systems integrating additive and associative methodologies of Digital Design. In general, the study is based on agile methods and cyclic research development mixed between practice and theory to achieve a comprehensive vision of the subject

    Workflow models for heterogeneous distributed systems

    Get PDF
    The role of data in modern scientific workflows becomes more and more crucial. The unprecedented amount of data available in the digital era, combined with the recent advancements in Machine Learning and High-Performance Computing (HPC), let computers surpass human performances in a wide range of fields, such as Computer Vision, Natural Language Processing and Bioinformatics. However, a solid data management strategy becomes crucial for key aspects like performance optimisation, privacy preservation and security. Most modern programming paradigms for Big Data analysis adhere to the principle of data locality: moving computation closer to the data to remove transfer-related overheads and risks. Still, there are scenarios in which it is worth, or even unavoidable, to transfer data between different steps of a complex workflow. The contribution of this dissertation is twofold. First, it defines a novel methodology for distributed modular applications, allowing topology-aware scheduling and data management while separating business logic, data dependencies, parallel patterns and execution environments. In addition, it introduces computational notebooks as a high-level and user-friendly interface to this new kind of workflow, aiming to flatten the learning curve and improve the adoption of such methodology. Each of these contributions is accompanied by a full-fledged, Open Source implementation, which has been used for evaluation purposes and allows the interested reader to experience the related methodology first-hand. The validity of the proposed approaches has been demonstrated on a total of five real scientific applications in the domains of Deep Learning, Bioinformatics and Molecular Dynamics Simulation, executing them on large-scale mixed cloud-High-Performance Computing (HPC) infrastructures

    Meta-environment and executable meta-language using smalltalk: an experience report

    Get PDF
    Object-oriented modelling languages such as EMOF are often used to specify domain specific meta-models. However, these modelling languages lack the ability to describe behavior or operational semantics. Several approaches have used a subset of Java mixed with OCL as executable meta-languages. In this experience report we show how we use Smalltalk as an executable meta-language in the context of the Moose reengineering environment. We present how we implemented EMOF and its behavioral aspects. Over the last decade we validated this approach through incrementally building a meta-described reengineering environment. Such an approach bridges the gap between a code-oriented view and a meta-model driven one. It avoids the creation of yet another language and reuses the infrastructure and run-time of the underlying implementation language. It offers an uniform way of letting developers focus on their tasks while at the same time allowing them to meta-describe their domain model. The advantage of our approach is that developers use the same tools and environment they use for their regular tasks. Still the approach is not Smalltalk specific but can be applied to language offering an introspective API such as Ruby, Python, CLOS, Java and C
    • …
    corecore