2,835 research outputs found

    NL4Py: Agent-Based Modeling in Python with Parallelizable NetLogo Workspaces

    Full text link
    NL4Py is a NetLogo controller software for Python, for the rapid, parallel execution of NetLogo models. NL4Py provides both headless (no graphical user interface) and GUI NetLogo workspace control through Python. Spurred on by the increasing availability of open-source computation and machine learning libraries on the Python package index, there is an increasing demand for such rapid, parallel execution of agent-based models through Python. NetLogo, being the language of choice for a majority of agent-based modeling driven research projects, requires an integration to Python for researchers looking to perform statistical analyses of agent-based model output using these libraries. Unfortunately, until the recent introduction of PyNetLogo, and now NL4Py, such a controller was unavailable. This article provides a detailed introduction into the usage of NL4Py and explains its client-server software architecture, highlighting architectural differences to PyNetLogo. A step-by-step demonstration of global sensitivity analysis and parameter calibration of the Wolf Sheep Predation model is then performed through NL4Py. Finally, NL4Py's performance is benchmarked against PyNetLogo and its combination with IPyParallel, and shown to provide significant savings in execution time over both configurations

    SKIRT: hybrid parallelization of radiative transfer simulations

    Full text link
    We describe the design, implementation and performance of the new hybrid parallelization scheme in our Monte Carlo radiative transfer code SKIRT, which has been used extensively for modeling the continuum radiation of dusty astrophysical systems including late-type galaxies and dusty tori. The hybrid scheme combines distributed memory parallelization, using the standard Message Passing Interface (MPI) to communicate between processes, and shared memory parallelization, providing multiple execution threads within each process to avoid duplication of data structures. The synchronization between multiple threads is accomplished through atomic operations without high-level locking (also called lock-free programming). This improves the scaling behavior of the code and substantially simplifies the implementation of the hybrid scheme. The result is an extremely flexible solution that adjusts to the number of available nodes, processors and memory, and consequently performs well on a wide variety of computing architectures.Comment: 21 pages, 20 figure

    Improving IRWLS algorithm for GLM with Intel Xeon Family

    Get PDF
    This study investigates utilizing the characteristics of Intel Xeon to improve the performance of training generalized linear models. The classic approach to fnd the maximum likelihood estimation of linear model requires loading entire data into memory for computation which is infeasible when data size is bigger than memory size. With the approach analyzed by Zhang and Yang (2017), the process of model fitting will be achieved iteratively through iterating each row. However, one limitation of this approach could be the iterative manner will impact performance when implementing it on Intel Xeon processor which delivers parallelism and vectorization. The study will focus on the tuning of application process and configuration on Xeon family processor based on the architecture of GLM model fitting algorithm

    Automating the multiprocessing environment

    Get PDF
    An approach to automate the programming and operation of tree-structured networks of multiprocessor systems is discussed. A conceptual, knowledge-based operating environment is presented, and requirements for two major technology elements are identified as follows: (1) An intelligent information translator is proposed for implementating information transfer between dissimilar hardware and software, thereby enabling independent and modular development of future systems and promoting a language-independence of codes and information; (2) A resident system activity manager, which recognizes the systems capabilities and monitors the status of all systems within the environment, is proposed for integrating dissimilar systems into effective parallel processing resources to optimally meet user needs. Finally, key computational capabilities which must be provided before the environment can be realized are identified

    Enhanced transformer long short-term memory framework for datastream prediction

    Get PDF
    In machine learning, datastream prediction is a challenging issue, particularly when dealing with enormous amounts of continuous data. The dynamic nature of data makes it difficult for traditional models to handle and sustain real-time prediction accuracy. This research uses a multi-processor long short-term memory (MPLSTM) architecture to present a unique framework for datastream regression. By employing several central processing units (CPUs) to divide the datastream into multiple parallel chunks, the MPLSTM framework illustrates the intrinsic parallelism of long short-term memory (LSTM) networks. The MPLSTM framework ensures accurate predictions by skillfully learning and adapting to changing data distributions. Extensive experimental assessments on real-world datasets have demonstrated the clear superiority of the MPLSTM architecture over previous methods. This study uses the transformer, the most recent deep learning breakthrough technology, to demonstrate how well it can handle challenging tasks and emphasizes its critical role as a cutting-edge approach to raising the bar for machine learning
    corecore