2,835 research outputs found
NL4Py: Agent-Based Modeling in Python with Parallelizable NetLogo Workspaces
NL4Py is a NetLogo controller software for Python, for the rapid, parallel
execution of NetLogo models. NL4Py provides both headless (no graphical user
interface) and GUI NetLogo workspace control through Python. Spurred on by the
increasing availability of open-source computation and machine learning
libraries on the Python package index, there is an increasing demand for such
rapid, parallel execution of agent-based models through Python. NetLogo, being
the language of choice for a majority of agent-based modeling driven research
projects, requires an integration to Python for researchers looking to perform
statistical analyses of agent-based model output using these libraries.
Unfortunately, until the recent introduction of PyNetLogo, and now NL4Py, such
a controller was unavailable.
This article provides a detailed introduction into the usage of NL4Py and
explains its client-server software architecture, highlighting architectural
differences to PyNetLogo. A step-by-step demonstration of global sensitivity
analysis and parameter calibration of the Wolf Sheep Predation model is then
performed through NL4Py. Finally, NL4Py's performance is benchmarked against
PyNetLogo and its combination with IPyParallel, and shown to provide
significant savings in execution time over both configurations
SKIRT: hybrid parallelization of radiative transfer simulations
We describe the design, implementation and performance of the new hybrid
parallelization scheme in our Monte Carlo radiative transfer code SKIRT, which
has been used extensively for modeling the continuum radiation of dusty
astrophysical systems including late-type galaxies and dusty tori. The hybrid
scheme combines distributed memory parallelization, using the standard Message
Passing Interface (MPI) to communicate between processes, and shared memory
parallelization, providing multiple execution threads within each process to
avoid duplication of data structures. The synchronization between multiple
threads is accomplished through atomic operations without high-level locking
(also called lock-free programming). This improves the scaling behavior of the
code and substantially simplifies the implementation of the hybrid scheme. The
result is an extremely flexible solution that adjusts to the number of
available nodes, processors and memory, and consequently performs well on a
wide variety of computing architectures.Comment: 21 pages, 20 figure
Improving IRWLS algorithm for GLM with Intel Xeon Family
This study investigates utilizing the characteristics of Intel Xeon to improve the performance of training generalized linear models. The classic approach to fnd the maximum likelihood estimation of linear model requires loading entire data into memory for computation which is infeasible when data size is bigger than memory size. With the approach analyzed by Zhang and Yang (2017), the process of model fitting will be achieved iteratively through iterating each row. However, one limitation of this approach could be the iterative manner will impact performance when implementing it on Intel Xeon processor which delivers parallelism and vectorization. The study will focus on the tuning of application process and configuration on Xeon family processor based on the architecture of GLM model fitting algorithm
Automating the multiprocessing environment
An approach to automate the programming and operation of tree-structured networks of multiprocessor systems is discussed. A conceptual, knowledge-based operating environment is presented, and requirements for two major technology elements are identified as follows: (1) An intelligent information translator is proposed for implementating information transfer between dissimilar hardware and software, thereby enabling independent and modular development of future systems and promoting a language-independence of codes and information; (2) A resident system activity manager, which recognizes the systems capabilities and monitors the status of all systems within the environment, is proposed for integrating dissimilar systems into effective parallel processing resources to optimally meet user needs. Finally, key computational capabilities which must be provided before the environment can be realized are identified
Enhanced transformer long short-term memory framework for datastream prediction
In machine learning, datastream prediction is a challenging issue, particularly when dealing with enormous amounts of continuous data. The dynamic nature of data makes it difficult for traditional models to handle and sustain real-time prediction accuracy. This research uses a multi-processor long short-term memory (MPLSTM) architecture to present a unique framework for datastream regression. By employing several central processing units (CPUs) to divide the datastream into multiple parallel chunks, the MPLSTM framework illustrates the intrinsic parallelism of long short-term memory (LSTM) networks. The MPLSTM framework ensures accurate predictions by skillfully learning and adapting to changing data distributions. Extensive experimental assessments on real-world datasets have demonstrated the clear superiority of the MPLSTM architecture over previous methods. This study uses the transformer, the most recent deep learning breakthrough technology, to demonstrate how well it can handle challenging tasks and emphasizes its critical role as a cutting-edge approach to raising the bar for machine learning
- …