Table of Contents
GAIA: Generic Adaptive Interaction Architecture
A migration based middleware can adaptively optimize the simulation execution by reallocating the simulated entities over the distributed simulation. The dynamic reallocation can reduce the communication overhead and improves the computation load balancing. This translates into a reduction of the Wall-Clock Time (WCT) needed to complete the parallel and distributed simulation runs.
The Generic Adaptive Interaction Architecture (GAIA) is a migration based framework built on top of the ARTÌS middleware. The basic task of GAIA is to check the communication pattern of each simulated entity during all the simulation execution. A set of heuristics evaluates the communication pattern and trigger the entities reallocation to reduce the communication costs and to improve the load balancing of the execution architecture [MSWIM2004] . GAIA clusters the highly interacting simulated entities within the same execution unit, reducing costly network communication and increasing the rate of low cost local communication [DSRT2004] .
An enhanced version of the GAIA framework (called GAIA+) has been designed and implemented to support the distributed simulation over shared Commercial Off-the-Shelf (COTS) clusters and to enhance the load balancing and communication overheads’ reduction in presence of massive models of dynamically interacting simulated entities, heterogeneous execution architectures and unpredictable computation and communication (background) loads. The adaptive load balancing mechanisms could improve the resources utilization and the simulation process execution, by dynamically tuning the simulation load, taking care of the synchronization and communication overheads reduction. One of the main goals of GAIA+ is to enhance the simulation execution on clusters with heterogeneous units connected by a computer network. Heterogeneity is intended here in terms of CPU's performance characteristics, available resources, and background load.
Features
- Multi-Agent System (MAS) paradigm
- Communication load-balancing
- Computation load-balancing
- Adaptive partitioning of the simulation model at runtime
- Many self-clustering strategies
- Automatically reacts to communication and computation imbalances in both the execution architecture and the simulation model
Download
Documentation
- More in deep information about GAIA and GAIA+ can be found in SIMPAT2017 and IJSPM09.
- Some videos that show how GAIA works can be found in this page.
- For more information on the ARTÌS/GAIA installation and usage please see the ARTÌS & GAIA HOWTO.
Work in progress
- We are currently working on an extended version of GAIA (called ReliableGAIA, R-GAIA) that aims to introduce some fault-tolerance to the simulation execution. This will permit to run simulations on top of unreliable execution platforms such as the public cloud. For more information please see HPCS11.
- Furthermore, in the PArallel Graph Algorithms (PAGA) and Adaptive Parallel And Distributed Simulation on HPC (HPC-PADS) research projects we are working on the porting of ARTÌS/GAIA to the Blue Gene/Q architecture.
Citation
To cite the GAIA/GAIA+ software use:
- gda-simpat-2017.txt
Gabriele D'Angelo. The Simulation Model Partitioning Problem: an Adaptive Solution Based on Self-Clustering. Simulation Modelling Practice and Theory, Elsevier, vol. 70 (January 2017). ISSN: 1569-190X
If you use BibTeX for LaTeX, use:
- gda-simpat-2017.tex
@article{gda-simpat-2017, author = {D’Angelo, Gabriele}, title = {The Simulation Model Partitioning Problem: an Adaptive Solution Based on Self-Clustering}, journal = {Simulation Modelling Practice and Theory (SIMPAT)}, issn = "1569-190X", doi = "10.1016/j.simpat.2016.10.001", volume = "70", number = "", pages = "1 - 20", year = "2017", url = "http://www.sciencedirect.com/science/article/pii/S1569190X16302350", publisher = {Elsevier} }