Active Replication Management Layer (ARML)
Overview
ARML is a middleware layer allowing transparent management of
(diversity-based) active replicas of a same federate simulator in the
context of advance simulation systems relying on the
High-Level-Architecture (HLA) IEEE interoperability standard. The objective
is the exploitation of the best "instant responsiveness" among the
replicas in different phases of the simulation run, in order to improve
the timeliness for the production of simulation output.
ARML v2.0 is targeted to off-the-shelf SMP computing systems and
has been implemented by exclusively relying on C technology and
standard POSIX APIs, hence it results portable across any kind of
POSIX compliant operating system (e.g. UNIX/LINUX systems). Also,
such an implementation has been tailored for integration with
the Georgia Tech B-RTI package , namely an open source release of
an HLA compliant Run-Time-Infrastructure middleware component.
Details
As shown by the Figure below, the classical interaction between a
federate simulator and the underlying RTI takes place via a set of
call/callback services specified by the HLA standard.
When ARML is eployed, the software architecture changes as shown
below. A federate can be present within the whole federation as multiple
(diversity-based) active replicas. ARML exposes to these replicas the same
interface exposed by the RTI. At the same time, ARML interacts with
the underlying RTI via that same interface.
ARML performs the following tasks:
- It intercepts all the instances of calls to a given RTI service performed by
the different federate replicas, and forwards a single one of those calls to the
underlying RTI. The forwarded call is the fastest one issued by the overlying
replicas.
- As soon as the RTI returns to ARML for a previously issued call,
ARML delivers the return statement (and the return value, if any) to
all the overlying replicas. If some federate replica has not yet
executed the call to the corresponding RTI service (this might happen
because the replicas are allowed to execute asynchronously), ARML
keeps the return value buffered until that call is issued, and then
immediately returns with the established return value to that federate
replica.
- ARML intercepts each callback from the RTI and delivers it to all the overlying
replicas.
- In case one replica cannot yet accept the callback (e.g. because,
acting asynchronously, it has not yet executed all the RTI calls preceding the
delivery of that callback), ARML simply delays the callback execution on that
replica until it is ready to accept it.
Given that ARML is designed to improve the simulation system
performace, its effectiveness requires the execution of each replica
and the RTI in real concurrency, i.e. as different threads/processes
on the SMP computing system. This is because the RTI must be able to
process requests according to an interleaved stream determined by ARML
via the selection of requests from one or another replica, depending
on instant responsiveness of each of those replicas. Hence it must be
able to proceed in parallel with all of the involved replicas. At the
same time, each replica must not affect the execution speed of the other
replicas and of the RTI due to resource (e.g. CPU) contention. In the
design of ARML v2.0, such an effectiveness requirement has been
satisfied maintaining complete transparency for what concerns the
presence of the ARML middleware layer. This means that both the
federate and the B-RTI must undergo no significative modication (e.g.
no addition of mechanisms for global data protection against critical
races within the replication scheme, such as encapsulation) in order
to be integrated with the ARML layer (the "User Guide" will clarify
the very minimal modifications required at the federate level for allowing
integration with ARML). To achieve this objective, ARML has been
organized into:
- A library to be linked to the federate code, called
libARML.a. This library exposes to the federate the same interface
offered by the B-RTI, and requires from the federate the same
callback interface required by the B-RTI.
- A set of modules that, once linked to the B-RTI constitute a run
time environment, called ARML, which, beyond supporting replication
specific tasks also performs the startup of the B-RTI in a separate
UNIX process.
The interaction between the ARML run time environment and libARML.a
(once the federate is activated in a different process) takes place via
ad-hoc mechanisms relying on standard POSIX shared memory and synchronization (via spinlocking) facilities.
Correctness of simulation output
The employment of ARML requires that all the federate replicas are
Piece-Wise-Deterministic (PWD), with the meaning that they must
exhibit the same external interactions with the RTI, under the same input
conditions (e.g. the same callbacks from the RTI). In case the PWD
assumption is not matched, ARML might produce unexpected alterations
of the correctness of simulation output.
Related Publications
F. Quaglia,
Software Diversity-Based Active Replication as an
Approach for Enhancing the Performance of
Advanced Simulation Systems,
International Journal of Foundations of Computer Science , 2007, pending revision.
F. Quaglia,
Enhancing the Performance of HLA-Based Simulation Systems via Software Diversity and Active Replication,
Proc. 20th IEEE International Parallel and Distributed
Processing Symposium - APDCM Workshop (IPDPS), Rhodes Island, Greece, IEEE Computer
Society Press, April 2006.
Download, Installation and User's Guide (by version)