|
5 | 5 | <h1>Execution Model</h1> |
6 | 6 |
|
7 | 7 |
|
| 8 | +A basic unit of parallel computation in Charm++ programs is a chare. A chare |
| 9 | +is similar to a process, an actor, an ADA task, etc. At its most basic level, |
| 10 | +it is just a C++ object. A Charm++ computation consists of a large number of |
| 11 | +chares distributed on available processors of the system, and interacting with |
| 12 | +each other via asynchronous method invocations. Asynchronously invoking a |
| 13 | +method on a remote object can also be thought of as sending a “message” to it. |
| 14 | +So, these method invocations are sometimes referred to as messages. (besides, |
| 15 | +in the implementation, the method invocations are packaged as messages |
| 16 | +anyway). Chares can be created dynamically. |
| 17 | + |
| 18 | +Conceptually, the system maintains a “work-pool” consisting of seeds for new |
| 19 | +chares, and messages for existing chares. The Charm++ runtime system ( Charm |
| 20 | +RTS) may pick multiple items, non-deterministically, from this pool and |
| 21 | +execute them, with the proviso that two different methods cannot be |
| 22 | +simultaneously executing on the same chare object (say, on different |
| 23 | +processors). Although one can define a reasonable theoretical operational |
| 24 | +semantics of Charm++ in this fashion, a more practical description of |
| 25 | +execution is useful to understand Charm++. A Charm++ application’s execution |
| 26 | +is distributed among Processing Elements (PEs), which are OS threads or |
| 27 | +processes depending on the selected Charm++ build options. On each PE, there |
| 28 | +is a scheduler operating with its own private pool of messages. Each |
| 29 | +instantiated chare has one PE which is where it currently resides. The pool on |
| 30 | +each PE includes messages meant for Chares residing on that PE, and seeds for |
| 31 | +new Chares that are tentatively meant to be instantiated on that PE. The |
| 32 | +scheduler picks a message, creates a new chare if the message is a seed (i.e. |
| 33 | +a constructor invocation) for a new Chare, and invokes the method specified by |
| 34 | +the message. When the method returns control back to the scheduler, it repeats |
| 35 | +the cycle. I.e. there is no pre-emptive scheduling of other invocations. |
| 36 | + |
| 37 | +When a chare method executes, it may create method invocations for other |
| 38 | +chares. The Charm Runtime System (RTS, sometimes referred to as the Chare |
| 39 | +Kernel in the manual) locates the PE where the targeted chare resides, and |
| 40 | +delivers the invocation to the scheduler on that PE. |
| 41 | + |
| 42 | +Methods of a chare that can be remotely invoked are called entry methods. |
| 43 | +Entry methods may take serializable parameters, or a pointer to a message |
| 44 | +object. Since chares can be created on remote processors, obviously some |
| 45 | +constructor of a chare needs to be an entry method. Ordinary entry methods [1] |
| 46 | +are completely non-preemptive- Charm++ will not interrupt an executing method |
| 47 | +to start any other work, and all calls made are asynchronous. |
| 48 | + |
| 49 | +Charm++ provides dynamic seed-based load balancing. Thus location (processor |
| 50 | +number) need not be specified while creating a remote chare. The Charm RTS |
| 51 | +will then place the remote chare on a suitable processor. Thus one can imagine |
| 52 | +chare creation as generating only a seed for the new chare, which may take |
| 53 | +root on some specific processor at a later time. |
| 54 | + |
| 55 | +Chares can be grouped into collections. The types of collections of chares |
| 56 | +supported in Charm++ are: chare-arrays, chare-groups, and chare-nodegroups, |
| 57 | +referred to as arrays, groups, and nodegroups throughout this manual for |
| 58 | +brevity. A Chare-array is a collection of an arbitrary number of migratable |
| 59 | +chares, indexed by some index type, and mapped to processors according to a |
| 60 | +user-defined map group. A group (nodegroup) is a collection of chares, with |
| 61 | +exactly one member element on each PE (“node”). |
| 62 | + |
| 63 | +Charm++ does not allow global variables, except readonly variables. A chare |
| 64 | +can normally only access its own data directly. However, each chare is |
| 65 | +accessible by a globally valid name. So, one can think of Charm++ as |
| 66 | +supporting a global object space. |
| 67 | + |
| 68 | +Every Charm++ program must have at least one mainchare. Each mainchare is |
| 69 | +created by the system on processor 0 when the Charm++ program starts up. |
| 70 | +Execution of a Charm++ program begins with the Charm Kernel constructing all |
| 71 | +the designated mainchares. For a mainchare named X, execution starts at |
| 72 | +constructor X() or X(CkArgMsg *) which are equivalent. Typically, the |
| 73 | +mainchare constructor starts the computation by creating arrays, other chares, |
| 74 | +and groups. It can also be used to initialize shared readonly objects. |
| 75 | + |
| 76 | +Charm++ program execution is terminated by the CkExit call. Like the exit |
| 77 | +system call, CkExit never returns, and it optionally accepts an integer value |
| 78 | +to specify the exit code that is returned to the calling shell. If no exit |
| 79 | +code is specified, a value of zero (indicating successful execution) is |
| 80 | +returned. The Charm RTS ensures that no more messages are processed and no |
| 81 | +entry methods are called after a CkExit. CkExit need not be called on all |
| 82 | +processors; it is enough to call it from just one processor at the end of the |
| 83 | +computation. |
| 84 | + |
| 85 | +As described so far, the execution of individual Chares is “reactive”: When |
| 86 | +method A is invoked the chare executes this code, and so on. But very often, |
| 87 | +chares have specific life-cycles, and the sequence of entry methods they |
| 88 | +execute can be specified in a structured manner, while allowing for some |
| 89 | +localized non-determinism (e.g. a pair of methods may execute in any order, |
| 90 | +but when they both finish, the execution continues in a pre-determined manner, |
| 91 | +say executing a 3rd entry method). To simplify expression of such control |
| 92 | +structures, Charm++ provides two methods: the structured dagger notation, |
| 93 | +which is the main notation we recommend you use. Alternatively, you may use |
| 94 | +threaded entry methods, in combination with futures and sync methods. The |
| 95 | +threaded methods run in light-weight user-level threads, and can block waiting |
| 96 | +for data in a variety of ways. Again, only the particular thread of a |
| 97 | +particular chare is blocked, while the PE continues executing other chares. |
| 98 | + |
| 99 | +The normal entry methods, being asynchronous, are not allowed to return any |
| 100 | +value, and are declared with a void return type. However, the sync methods are |
| 101 | +an exception to this. They must be called from a threaded method, and so are |
| 102 | +allowed to return (certain types of) values. |
0 commit comments