Module for Message Parsing Interface parallellization utilities.
This module handles the initialization of the MPI environment and assigns the cpus their indices. Parallellization is done by distributing atoms on the processors and the routine for doing this randomly is provided. Also tools for monitoring the loads of all the cpus and redistributing them are also implemented.
- all_atoms¶
integer scalar
the total number of atoms
- all_loads¶
double precision allocatable size(:)
list of the loads of all cpus
- atom_buffer¶
integer allocatable size(:)
list used for passing atom indices during load balancing
- cpu_id¶
integer scalar
identification number for the cpu, from 0 to \(n_\mathrm{cpus}-1\)
- is_my_atom¶
logical allocatable size(:)
logical array, true for the indices of the atoms that are distributed to this cpu
- load_length¶
integer scalar
the number of times loads have been recorded
- loadout¶
integer scalar
initial value = 2352
an integer of the output channel for loads
- loads_mask¶
logical allocatable size(:)
logical array used in load rebalancing, true for cpus whose loads have not yet been balanced
- mpi_atoms_allocated¶
logical scalar
initial value = .false.
logical switch for denoting that the mpi allocatable arrays have been allocated
- mpistat¶
integer scalar
mpi return value
- mpistatus¶
integer size(mpi_status_size)
array for storing the mpi status return values
- my_atoms¶
integer scalar
the number of atoms distributed to this cpu
- my_load¶
double precision scalar
storage for the load of this particular cpu
- n_cpus¶
integer scalar
number of cpus, \(n_\mathrm{cpus}\)
- stopwatch¶
double precision scalar
cpu time storage
- track_loads¶
logical scalar parameter
initial value = .false.
logical switch, if true, the loads of cpus are written to a file during run
- balance_loads()¶
Load balancing.
The loads are gathered from all cpus and sorted. Then load (atoms) is passed from the most loaded cpus to the least loaded ones.
- close_loadmonitor()¶
Closes the output for wirting workload data
- initialize_load(reallocate)¶
Initializes the load monitoring arrays.
- reallocate: logical intent(in) scalar
- Logical switch for reallocating the arrays. If true, the related arrays are allocated. Otherwise only the load counters are set to zero.
- mpi_distribute(n_atoms)¶
distributes atoms among processors
- n_atoms: integer intent(in) scalar
- number of atoms
- mpi_finish()¶
closes the mpi framework
- mpi_initialize()¶
intializes the mpi framework
- mpi_master_bcast_int(sync_int)¶
the master cpu broadcasts an integer value to all other cpus
- sync_int: integer intent(inout) scalar
- the broadcast integer
- mpi_stack(list, items, depth, length, width)¶
stacks the “lists” from all cpus together according to the lengths given in “items” and gathers the complete list to cpu 0. For example:
cpu 0 cpu 1 cpu 0 abc.... 12..... abc12.. de..... 3456... -> de3456. fghij.. 78..... fghij78The stacking is done for the second array index: list(1,:,1). The stacking works so that first every cpu 2n+1 sends its data to cpu 2n, then 2*(2n+1) sends data to 2*2n, and so on, until the final cpu 2^m sends its data to cpu 0:
cpu 0 1 2 3 4 5 6 7 8 9 10 |-/ |-/ |-/ |-/ |-/ | |---/ |---/ |---/ |-------/ | |---------------/ xParameters:
- list: INTEGER intent() size(:, :, :)
- 3d arrays containing lists to be stacked
- items: INTEGER intent() size(:)
- the numbers of items to be stacked in each list
- depth: INTEGER intent() scalar
- dimensionality of the stacked objects (size of list(:,1,1))
- length: INTEGER intent() scalar
- the number of lists (size of list(1,1,:))
- width: INTEGER intent() scalar
- max size of lists (size of list(1,:,1))
- mpi_sync()¶
syncs the cpus by calling mpi_barrier
- mpi_wall_clock(clock)¶
returns the global time through mpi_wtime
- clock: double precision intent(out) scalar
- the measured time
- open_loadmonitor()¶
Opens the output for writing workload data to a file called “mpi_load.out”
- record_load(amount)¶
Saves the given load.
- amount: double precision intent(in) scalar
- the load to be stored
- timer(stamp)¶
reads the elapsed wall clock time since the previous starting of the timer (saved in stopwatch) and then restarts the timer.
- stamp: double precision intent(inout) scalar
- the elapsed real time
- write_loadmonitor()¶
Routine for writing force calculation workload analysis data to a file called “mpi_load.out”