Calcul Distribué

You can calculate much faster using the new calculation option Distributed calculation. With this option, your project is calculated by a group of machines (including yours).

How it works

When you start a distributed calculation, the SLIP instance you are running plays the role of a client. The client, subdivides the calculation job into several tasks, each of which consists in the calculation of a small region (bounding box or extent) that contains some of the immission points to be calculated. For a big project, the client usually specifies many hundreeds of such tasks. The data needed for the job and the specifications of the tasks are written into a globally accessible blackboard (essentially, a directory in the network).

These tasks are then processed by workers, which run on several machines in the network. Roughly, a worker runs a loop, executing the following: (1) read a task specification from the blackboard, (2) calculate the points in the wanted box, and (3) write the results to the blackboard. The SLIP instance playing the role of the client also plays the role of a worker; this ensures that there is always at least one worker working on your job. All other workers run only on machines that have been user-idle for a few minutes (this is configurable; overall best results are obtained with values between 8 and 30 min).

When all tasks are done, the client simply fetches the results and stores them in your project.

  Notes:
  • You can simultaneously use options multicore calculation and distributed calculation to accelerate your calculations. Note that workers running in other (idle) machines always use the multicore option. They do so, even if you do not activate the multicore option for your own computer.

  Details:
  • The system works with any type of source and receiver (including both types of surface receivers).
  • Multi-selection (or selection-list) calculation is also supported.
  • Even if you selected the distributed-calculation option, the program will not distribute the calculation if it's not "necessary" (if there is too little work to do).
  • The system is highly transparent to the user using the "client".
  • It is transparent to the user where "workers" are installed; these run in low priority, and only when the computer is idle (and only if there are enough resources on the computer). Suppose a worker installed on an (idle) machine is working on a task, and the user of this machine comes back (moves the mouse or hitspresses some key). When this happens, the calculation stops and the worker exits (usually within 3 seconds).
  • It is possible to start several workers per machine (makes sense for multicore machines).
  • The number of workers can be changed (increased or decreased) dynamically, even while calculation are being performed.
  • It supports an unlimited number of simultaneous clients (should remain efficient for up to mathematical expression simultaneous clients, at least).
  • The current version should efficiently support up to mathematical expression workers.
  • The system is robust: in particular, the calculation will not stop even if a worker is killed (even if its machine explodes), even if in the middle of a calculation, the network path where the tasks-agenda is stored becomes inaccessible for ever).
  • Note that the distributed calculation works even if only the client is running, because the client is also a worker. Also, in contrast to other workers, the client only works for its own job.
  • The full installation of the distributed calculation system only requires simple settings. (Details will be provided in a later update.)
  • Depending on where a blackboard is in the network, the efficiency of the system might vary, especially that of saving the job data, which includes the acoustically elements that are relevant for the job. Note that the job data is not yet compressed; this will improved in a later update (this will reduce the initial overhead).

Distributed calculation: two-level distribution.

SLIP's distributed calculations are based on blackboards (exchange folders that can be reached by all the machines involved in such calculations). In particular, blackboards are used to exchange task definitions and results.

In this update, the distributed-calculation implementation has been extended to support two blackboards: a primary and a secondary one. As explained next, having two levels is good; in particular, it allows to considerably reduce unnecessary communication with remote sites.

The main purpose of the primary blackboard is to synchronize "local" machines (the ones at your site). This blackboard is used for every distributed calculation and is accessed quite intensively. Thus, your machine should be able to access it reasonably fast. For example, if you are working in G+P Zurich, you should choose a primary blackboard that is also in G+P Zurich.

The purpose of the secondary blackboard is to synchronize your machine with more remote machines (not at your site). SLIP uses this blackboard only for rather big tasks (for example, it is not used for calculations that take less than 30 minutes). This blackboard can be anywhere (for example, in another G+P site; it will soon be possible to use web-folders for this blackboard). Access to this blackboard is "nice" in the sense that files are transferred without using too much bandwidth (other users are not disturbed too much by such transfers).

Note that, to avoid excessive communication among sites, workers are currently configured to work only for their local blackboard. Thus, if you use a local blackboard as a primary one (this is usually the most convenient setting), and only the primary blackboard is used (you can choose not to use a seconary blackboard at all), then only workers at your site will work on your distributed calculation. But when the secondary blackboard is also used (and this is set to be at some other site), the calculation gets even more distributed: workers at other sites can also participate, which further speeds up your calculation.

You will find the corresponding two new settings under settings / calculation / distributed calculation / .... Please do revise these settings. For simplicity, I have created a few aliases and you can only choose among them (using a drop-down list; but you will have more freedom in a near future). Be careful not to set the secondary blackboard to be identical to (refer the same folder as) the primary one. For the moment, it is recommended to use the following choices, depending on where (which G+P site) your XLIP copy is running:

siteprimary blackboard secondary blackboard
Aarau G+P AARAU G+P BERN
Bern G+P BERN G+P AARAU
Zurich G+P ZURICH G+P BERN

See also Reflexionen (Berechnungsoptionen).