Calcul distribué : inclure la puissance de calcul
Cette option accélère considérablement les calculs, car de nombreux ordinateurs du réseau local peuvent être inclus dans le processus de calcul.
Details You can calculate much faster using the new calculation option Distributed calculation. With this option, your project is calculated by a group of machines (including yours).
How it works
When you start a distributed calculation, the SLIP instance you are running plays the role of a client. The client, subdivides the calculation job into several tasks, each of which consists in the calculation of a small region (bounding box or extent) that contains some of the immission points to be calculated. For a big project, the client usually specifies many hundreeds of such tasks. The data needed for the job and the specifications of the tasks are written into a globally accessible blackboard (essentially, a directory in the network).
These tasks are then processed by workers, which run on several machines in the network. Roughly, a worker runs a loop, executing the following: (1) read a task specification from the blackboard, (2) calculate the points in the wanted box, and (3) write the results to the blackboard. The SLIP instance playing the role of the client also plays the role of a worker; this ensures that there is always at least one worker working on your job. All other workers run only on machines that have been user-idle for a few minutes (this is configurable; overall best results are obtained with values between 8 and 30 min).
When all tasks are done, the client simply fetches the results and stores them in your project.
| ❑ | Details: - The system works with any type of source and receiver (including both types of surface receivers).
- Multi-selection (or selection-list) calculation is also supported.
- Even if you selected the distributed-calculation option, the program will not distribute the calculation if it's not "necessary" (if there is too little work to do).
- The system is highly transparent to the user using the "client".
- It is transparent to the user where "workers" are installed; these run in low priority, and only when the computer is idle (and only if there are enough resources on the computer). Suppose a worker installed on an (idle) machine is working on a task, and the user of this machine comes back (moves the mouse or hitspresses some key). When this happens, the calculation stops and the worker exits (usually within 3 seconds).
- It is possible to start several workers per machine (makes sense for multicore machines).
- The number of workers can be changed (increased or decreased) dynamically, even while calculation are being performed.
- It supports an unlimited number of simultaneous clients (should remain efficient for up to
simultaneous clients, at least). - The current version should efficiently support up to
workers. - The system is robust: in particular, the calculation will not stop even if a worker is killed (even if its machine explodes), even if in the middle of a calculation, the network path where the tasks-agenda is stored becomes inaccessible for ever).
- Note that the distributed calculation works even if only the client is running, because the client is also a worker. Also, in contrast to other workers, the client only works for its own job.
- The full installation of the distributed calculation system only requires simple settings.
- Depending on where a blackboard is in the network, the efficiency of the system might vary, especially that of saving the job data, which includes the acoustically elements that are relevant for the job.
You will find the corresponding two new settings under settings / calculation / distributed calculation / .... Please do revise these settings. Currently, environment variables are used to specify primary and secondary blackboards. For now, the secondary blackboard is disabled, to avoid excesive network traffic. Version info-center - Even more parallelism.
As you already know, you can simultaneously use options multicore calculation and distributed calculation to accelerate your calculations. Now, workers running in other (idle) machines use themselves the multicore option. (Note that they do so, even if you do not activate the multicore option for your own computer.)
- Distributed calculation: two-level distribution.
As explained in a previous update, SLIP's distributed calculations are based on blackboards (think of them as exchange folders that can be reached by all the machines involved in such calculations). In particular, blackboards are used to exchange task definitions and results. In this update, the distributed-calculation implementation has been extended to support two blackboards: a primary and a secondary one. As explained next, having two levels is good; in particular, it allows to considerably reduce unnecessary communication with remote sites. The main purpose of the primary blackboard is to synchronize "local" machines (the ones at your site). This blackboard is used for every distributed calculation and is accessed quite intensively. Thus, your machine should be able to access it reasonably fast (typically it should be in your local-area network). The purpose of the secondary blackboard is to synchronize your machine with more remote machines (not at your site). SLIP uses this blackboard only for rather big tasks (for example, it is not used for calculations that take less than 5 minutes). This blackboard can for example be in another G+P site, and it will soon be possible to use web-folders for this blackboard. Access to this blackboard is "nice" in the sense that files are transferred without using too much bandwidth (other users are not disturbed too much by such transfers). Note that, to avoid excessive communication among sites, workers are configured to work only for their local blackboard. Thus, if you use a local blackboard as a primary one (this is usually the most convenient setting), and only the primary blackboard is used (you can choose not to use a secondary blackboard at all), then only workers at your site will work on your distributed calculation. When the secondary blackboard is also used (and this is set to be at some other site), the calculation gets even more distributed: workers at other sites can also participate, which further speeds up your calculation. [In addition to this, soon, workers will be able do work for a remote blackboard, when there is nothing to do in their local one, and within some specified time periods.]
|
For details, see Calcul Distribué.