********* Launchers ********* SmartSim interfaces with a number of backends called `launchers` that are responsible for constructing jobs based on run parameters and launching them onto a system. The `launchers` allow SmartSim users to interact with their system programmatically through a python interface. Because of this, SmartSim users do not have to leave the Jupyter Notebook, Python REPL, or Python script to launch, query, and interact with their jobs. SmartSim currently supports 5 `launchers`: 1. ``local``: for single-node, workstation, or laptop 2. ``slurm``: for systems using the Slurm scheduler 3. ``pbs``: for systems using the PBSpro scheduler 4. ``cobalt``: for systems using the Cobalt scheduler 5. ``lsf``: for systems using the LSF scheduler 6. ``auto``: have SmartSim auto-detect the launcher to use. To specify a specific launcher, one argument needs to be provided to the ``Experiment`` initialization. .. code-block:: python from smartsim import Experiment exp = Experiment("name-of-experiment", launcher="local") # local launcher exp = Experiment("name-of-experiment", launcher="slurm") # Slurm launcher exp = Experiment("name-of-experiment", launcher="pbs") # PBSpro launcher exp = Experiment("name-of-experiment", launcher="cobalt") # Cobalt launcher exp = Experiment("name-of-experiment", launcher="lsf") # LSF launcher exp = Experiment("name-of-experiment", launcher="auto") # auto-detect launcher ------------------------------------------------------------------------- Local ===== The local launcher can be used on laptops, workstations and single nodes of supercomputer and cluster systems. Through launching locally, users can prototype workflows and quickly scale them to larger systems with minimal changes. As with all launchers in SmartSim, the local launcher supports asynchronous execution meaning once entities have been launched the main thread of execution is not blocked. Daemon threads that manage currently running jobs will be created when active jobs are present within SmartSim. .. _psutil: https://github.com/giampaolo/psutil The local launcher uses the `psutil`_ library to execute and monitor user-created jobs. Running Locally --------------- The local launcher supports the base :ref:`RunSettings API ` which can be used to run executables as well as run executables with arbitrary launch binaries like `mpiexec`. The local launcher is the default launcher for all ``Experiment`` instances. The local launcher does not support batch launching. Ensembles are always executed in parallel but launched sequentially. ---------------------------------------------------------------------- Slurm ===== The Slurm launcher works directly with the Slurm scheduler to launch, query, monitor and stop applications. During the course of an ``Experiment``, launched entities can be queried for status, completion, and errors. The amount of communication between SmartSim and Slurm can be tuned for specific guidelines of different sites by setting the value for ``jm_interval`` in the SmartSim configuration file. To use the Slurm launcher, specify at ``Experiment`` initialization: .. code-block:: python from smartsim import Experiment exp = Experiment("NAMD-worklfow", launcher="slurm") Running on Slurm ---------------- The Slurm launcher supports three types of ``RunSettings``: 1. :ref:`SrunSettings ` 2. :ref:`MpirunSettings ` 3. :ref:`MpiexecSettings ` As well as batch settings for ``sbatch`` through: 1. :ref:`SbatchSettings ` Both supported ``RunSettings`` types above can be added to a ``SbatchSettings`` batch workload through ``Ensemble`` creation. Getting Allocations ------------------- Slurm supports a number of user facing features that other schedulers do not. For this reason, an extra module :ref:`smartsim.slurm ` can be used to obtain allocations to launch on and release them after ``Experiment`` completion. .. code-block:: python from smartsim.wlm import slurm alloc = slurm.get_allocation(nodes=1) The ID of the allocation is returned as a string to the user so that they can specify what entities should run on which allocations obtained by SmartSim. Additional arguments that would have been passed to the ``salloc`` command can be passed through the ``options`` argument in a dictionary. Anything passed to the options will be processed as a Slurm argument and appended to the salloc command with the appropriate prefix (e.g. `-` or `--`). For arguments without a value, pass None as the value: - `exclusive=None` .. code-block:: python from smartsim.wlm import slurm salloc_options = { "C": "haswell", "partition": "debug", "exclusive": None } alloc_id = slurm.get_slurm_allocation(nodes=128, time="10:00:00", options=salloc_options) The above code would generate a ``salloc`` command like: .. code-block:: bash salloc -N 5 -C haswell --partition debug --time 10:00:00 --exclusive Releasing Allocations --------------------- The :ref:`smartsim.slurm ` interface also supports releasing allocations obtained in an experiment. The example below releases the allocation in the example above. .. code-block:: python from smartsim.wlm import slurm salloc_options = { "C": "haswell", "partition": "debug", "exclusive": None } alloc_id = slurm.get_slurm_allocation(nodes=128, time="10:00:00", options=salloc_options) # slurm.release_slurm_allocation(alloc_id) ------------------------------------------------------------------- PBSPro ====== Like the Slurm launcher, the PBSPro launcher works directly with the PBSPro scheduler to launch, query, monitor and stop applications. The amount of communication between SmartSim and PBSPro can be tuned for specific guidelines of different sites by setting the value for ``jm_interval`` in the SmartSim configuration file. To use the PBSpro launcher, specify at ``Experiment`` initialization: .. code-block:: python from smartsim import Experiment exp = Experiment("LAMMPS-melt", launcher="pbs") Running on PBSpro ----------------- The PBSpro launcher supports three types of ``RunSettings``: 1. :ref:`AprunSettings ` 2. :ref:`MpirunSettings ` 3. :ref:`MpiexecSettings ` As well as batch settings for ``qsub`` through: 1. :ref:`QsubBatchSettings ` Both supported ``RunSettings`` types above can be added to a ``QsubBatchSettings`` batch workload through ``Ensemble`` creation. --------------------------------------------------------------------- Cobalt ====== The Cobalt Launcher works just like the PBSPro launcher and is compatible with ALPS and OpenMPI workloads as well. To use the Cobalt launcher, specify at ``Experiment`` initialization: .. code-block:: python from smartsim import Experiment exp = Experiment("MOM6-double-gyre", launcher="cobalt") Running on Cobalt ----------------- The Cobalt launcher supports three types of ``RunSettings``: 1. :ref:`AprunSettings ` 2. :ref:`MpirunSettings ` 3. :ref:`MpiexecSettings ` As well as batch settings for ``qsub`` through: 1. :ref:`CobaltBatchSettings ` Both supported ``RunSettings`` types above can be added to a ``CobaltBatchSettings`` batch workload through ``Ensemble`` creation. --------------------------------------------------------------------- LSF === The LSF Launcher works like the PBSPro and Cobalt launchers and is compatible with LSF and OpenMPI workloads. To use the LSF launcher, specify at ``Experiment`` initialization: .. code-block:: python from smartsim import Experiment exp = Experiment("MOM6-double-gyre", launcher="lsf") Running on LSF -------------- The LSF launcher supports three types of ``RunSettings``: 1. :ref:`JsrunSettings ` 2. :ref:`MpirunSettings ` 3. :ref:`MpiexecSettings ` As well as batch settings for ``bsub`` through: 1. :ref:`BsubBatchSettings ` Both supported ``RunSettings`` types above can be added to a ``BsubBatchSettings`` batch workload through ``Ensemble`` creation.