Ensemble#
Overview#
A SmartSim Ensemble enables users to run a group of computational tasks together in an
Experiment workflow. An Ensemble is comprised of multiple Model objects,
where each Ensemble member (SmartSim Model) represents an individual application.
An Ensemble can be managed as a single entity and
launched with other Model’s and Orchestrators to construct AI-enabled workflows.
The Ensemble API offers key features, including methods to:
Attach Configuration Files for use at
Ensembleruntime.Load AI Models (TF, TF-lite, PT, or ONNX) into the
OrchestratoratEnsembleruntime.Load TorchScripts into the
OrchestratoratEnsembleruntime.Prevent Data Collisions within the
Ensemble, which allows for reuse of application code.
To create a SmartSim Ensemble, use the Experiment.create_ensemble API function. When
initializing an Ensemble, consider one of the three creation strategies explained
in the Initialization section.
SmartSim manages Ensemble instances through the Experiment API by providing functions to
launch, monitor, and stop applications.
Initialization#
Overview#
The Experiment API is responsible for initializing all workflow entities.
An Ensemble is created using the Experiment.create_ensemble factory method, and users can customize the
Ensemble creation via the factory method parameters.
The factory method arguments for Ensemble creation can be found in the Experiment API
under the create_ensemble docstring.
By using specific combinations of the factory method arguments, users can tailor
the creation of an Ensemble to align with one of the following creation strategies:
Parameter Expansion: Generate a variable-sized set of unique simulation instances configured with user-defined input parameters.
Replica Creation: Generate a specified number of
Modelreplicas.Manually: Attach pre-configured
Model’s to anEnsembleto manage as a single unit.
Parameter Expansion#
Parameter expansion is a technique that allows users to set parameter values per Ensemble member.
This is done by specifying input to the params and perm_strategy factory method arguments during
Ensemble creation (Experiment.create_ensemble). Users may control how the params values
are applied to the Ensemble through the perm_strategy argument. The perm_strategy argument
accepts three values listed below.
Parameter Expansion Strategy Options:
“all_perm”: Generate all possible parameter permutations for an exhaustive exploration. This means that every possible combination of parameters will be used in the
Ensemble.“step”: Create parameter sets by collecting identically indexed values across parameter lists. This allows for discrete combinations of parameters for
Model’s.“random”: Enable random selection from predefined parameter spaces, offering a stochastic approach. This means that the parameters will be chosen randomly for each
Model, which can be useful for exploring a wide range of possibilities.
Examples#
This subsection contains two examples of Ensemble parameter expansion. The
first example illustrates parameter expansion using two parameters
while the second example demonstrates parameter expansion with two
parameters along with the launch of the Ensemble as a batch workload.
Example 1 : Parameter Expansion Using all_perm Strategy
In this example an
Ensembleof fourModelentities is created by expanding two parameters using the all_perm strategy. All of theModel’s in theEnsembleshare the sameRunSettingsand only differ in the value of the params assigned to each member. The source code example is available in the dropdown below for convenient execution and customization.Example Driver Script Source Code
from smartsim import Experiment # Initialize the Experiment and set the launcher to auto exp = Experiment("getting-started", launcher="auto") # Initialize a RunSettings rs = exp.create_run_settings(exe="path/to/example_simulation_program") #Create the parameters to expand to the Ensemble members params = { "name": ["Ellie", "John"], "parameter": [2, 11] } # Initialize the Ensemble by specifying RunSettings, the params and "all_perm" ensemble = exp.create_ensemble("model_member", run_settings=rs, params=params, perm_strategy="all_perm")Begin by initializing a
RunSettingsobject to apply to allEnsemblemembers:1# Initialize a RunSettings 2rs = exp.create_run_settings(exe="path/to/example_simulation_program")Next, define the parameters that will be applied to the
Ensemble:1#Create the parameters to expand to the Ensemble members 2params = { 3 "name": ["Ellie", "John"], 4 "parameter": [2, 11] 5 }Finally, initialize an
Ensembleby specifying theRunSettings, params and perm_strategy=”all_perm”:1# Initialize the Ensemble by specifying RunSettings, the params and "all_perm" 2ensemble = exp.create_ensemble("model_member", run_settings=rs, params=params, perm_strategy="all_perm")By specifying perm_strategy=”all_perm”, all permutations of the params will be calculated and distributed across
Ensemblemembers. Here there are four permutations of the params values:ensemble member 1: ["Ellie", 2] ensemble member 2: ["Ellie", 11] ensemble member 3: ["John", 2] ensemble member 4: ["John", 11]
Example 2 : Parameter Expansion Using step Strategy with the Ensemble Configured For Batch Launching
In this example an
Ensembleof twoModelentities is created by expanding two parameters using the step strategy. All of theModel’s in theEnsembleshare the sameRunSettingsand only differ in the value of the params assigned to each member. Lastly, theEnsembleis submitted as a batch workload. The source code example is available in the dropdown below for convenient execution and customization.Example Driver Script source code
from smartsim import Experiment # Initialize the Experiment and set the launcher to auto exp = Experiment("getting-started", launcher="auto") # Initialize a BatchSettings bs = exp.create_batch_settings(nodes=2, time="10:00:00") # Initialize and configure RunSettings rs = exp.create_run_settings(exe="python", exe_args="path/to/application_script.py") rs.set_nodes(1) #Create the parameters to expand to the Ensemble members params = { "name": ["Ellie", "John"], "parameter": [2, 11] } # Initialize the Ensemble by specifying RunSettings, BatchSettings, the params and "step" ensemble = exp.create_ensemble("ensemble", run_settings=rs, batch_settings=bs, params=params, perm_strategy="step")Begin by initializing and configuring a
BatchSettingsobject to run theEnsembleinstance:1# Initialize a BatchSettings 2bs = exp.create_batch_settings(nodes=2, 3 time="10:00:00")The above
BatchSettingsobject will instruct SmartSim to run theEnsembleon two nodes with a timeout of 10 hours.Next initialize a
RunSettingsobject to apply to allEnsemblemembers:1# Initialize and configure RunSettings 2rs = exp.create_run_settings(exe="python", exe_args="path/to/application_script.py") 3rs.set_nodes(1)Next, define the parameters to include in
Ensemble:1#Create the parameters to expand to the Ensemble members 2params = { 3 "name": ["Ellie", "John"], 4 "parameter": [2, 11] 5 }Finally, initialize an
Ensembleby passing in theRunSettings, params and perm_strategy=”step”:1# Initialize the Ensemble by specifying RunSettings, BatchSettings, the params and "step" 2ensemble = exp.create_ensemble("ensemble", run_settings=rs, batch_settings=bs, params=params, perm_strategy="step")When specifying perm_strategy=”step”, the params sets are created by collecting identically indexed values across the param value lists.
ensemble member 1: ["Ellie", 2] ensemble member 2: ["John", 11]
Replicas#
A replica strategy involves the creation of identical Model’s within an Ensemble.
This strategy is particularly useful for applications that have some inherent randomness.
Users may use the replicas factory method argument to create a specified number of identical
Model members during Ensemble creation (Experiment.create_ensemble).
Examples#
This subsection contains two examples of using the replicas creation strategy. The
first example illustrates creating four Ensemble member clones
while the second example demonstrates creating four Ensemble
member clones along with the launch of the Ensemble as a batch workload.
Example 1 : Ensemble creation with replicas strategy
In this example an
Ensembleof four identicalModelmembers is created by specifying the number of clones to create via the replicas argument. All of theModel’s in theEnsembleshare the sameRunSettings. The source code example is available in the dropdown below for convenient execution and customization.Example Driver Script Source Code
from smartsim import Experiment # Initialize the Experiment and set the launcher to auto exp = Experiment("getting-started", launcher="auto") # Initialize a RunSettings object rs = exp.create_run_settings(exe="python", exe_args="path/to/application_script.py") # Initialize the Ensemble by specifying the number of replicas and RunSettings ensemble = exp.create_ensemble("ensemble-replica", replicas=4, run_settings=rs)To create an
Ensembleof identicalModel’s, begin by initializing aRunSettingsobject:1# Initialize a RunSettings object 2rs = exp.create_run_settings(exe="python", exe_args="path/to/application_script.py")Initialize the
Ensembleby specifying theRunSettingsobject and number of clones to replicas:1# Initialize the Ensemble by specifying the number of replicas and RunSettings 2ensemble = exp.create_ensemble("ensemble-replica", replicas=4, run_settings=rs)By passing in replicas=4, four identical
Ensemblemembers will be initialized.
Example 2 : Ensemble Creation with Replicas Strategy and Ensemble Batch Launching
In this example an
Ensembleof fourModelentities is created by specifying the number of clones to create via the replicas argument. All of theModel’s in theEnsembleshare the sameRunSettingsand theEnsembleis submitted as a batch workload. The source code example is available in the dropdown below for convenient execution and customization.Example Driver Script Source Code
from smartsim import Experiment # Initialize the Experiment and set the launcher to auto exp = Experiment("getting-started", launcher="auto") # Initialize a BatchSettings object bs = exp.create_batch_settings(nodes=4, time="10:00:00") # Initialize and configure a RunSettings object rs = exp.create_run_settings(exe="python", exe_args="path/to/application_script.py") rs.set_nodes(4) # Initialize an Ensemble ensemble = exp.create_ensemble("ensemble-replica", replicas=4, run_settings=rs, batch_settings=bs)To launch the
Ensembleof identicalModel’s as a batch job, begin by initializing aBatchSettingsobject:1# Initialize a BatchSettings object 2bs = exp.create_batch_settings(nodes=4, 3 time="10:00:00") 4The above
BatchSettingsobject will instruct SmartSim to run theEnsembleon four nodes with a timeout of 10 hours.Next, create a
RunSettingsobject to apply to allModelreplicas:1# Initialize and configure a RunSettings object 2rs = exp.create_run_settings(exe="python", exe_args="path/to/application_script.py") 3rs.set_nodes(4)Initialize the
Ensembleby specifying theRunSettingsobject,BatchSettingsobject and number of clones to replicas:1# Initialize an Ensemble 2ensemble = exp.create_ensemble("ensemble-replica", replicas=4, run_settings=rs, batch_settings=bs)By passing in replicas=4, four identical
Ensemblemembers will be initialized.
Manually Append#
Manually appending Model’s to an Ensemble offers an in-depth level of customization in Ensemble design.
This approach is favorable when users have distinct requirements for individual Model’s, such as variations
in parameters, run settings, or different types of simulations.
Examples#
This subsection contains an example of creating an Ensemble by manually appending Model’s.
The example illustrates attaching two SmartSim Model’s to the Ensemble.
The Ensemble is submitted as a batch workload.
Example 1 : Append Model’s to an Ensemble and Launch as a Batch Job
In this example, we append
Model’s to anEnsemblefor batch job execution. To do this, we first initialize an Ensemble with aBatchSettingsobject. Then, manually createModel’s and add each to theEnsembleusing theEnsemble.add_modelfunction. The source code example is available in the dropdown below for convenient execution and customization.Example Driver Script Source Code
from smartsim import Experiment # Initialize the Experiment and set the launcher to auto exp = Experiment("getting-started", launcher="auto") # Initialize BatchSettings bs = exp.create_batch_settings(nodes=10, time="01:00:00") # Initialize Ensemble ensemble = exp.create_ensemble("ensemble-append", batch_settings=bs) # Initialize RunSettings for Model 1 srun_settings_1 = exp.create_run_settings(exe=exe, exe_args="path/to/application_script_1.py") # Initialize RunSettings for Model 2 srun_settings_2 = exp.create_run_settings(exe=exe, exe_args="path/to/application_script_2.py") # Initialize Model 1 with RunSettings 1 model_1 = exp.create_model(name="model_1", run_settings=srun_settings_1) # Initialize Model 2 with RunSettings 2 model_2 = exp.create_model(name="model_2", run_settings=srun_settings_2) # Add Model member to Ensemble ensemble.add_model(model_1) # Add Model member to Ensemble ensemble.add_model(model_2)To create an empty
Ensembleto appendModel’s, initialize theEnsemblewith a batch settings object:1# Initialize BatchSettings 2bs = exp.create_batch_settings(nodes=10, 3 time="01:00:00") 4 5# Initialize Ensemble 6ensemble = exp.create_ensemble("ensemble-append", batch_settings=bs)Next, create the
Model’s to append to theEnsemble:1# Initialize RunSettings for Model 1 2srun_settings_1 = exp.create_run_settings(exe=exe, exe_args="path/to/application_script_1.py") 3# Initialize RunSettings for Model 2 4srun_settings_2 = exp.create_run_settings(exe=exe, exe_args="path/to/application_script_2.py") 5# Initialize Model 1 with RunSettings 1 6model_1 = exp.create_model(name="model_1", run_settings=srun_settings_1) 7# Initialize Model 2 with RunSettings 2 8model_2 = exp.create_model(name="model_2", run_settings=srun_settings_2)Finally, append the
Modelobjects to theEnsemble:1# Add Model member to Ensemble 2ensemble.add_model(model_1) 3# Add Model member to Ensemble 4ensemble.add_model(model_2)The new
Ensembleis comprised of two appendedModelmembers.
Files#
Overview#
Ensemble members often depend on external files (e.g. training datasets, evaluation datasets, etc)
to operate as intended. Users can instruct SmartSim to copy, symlink, or manipulate external files
prior to an Ensemble launch via the Ensemble.attach_generator_files function. Attached files
will be applied to all Ensemble members.
Note
Multiple calls to Ensemble.attach_generator_files will overwrite previous file configurations
on the Ensemble.
To attach a file to an Ensemble for use at runtime, provide one of the following arguments to the
Ensemble.attach_generator_files function:
to_copy (t.Optional[t.List[str]] = None): Files that are copied into the path of the
Ensemblemembers.to_symlink (t.Optional[t.List[str]] = None): Files that are symlinked into the path of the
Ensemblemembers. A symlink, or symbolic link, is a file that points to another file or directory, allowing you to access that file as if it were located in the same directory as the symlink.
To specify a template file in order to programmatically replace specified parameters during generation
of Ensemble member directories, pass the following value to the Ensemble.attach_generator_files function:
to_configure (t.Optional[t.List[str]] = None): This parameter is designed for text-based
Ensemblemember input files. During directory generation forEnsemblemembers, the linked files are parsed and replaced with the params values applied to eachEnsemblemember. To further explain, theEnsemblecreation strategy is considered when replacing the tagged parameters in the input files. These tagged parameters are placeholders in the text that are replaced with the actual parameter values during the directory generation process. The default tag is a semicolon (e.g., THERMO = ;THERMO;).
In the Example subsection, we provide an example using the value to_configure
within Ensemble.attach_generator_files.
See also
To add a file to a single Model that will be appended to an Ensemble, refer to the Files
section of the Model documentation.
Example#
This example demonstrates how to attach a text file to an Ensemble for parameter replacement.
This is accomplished using the params function parameter in
the Experiment.create_ensemble factory function and the to_configure function parameter
in Ensemble.attach_generator_files. The source code example is available in the dropdown below for
convenient execution and customization.
Example Driver Script Source Code
from smartsim import Experiment
# Initialize the Experiment
exp = Experiment("getting-started", launcher="auto")
# Initialize a RunSettings object
ensemble_settings = exp.create_run_settings(exe="python", exe_args="/path/to/application.py")
# Initialize an Ensemble object via replicas strategy
example_ensemble = exp.create_ensemble("ensemble", ensemble_settings, replicas=2, params={"THERMO":1})
# Attach the file to the Ensemble instance
example_ensemble.attach_generator_files(to_configure="path/to/params_inputs.txt")
# Generate the Ensemble directory
exp.generate(example_ensemble)
# Launch the Ensemble
exp.start(example_ensemble)
In this example, we have a text file named params_inputs.txt. Within the text, is the parameter THERMO
that is required by each Ensemble member at runtime:
THERMO = ;THERMO;
In order to have the tagged parameter ;THERMO; replaced with a usable value at runtime, two steps are required:
The THERMO variable must be included in
Experiment.create_ensemblefactory method as part of the params parameter.The file containing the tagged parameter ;THERMO;, params_inputs.txt, must be attached to the
Ensemblevia theEnsemble.attach_generator_filesmethod as part of the to_configure parameter.
To encapsulate our application within an Ensemble, we must create an Experiment instance
to gain access to the Experiment factory method that creates the Ensemble.
Begin by importing the Experiment module and initializing an Experiment:
1from smartsim import Experiment
2
3# Initialize the Experiment
4exp = Experiment("getting-started", launcher="auto")
To create our Ensemble, we are using the replicas initialization strategy.
Begin by creating a simple RunSettings object to specify the path to
the executable simulation as an executable:
1# Initialize a RunSettings object
2ensemble_settings = exp.create_run_settings(exe="python", exe_args="/path/to/application.py")
Next, initialize an Ensemble object with Experiment.create_ensemble
by passing in ensemble_settings, params={“THERMO”:1} and replicas=2:
1# Initialize an Ensemble object via replicas strategy
2example_ensemble = exp.create_ensemble("ensemble", ensemble_settings, replicas=2, params={"THERMO":1})
We now have an Ensemble instance named example_ensemble. Attach the above text file
to the Ensemble for use at entity runtime. To do so, we use the
Ensemble.attach_generator_files function and specify the to_configure
parameter with the path to the text file, params_inputs.txt:
1# Attach the file to the Ensemble instance
2example_ensemble.attach_generator_files(to_configure="path/to/params_inputs.txt")
To create an isolated directory for the Ensemble member outputs and configuration files, invoke Experiment.generate via the
Experiment instance exp with example_ensemble as an input parameter:
1# Generate the Ensemble directory
2exp.generate(example_ensemble)
After invoking Experiment.generate, the attached generator files will be available for the
application when exp.start(example_ensemble) is called.
1# Launch the Ensemble
2exp.start(example_ensemble)
The contents of params_inputs.txt after Ensemble completion are:
THERMO = 1
ML Models and Scripts#
Overview#
SmartSim users have the capability to load ML models and TorchScripts into an Orchestrator
within the Experiment script for use within Ensemble members. Functions
accessible through an Ensemble object support loading ML models (TensorFlow, TensorFlow-lite,
PyTorch, and ONNX) and TorchScripts into standalone or colocated Orchestrators before
application runtime.
See also
To add an ML model or TorchScript to a single Model that will be appended to an
Ensemble, refer to the ML Models and Scripts
section of the Model documentation.
Depending on the planned storage method of the ML model, there are two distinct
approaches to load it into the Orchestrator:
Warning
Uploading an ML model from memory is solely supported for
standalone Orchestrators. To upload an ML model to a colocated Orchestrator, users
must save the ML model to disk and upload from file.
Depending on the planned storage method of the TorchScript, there are three distinct
approaches to load it into the Orchestrator:
Warning
Uploading a TorchScript from memory is solely supported for
standalone Orchestrators. To upload a TorchScript to a colocated Orchestrator, users
upload from file or from string.
Once a ML model or TorchScript is loaded into the Orchestrator, Ensemble members can
leverage ML capabilities by utilizing the SmartSim client (SmartRedis)
to execute the stored ML models or TorchScripts.
AI Models#
When configuring an Ensemble, users can instruct SmartSim to load
Machine Learning (ML) models dynamically to the Orchestrator (colocated or standalone). ML models added
are loaded into the Orchestrator prior to the execution of the Ensemble. To load an ML model
to the Orchestrator, SmartSim users can serialize and provide the ML model in-memory or specify the file path
via the Ensemble.add_ml_model function. The supported ML frameworks are TensorFlow,
TensorFlow-lite, PyTorch, and ONNX.
Users must serialize TensorFlow ML models before sending to an Orchestrator from memory
or from file. To save a TensorFlow model to memory, SmartSim offers the serialize_model
function. This function returns the TF model as a byte string with the names of the
input and output layers, which will be required upon uploading. To save a TF model to disk,
SmartSim offers the freeze_model function which returns the path to the serialized
TF model file with the names of the input and output layers. Additional TF model serialization
information and examples can be found in the ML Features section of SmartSim.
Note
Uploading an ML model from memory is only supported for standalone Orchestrators.
When attaching an ML model using Ensemble.add_ml_model, the
following arguments are offered to customize storage and execution:
name (str): name to reference the ML model in the
Orchestrator.backend (str): name of the backend (TORCH, TF, TFLITE, ONNX).
model (t.Optional[str] = None): An ML model in memory (only supported for non-colocated
Orchestrators).model_path (t.Optional[str] = None): serialized ML model.
device (t.Literal[“CPU”, “GPU”] = “CPU”): name of device for execution, defaults to “CPU”.
devices_per_node (int = 1): The number of GPU devices available on the host. This parameter only applies to GPU devices and will be ignored if device is specified as CPU.
first_device (int = 0): The first GPU device to use on the host. This parameter only applies to GPU devices and will be ignored if device is specified as CPU.
batch_size (int = 0): batch size for execution, defaults to 0.
min_batch_size (int = 0): minimum batch size for ML model execution, defaults to 0.
min_batch_timeout (int = 0): time to wait for minimum batch size, defaults to 0.
tag (str = “”): additional tag for ML model information, defaults to “”.
inputs (t.Optional[t.List[str]] = None): ML model inputs (TF only), defaults to None.
outputs (t.Optional[t.List[str]] = None): ML model outputs (TF only), defaults to None.
See also
To add an ML model to a single Model that will be appended to an
Ensemble, refer to the AI Models
section of the Model documentation.
Example: Attach an In-Memory ML Model#
This example demonstrates how to attach an in-memory ML model to a SmartSim Ensemble
to load into an Orchestrator at Ensemble runtime. The source code example is
available in the dropdown below for convenient execution and customization.
Experiment Driver Script Source Code
from smartsim import Experiment
from tensorflow import keras
from tensorflow.keras.layers import Conv2D, Input
class Net(keras.Model):
def __init__(self):
super(Net, self).__init__(name="cnn")
self.conv = Conv2D(1, 3, 1)
def call(self, x):
y = self.conv(x)
return y
def create_tf_cnn():
"""Create an in-memory Keras CNN for example purposes
"""
from smartsim.ml.tf import serialize_model
n = Net()
input_shape = (3,3,1)
inputs = Input(input_shape)
outputs = n(inputs)
model = keras.Model(inputs=inputs, outputs=outputs, name=n.name)
return serialize_model(model)
# Serialize and save TF model
model, inputs, outputs = create_tf_cnn()
# Initialize the Experiment and set the launcher to auto
exp = Experiment("getting-started", launcher="auto")
# Initialize a RunSettings object
ensemble_settings = exp.create_run_settings(exe="path/to/example_simulation_program")
# Initialize a Model object
ensemble_instance = exp.create_ensemble("ensemble_name", ensemble_settings)
# Attach the in-memory ML model to the SmartSim Ensemble
ensemble_instance.add_ml_model(name="cnn", backend="TF", model=model, device="GPU", devices_per_node=2, first_device=0, inputs=inputs, outputs=outputs)
Note
This example assumes:
an
Orchestratoris launched prior to theEnsembleexecutionan initialized
Ensemblenamed ensemble_instance exists within theExperimentworkflowa Tensorflow-based ML model was serialized using
serialize_modelwhich returns the ML model as a byte string with the names of the input and output layers
Attach the ML Model to a SmartSim Ensemble
In this example, we have a serialized Tensorflow-based ML model that was saved to a byte string stored under model.
Additionally, the serialize_model function returned the names of the input and output layers stored under
inputs and outputs. Assuming an initialized Ensemble named ensemble_instance exists, we add the byte string TensorFlow model using
Ensemble.add_ml_model:
1# Attach the in-memory ML model to the SmartSim Ensemble
2ensemble_instance.add_ml_model(name="cnn", backend="TF", model=model, device="GPU", devices_per_node=2, first_device=0, inputs=inputs, outputs=outputs)
In the above ensemble_instance.add_ml_model code snippet, we offer the following arguments:
name (“cnn”): A name to reference the ML model in the
Orchestrator.backend (“TF”): Indicating that the ML model is a TensorFlow model.
model (model): The in-memory representation of the TensorFlow model.
device (“GPU”): Specifying the device for ML model execution.
devices_per_node (2): Use two GPUs per node.
first_device (0): Start with 0 index GPU.
inputs (inputs): The name of the ML model input nodes (TensorFlow only).
outputs (outputs): The name of the ML model output nodes (TensorFlow only).
Warning
Calling exp.start(ensemble_instance) prior to the launch of an Orchestrator will result in
a failed attempt to load the ML model to a non-existent standalone Orchestrator.
When the Ensemble is started via Experiment.start, the ML model will be loaded to the
launched standalone Orchestrator. The ML model can then be executed on the Orchestrator via a SmartSim
client (SmartRedis) within the application code.
Example: Attach an ML Model From File#
This example demonstrates how to attach a ML model from file to a SmartSim Ensemble
to load into an Orchestrator at Ensemble runtime. The source code example is
available in the dropdown below for convenient execution and customization.
Experiment Driver Script Source Code
from smartsim import Experiment
from tensorflow import keras
from tensorflow.keras.layers import Conv2D, Input
class Net(keras.Model):
def __init__(self):
super(Net, self).__init__(name="cnn")
self.conv = Conv2D(1, 3, 1)
def call(self, x):
y = self.conv(x)
return y
def save_tf_cnn(path, file_name):
"""Create a Keras CNN and save to file for example purposes"""
from smartsim.ml.tf import freeze_model
n = Net()
input_shape = (3, 3, 1)
n.build(input_shape=(None, *input_shape))
inputs = Input(input_shape)
outputs = n(inputs)
model = keras.Model(inputs=inputs, outputs=outputs, name=n.name)
return freeze_model(model, path, file_name)
# Initialize the Experiment and set the launcher to auto
exp = Experiment("getting-started", launcher="auto")
# Initialize a RunSettings object
ensemble_settings = exp.create_run_settings(exe="path/to/example_simulation_program")
# Initialize a Model object
ensemble_instance = exp.create_ensemble("ensemble_name", ensemble_settings)
# Serialize and save TF model to file
model_file, inputs, outputs = save_tf_cnn(ensemble_instance.path, "model.pb")
# Attach ML model file to Ensemble
ensemble_instance.add_ml_model(name="cnn", backend="TF", model_path=model_file, device="GPU", devices_per_node=2, first_device=0, inputs=inputs, outputs=outputs)
Note
This example assumes:
a standalone
Orchestratoris launched prior toEnsembleexecutionan initialized
Ensemblenamed ensemble_instance exists within theExperimentworkflowa Tensorflow-based ML model was serialized using
freeze_modelwhich returns the the path to the serialized model file and the names of the input and output layers
Attach the ML Model to a SmartSim Ensemble
In this example, we have a serialized Tensorflow-based ML model that was saved to disk and stored under model.
Additionally, the freeze_model function returned the names of the input and output layers stored under
inputs and outputs. Assuming an initialized Ensemble named ensemble_instance exists, we add a TensorFlow model using
the Ensemble.add_ml_model function and specify the ML model path to the parameter model_path:
1# Attach ML model file to Ensemble
2ensemble_instance.add_ml_model(name="cnn", backend="TF", model_path=model_file, device="GPU", devices_per_node=2, first_device=0, inputs=inputs, outputs=outputs)
In the above ensemble_instance.add_ml_model code snippet, we offer the following arguments:
name (“cnn”): A name to reference the ML model in the
Orchestrator.backend (“TF”): Indicating that the ML model is a TensorFlow model.
model_path (model_file): The path to the ML model script.
device (“GPU”): Specifying the device for ML model execution.
devices_per_node (2): Use two GPUs per node.
first_device (0): Start with 0 index GPU.
inputs (inputs): The name of the ML model input nodes (TensorFlow only).
outputs (outputs): The name of the ML model output nodes (TensorFlow only).
Warning
Calling exp.start(ensemble_instance) prior to instantiation of an Orchestrator will result in
a failed attempt to load the ML model to a non-existent Orchestrator.
When the Ensemble is started via Experiment.start, the ML model will be loaded to the
launched Orchestrator. The ML model can then be executed on the Orchestrator via a SmartSim
client (SmartRedis) within the application executable.
TorchScripts#
When configuring an Ensemble, users can instruct SmartSim to load TorchScripts dynamically
to the Orchestrator. The TorchScripts become available for each Ensemble member upon being loaded
into the Orchestrator prior to the execution of the Ensemble. SmartSim users may upload
a single TorchScript function via Ensemble.add_function or alternatively upload a script
containing multiple functions via Ensemble.add_script. To load a TorchScript to the
Orchestrator, SmartSim users can follow one of the following processes:
- Define a TorchScript Function In-Memory
Use the
Ensemble.add_functionto instruct SmartSim to load an in-memory TorchScript to theOrchestrator.
- Define Multiple TorchScript Functions From File
Provide file path to
Ensemble.add_scriptto instruct SmartSim to load the TorchScript from file to theOrchestrator.
- Define a TorchScript Function as String
Provide function string to
Ensemble.add_scriptto instruct SmartSim to load a raw string as a TorchScript function to theOrchestrator.
Note
Uploading a TorchScript from memory using Ensemble.add_function
is only supported for standalone Orchestrators. Users uploading
TorchScripts to colocated Orchestrators should instead use the function Ensemble.add_script
to upload from file or as a string.
Each function also provides flexible device selection, allowing users to choose between which device the TorchScript is executed on, “GPU” or “CPU”. In environments with multiple devices, specific device numbers can be specified using the devices_per_node parameter.
Note
If device=GPU is specified when attaching a TorchScript function to an Ensemble, this instructs
SmartSim to execute the TorchScript on GPU nodes. However, TorchScripts loaded to an Orchestrator are
executed on the Orchestrator compute resources. Therefore, users must make sure that the device
specified is included in the Orchestrator compute resources. To further explain, if a user
specifies device=GPU, however, initializes Orchestrator on only CPU nodes,
the TorchScript will not run on GPU nodes as advised.
Continue or select the respective process link to learn more on how each function (Ensemble.add_script and Ensemble.add_function)
dynamically loads TorchScripts to the Orchestrator.
See also
To add a TorchScript to a single Model that will be appended to an
Ensemble, refer to the TorchScripts
section of the Model documentation.
Attach an In-Memory TorchScript#
Users can define TorchScript functions within the Experiment driver script
to attach to an Ensemble. This feature is supported by Ensemble.add_function.
Warning
Ensemble.add_function does not support loading in-memory TorchScript functions to a colocated Orchestrator.
If you would like to load a TorchScript function to a colocated Orchestrator, define the function
as a raw string or load from file.
When specifying an in-memory TF function using Ensemble.add_function, the
following arguments are offered:
name (str): reference name for the script inside of the
Orchestrator.function (t.Optional[str] = None): TorchScript function code.
device (t.Literal[“CPU”, “GPU”] = “CPU”): device for script execution, defaults to “CPU”.
devices_per_node (int = 1): The number of GPU devices available on the host. This parameter only applies to GPU devices and will be ignored if device is specified as CPU.
first_device (int = 0): The first GPU device to use on the host. This parameter only applies to GPU devices and will be ignored if device is specified as CPU.
Example: Load a In-Memory TorchScript Function#
This example walks through the steps of instructing SmartSim to load an in-memory TorchScript function
to a standalone Orchestrator. The source code example is available in the dropdown below for
convenient execution and customization.
Experiment Driver Script Source Code
from smartsim import Experiment
def timestwo(x):
return 2*x
# Initialize the Experiment and set the launcher to auto
exp = Experiment("getting-started", launcher="auto")
# Initialize a RunSettings object
ensemble_settings = exp.create_run_settings(exe="path/to/example_simulation_program")
# Initialize a Ensemble object
ensemble_instance = exp.create_ensemble("ensemble_name", ensemble_settings)
# Attach TorchScript to Ensemble
ensemble_instance.add_function(name="example_func", function=timestwo, device="GPU", devices_per_node=2, first_device=0)
Note
The example assumes:
a standalone
Orchestratoris launched prior toEnsembleexecutionan initialized
Ensemblenamed ensemble_instance exists within theExperimentworkflow
Define an In-Memory TF Function
To begin, define an in-memory TorchScript function within the Python driver script. For the purpose of the example, we add a simple TorchScript function, timestwo:
1def timestwo(x):
2 return 2*x
Attach the In-Memory TorchScript Function to a SmartSim Ensemble
We use the Ensemble.add_function function to instruct SmartSim to load the TorchScript function timestwo
onto the launched standalone Orchestrator. Specify the function timestwo to the function
parameter:
1# Attach TorchScript to Ensemble
2ensemble_instance.add_function(name="example_func", function=timestwo, device="GPU", devices_per_node=2, first_device=0)
In the above ensemble_instance.add_function code snippet, we offer the following arguments:
name (“example_func”): A name to uniquely identify the TorchScript within the
Orchestrator.function (timestwo): Name of the TorchScript function defined in the Python driver script.
device (“GPU”): Specifying the device for TorchScript execution.
devices_per_node (2): Use two GPUs per node.
first_device (0): Start with 0 index GPU.
Warning
Calling exp.start(ensemble_instance) prior to instantiation of an Orchestrator will result in
a failed attempt to load the TorchScript to a non-existent Orchestrator.
When the Ensemble is started via Experiment.start, the TF function will be loaded to the
standalone Orchestrator. The function can then be executed on the Orchestrator via a SmartSim
client (SmartRedis) within the application code.
Attach a TorchScript From File#
Users can attach TorchScript functions from a file to an Ensemble and upload them to a
colocated or standalone Orchestrator. This functionality is supported by the Ensemble.add_script
function’s script_path parameter.
When specifying a TorchScript using Ensemble.add_script, the
following arguments are offered:
name (str): Reference name for the script inside of the
Orchestrator.script (t.Optional[str] = None): TorchScript code (only supported for non-colocated
Orchestrators).script_path (t.Optional[str] = None): path to TorchScript code.
device (t.Literal[“CPU”, “GPU”] = “CPU”): device for script execution, defaults to “CPU”.
devices_per_node (int = 1): The number of GPU devices available on the host. This parameter only applies to GPU devices and will be ignored if device is specified as CPU.
first_device (int = 0): The first GPU device to use on the host. This parameter only applies to GPU devices and will be ignored if device is specified as CPU.
Example: Loading a TorchScript From File#
This example walks through the steps of instructing SmartSim to load a TorchScript from file
to an Orchestrator. The source code example is available in the dropdown below for
convenient execution and customization.
Experiment Driver Script Source Code
from smartsim import Experiment
# Initialize the Experiment and set the launcher to auto
exp = Experiment("getting-started", launcher="auto")
# Initialize a RunSettings object
ensemble_settings = exp.create_run_settings(exe="path/to/example_simulation_program")
# Initialize a Model object
ensemble_instance = exp.create_ensemble("ensemble_name", ensemble_settings)
# Attach TorchScript to Ensemble
ensemble_instance.add_script(name="example_script", script_path="path/to/torchscript.py", device="GPU", devices_per_node=2, first_device=0)
Note
This example assumes:
an
Orchestratoris launched prior toEnsembleexecutionan initialized
Ensemblenamed ensemble_instance exists within theExperimentworkflow
Define a TorchScript Script
For the example, we create the Python script torchscript.py. The file contains multiple simple torch function shown below:
def negate(x):
return torch.neg(x)
def random(x, y):
return torch.randn(x, y)
def pos(z):
return torch.positive(z)
Attach the TorchScript Script to a SmartSim Ensemble
Assuming an initialized Ensemble named ensemble_instance exists, we add a TorchScript script using
the Ensemble.add_script function and specify the script path to the parameter script_path:
1# Initialize a Ensemble object
2ensemble_instance = exp.create_ensemble("ensemble_name", ensemble_settings)
In the above smartsim_model.add_script code snippet, we offer the following arguments:
name (“example_script”): Reference name for the script inside of the
Orchestrator.script_path (“path/to/torchscript.py”): Path to the script file.
device (“GPU”): device for script execution.
devices_per_node (2): Use two GPUs per node.
first_device (0): Start with 0 index GPU.
Warning
Calling exp.start(ensemble_instance) prior to instantiation of an Orchestrator will result in
a failed attempt to load the ML model to a non-existent Orchestrator.
When ensemble_instance is started via Experiment.start, the TorchScript will be loaded from file to the
Orchestrator that is launched prior to the start of ensemble_instance.
Define TorchScripts as Raw String#
Users can upload TorchScript functions from string to send to a colocated or
standalone Orchestrator. This feature is supported by the
Ensemble.add_script function’s script parameter.
When specifying a TorchScript using Ensemble.add_script, the
following arguments are offered:
name (str): Reference name for the script inside of the
Orchestrator.script (t.Optional[str] = None): String of function code (e.g. TorchScript code string).
script_path (t.Optional[str] = None): path to TorchScript code.
device (t.Literal[“CPU”, “GPU”] = “CPU”): device for script execution, defaults to “CPU”.
devices_per_node (int = 1): The number of GPU devices available on the host. This parameter only applies to GPU devices and will be ignored if device is specified as CPU.
first_device (int = 0): The first GPU device to use on the host. This parameter only applies to GPU devices and will be ignored if device is specified as CPU.
Example: Load a TorchScript From String#
This example walks through the steps of instructing SmartSim to load a TorchScript function
from string to an Orchestrator before the execution of the associated Ensemble.
The source code example is available in the dropdown below for convenient execution and customization.
Experiment Driver Script Source Code
from smartsim import Experiment
# Initialize the Experiment and set the launcher to auto
exp = Experiment("getting-started", launcher="auto")
# Initialize a RunSettings object
ensemble_settings = exp.create_run_settings(exe="path/to/executable/simulation")
# Initialize a Model object
ensemble_instance = exp.create_ensemble("ensemble_name", ensemble_settings)
# TorchScript string
torch_script_str = "def negate(x):\n\treturn torch.neg(x)\n"
# Attach TorchScript to Ensemble
ensemble_instance.add_script(name="example_script", script=torch_script_str, device="GPU", devices_per_node=2, first_device=0)
Note
This example assumes:
an
Orchestratoris launched prior toEnsembleexecutionan initialized
Ensemblenamed ensemble_instance exists within theExperimentworkflow
Define a String TorchScript
Define the TorchScript code as a variable in the Python driver script:
1# TorchScript string
2torch_script_str = "def negate(x):\n\treturn torch.neg(x)\n"
Attach the TorchScript Function to a SmartSim Ensemble
Assuming an initialized Ensemble named ensemble_instance exists, we add a TorchScript using
the Ensemble.add_script function and specify the variable torch_script_str to the parameter
script:
1# Attach TorchScript to Ensemble
2ensemble_instance.add_script(name="example_script", script=torch_script_str, device="GPU", devices_per_node=2, first_device=0)
In the above ensemble_instance.add_script code snippet, we offer the following arguments:
name (“example_script”): key to store script under.
script (torch_script_str): TorchScript code.
device (“GPU”): device for script execution.
devices_per_node (2): Use two GPUs per node.
first_device (0): Start with 0 index GPU.
Warning
Calling exp.start(ensemble_instance) prior to instantiation of an Orchestrator will result in
a failed attempt to load the ML model to a non-existent Orchestrator.
When the Ensemble is started via Experiment.start, the TorchScript will be loaded to the
Orchestrator that is launched prior to the start of the Ensemble.
Data Collision Prevention#
Overview#
When multiple Ensemble members use the same code to send and access their respective data
in the Orchestrator, key overlapping can occur, leading to inadvertent data access
between Ensemble members. To address this, SmartSim supports key prefixing
through Ensemble.enable_key_prefixing which enables key prefixing for all
Ensemble members. For example, during an Ensemble simulation with prefixing enabled, SmartSim will add
the Ensemble member name as a prefix to the keys sent to the Orchestrator.
Enabling key prefixing eliminates issues related to key overlapping, allowing Ensemble
members to use the same code without issue.
The key components of SmartSim Ensemble prefixing functionality include:
Sending Data to the Orchestrator: Users can send data to an
Orchestratorwith theEnsemblemember name prepended to the data name by utilizing SmartSim Ensemble functions.Retrieving Data From the Orchestrator: Users can instruct a
Clientto prepend aEnsemblemember name to a key during data retrieval, polling, or check for existence on theOrchestratorthrough SmartRedis Client functions. However, entity interaction must be registered using Ensemble or Model functions.
See also
For information on prefixing Client functions, visit the Client functions page of the Model
documentation.
For example, assume you have an Ensemble that was initialized using the replicas creation strategy.
Two identical Model were created named ensemble_0 and ensemble_1 that use the same executable application
within an Ensemble named ensemble. In the application code you use the function Client.put_tensor("tensor_0", data).
Without key prefixing enabled, the slower member will overwrite the data from the faster simulation.
With Ensemble key prefixing turned on, ensemble_0 and ensemble_1 can access
their tensor “tensor_0” by name without overwriting or accessing the other Model’s “tensor_0” tensor.
In this scenario, the two tensors placed in the Orchestrator are named ensemble_0.tensor_0 and ensemble_1.tensor_0.
Ensemble Functions#
An Ensemble object supports two prefixing functions: Ensemble.enable_key_prefixing and
Ensemble.register_incoming_entity. For more information on each function, reference the
Ensemble API docs.
To enable prefixing on a Ensemble, users must use the Ensemble.enable_key_prefixing
function in the Experiment driver script. This function activates prefixing for tensors,
Datasets, and lists sent to an Orchestrator for all Ensemble members. This function
also enables access to prefixing Client functions within the Ensemble members. This excludes
the Client.set_data_source function, where enable_key_prefixing is not require for access.
Note
ML model and script prefixing is not automatically enabled through Ensemble.enable_key_prefixing.
Prefixing must be enabled within the Ensemble by calling the use_model_ensemble_prefix method
on the Client embedded within the member application.
Users can enable the SmartRedis Client to interact with prefixed data, ML models and TorchScripts
using the Client.set_data_source. However, for SmartSim to recognize the producer entity name
passed to the function within an application, the producer entity must be registered on the consumer
entity using Ensemble.register_incoming_entity.
If a consumer Ensemble member requests data sent to the Orchestrator by other Ensemble members, the producer members must be
registered on consumer member. To access Ensemble members, SmartSim offers the attribute Ensemble.models that returns
a list of Ensemble members. Below we demonstrate registering producer members on a consumer member:
# list of producer Ensemble members
list_of_ensemble_names = ["producer_0", "producer_1", "producer_2"]
# Grab the consumer Ensemble member
ensemble_member = ensemble.models.get("producer_3")
# Register the producer members on the consumer member
for name in list_of_ensemble_names:
ensemble_member.register_incoming_entity(ensemble.models.get(name))
For examples demonstrating how to retrieve data within the entity application that produced
the data, visit the Model Copy/Rename/Delete Operations subsection.
Example: Ensemble Key Prefixing#
In this example, we create an Ensemble comprised of two Model’s that use identical code
to send data to a standalone Orchestrator. To prevent key collisions and ensure data
integrity, we enable key prefixing on the Ensemble which automatically
appends the Ensemble member name to the data sent to the Orchestrator. After the
Ensemble completes, we launch a consumer Model within the Experiment driver script
to demonstrate accessing prefixed data sent to the Orchestrator by Ensemble members.
This example consists of three Python scripts:
Application Producer Script: This script is encapsulated in a SmartSim
Ensemblewithin theExperimentdriver script. Prefixing is enabled on theEnsemble. The producer script puts NumPy tensors on anOrchestratorlaunched in theExperimentdriver script. TheEnsemblecreates two identicalEnsemblemembers. The producer script is executed in bothEnsemblemembers to send two prefixed tensors to theOrchestrator. The source code example is available in the dropdown below for convenient customization.
Application Producer Script Source Code
from smartredis import Client
import numpy as np
# Initialize a Client
client = Client(cluster=False)
# Create NumPy array
array = np.array([1, 2, 3, 4])
# Use SmartRedis Client to place tensor in standalone Orchestrator
client.put_tensor("tensor", array)
Application Consumer Script: This script is encapsulated within a SmartSim
Modelin theExperimentdriver script. The script requests the prefixed tensors placed by the producer script. The source code example is available in the dropdown below for convenient customization.
Application Consumer Script Source Code
from smartredis import Client, LLInfo
# Initialize a Client
client = Client(cluster=False)
# Set the data source
client.set_data_source("producer_0")
# Check if the tensor exists
tensor_1 = client.poll_tensor("tensor", 100, 100)
# Set the data source
client.set_data_source("producer_1")
# Check if the tensor exists
tensor_2 = client.poll_tensor("tensor", 100, 100)
client.log_data(LLInfo, f"producer_0.tensor was found: {tensor_1}")
client.log_data(LLInfo, f"producer_1.tensor was found: {tensor_2}")
Experiment Driver Script: The driver script launches the
Orchestrator, theEnsemble(which sends prefixed keys to theOrchestrator), and theModel(which requests prefixed keys from theOrchestrator). TheExperimentdriver script is the centralized spot that controls the workflow. The source code example is available in the dropdown below for convenient execution and customization.
Experiment Driver Script Source Code
from smartsim import Experiment
from smartsim.log import get_logger
logger = get_logger("Experiment Log")
# Initialize the Experiment
exp = Experiment("getting-started", launcher="auto")
# Initialize a standalone Orchestrator
standalone_orch = exp.create_database(db_nodes=1)
# Initialize a RunSettings object for Ensemble
ensemble_settings = exp.create_run_settings(exe="/path/to/executable_producer_simulation")
# Initialize Ensemble
producer_ensemble = exp.create_ensemble("producer", run_settings=ensemble_settings, replicas=2)
# Enable key prefixing for Ensemble members
producer_ensemble.enable_key_prefixing()
# Initialize a RunSettings object for Model
model_settings = exp.create_run_settings(exe="/path/to/executable_consumer_simulation")
# Initialize Model
consumer_model = exp.create_model("consumer", model_settings)
# Generate SmartSim entity folder tree
exp.generate(standalone_orch, producer_ensemble, consumer_model, overwrite=True)
# Launch Orchestrator
exp.start(standalone_orch, summary=True)
# Launch Ensemble
exp.start(producer_ensemble, block=True, summary=True)
# Register Ensemble members on consumer Model
for model in producer_ensemble:
consumer_model.register_incoming_entity(model)
# Launch consumer Model
exp.start(consumer_model, block=True, summary=True)
# Clobber Orchestrator
exp.stop(standalone_orch)
The Application Producer Script#
In the Experiment driver script, we instruct SmartSim to create an Ensemble comprised of
two duplicate members that execute this producer script. In the producer script, a SmartRedis Client sends a
tensor to the Orchestrator. Since the Ensemble members are identical and therefore use the same
application code, two tensors are sent to the Orchestrator. Without prefixing enabled on the Ensemble
the keys can be overwritten. To prevent this, we enable key prefixing on the Ensemble in the driver script
via Ensemble.enable_key_prefixing. When the producer script is executed by each Ensemble member, a
tensor is sent to the Orchestrator with the Ensemble member name prepended to the tensor name.
Here we provide the producer script that is applied to the Ensemble members:
1from smartredis import Client
2import numpy as np
3
4# Initialize a Client
5client = Client(cluster=False)
6
7# Create NumPy array
8array = np.array([1, 2, 3, 4])
9# Use SmartRedis Client to place tensor in standalone Orchestrator
10client.put_tensor("tensor", array)
After the completion of Ensemble members producer_0 and producer_1, the contents of the Orchestrator are:
1) "producer_0.tensor"
2) "producer_1.tensor"
The Application Consumer Script#
In the Experiment driver script, we initialize a consumer Model that encapsulates
the consumer application to request the tensors produced from the Ensemble. To do
so, we use SmartRedis key prefixing functionality to instruct the SmartRedis Client
to append the name of an Ensemble member to the key name.
See also
For more information on Client prefixing functions, visit the Client functions
subsection of the Model documentation.
To begin, specify the imports and initialize a SmartRedis Client:
1from smartredis import Client, LLInfo
2
3# Initialize a Client
4client = Client(cluster=False)
To retrieve the tensor from the first Ensemble member named producer_0, use
Client.set_data_source. Specify the name of the first Ensemble member
as an argument to the function. This instructs SmartSim to append the Ensemble member name to the data
search on the Orchestrator. When Client.poll_tensor is executed,
the SmartRedis client will poll for key, producer_0.tensor:
1# Set the data source
2client.set_data_source("producer_0")
3# Check if the tensor exists
4tensor_1 = client.poll_tensor("tensor", 100, 100)
Follow the same steps above, however, change the data source name to the name
of the second Ensemble member (producer_1):
1# Set the data source
2client.set_data_source("producer_1")
3# Check if the tensor exists
4tensor_2 = client.poll_tensor("tensor", 100, 100)
We print the boolean return to verify that the tensors were found:
1client.log_data(LLInfo, f"producer_0.tensor was found: {tensor_1}")
2client.log_data(LLInfo, f"producer_1.tensor was found: {tensor_2}")
When the Experiment driver script is executed, the following output will appear in consumer.out:
Default@11-46-05:producer_0.tensor was found: True
Default@11-46-05:producer_1.tensor was found: True
Warning
For SmartSim to recognize the Ensemble member names as a valid data source
to Client.set_data_source, you must register each Ensemble member
on the consumer Model in the driver script via Model.register_incoming_entity.
We demonstrate this in the Experiment driver script section of the example.
The Experiment Script#
The Experiment driver script manages all workflow components and utilizes the producer and consumer
application scripts. In the example, the Experiment:
launches standalone
Orchestratorlaunches an
Ensemblevia the replicas initialization strategylaunches a consumer
Modelclobbers the
Orchestrator
To begin, add the necessary imports, initialize an Experiment instance and initialize the
standalone Orchestrator:
1from smartsim import Experiment
2from smartsim.log import get_logger
3
4logger = get_logger("Experiment Log")
5# Initialize the Experiment
6exp = Experiment("getting-started", launcher="auto")
7
8# Initialize a standalone Orchestrator
9standalone_orch = exp.create_database(db_nodes=1)
We are now setup to discuss key prefixing within the Experiment driver script.
To create an Ensemble using the replicas strategy, begin by initializing a RunSettings
object to apply to all Ensemble members. Specify the path to the application
producer script:
1# Initialize a RunSettings object for Ensemble
2ensemble_settings = exp.create_run_settings(exe="/path/to/executable_producer_simulation")
Next, initialize an Ensemble by specifying ensemble_settings and the number of Model replicas to create:
1# Initialize Ensemble
2producer_ensemble = exp.create_ensemble("producer", run_settings=ensemble_settings, replicas=2)
Instruct SmartSim to prefix all tensors sent to the Orchestrator from the Ensemble via Ensemble.enable_key_prefixing:
1# Enable key prefixing for Ensemble members
2producer_ensemble.enable_key_prefixing()
Next, initialize the consumer Model. The consumer Model application requests
the prefixed tensors produced by the Ensemble:
1# Initialize a RunSettings object for Model
2model_settings = exp.create_run_settings(exe="/path/to/executable_consumer_simulation")
3# Initialize Model
4consumer_model = exp.create_model("consumer", model_settings)
Next, organize the SmartSim entity output files into a single Experiment folder:
1# Generate SmartSim entity folder tree
2exp.generate(standalone_orch, producer_ensemble, consumer_model, overwrite=True)
Launch the Orchestrator:
1# Launch Orchestrator
2exp.start(standalone_orch, summary=True)
Launch the Ensemble:
1# Launch Ensemble
2exp.start(producer_ensemble, block=True, summary=True)
Set block=True so that Experiment.start waits until the last Ensemble member has finished before continuing.
The consumer Model application script uses Client.set_data_source which
accepts the Ensemble member names when searching for prefixed
keys in the Orchestrator. In order for SmartSim to recognize the Ensemble
member names as a valid data source in the consumer Model, we must register
the entity interaction:
1# Register Ensemble members on consumer Model
2for model in producer_ensemble:
3 consumer_model.register_incoming_entity(model)
Launch the consumer Model:
1# Launch consumer Model
2exp.start(consumer_model, block=True, summary=True)
To finish, tear down the standalone Orchestrator:
1# Clobber Orchestrator
2exp.stop(standalone_orch)