Ensemble#
Overview#
A SmartSim Ensemble
enables users to run a group of computational tasks together in an
Experiment
workflow. An Ensemble
is comprised of multiple Model
objects,
where each Ensemble
member (SmartSim Model
) represents an individual application.
An Ensemble
can be managed as a single entity and
launched with other Model’s and Orchestrators to construct AI-enabled workflows.
The Ensemble API offers key features, including methods to:
Attach Configuration Files for use at
Ensemble
runtime.Load AI Models (TF, TF-lite, PT, or ONNX) into the
Orchestrator
atEnsemble
runtime.Load TorchScripts into the
Orchestrator
atEnsemble
runtime.Prevent Data Collisions within the
Ensemble
, which allows for reuse of application code.
To create a SmartSim Ensemble
, use the Experiment.create_ensemble
API function. When
initializing an Ensemble
, consider one of the three creation strategies explained
in the Initialization section.
SmartSim manages Ensemble
instances through the Experiment API by providing functions to
launch, monitor, and stop applications.
Initialization#
Overview#
The Experiment API is responsible for initializing all workflow entities.
An Ensemble
is created using the Experiment.create_ensemble
factory method, and users can customize the
Ensemble
creation via the factory method parameters.
The factory method arguments for Ensemble
creation can be found in the Experiment API
under the create_ensemble
docstring.
By using specific combinations of the factory method arguments, users can tailor
the creation of an Ensemble
to align with one of the following creation strategies:
Parameter Expansion: Generate a variable-sized set of unique simulation instances configured with user-defined input parameters.
Replica Creation: Generate a specified number of
Model
replicas.Manually: Attach pre-configured
Model
’s to anEnsemble
to manage as a single unit.
Parameter Expansion#
Parameter expansion is a technique that allows users to set parameter values per Ensemble
member.
This is done by specifying input to the params and perm_strategy factory method arguments during
Ensemble
creation (Experiment.create_ensemble
). Users may control how the params values
are applied to the Ensemble
through the perm_strategy argument. The perm_strategy argument
accepts three values listed below.
Parameter Expansion Strategy Options:
“all_perm”: Generate all possible parameter permutations for an exhaustive exploration. This means that every possible combination of parameters will be used in the
Ensemble
.“step”: Create parameter sets by collecting identically indexed values across parameter lists. This allows for discrete combinations of parameters for
Model
’s.“random”: Enable random selection from predefined parameter spaces, offering a stochastic approach. This means that the parameters will be chosen randomly for each
Model
, which can be useful for exploring a wide range of possibilities.
Examples#
This subsection contains two examples of Ensemble
parameter expansion. The
first example illustrates parameter expansion using two parameters
while the second example demonstrates parameter expansion with two
parameters along with the launch of the Ensemble
as a batch workload.
Example 1 : Parameter Expansion Using all_perm Strategy
In this example an
Ensemble
of fourModel
entities is created by expanding two parameters using the all_perm strategy. All of theModel
’s in theEnsemble
share the sameRunSettings
and only differ in the value of the params assigned to each member. The source code example is available in the dropdown below for convenient execution and customization.Example Driver Script Source Code
from smartsim import Experiment # Initialize the Experiment and set the launcher to auto exp = Experiment("getting-started", launcher="auto") # Initialize a RunSettings rs = exp.create_run_settings(exe="path/to/example_simulation_program") #Create the parameters to expand to the Ensemble members params = { "name": ["Ellie", "John"], "parameter": [2, 11] } # Initialize the Ensemble by specifying RunSettings, the params and "all_perm" ensemble = exp.create_ensemble("model_member", run_settings=rs, params=params, perm_strategy="all_perm")Begin by initializing a
RunSettings
object to apply to allEnsemble
members:1# Initialize a RunSettings 2rs = exp.create_run_settings(exe="path/to/example_simulation_program")Next, define the parameters that will be applied to the
Ensemble
:1#Create the parameters to expand to the Ensemble members 2params = { 3 "name": ["Ellie", "John"], 4 "parameter": [2, 11] 5 }Finally, initialize an
Ensemble
by specifying theRunSettings
, params and perm_strategy=”all_perm”:1# Initialize the Ensemble by specifying RunSettings, the params and "all_perm" 2ensemble = exp.create_ensemble("model_member", run_settings=rs, params=params, perm_strategy="all_perm")By specifying perm_strategy=”all_perm”, all permutations of the params will be calculated and distributed across
Ensemble
members. Here there are four permutations of the params values:ensemble member 1: ["Ellie", 2] ensemble member 2: ["Ellie", 11] ensemble member 3: ["John", 2] ensemble member 4: ["John", 11]
Example 2 : Parameter Expansion Using step Strategy with the Ensemble
Configured For Batch Launching
In this example an
Ensemble
of twoModel
entities is created by expanding two parameters using the step strategy. All of theModel
’s in theEnsemble
share the sameRunSettings
and only differ in the value of the params assigned to each member. Lastly, theEnsemble
is submitted as a batch workload. The source code example is available in the dropdown below for convenient execution and customization.Example Driver Script source code
from smartsim import Experiment # Initialize the Experiment and set the launcher to auto exp = Experiment("getting-started", launcher="auto") # Initialize a BatchSettings bs = exp.create_batch_settings(nodes=2, time="10:00:00") # Initialize and configure RunSettings rs = exp.create_run_settings(exe="python", exe_args="path/to/application_script.py") rs.set_nodes(1) #Create the parameters to expand to the Ensemble members params = { "name": ["Ellie", "John"], "parameter": [2, 11] } # Initialize the Ensemble by specifying RunSettings, BatchSettings, the params and "step" ensemble = exp.create_ensemble("ensemble", run_settings=rs, batch_settings=bs, params=params, perm_strategy="step")Begin by initializing and configuring a
BatchSettings
object to run theEnsemble
instance:1# Initialize a BatchSettings 2bs = exp.create_batch_settings(nodes=2, 3 time="10:00:00")The above
BatchSettings
object will instruct SmartSim to run theEnsemble
on two nodes with a timeout of 10 hours.Next initialize a
RunSettings
object to apply to allEnsemble
members:1# Initialize and configure RunSettings 2rs = exp.create_run_settings(exe="python", exe_args="path/to/application_script.py") 3rs.set_nodes(1)Next, define the parameters to include in
Ensemble
:1#Create the parameters to expand to the Ensemble members 2params = { 3 "name": ["Ellie", "John"], 4 "parameter": [2, 11] 5 }Finally, initialize an
Ensemble
by passing in theRunSettings
, params and perm_strategy=”step”:1# Initialize the Ensemble by specifying RunSettings, BatchSettings, the params and "step" 2ensemble = exp.create_ensemble("ensemble", run_settings=rs, batch_settings=bs, params=params, perm_strategy="step")When specifying perm_strategy=”step”, the params sets are created by collecting identically indexed values across the param value lists.
ensemble member 1: ["Ellie", 2] ensemble member 2: ["John", 11]
Replicas#
A replica strategy involves the creation of identical Model
’s within an Ensemble
.
This strategy is particularly useful for applications that have some inherent randomness.
Users may use the replicas factory method argument to create a specified number of identical
Model
members during Ensemble
creation (Experiment.create_ensemble
).
Examples#
This subsection contains two examples of using the replicas creation strategy. The
first example illustrates creating four Ensemble
member clones
while the second example demonstrates creating four Ensemble
member clones along with the launch of the Ensemble
as a batch workload.
Example 1 : Ensemble
creation with replicas strategy
In this example an
Ensemble
of four identicalModel
members is created by specifying the number of clones to create via the replicas argument. All of theModel
’s in theEnsemble
share the sameRunSettings
. The source code example is available in the dropdown below for convenient execution and customization.Example Driver Script Source Code
from smartsim import Experiment # Initialize the Experiment and set the launcher to auto exp = Experiment("getting-started", launcher="auto") # Initialize a RunSettings object rs = exp.create_run_settings(exe="python", exe_args="path/to/application_script.py") # Initialize the Ensemble by specifying the number of replicas and RunSettings ensemble = exp.create_ensemble("ensemble-replica", replicas=4, run_settings=rs)To create an
Ensemble
of identicalModel
’s, begin by initializing aRunSettings
object:1# Initialize a RunSettings object 2rs = exp.create_run_settings(exe="python", exe_args="path/to/application_script.py")Initialize the
Ensemble
by specifying theRunSettings
object and number of clones to replicas:1# Initialize the Ensemble by specifying the number of replicas and RunSettings 2ensemble = exp.create_ensemble("ensemble-replica", replicas=4, run_settings=rs)By passing in replicas=4, four identical
Ensemble
members will be initialized.
Example 2 : Ensemble
Creation with Replicas Strategy and Ensemble
Batch Launching
In this example an
Ensemble
of fourModel
entities is created by specifying the number of clones to create via the replicas argument. All of theModel
’s in theEnsemble
share the sameRunSettings
and theEnsemble
is submitted as a batch workload. The source code example is available in the dropdown below for convenient execution and customization.Example Driver Script Source Code
from smartsim import Experiment # Initialize the Experiment and set the launcher to auto exp = Experiment("getting-started", launcher="auto") # Initialize a BatchSettings object bs = exp.create_batch_settings(nodes=4, time="10:00:00") # Initialize and configure a RunSettings object rs = exp.create_run_settings(exe="python", exe_args="path/to/application_script.py") rs.set_nodes(4) # Initialize an Ensemble ensemble = exp.create_ensemble("ensemble-replica", replicas=4, run_settings=rs, batch_settings=bs)To launch the
Ensemble
of identicalModel
’s as a batch job, begin by initializing aBatchSettings
object:1# Initialize a BatchSettings object 2bs = exp.create_batch_settings(nodes=4, 3 time="10:00:00") 4The above
BatchSettings
object will instruct SmartSim to run theEnsemble
on four nodes with a timeout of 10 hours.Next, create a
RunSettings
object to apply to allModel
replicas:1# Initialize and configure a RunSettings object 2rs = exp.create_run_settings(exe="python", exe_args="path/to/application_script.py") 3rs.set_nodes(4)Initialize the
Ensemble
by specifying theRunSettings
object,BatchSettings
object and number of clones to replicas:1# Initialize an Ensemble 2ensemble = exp.create_ensemble("ensemble-replica", replicas=4, run_settings=rs, batch_settings=bs)By passing in replicas=4, four identical
Ensemble
members will be initialized.
Manually Append#
Manually appending Model
’s to an Ensemble
offers an in-depth level of customization in Ensemble
design.
This approach is favorable when users have distinct requirements for individual Model
’s, such as variations
in parameters, run settings, or different types of simulations.
Examples#
This subsection contains an example of creating an Ensemble
by manually appending Model
’s.
The example illustrates attaching two SmartSim Model
’s to the Ensemble
.
The Ensemble
is submitted as a batch workload.
Example 1 : Append Model
’s to an Ensemble
and Launch as a Batch Job
In this example, we append
Model
’s to anEnsemble
for batch job execution. To do this, we first initialize an Ensemble with aBatchSettings
object. Then, manually createModel
’s and add each to theEnsemble
using theEnsemble.add_model
function. The source code example is available in the dropdown below for convenient execution and customization.Example Driver Script Source Code
from smartsim import Experiment # Initialize the Experiment and set the launcher to auto exp = Experiment("getting-started", launcher="auto") # Initialize BatchSettings bs = exp.create_batch_settings(nodes=10, time="01:00:00") # Initialize Ensemble ensemble = exp.create_ensemble("ensemble-append", batch_settings=bs) # Initialize RunSettings for Model 1 srun_settings_1 = exp.create_run_settings(exe=exe, exe_args="path/to/application_script_1.py") # Initialize RunSettings for Model 2 srun_settings_2 = exp.create_run_settings(exe=exe, exe_args="path/to/application_script_2.py") # Initialize Model 1 with RunSettings 1 model_1 = exp.create_model(name="model_1", run_settings=srun_settings_1) # Initialize Model 2 with RunSettings 2 model_2 = exp.create_model(name="model_2", run_settings=srun_settings_2) # Add Model member to Ensemble ensemble.add_model(model_1) # Add Model member to Ensemble ensemble.add_model(model_2)To create an empty
Ensemble
to appendModel
’s, initialize theEnsemble
with a batch settings object:1# Initialize BatchSettings 2bs = exp.create_batch_settings(nodes=10, 3 time="01:00:00") 4 5# Initialize Ensemble 6ensemble = exp.create_ensemble("ensemble-append", batch_settings=bs)Next, create the
Model
’s to append to theEnsemble
:1# Initialize RunSettings for Model 1 2srun_settings_1 = exp.create_run_settings(exe=exe, exe_args="path/to/application_script_1.py") 3# Initialize RunSettings for Model 2 4srun_settings_2 = exp.create_run_settings(exe=exe, exe_args="path/to/application_script_2.py") 5# Initialize Model 1 with RunSettings 1 6model_1 = exp.create_model(name="model_1", run_settings=srun_settings_1) 7# Initialize Model 2 with RunSettings 2 8model_2 = exp.create_model(name="model_2", run_settings=srun_settings_2)Finally, append the
Model
objects to theEnsemble
:1# Add Model member to Ensemble 2ensemble.add_model(model_1) 3# Add Model member to Ensemble 4ensemble.add_model(model_2)The new
Ensemble
is comprised of two appendedModel
members.
Files#
Overview#
Ensemble
members often depend on external files (e.g. training datasets, evaluation datasets, etc)
to operate as intended. Users can instruct SmartSim to copy, symlink, or manipulate external files
prior to an Ensemble
launch via the Ensemble.attach_generator_files
function. Attached files
will be applied to all Ensemble
members.
Note
Multiple calls to Ensemble.attach_generator_files
will overwrite previous file configurations
on the Ensemble
.
To attach a file to an Ensemble
for use at runtime, provide one of the following arguments to the
Ensemble.attach_generator_files
function:
to_copy (t.Optional[t.List[str]] = None): Files that are copied into the path of the
Ensemble
members.to_symlink (t.Optional[t.List[str]] = None): Files that are symlinked into the path of the
Ensemble
members. A symlink, or symbolic link, is a file that points to another file or directory, allowing you to access that file as if it were located in the same directory as the symlink.
To specify a template file in order to programmatically replace specified parameters during generation
of Ensemble
member directories, pass the following value to the Ensemble.attach_generator_files
function:
to_configure (t.Optional[t.List[str]] = None): This parameter is designed for text-based
Ensemble
member input files. During directory generation forEnsemble
members, the linked files are parsed and replaced with the params values applied to eachEnsemble
member. To further explain, theEnsemble
creation strategy is considered when replacing the tagged parameters in the input files. These tagged parameters are placeholders in the text that are replaced with the actual parameter values during the directory generation process. The default tag is a semicolon (e.g., THERMO = ;THERMO;).
In the Example subsection, we provide an example using the value to_configure
within Ensemble.attach_generator_files
.
See also
To add a file to a single Model
that will be appended to an Ensemble
, refer to the Files
section of the Model
documentation.
Example#
This example demonstrates how to attach a text file to an Ensemble
for parameter replacement.
This is accomplished using the params function parameter in
the Experiment.create_ensemble
factory function and the to_configure function parameter
in Ensemble.attach_generator_files
. The source code example is available in the dropdown below for
convenient execution and customization.
Example Driver Script Source Code
from smartsim import Experiment
# Initialize the Experiment
exp = Experiment("getting-started", launcher="auto")
# Initialize a RunSettings object
ensemble_settings = exp.create_run_settings(exe="python", exe_args="/path/to/application.py")
# Initialize an Ensemble object via replicas strategy
example_ensemble = exp.create_ensemble("ensemble", ensemble_settings, replicas=2, params={"THERMO":1})
# Attach the file to the Ensemble instance
example_ensemble.attach_generator_files(to_configure="path/to/params_inputs.txt")
# Generate the Ensemble directory
exp.generate(example_ensemble)
# Launch the Ensemble
exp.start(example_ensemble)
In this example, we have a text file named params_inputs.txt. Within the text, is the parameter THERMO
that is required by each Ensemble
member at runtime:
THERMO = ;THERMO;
In order to have the tagged parameter ;THERMO; replaced with a usable value at runtime, two steps are required:
The THERMO variable must be included in
Experiment.create_ensemble
factory method as part of the params parameter.The file containing the tagged parameter ;THERMO;, params_inputs.txt, must be attached to the
Ensemble
via theEnsemble.attach_generator_files
method as part of the to_configure parameter.
To encapsulate our application within an Ensemble
, we must create an Experiment
instance
to gain access to the Experiment
factory method that creates the Ensemble
.
Begin by importing the Experiment
module and initializing an Experiment
:
1from smartsim import Experiment
2
3# Initialize the Experiment
4exp = Experiment("getting-started", launcher="auto")
To create our Ensemble
, we are using the replicas initialization strategy.
Begin by creating a simple RunSettings
object to specify the path to
the executable simulation as an executable:
1# Initialize a RunSettings object
2ensemble_settings = exp.create_run_settings(exe="python", exe_args="/path/to/application.py")
Next, initialize an Ensemble
object with Experiment.create_ensemble
by passing in ensemble_settings, params={“THERMO”:1} and replicas=2:
1# Initialize an Ensemble object via replicas strategy
2example_ensemble = exp.create_ensemble("ensemble", ensemble_settings, replicas=2, params={"THERMO":1})
We now have an Ensemble
instance named example_ensemble. Attach the above text file
to the Ensemble
for use at entity runtime. To do so, we use the
Ensemble.attach_generator_files
function and specify the to_configure
parameter with the path to the text file, params_inputs.txt:
1# Attach the file to the Ensemble instance
2example_ensemble.attach_generator_files(to_configure="path/to/params_inputs.txt")
To create an isolated directory for the Ensemble
member outputs and configuration files, invoke Experiment.generate
via the
Experiment
instance exp with example_ensemble as an input parameter:
1# Generate the Ensemble directory
2exp.generate(example_ensemble)
After invoking Experiment.generate
, the attached generator files will be available for the
application when exp.start(example_ensemble)
is called.
1# Launch the Ensemble
2exp.start(example_ensemble)
The contents of params_inputs.txt after Ensemble
completion are:
THERMO = 1
ML Models and Scripts#
Overview#
SmartSim users have the capability to load ML models and TorchScripts into an Orchestrator
within the Experiment
script for use within Ensemble
members. Functions
accessible through an Ensemble
object support loading ML models (TensorFlow, TensorFlow-lite,
PyTorch, and ONNX) and TorchScripts into standalone or colocated Orchestrators
before
application runtime.
See also
To add an ML model or TorchScript to a single Model
that will be appended to an
Ensemble
, refer to the ML Models and Scripts
section of the Model
documentation.
Depending on the planned storage method of the ML model, there are two distinct
approaches to load it into the Orchestrator
:
Warning
Uploading an ML model from memory is solely supported for
standalone Orchestrators
. To upload an ML model to a colocated Orchestrator
, users
must save the ML model to disk and upload from file.
Depending on the planned storage method of the TorchScript, there are three distinct
approaches to load it into the Orchestrator
:
Warning
Uploading a TorchScript from memory is solely supported for
standalone Orchestrators
. To upload a TorchScript to a colocated Orchestrator
, users
upload from file or from string.
Once a ML model or TorchScript is loaded into the Orchestrator
, Ensemble
members can
leverage ML capabilities by utilizing the SmartSim client (SmartRedis)
to execute the stored ML models or TorchScripts.
AI Models#
When configuring an Ensemble
, users can instruct SmartSim to load
Machine Learning (ML) models dynamically to the Orchestrator
(colocated or standalone). ML models added
are loaded into the Orchestrator
prior to the execution of the Ensemble
. To load an ML model
to the Orchestrator
, SmartSim users can serialize and provide the ML model in-memory or specify the file path
via the Ensemble.add_ml_model
function. The supported ML frameworks are TensorFlow,
TensorFlow-lite, PyTorch, and ONNX.
Users must serialize TensorFlow ML models before sending to an Orchestrator
from memory
or from file. To save a TensorFlow model to memory, SmartSim offers the serialize_model
function. This function returns the TF model as a byte string with the names of the
input and output layers, which will be required upon uploading. To save a TF model to disk,
SmartSim offers the freeze_model
function which returns the path to the serialized
TF model file with the names of the input and output layers. Additional TF model serialization
information and examples can be found in the ML Features section of SmartSim.
Note
Uploading an ML model from memory is only supported for standalone Orchestrators
.
When attaching an ML model using Ensemble.add_ml_model
, the
following arguments are offered to customize storage and execution:
name (str): name to reference the ML model in the
Orchestrator
.backend (str): name of the backend (TORCH, TF, TFLITE, ONNX).
model (t.Optional[str] = None): An ML model in memory (only supported for non-colocated
Orchestrators
).model_path (t.Optional[str] = None): serialized ML model.
device (t.Literal[“CPU”, “GPU”] = “CPU”): name of device for execution, defaults to “CPU”.
devices_per_node (int = 1): The number of GPU devices available on the host. This parameter only applies to GPU devices and will be ignored if device is specified as CPU.
first_device (int = 0): The first GPU device to use on the host. This parameter only applies to GPU devices and will be ignored if device is specified as CPU.
batch_size (int = 0): batch size for execution, defaults to 0.
min_batch_size (int = 0): minimum batch size for ML model execution, defaults to 0.
min_batch_timeout (int = 0): time to wait for minimum batch size, defaults to 0.
tag (str = “”): additional tag for ML model information, defaults to “”.
inputs (t.Optional[t.List[str]] = None): ML model inputs (TF only), defaults to None.
outputs (t.Optional[t.List[str]] = None): ML model outputs (TF only), defaults to None.
See also
To add an ML model to a single Model
that will be appended to an
Ensemble
, refer to the AI Models
section of the Model
documentation.
Example: Attach an In-Memory ML Model#
This example demonstrates how to attach an in-memory ML model to a SmartSim Ensemble
to load into an Orchestrator
at Ensemble
runtime. The source code example is
available in the dropdown below for convenient execution and customization.
Experiment Driver Script Source Code
from smartsim import Experiment
from tensorflow import keras
from tensorflow.keras.layers import Conv2D, Input
class Net(keras.Model):
def __init__(self):
super(Net, self).__init__(name="cnn")
self.conv = Conv2D(1, 3, 1)
def call(self, x):
y = self.conv(x)
return y
def create_tf_cnn():
"""Create an in-memory Keras CNN for example purposes
"""
from smartsim.ml.tf import serialize_model
n = Net()
input_shape = (3,3,1)
inputs = Input(input_shape)
outputs = n(inputs)
model = keras.Model(inputs=inputs, outputs=outputs, name=n.name)
return serialize_model(model)
# Serialize and save TF model
model, inputs, outputs = create_tf_cnn()
# Initialize the Experiment and set the launcher to auto
exp = Experiment("getting-started", launcher="auto")
# Initialize a RunSettings object
ensemble_settings = exp.create_run_settings(exe="path/to/example_simulation_program")
# Initialize a Model object
ensemble_instance = exp.create_ensemble("ensemble_name", ensemble_settings)
# Attach the in-memory ML model to the SmartSim Ensemble
ensemble_instance.add_ml_model(name="cnn", backend="TF", model=model, device="GPU", devices_per_node=2, first_device=0, inputs=inputs, outputs=outputs)
Note
This example assumes:
an
Orchestrator
is launched prior to theEnsemble
executionan initialized
Ensemble
named ensemble_instance exists within theExperiment
workflowa Tensorflow-based ML model was serialized using
serialize_model
which returns the ML model as a byte string with the names of the input and output layers
Attach the ML Model to a SmartSim Ensemble
In this example, we have a serialized Tensorflow-based ML model that was saved to a byte string stored under model.
Additionally, the serialize_model
function returned the names of the input and output layers stored under
inputs and outputs. Assuming an initialized Ensemble
named ensemble_instance exists, we add the byte string TensorFlow model using
Ensemble.add_ml_model
:
1# Attach the in-memory ML model to the SmartSim Ensemble
2ensemble_instance.add_ml_model(name="cnn", backend="TF", model=model, device="GPU", devices_per_node=2, first_device=0, inputs=inputs, outputs=outputs)
In the above ensemble_instance.add_ml_model
code snippet, we offer the following arguments:
name (“cnn”): A name to reference the ML model in the
Orchestrator
.backend (“TF”): Indicating that the ML model is a TensorFlow model.
model (model): The in-memory representation of the TensorFlow model.
device (“GPU”): Specifying the device for ML model execution.
devices_per_node (2): Use two GPUs per node.
first_device (0): Start with 0 index GPU.
inputs (inputs): The name of the ML model input nodes (TensorFlow only).
outputs (outputs): The name of the ML model output nodes (TensorFlow only).
Warning
Calling exp.start(ensemble_instance) prior to the launch of an Orchestrator
will result in
a failed attempt to load the ML model to a non-existent standalone Orchestrator
.
When the Ensemble
is started via Experiment.start
, the ML model will be loaded to the
launched standalone Orchestrator
. The ML model can then be executed on the Orchestrator
via a SmartSim
client (SmartRedis) within the application code.
Example: Attach an ML Model From File#
This example demonstrates how to attach a ML model from file to a SmartSim Ensemble
to load into an Orchestrator
at Ensemble
runtime. The source code example is
available in the dropdown below for convenient execution and customization.
Experiment Driver Script Source Code
from smartsim import Experiment
from tensorflow import keras
from tensorflow.keras.layers import Conv2D, Input
class Net(keras.Model):
def __init__(self):
super(Net, self).__init__(name="cnn")
self.conv = Conv2D(1, 3, 1)
def call(self, x):
y = self.conv(x)
return y
def save_tf_cnn(path, file_name):
"""Create a Keras CNN and save to file for example purposes"""
from smartsim.ml.tf import freeze_model
n = Net()
input_shape = (3, 3, 1)
n.build(input_shape=(None, *input_shape))
inputs = Input(input_shape)
outputs = n(inputs)
model = keras.Model(inputs=inputs, outputs=outputs, name=n.name)
return freeze_model(model, path, file_name)
# Initialize the Experiment and set the launcher to auto
exp = Experiment("getting-started", launcher="auto")
# Initialize a RunSettings object
ensemble_settings = exp.create_run_settings(exe="path/to/example_simulation_program")
# Initialize a Model object
ensemble_instance = exp.create_ensemble("ensemble_name", ensemble_settings)
# Serialize and save TF model to file
model_file, inputs, outputs = save_tf_cnn(ensemble_instance.path, "model.pb")
# Attach ML model file to Ensemble
ensemble_instance.add_ml_model(name="cnn", backend="TF", model_path=model_file, device="GPU", devices_per_node=2, first_device=0, inputs=inputs, outputs=outputs)
Note
This example assumes:
a standalone
Orchestrator
is launched prior toEnsemble
executionan initialized
Ensemble
named ensemble_instance exists within theExperiment
workflowa Tensorflow-based ML model was serialized using
freeze_model
which returns the the path to the serialized model file and the names of the input and output layers
Attach the ML Model to a SmartSim Ensemble
In this example, we have a serialized Tensorflow-based ML model that was saved to disk and stored under model.
Additionally, the freeze_model
function returned the names of the input and output layers stored under
inputs and outputs. Assuming an initialized Ensemble
named ensemble_instance exists, we add a TensorFlow model using
the Ensemble.add_ml_model
function and specify the ML model path to the parameter model_path:
1# Attach ML model file to Ensemble
2ensemble_instance.add_ml_model(name="cnn", backend="TF", model_path=model_file, device="GPU", devices_per_node=2, first_device=0, inputs=inputs, outputs=outputs)
In the above ensemble_instance.add_ml_model
code snippet, we offer the following arguments:
name (“cnn”): A name to reference the ML model in the
Orchestrator
.backend (“TF”): Indicating that the ML model is a TensorFlow model.
model_path (model_file): The path to the ML model script.
device (“GPU”): Specifying the device for ML model execution.
devices_per_node (2): Use two GPUs per node.
first_device (0): Start with 0 index GPU.
inputs (inputs): The name of the ML model input nodes (TensorFlow only).
outputs (outputs): The name of the ML model output nodes (TensorFlow only).
Warning
Calling exp.start(ensemble_instance) prior to instantiation of an Orchestrator
will result in
a failed attempt to load the ML model to a non-existent Orchestrator
.
When the Ensemble
is started via Experiment.start
, the ML model will be loaded to the
launched Orchestrator
. The ML model can then be executed on the Orchestrator
via a SmartSim
client (SmartRedis) within the application executable.
TorchScripts#
When configuring an Ensemble
, users can instruct SmartSim to load TorchScripts dynamically
to the Orchestrator
. The TorchScripts become available for each Ensemble
member upon being loaded
into the Orchestrator
prior to the execution of the Ensemble
. SmartSim users may upload
a single TorchScript function via Ensemble.add_function
or alternatively upload a script
containing multiple functions via Ensemble.add_script
. To load a TorchScript to the
Orchestrator
, SmartSim users can follow one of the following processes:
- Define a TorchScript Function In-Memory
Use the
Ensemble.add_function
to instruct SmartSim to load an in-memory TorchScript to theOrchestrator
.
- Define Multiple TorchScript Functions From File
Provide file path to
Ensemble.add_script
to instruct SmartSim to load the TorchScript from file to theOrchestrator
.
- Define a TorchScript Function as String
Provide function string to
Ensemble.add_script
to instruct SmartSim to load a raw string as a TorchScript function to theOrchestrator
.
Note
Uploading a TorchScript from memory using Ensemble.add_function
is only supported for standalone Orchestrators
. Users uploading
TorchScripts to colocated Orchestrators
should instead use the function Ensemble.add_script
to upload from file or as a string.
Each function also provides flexible device selection, allowing users to choose between which device the TorchScript is executed on, “GPU” or “CPU”. In environments with multiple devices, specific device numbers can be specified using the devices_per_node parameter.
Note
If device=GPU is specified when attaching a TorchScript function to an Ensemble
, this instructs
SmartSim to execute the TorchScript on GPU nodes. However, TorchScripts loaded to an Orchestrator
are
executed on the Orchestrator
compute resources. Therefore, users must make sure that the device
specified is included in the Orchestrator
compute resources. To further explain, if a user
specifies device=GPU, however, initializes Orchestrator
on only CPU nodes,
the TorchScript will not run on GPU nodes as advised.
Continue or select the respective process link to learn more on how each function (Ensemble.add_script
and Ensemble.add_function
)
dynamically loads TorchScripts to the Orchestrator
.
See also
To add a TorchScript to a single Model
that will be appended to an
Ensemble
, refer to the TorchScripts
section of the Model
documentation.
Attach an In-Memory TorchScript#
Users can define TorchScript functions within the Experiment
driver script
to attach to an Ensemble
. This feature is supported by Ensemble.add_function
.
Warning
Ensemble.add_function
does not support loading in-memory TorchScript functions to a colocated Orchestrator
.
If you would like to load a TorchScript function to a colocated Orchestrator
, define the function
as a raw string or load from file.
When specifying an in-memory TF function using Ensemble.add_function
, the
following arguments are offered:
name (str): reference name for the script inside of the
Orchestrator
.function (t.Optional[str] = None): TorchScript function code.
device (t.Literal[“CPU”, “GPU”] = “CPU”): device for script execution, defaults to “CPU”.
devices_per_node (int = 1): The number of GPU devices available on the host. This parameter only applies to GPU devices and will be ignored if device is specified as CPU.
first_device (int = 0): The first GPU device to use on the host. This parameter only applies to GPU devices and will be ignored if device is specified as CPU.
Example: Load a In-Memory TorchScript Function#
This example walks through the steps of instructing SmartSim to load an in-memory TorchScript function
to a standalone Orchestrator
. The source code example is available in the dropdown below for
convenient execution and customization.
Experiment Driver Script Source Code
from smartsim import Experiment
def timestwo(x):
return 2*x
# Initialize the Experiment and set the launcher to auto
exp = Experiment("getting-started", launcher="auto")
# Initialize a RunSettings object
ensemble_settings = exp.create_run_settings(exe="path/to/example_simulation_program")
# Initialize a Ensemble object
ensemble_instance = exp.create_ensemble("ensemble_name", ensemble_settings)
# Attach TorchScript to Ensemble
ensemble_instance.add_function(name="example_func", function=timestwo, device="GPU", devices_per_node=2, first_device=0)
Note
The example assumes:
a standalone
Orchestrator
is launched prior toEnsemble
executionan initialized
Ensemble
named ensemble_instance exists within theExperiment
workflow
Define an In-Memory TF Function
To begin, define an in-memory TorchScript function within the Python driver script. For the purpose of the example, we add a simple TorchScript function, timestwo:
1def timestwo(x):
2 return 2*x
Attach the In-Memory TorchScript Function to a SmartSim Ensemble
We use the Ensemble.add_function
function to instruct SmartSim to load the TorchScript function timestwo
onto the launched standalone Orchestrator
. Specify the function timestwo to the function
parameter:
1# Attach TorchScript to Ensemble
2ensemble_instance.add_function(name="example_func", function=timestwo, device="GPU", devices_per_node=2, first_device=0)
In the above ensemble_instance.add_function
code snippet, we offer the following arguments:
name (“example_func”): A name to uniquely identify the TorchScript within the
Orchestrator
.function (timestwo): Name of the TorchScript function defined in the Python driver script.
device (“GPU”): Specifying the device for TorchScript execution.
devices_per_node (2): Use two GPUs per node.
first_device (0): Start with 0 index GPU.
Warning
Calling exp.start(ensemble_instance) prior to instantiation of an Orchestrator
will result in
a failed attempt to load the TorchScript to a non-existent Orchestrator
.
When the Ensemble
is started via Experiment.start
, the TF function will be loaded to the
standalone Orchestrator
. The function can then be executed on the Orchestrator
via a SmartSim
client (SmartRedis) within the application code.
Attach a TorchScript From File#
Users can attach TorchScript functions from a file to an Ensemble
and upload them to a
colocated or standalone Orchestrator
. This functionality is supported by the Ensemble.add_script
function’s script_path parameter.
When specifying a TorchScript using Ensemble.add_script
, the
following arguments are offered:
name (str): Reference name for the script inside of the
Orchestrator
.script (t.Optional[str] = None): TorchScript code (only supported for non-colocated
Orchestrators
).script_path (t.Optional[str] = None): path to TorchScript code.
device (t.Literal[“CPU”, “GPU”] = “CPU”): device for script execution, defaults to “CPU”.
devices_per_node (int = 1): The number of GPU devices available on the host. This parameter only applies to GPU devices and will be ignored if device is specified as CPU.
first_device (int = 0): The first GPU device to use on the host. This parameter only applies to GPU devices and will be ignored if device is specified as CPU.
Example: Loading a TorchScript From File#
This example walks through the steps of instructing SmartSim to load a TorchScript from file
to an Orchestrator
. The source code example is available in the dropdown below for
convenient execution and customization.
Experiment Driver Script Source Code
from smartsim import Experiment
# Initialize the Experiment and set the launcher to auto
exp = Experiment("getting-started", launcher="auto")
# Initialize a RunSettings object
ensemble_settings = exp.create_run_settings(exe="path/to/example_simulation_program")
# Initialize a Model object
ensemble_instance = exp.create_ensemble("ensemble_name", ensemble_settings)
# Attach TorchScript to Ensemble
ensemble_instance.add_script(name="example_script", script_path="path/to/torchscript.py", device="GPU", devices_per_node=2, first_device=0)
Note
This example assumes:
an
Orchestrator
is launched prior toEnsemble
executionan initialized
Ensemble
named ensemble_instance exists within theExperiment
workflow
Define a TorchScript Script
For the example, we create the Python script torchscript.py. The file contains multiple simple torch function shown below:
def negate(x):
return torch.neg(x)
def random(x, y):
return torch.randn(x, y)
def pos(z):
return torch.positive(z)
Attach the TorchScript Script to a SmartSim Ensemble
Assuming an initialized Ensemble
named ensemble_instance exists, we add a TorchScript script using
the Ensemble.add_script
function and specify the script path to the parameter script_path:
1# Initialize a Ensemble object
2ensemble_instance = exp.create_ensemble("ensemble_name", ensemble_settings)
In the above smartsim_model.add_script
code snippet, we offer the following arguments:
name (“example_script”): Reference name for the script inside of the
Orchestrator
.script_path (“path/to/torchscript.py”): Path to the script file.
device (“GPU”): device for script execution.
devices_per_node (2): Use two GPUs per node.
first_device (0): Start with 0 index GPU.
Warning
Calling exp.start(ensemble_instance) prior to instantiation of an Orchestrator
will result in
a failed attempt to load the ML model to a non-existent Orchestrator
.
When ensemble_instance is started via Experiment.start
, the TorchScript will be loaded from file to the
Orchestrator
that is launched prior to the start of ensemble_instance.
Define TorchScripts as Raw String#
Users can upload TorchScript functions from string to send to a colocated or
standalone Orchestrator
. This feature is supported by the
Ensemble.add_script
function’s script parameter.
When specifying a TorchScript using Ensemble.add_script
, the
following arguments are offered:
name (str): Reference name for the script inside of the
Orchestrator
.script (t.Optional[str] = None): String of function code (e.g. TorchScript code string).
script_path (t.Optional[str] = None): path to TorchScript code.
device (t.Literal[“CPU”, “GPU”] = “CPU”): device for script execution, defaults to “CPU”.
devices_per_node (int = 1): The number of GPU devices available on the host. This parameter only applies to GPU devices and will be ignored if device is specified as CPU.
first_device (int = 0): The first GPU device to use on the host. This parameter only applies to GPU devices and will be ignored if device is specified as CPU.
Example: Load a TorchScript From String#
This example walks through the steps of instructing SmartSim to load a TorchScript function
from string to an Orchestrator
before the execution of the associated Ensemble
.
The source code example is available in the dropdown below for convenient execution and customization.
Experiment Driver Script Source Code
from smartsim import Experiment
# Initialize the Experiment and set the launcher to auto
exp = Experiment("getting-started", launcher="auto")
# Initialize a RunSettings object
ensemble_settings = exp.create_run_settings(exe="path/to/executable/simulation")
# Initialize a Model object
ensemble_instance = exp.create_ensemble("ensemble_name", ensemble_settings)
# TorchScript string
torch_script_str = "def negate(x):\n\treturn torch.neg(x)\n"
# Attach TorchScript to Ensemble
ensemble_instance.add_script(name="example_script", script=torch_script_str, device="GPU", devices_per_node=2, first_device=0)
Note
This example assumes:
an
Orchestrator
is launched prior toEnsemble
executionan initialized
Ensemble
named ensemble_instance exists within theExperiment
workflow
Define a String TorchScript
Define the TorchScript code as a variable in the Python driver script:
1# TorchScript string
2torch_script_str = "def negate(x):\n\treturn torch.neg(x)\n"
Attach the TorchScript Function to a SmartSim Ensemble
Assuming an initialized Ensemble
named ensemble_instance exists, we add a TorchScript using
the Ensemble.add_script
function and specify the variable torch_script_str to the parameter
script:
1# Attach TorchScript to Ensemble
2ensemble_instance.add_script(name="example_script", script=torch_script_str, device="GPU", devices_per_node=2, first_device=0)
In the above ensemble_instance.add_script
code snippet, we offer the following arguments:
name (“example_script”): key to store script under.
script (torch_script_str): TorchScript code.
device (“GPU”): device for script execution.
devices_per_node (2): Use two GPUs per node.
first_device (0): Start with 0 index GPU.
Warning
Calling exp.start(ensemble_instance) prior to instantiation of an Orchestrator
will result in
a failed attempt to load the ML model to a non-existent Orchestrator
.
When the Ensemble
is started via Experiment.start
, the TorchScript will be loaded to the
Orchestrator
that is launched prior to the start of the Ensemble
.
Data Collision Prevention#
Overview#
When multiple Ensemble
members use the same code to send and access their respective data
in the Orchestrator
, key overlapping can occur, leading to inadvertent data access
between Ensemble
members. To address this, SmartSim supports key prefixing
through Ensemble.enable_key_prefixing
which enables key prefixing for all
Ensemble
members. For example, during an Ensemble
simulation with prefixing enabled, SmartSim will add
the Ensemble
member name as a prefix to the keys sent to the Orchestrator
.
Enabling key prefixing eliminates issues related to key overlapping, allowing Ensemble
members to use the same code without issue.
The key components of SmartSim Ensemble
prefixing functionality include:
Sending Data to the Orchestrator: Users can send data to an
Orchestrator
with theEnsemble
member name prepended to the data name by utilizing SmartSim Ensemble functions.Retrieving Data From the Orchestrator: Users can instruct a
Client
to prepend aEnsemble
member name to a key during data retrieval, polling, or check for existence on theOrchestrator
through SmartRedis Client functions. However, entity interaction must be registered using Ensemble or Model functions.
See also
For information on prefixing Client
functions, visit the Client functions page of the Model
documentation.
For example, assume you have an Ensemble
that was initialized using the replicas creation strategy.
Two identical Model
were created named ensemble_0 and ensemble_1 that use the same executable application
within an Ensemble
named ensemble. In the application code you use the function Client.put_tensor("tensor_0", data)
.
Without key prefixing enabled, the slower member will overwrite the data from the faster simulation.
With Ensemble
key prefixing turned on, ensemble_0 and ensemble_1 can access
their tensor “tensor_0” by name without overwriting or accessing the other Model
’s “tensor_0” tensor.
In this scenario, the two tensors placed in the Orchestrator
are named ensemble_0.tensor_0 and ensemble_1.tensor_0.
Ensemble Functions#
An Ensemble
object supports two prefixing functions: Ensemble.enable_key_prefixing
and
Ensemble.register_incoming_entity
. For more information on each function, reference the
Ensemble API docs.
To enable prefixing on a Ensemble
, users must use the Ensemble.enable_key_prefixing
function in the Experiment
driver script. This function activates prefixing for tensors,
Datasets
, and lists sent to an Orchestrator
for all Ensemble
members. This function
also enables access to prefixing Client
functions within the Ensemble
members. This excludes
the Client.set_data_source
function, where enable_key_prefixing
is not require for access.
Note
ML model and script prefixing is not automatically enabled through Ensemble.enable_key_prefixing
.
Prefixing must be enabled within the Ensemble
by calling the use_model_ensemble_prefix
method
on the Client
embedded within the member application.
Users can enable the SmartRedis Client
to interact with prefixed data, ML models and TorchScripts
using the Client.set_data_source
. However, for SmartSim to recognize the producer entity name
passed to the function within an application, the producer entity must be registered on the consumer
entity using Ensemble.register_incoming_entity
.
If a consumer Ensemble
member requests data sent to the Orchestrator
by other Ensemble
members, the producer members must be
registered on consumer member. To access Ensemble
members, SmartSim offers the attribute Ensemble.models
that returns
a list of Ensemble
members. Below we demonstrate registering producer members on a consumer member:
# list of producer Ensemble members
list_of_ensemble_names = ["producer_0", "producer_1", "producer_2"]
# Grab the consumer Ensemble member
ensemble_member = ensemble.models.get("producer_3")
# Register the producer members on the consumer member
for name in list_of_ensemble_names:
ensemble_member.register_incoming_entity(ensemble.models.get(name))
For examples demonstrating how to retrieve data within the entity application that produced
the data, visit the Model
Copy/Rename/Delete Operations subsection.
Example: Ensemble Key Prefixing#
In this example, we create an Ensemble
comprised of two Model
’s that use identical code
to send data to a standalone Orchestrator
. To prevent key collisions and ensure data
integrity, we enable key prefixing on the Ensemble
which automatically
appends the Ensemble
member name to the data sent to the Orchestrator
. After the
Ensemble
completes, we launch a consumer Model
within the Experiment
driver script
to demonstrate accessing prefixed data sent to the Orchestrator
by Ensemble
members.
This example consists of three Python scripts:
Application Producer Script: This script is encapsulated in a SmartSim
Ensemble
within theExperiment
driver script. Prefixing is enabled on theEnsemble
. The producer script puts NumPy tensors on anOrchestrator
launched in theExperiment
driver script. TheEnsemble
creates two identicalEnsemble
members. The producer script is executed in bothEnsemble
members to send two prefixed tensors to theOrchestrator
. The source code example is available in the dropdown below for convenient customization.
Application Producer Script Source Code
from smartredis import Client
import numpy as np
# Initialize a Client
client = Client(cluster=False)
# Create NumPy array
array = np.array([1, 2, 3, 4])
# Use SmartRedis Client to place tensor in standalone Orchestrator
client.put_tensor("tensor", array)
Application Consumer Script: This script is encapsulated within a SmartSim
Model
in theExperiment
driver script. The script requests the prefixed tensors placed by the producer script. The source code example is available in the dropdown below for convenient customization.
Application Consumer Script Source Code
from smartredis import Client, LLInfo
# Initialize a Client
client = Client(cluster=False)
# Set the data source
client.set_data_source("producer_0")
# Check if the tensor exists
tensor_1 = client.poll_tensor("tensor", 100, 100)
# Set the data source
client.set_data_source("producer_1")
# Check if the tensor exists
tensor_2 = client.poll_tensor("tensor", 100, 100)
client.log_data(LLInfo, f"producer_0.tensor was found: {tensor_1}")
client.log_data(LLInfo, f"producer_1.tensor was found: {tensor_2}")
Experiment Driver Script: The driver script launches the
Orchestrator
, theEnsemble
(which sends prefixed keys to theOrchestrator
), and theModel
(which requests prefixed keys from theOrchestrator
). TheExperiment
driver script is the centralized spot that controls the workflow. The source code example is available in the dropdown below for convenient execution and customization.
Experiment Driver Script Source Code
from smartsim import Experiment
from smartsim.log import get_logger
logger = get_logger("Experiment Log")
# Initialize the Experiment
exp = Experiment("getting-started", launcher="auto")
# Initialize a standalone Orchestrator
standalone_orch = exp.create_database(db_nodes=1)
# Initialize a RunSettings object for Ensemble
ensemble_settings = exp.create_run_settings(exe="/path/to/executable_producer_simulation")
# Initialize Ensemble
producer_ensemble = exp.create_ensemble("producer", run_settings=ensemble_settings, replicas=2)
# Enable key prefixing for Ensemble members
producer_ensemble.enable_key_prefixing()
# Initialize a RunSettings object for Model
model_settings = exp.create_run_settings(exe="/path/to/executable_consumer_simulation")
# Initialize Model
consumer_model = exp.create_model("consumer", model_settings)
# Generate SmartSim entity folder tree
exp.generate(standalone_orch, producer_ensemble, consumer_model, overwrite=True)
# Launch Orchestrator
exp.start(standalone_orch, summary=True)
# Launch Ensemble
exp.start(producer_ensemble, block=True, summary=True)
# Register Ensemble members on consumer Model
for model in producer_ensemble:
consumer_model.register_incoming_entity(model)
# Launch consumer Model
exp.start(consumer_model, block=True, summary=True)
# Clobber Orchestrator
exp.stop(standalone_orch)
The Application Producer Script#
In the Experiment
driver script, we instruct SmartSim to create an Ensemble
comprised of
two duplicate members that execute this producer script. In the producer script, a SmartRedis Client
sends a
tensor to the Orchestrator
. Since the Ensemble
members are identical and therefore use the same
application code, two tensors are sent to the Orchestrator
. Without prefixing enabled on the Ensemble
the keys can be overwritten. To prevent this, we enable key prefixing on the Ensemble
in the driver script
via Ensemble.enable_key_prefixing
. When the producer script is executed by each Ensemble
member, a
tensor is sent to the Orchestrator
with the Ensemble
member name prepended to the tensor name.
Here we provide the producer script that is applied to the Ensemble
members:
1from smartredis import Client
2import numpy as np
3
4# Initialize a Client
5client = Client(cluster=False)
6
7# Create NumPy array
8array = np.array([1, 2, 3, 4])
9# Use SmartRedis Client to place tensor in standalone Orchestrator
10client.put_tensor("tensor", array)
After the completion of Ensemble
members producer_0 and producer_1, the contents of the Orchestrator
are:
1) "producer_0.tensor"
2) "producer_1.tensor"
The Application Consumer Script#
In the Experiment
driver script, we initialize a consumer Model
that encapsulates
the consumer application to request the tensors produced from the Ensemble
. To do
so, we use SmartRedis key prefixing functionality to instruct the SmartRedis Client
to append the name of an Ensemble
member to the key name.
See also
For more information on Client
prefixing functions, visit the Client functions
subsection of the Model
documentation.
To begin, specify the imports and initialize a SmartRedis Client
:
1from smartredis import Client, LLInfo
2
3# Initialize a Client
4client = Client(cluster=False)
To retrieve the tensor from the first Ensemble
member named producer_0, use
Client.set_data_source
. Specify the name of the first Ensemble
member
as an argument to the function. This instructs SmartSim to append the Ensemble
member name to the data
search on the Orchestrator
. When Client.poll_tensor
is executed,
the SmartRedis client will poll for key, producer_0.tensor:
1# Set the data source
2client.set_data_source("producer_0")
3# Check if the tensor exists
4tensor_1 = client.poll_tensor("tensor", 100, 100)
Follow the same steps above, however, change the data source name to the name
of the second Ensemble
member (producer_1):
1# Set the data source
2client.set_data_source("producer_1")
3# Check if the tensor exists
4tensor_2 = client.poll_tensor("tensor", 100, 100)
We print the boolean return to verify that the tensors were found:
1client.log_data(LLInfo, f"producer_0.tensor was found: {tensor_1}")
2client.log_data(LLInfo, f"producer_1.tensor was found: {tensor_2}")
When the Experiment
driver script is executed, the following output will appear in consumer.out:
Default@11-46-05:producer_0.tensor was found: True
Default@11-46-05:producer_1.tensor was found: True
Warning
For SmartSim to recognize the Ensemble
member names as a valid data source
to Client.set_data_source
, you must register each Ensemble
member
on the consumer Model
in the driver script via Model.register_incoming_entity
.
We demonstrate this in the Experiment
driver script section of the example.
The Experiment Script#
The Experiment
driver script manages all workflow components and utilizes the producer and consumer
application scripts. In the example, the Experiment
:
launches standalone
Orchestrator
launches an
Ensemble
via the replicas initialization strategylaunches a consumer
Model
clobbers the
Orchestrator
To begin, add the necessary imports, initialize an Experiment
instance and initialize the
standalone Orchestrator
:
1from smartsim import Experiment
2from smartsim.log import get_logger
3
4logger = get_logger("Experiment Log")
5# Initialize the Experiment
6exp = Experiment("getting-started", launcher="auto")
7
8# Initialize a standalone Orchestrator
9standalone_orch = exp.create_database(db_nodes=1)
We are now setup to discuss key prefixing within the Experiment
driver script.
To create an Ensemble
using the replicas strategy, begin by initializing a RunSettings
object to apply to all Ensemble
members. Specify the path to the application
producer script:
1# Initialize a RunSettings object for Ensemble
2ensemble_settings = exp.create_run_settings(exe="/path/to/executable_producer_simulation")
Next, initialize an Ensemble
by specifying ensemble_settings and the number of Model
replicas to create:
1# Initialize Ensemble
2producer_ensemble = exp.create_ensemble("producer", run_settings=ensemble_settings, replicas=2)
Instruct SmartSim to prefix all tensors sent to the Orchestrator
from the Ensemble
via Ensemble.enable_key_prefixing
:
1# Enable key prefixing for Ensemble members
2producer_ensemble.enable_key_prefixing()
Next, initialize the consumer Model
. The consumer Model
application requests
the prefixed tensors produced by the Ensemble
:
1# Initialize a RunSettings object for Model
2model_settings = exp.create_run_settings(exe="/path/to/executable_consumer_simulation")
3# Initialize Model
4consumer_model = exp.create_model("consumer", model_settings)
Next, organize the SmartSim entity output files into a single Experiment
folder:
1# Generate SmartSim entity folder tree
2exp.generate(standalone_orch, producer_ensemble, consumer_model, overwrite=True)
Launch the Orchestrator
:
1# Launch Orchestrator
2exp.start(standalone_orch, summary=True)
Launch the Ensemble
:
1# Launch Ensemble
2exp.start(producer_ensemble, block=True, summary=True)
Set block=True so that Experiment.start
waits until the last Ensemble
member has finished before continuing.
The consumer Model
application script uses Client.set_data_source
which
accepts the Ensemble
member names when searching for prefixed
keys in the Orchestrator
. In order for SmartSim to recognize the Ensemble
member names as a valid data source in the consumer Model
, we must register
the entity interaction:
1# Register Ensemble members on consumer Model
2for model in producer_ensemble:
3 consumer_model.register_incoming_entity(model)
Launch the consumer Model
:
1# Launch consumer Model
2exp.start(consumer_model, block=True, summary=True)
To finish, tear down the standalone Orchestrator
:
1# Clobber Orchestrator
2exp.stop(standalone_orch)