Changelog#
Listed here are the changes between each release of SmartSim, SmartRedis and SmartDashboard.
Jump to:
SmartSim#
To be released at some point in the future
Description
Implement workaround for Tensorflow that allows RedisAI to build with GCC-14
Add instructions for installing SmartSim on PML’s Scylla
Fix typos in documentation
Detailed Notes
In libtensorflow, the input argument to TF_SessionRun seems to be mistyped to TF_Output instead of TF_Input. These two types differ only in name. GCC-14 catches this and throws an error, even though earlier versions allow this. To solve this problem, patches are applied to the Tensorflow backend in RedisAI. Future versions of Tensorflow may fix this problem, but for now this seems to be the best workaround. (SmartSim-PR738)
PML’s Scylla is still under development. The usual SmartSim build instructions do not apply because the GPU dependencies have yet to be installed at a system-wide level. Scylla has its own entry in the documentation. (SmartSim-PR733)
Fix typos in the
train_surrogate
tutorial documentation
0.8.0#
Released on 27 September, 2024
Description
Add instructions for Frontier to set the MIOPEN cache
Refine Frontier documentation for proper use of miniforge3
Refactor to the RedisAI build to allow more flexibility in versions and sources of ML backends
Add Dockerfiles with GPU support
Fine grain build support for GPUs
Update Torch to 2.1.0, Tensorflow to 2.15.0
Better error messages in build process
Allow specifying Model and Ensemble parameters with number-like types (e.g. numpy types)
Pin watchdog to 4.x
Update codecov to 4.5.0
Remove build of Redis from setup.py
Mitigate dependency installation issues
Fix internal host name representation for Dragon backend
Make dependencies more discoverable in setup.py
Add hardware pinning capability when using dragon
Pin NumPy version to 1.x
New launcher support for SGE (and similar derivatives)
Fix test outputs being created in incorrect directory
Improve support for building SmartSim without ML backends
Update packaging dependency
Remove broken oss.redis.com URI blocking documentation generation
Detailed Notes
On Frontier, the MIOPEN cache may need to be set prior to using RedisAI in the
smart validate
. The instructions for Frontier have been updated accordingly. (SmartSim-PR727)On Frontier, the recommended way to activate conda environments is to go through source activate. This also means that
conda init
is not needed. The instructions for Frontier have been updated to reflect this. (SmartSim-PR719)The RedisAIBuilder class was completely overhauled to allow users to express a wider range of support for hardware/software stacks. This will be extended to support ROCm, CUDA-11, and CUDA-12. (SmartSim-PR669)
Versions for each of these packages are no longer specified in an internal class. Instead a default set of JSON files specifies the sources and versions. Users can specify their own custom specifications at smart build time. (SmartSim-PR669)
Because all build configuration has been moved to static files and all backends are compiled during
smart build
, SmartSim can now be shipped as a pure python wheel. (SmartSim-PR728)Two new Dockerfiles are now provided (one each for 11.8 and 12.1) that can be used to build a container to run the tutorials. No HPC support should be expected at this time (SmartSim-PR669)
As a result of the previous change, SmartSim now requires C++17 and a minimum Cuda version of 11.8 in order to build Torch 2.1.0. (SmartSim-PR669)
Error messages were not being interpolated correctly. This has been addressed to provide more context when exposing error messages to users. (SmartSim-PR669)
The serializer would fail if a parameter for a Model or Ensemble was specified as a numpy dtype. The constructors for these methods now validate that the input is number-like and convert them to strings (SmartSim-PR676)
Pin watchdog to 4.x because v5 introduces new types and requires updates to the type-checking (SmartSim-PR690)
Update codecov to 4.5.0 to mitigate GitHub action failure (SmartSim-PR657)
The builder module was included in setup.py to allow us to ship the main Redis binaries (not RedisAI) with installs from PyPI. To allow easier maintenance of this file and enable future complexity this has been removed. The Redis binaries will thus be built by users during the
smart build
stepInstallation of mypy or dragon in separate build actions caused some dependencies (typing_extensions, numpy) to be upgraded and caused runtime failures. The build actions were tweaked to include all optional dependencies to be considered by pip during resolution. Additionally, the numpy version was capped on dragon installations. (SmartSim-PR653)
setup.py used to define dependencies in a way that was not amenable to code scanning tools. Direct dependencies now appear directly in the setup call and the definition of the SmartRedis version has been removed (SmartSim-PR635)
The separate definition of dependencies for the docs in requirements-doc.txt is now defined as an extra. (SmartSim-PR635)
The new major version release of Numpy is incompatible with modules compiled against Numpy 1.x. For both SmartSim and SmartRedis we request a 1.x version of numpy. This is needed in SmartSim because some of the downstream dependencies request NumPy (SmartSim-PR623)
SGE is now a supported launcher for SmartSim. Users can now define BatchSettings which will be monitored by the TaskManager. Additionally, if the MPI implementation was built with SGE support, Orchestrators can use
mpirun
without needing to specify the hosts (SmartSim-PR610)Ensure outputs from tests are written to temporary
tests/test_output
directoryFix an error that would prevent
smart build
from moving a successfully compiled RedisAI shared object to the install location expected by SmartSim if no ML backend installations were found. Previously, this would effectively require users to build and install an ML backend to use the SmartSim orchestrator even if it was not necessary for their workflow. Users can install SmartSim without ML backends by runningsmart build --no_tf --no_pt
and the RedisAI shared object will now be placed in the expected location. (SmartSim-PR601)Fix packaging failures due to deprecated
pkg_resources
. (SmartSim-PR598)
0.7.0#
Released on 14 May, 2024
Description
Update tutorials and tutorial containers
Improve Dragon server shutdown
Add dragon runtime installer
Add launcher based on Dragon
Reuse Orchestrators within the testing suite to improve performance.
Fix building of documentation
Preview entities on experiment before start
Update authentication in release workflow
Auto-generate type-hints into documentation
Auto-post release PR to develop
Bump manifest.json to version 0.0.4
Fix symlinking batch ensemble and model bug
Fix noisy failing WLM test
Remove defensive regexp in .gitignore
Upgrade ubuntu to 22.04
Remove helper function
init_default
Fix telemetry monitor logging errors for task history
Change default path for entities
Drop Python 3.8 support
Update watchdog dependency
Historical output files stored under .smartsim directory
Fixes unfalsifiable test that tests SmartSim’s custom SIGINT signal handler
Add option to build Torch backend without the Intel Math Kernel Library
Fix ReadTheDocs build issue
Disallow uninitialized variable use
Promote device options to an Enum
Update telemetry monitor, add telemetry collectors
Add method to specify node features for a Slurm job
Colo Orchestrator setup now blocks application start until setup finished
Refactor areas of the code where mypy potential errors
Minor enhancements to test suite
ExecArgs handling correction
ReadTheDocs config file added and enabled on PRs
Enforce changelog updates
Fix Jupyter notebook math expressions
Remove deprecated SmartSim modules
SmartSim Documentation refactor
Promote SmartSim statuses to a dedicated type
Update the version of Redis from [7.0.4]{.title-ref} to [7.2.4]{.title-ref}
Increase disk space in doc builder container
Update Experiment API typing
Prevent duplicate entity names
Fix publishing of development docs
Detailed Notes
The tutorials are up-to date with SmartSim and SmartRedis APIs. Additionally, the tutorial containers’ Docker files are updated. (SmartSim-PR589)
The Dragon server will now terminate any process which is still running when a request of an immediate shutdown is sent. (SmartSim-PR582)
Add
--dragon
option tosmart build
. Install appropriate Dragon runtime from Dragon GitHub release assets. (SmartSim-PR580)Add new launcher, based on Dragon. The new launcher is compatible with the Slurm and PBS schedulers and can be selected by specifying
launcher="dragon"
when creating anExperiment
, or by usingDragonRunSettings
to launch a job. The Dragon launcher is at an early stage of development: early adopters are referred to the dedicated documentation section to learn more about it. (SmartSim-PR580)Tests may now request a given configuration and will reconnect to the existing orchestrator instead of building up and tearing down a new one each test. (SmartSim-PR567)
Manually ensure that typing_extensions==4.6.1 in Dockerfile used to build docs. This fixes the deploy_dev_docs Github action (SmartSim-PR564)
Added preview functionality to Experiment, including preview of all entities, active infrastructure and client configuration. (SmartSim-PR525)
Replace the developer created token with the GH_TOKEN environment variable. (SmartSim-PR570)
Add extension to auto-generate function type-hints into documentation. (SmartSim-PR561)
Add to github release workflow to auto generate a pull request from master into develop for release. (SmartSim-PR566)
The manifest.json version needs to match the SmartDashboard version, which is 0.0.4 in the upcoming release. (SmartSim-PR563)
Properly symlinks batch ensembles and batch models. (SmartSim-PR547)
Remove defensive regexp in .gitignore and ensure tests write to test_output. (SmartSim-PR560)
After dropping support for Python 3.8, ubuntu needs to be upgraded. (SmartSim-PR558)
Remove helper function
init_default
and replace with traditional type narrowing. (SmartSim-PR545)Ensure the telemetry monitor does not track a task_id for a managed task. (SmartSim-PR557)
The default path for an entity is now the path to the experiment / the entity name. create_database and create_ensemble now have path arguments. All path arguments are compatible with relative paths. Relative paths are relative to the CWD. (SmartSim-PR533)
Python 3.8 is reaching its end-of-life in October, 2024, so it will no longer continue to be supported. (SmartSim-PR544)
Update watchdog dependency from 3.x to 4.x, fix new type issues (SmartSim-PR540)
The dashboard needs to display historical logs, so log files are written out under the .smartsim directory and files under the experiment directory are symlinked to them. (SmartSim-PR532)
Add an option to smart build “–torch_with_mkl”/”–no_torch_with_mkl” to prevent Torch from trying to link in the Intel Math Kernel Library. This is needed because on machines that have the Intel compilers installed, the Torch will unconditionally try to link in this library, however fails because the linking flags are incorrect. (SmartSim-PR538)
Change typing_extensions and pydantic versions in readthedocs environment to enable docs build. (SmartSim-PR537)
Promote devices to a dedicated Enum type throughout the SmartSim code base. (SmartSim-PR527)
Update the telemetry monitor to enable retrieval of metrics on a scheduled interval. Switch basic experiment tracking telemetry to default to on. Add database metric collectors. Improve telemetry monitor logging. Create telemetry subpackage at [smartsim._core.utils.telemetry]{.title-ref}. Refactor telemetry monitor entrypoint. (SmartSim-PR460)
Users can now specify node features for a Slurm job through
SrunSettings.set_node_feature
. The method accepts a string or list of strings. (SmartSim-PR529)The request to the colocated entrypoints file within the shell script is now a blocking process. Once the Orchestrator is setup, it returns which moves the process to the background and allows the application to start. This prevents the application from requesting a ML model or script that has not been uploaded to the Orchestrator yet. (SmartSim-PR522)
Add checks and tests to ensure SmartSim users cannot initialize run settings with a list of lists as the exe_args argument. (SmartSim-PR517)
Add readthedocs configuration file and enable readthedocs builds on pull requests. Additionally added robots.txt file generation when readthedocs environment detected. (SmartSim-PR512)
Add Github Actions workflow that checks if changelog is edited on pull requests into develop. (SmartSim-PR518)
Add path to MathJax.js file so that Sphinx will use to render math expressions. (SmartSim-PR516)
Removed deprecated SmartSim modules: slurm and mpirunSettings. (SmartSim-PR514)
Implemented new structure of SmartSim documentation. Added examples images and further detail of SmartSim components. (SmartSim-PR463)
Promote SmartSim statuses to a dedicated type named SmartSimStatus. (SmartSim-PR509)
Update Redis version to [7.2.4]{.title-ref}. This change fixes an issue in the Redis build scripts causing failures on Apple Silicon hosts. (SmartSim-PR507)
The container which builds the documentation for every merge to develop was failing due to a lack of space within the container. This was fixed by including an additional Github action that removes some unneeded software and files that come from the default Github Ubuntu container. (SmartSim-PR504)
Update the generic [t.Any]{.title-ref} typehints in Experiment API. (SmartSim-PR501)
The CI will fail static analysis if common erroneous truthy checks are detected. (SmartSim-PR524)
Prevent the launch of duplicate named entities. Allow completed entities to run. (SmartSim-PR480)
The CI will fail static analysis if a local variable used while potentially undefined. (SmartSim-PR521)
Remove previously deprecated behavior present in test suite on machines with Slurm and Open MPI. (SmartSim-PR520)
Experiments in the WLM tests are given explicit paths to prevent unexpected directory creation. Ensure database are not left open on test suite failures. Update path to pickle file in
tests/full_wlm/test_generic_orc_launch_batch.py::test_launch_cluster_orc_reconnect
to conform with changes made in (SmartSim-PR533). (SmartSim-PR559)When calling
Experiment.start
SmartSim would register a signal handler that would capture an interrupt signal (^C) to kill any jobs launched through itsJobManager
. This would replace the default (or user defined) signal handler. SmartSim will now attempt to kill any launched jobs before calling the previously registered signal handler. (SmartSim-PR535)
0.6.2#
Released on 16 February, 2024
Description
Patch SmartSim dependency version
Detailed Notes
A critical performance concern was identified and addressed in SmartRedis. A patch fix was deployed, and SmartSim was updated to ensure users do not inadvertently pull the unpatched version of SmartRedis. (SmartSim-PR493)
0.6.1#
Released on 15 February, 2024
Description
Duplicate for DBModel/Script prevented
Update license to include 2024
Telemetry monitor is now active by default
Add support for Mac OSX on Apple Silicon
Remove Torch warnings during testing
Validate Slurm timing format
Expose Python Typehints
Fix test_logs to prevent generation of directory
Fix Python Typehint for colocated database settings
Python 3.11 Support
Quality of life [smart validate]{.title-ref} improvements
Remove Cobalt support
Enrich logging through context variables
Upgrade Machine Learning dependencies
Override sphinx-tabs background color
Add concurrency group to test workflow
Fix index when installing torch through smart build
Detailed Notes
Modify the [git clone]{.title-ref} for both Redis and RedisAI to set the line endings to unix-style line endings when using MacOS on ARM. (SmartSim-PR482)
Separate install instructions are now provided for Mac OSX on x64 vs ARM64 (SmartSim-PR479)
Prevent duplicate ML model and script names being added to an Ensemble member if the names exists. (SmartSim-PR475)
Updates [Copyright (c) 2021-2023]{.title-ref} to [Copyright (c) 2021-2024]{.title-ref} in all of the necessary files. (SmartSim-PR485)
Bug fix which prevents the expected behavior when the [SMARTSIM_LOG_LEVEL]{.title-ref} environment variable was set to [developer]{.title-ref}. (SmartSim-PR473)
Sets the default value of the “enable telemetry” flag to on. Bumps the output [manifest.json]{.title-ref} version number to match that of [smartdashboard]{.title-ref} and pins a watchdog version to avoid build errors. (SmartSim-PR477)
Refactor logic of [Manifest.has_db_objects]{.title-ref} to remove excess branching and improve readability/maintainability. (SmartSim-PR476)
SmartSim can now be built and used on platforms using Apple Silicon (ARM64). Currently, only the PyTorch backend is supported. Note that libtorch will be downloaded from a CrayLabs github repo. (SmartSim-PR465)
Tests that were saving Torch models were emitting warnings. These warnings were addressed by updating the model save test function. (SmartSim-PR472)
Validate the timing format when requesting a slurm allocation. (SmartSim-PR471)
Add and ship [py.typed]{.title-ref} marker to expose inline type hints. Fix type errors related to SmartRedis. (SmartSim-PR468)
Fix the [test_logs.py::test_context_leak]{.title-ref} test that was erroneously creating a directory named [some value]{.title-ref} in SmartSim’s root directory. (SmartSim-PR467)
Add Python type hinting to colocated settings. (SmartSim-PR462)
Add github actions for running black and isort checks. (SmartSim-PR464)
Relax the required version of [typing_extensions]{.title-ref}. (SmartSim-PR459)
Addition of Python 3.11 to SmartSim. (SmartSim-PR461)
Quality of life [smart validate]{.title-ref} improvements such as setting [CUDA_VISIBLE_DEVICES]{.title-ref} environment variable within [smart validate]{.title-ref} prior to importing any ML deps to prevent false negatives on multi-GPU systems. Additionally, move SmartRedis logs from standard out to dedicated log file in the validation temporary directory as well as suppress [sklearn]{.title-ref} deprecation warning by pinning [KMeans]{.title-ref} constructor argument. Lastly, move TF test to last as TF may reserve the GPUs it uses. (SmartSim-PR458)
Some actions in the current GitHub CI/CD workflows were outdated. They were replaced with the latest versions. (SmartSim-PR446)
As the Cobalt workload manager is not used on any system we are aware of, its support in SmartSim was terminated and classes such as [CobaltLauncher]{.title-ref} have been removed. (SmartSim-PR448)
Experiment logs are written to a file that can be read by the dashboard. (SmartSim-PR452)
Updated SmartSim’s machine learning backends to PyTorch 2.0.1, Tensorflow 2.13.1, ONNX 1.14.1, and ONNX Runtime 1.16.1. As a result of this change, there is now an available ONNX wheel for use with Python 3.10, and wheels for all of SmartSim’s machine learning backends with Python 3.11. (SmartSim-PR451) (SmartSim-PR461)
The sphinx-tabs documentation extension uses a white background for the tabs component. A custom CSS for those components to inherit the overall theme color has been added. (SmartSim-PR453)
Add concurrency groups to GitHub’s CI/CD workflows, preventing multiple workflows from the same PR to be launched concurrently. (SmartSim-PR439)
Torch changed their preferred indexing when trying to install their provided wheels. Updated the [pip install]{.title-ref} command within [smart build]{.title-ref} to ensure that the appropriate packages can be found. (SmartSim-PR449)
0.6.0#
Released on 18 December, 2023
Description
Conflicting directives in the SmartSim packaging instructions were fixed
[sacct]{.title-ref} and [sstat]{.title-ref} errors are now fatal for Slurm-based workflow executions
Added documentation section about ML features and TorchScript
Added TorchScript functions to Online Analysis tutorial
Added multi-DB example to documentation
Improved test stability on HPC systems
Added support for producing & consuming telemetry outputs
Split tests into groups for parallel execution in CI/CD pipeline
Change signature of [Experiment.summary()]{.title-ref}
Expose first_device parameter for scripts, functions, models
Added support for MINBATCHTIMEOUT in model execution
Remove support for RedisAI 1.2.5, use RedisAI 1.2.7 commit
Add support for multiple databases
Detailed Notes
Several conflicting directives between the [setup.py]{.title-ref} and the [setup.cfg]{.title-ref} were fixed to mitigate warnings issued when building the pip wheel. (SmartSim-PR435)
When the Slurm functions [sacct]{.title-ref} and [sstat]{.title-ref} returned an error, it would be ignored and SmartSim’s state could become inconsistent. To prevent this, errors raised by [sacct]{.title-ref} or [sstat]{.title-ref} now result in an exception. (SmartSim-PR392)
A section named ML Features was added to documentation. It contains multiple examples of how ML models and functions can be added to and executed on the DB. TorchScript-based post-processing was added to the Online Analysis tutorial (SmartSim-PR411)
An example of how to use multiple Orchestrators concurrently was added to the documentation (SmartSim-PR409)
The test infrastructure was improved. Tests on HPC system are now stable, and issues such as non-stopped [Orchestrators]{.title-ref} or experiments created in the wrong paths have been fixed (SmartSim-PR381)
A telemetry monitor was added to check updates and produce events for SmartDashboard (SmartSim-PR426)
Split tests into [group_a]{.title-ref}, [group_b]{.title-ref}, [slow_tests]{.title-ref} for parallel execution in CI/CD pipeline (SmartSim-PR417, SmartSim-PR424)
Change [format]{.title-ref} argument to [style]{.title-ref} in [Experiment.summary()]{.title-ref}, this is an API break (SmartSim-PR391)
Added support for first_device parameter for scripts, functions, and models. This causes them to be loaded to the first num_devices beginning with first_device (SmartSim-PR394)
Added support for MINBATCHTIMEOUT in model execution, which caps the delay waiting for a minimium number of model execution operations to accumulate before executing them as a batch (SmartSim-PR387)
RedisAI 1.2.5 is not supported anymore. The only RedisAI version is now 1.2.7. Since the officially released RedisAI 1.2.7 has a bug which breaks the build process on Mac OSX, it was decided to use commit 634916c from RedisAI’s GitHub repository, where such bug has been fixed. This applies to all operating systems. (SmartSim-PR383)
Add support for creation of multiple databases with unique identifiers. (SmartSim-PR342)
0.5.1#
Released on 14 September, 2023
Description
Add typehints throughout the SmartSim codebase
Provide support for Slurm heterogeneous jobs
Provide better support for [PalsMpiexecSettings]{.title-ref}
Allow for easier inspection of SmartSim entities
Log ignored error messages from [sacct]{.title-ref}
Fix colocated db preparation bug when using [JsrunSettings]{.title-ref}
Fix bug when user specify CPU and devices greater than 1
Fix bug when get_allocation called with reserved keywords
Enabled mypy in CI for better type safety
Mitigate additional suppressed pylint errors
Update linting support and apply to existing errors
Various improvements to the [smart]{.title-ref} CLI
Various documentation improvements
Various test suite improvements
Detailed Notes
Add methods to allow users to inspect files attached to models and ensembles. (SmartSim-PR352)
Add a [smart info]{.title-ref} target to provide rudimentary information about the SmartSim installation. (SmartSim-PR350)
Remove unnecessary generation producing unexpected directories in the test suite. (SmartSim-PR349)
Add support for heterogeneous jobs to [SrunSettings]{.title-ref} by allowing users to set the [–het-group]{.title-ref} parameter. (SmartSim-PR346)
Provide clearer guidelines on how to contribute to SmartSim. (SmartSim-PR344)
Integrate [PalsMpiexecSettings]{.title-ref} into the [Experiment]{.title-ref} factory methods when using the [“pals”]{.title-ref} launcher. (SmartSim-PR343)
Create public properties where appropriate to mitigate [protected-access]{.title-ref} errors. (SmartSim-PR341)
Fix a failure to execute [_prep_colocated_db]{.title-ref} due to incorrect named attr check. (SmartSim-PR339)
Enabled and mitigated mypy [disallow_any_generics]{.title-ref} and [warn_return_any]{.title-ref}. (SmartSim-PR338)
Add a [smart validate]{.title-ref} target to provide a simple smoke test to assess a SmartSim build. (SmartSim-PR336, SmartSim-PR351)
Add typehints to [smartsim._core.launcher.step.*]{.title-ref}. (SmartSim-PR334)
Log errors reported from slurm WLM when attempts to retrieve status fail. (SmartSim-PR331, SmartSim-PR332)
Fix incorrectly formatted positional arguments in log format strings. (SmartSim-PR330)
Ensure that launchers pass environment variables to unmanaged job steps. (SmartSim-PR329)
Add additional tests surrounding the [RAI_PATH]{.title-ref} configuration environment variable. (SmartSim-PR328)
Remove unnecessary execution of unescaped shell commands. (SmartSim-PR327)
Add error if user calls get_allocation with reserved keywords in slurm get_allocation. (SmartSim-PR325)
Add error when user requests CPU with devices greater than 1 within add_ml_model and add_script. (SmartSim-PR324)
Update documentation surrounding ensemble key prefixing. (SmartSim-PR322)
Fix formatting of the Frontier site installation. (SmartSim-PR321)
Update pylint dependency, update .pylintrc, mitigate non-breaking issues, suppress api breaks. (SmartSim-PR311)
Refactor the [smart]{.title-ref} CLI to use subparsers for better documentation and extension. (SmartSim-PR308)
0.5.0#
Released on 6 July, 2023
Description
A full list of changes and detailed notes can be found below:
Update SmartRedis dependency to v0.4.1
Fix tests for db models and scripts
Fix add_ml_model() and add_script() documentation, tests, and code
Remove [requirements.txt]{.title-ref} and other places where dependencies were defined
Replace [limit_app_cpus]{.title-ref} with [limit_db_cpus]{.title-ref} for co-located orchestrators
Remove wait time associated with Experiment launch summary
Update and rename Redis conf file
Migrate from redis-py-cluster to redis-py
Update full test suite to not require a TF wheel at test time
Update doc strings
Remove deprecated code
Relax the coloredlogs version
Update Fortran tutorials for SmartRedis
Add support for multiple network interface binding in Orchestrator and Colocated DBs
Add typehints and static analysis
Detailed notes
Updates SmartRedis to the most current release (SmartSim-PR316)
Fixes and enhancements to documentation (SmartSim-PR317, SmartSim-PR314, SmartSim-PR287)
Various fixes and enhancements to the test suite (SmartSim-PR315, SmartSim-PR312, SmartSim-PR310, SmartSim-PR302, SmartSim-PR283)
Fix a defect in the tests related to database models and scripts that was causing key collisions when testing on workload managers (SmartSim-PR313)
Remove [requirements.txt]{.title-ref} and other places where dependencies were defined. (SmartSim-PR307)
Fix defect where dictionaries used to create run settings can be changed unexpectedly due to copy-by-ref (SmartSim-PR305)
The underlying code for Model.add_ml_model() and Model.add_script() was fixed to correctly handle multi-GPU configurations. Tests were updated to run on non-local launchers. Documentation was updated and fixed. Also, the default testing interface has been changed to lo instead of ipogif. (SmartSim-PR304)
Typehints have been added. A makefile target [make check-mypy]{.title-ref} executes static analysis with mypy. (SmartSim-PR295, SmartSim-PR301, SmartSim-PR303)
Replace [limit_app_cpus]{.title-ref} with [limit_db_cpus]{.title-ref} for co-located orchestrators. This resolves some incorrect behavior/assumptions about how the application would be pinned. Instead, users should directly specify the binding options in their application using the options appropriate for their launcher (SmartSim-PR306)
Simplify code in [random_permutations]{.title-ref} parameter generation strategy (SmartSim-PR300)
Remove wait time associated with Experiment launch summary (SmartSim-PR298)
Update Redis conf file to conform with Redis v7.0.5 conf file (SmartSim-PR293)
Migrate from redis-py-cluster to redis-py for cluster status checks (SmartSim-PR292)
Update full test suite to no longer require a tensorflow wheel to be available at test time. (SmartSim-PR291)
Correct spelling of colocated in doc strings (SmartSim-PR290)
Deprecated launcher-specific orchestrators, constants, and ML utilities were removed. (SmartSim-PR289)
Relax the coloredlogs version to be greater than 10.0 (SmartSim-PR288)
Update the Github Actions runner image from [macos-10.15]{.title-ref}[ to `macos-12]{.title-ref}`. The former began deprecation in May 2022 and was finally removed in May 2023. (SmartSim-PR285)
The Fortran tutorials had not been fully updated to show how to handle return/error codes. These have now all been updated. (SmartSim-PR284)
Orchestrator and Colocated DB now accept a list of interfaces to bind to. The argument name is still [interface]{.title-ref} for backward compatibility reasons. (SmartSim-PR281)
Typehints have been added to public APIs. A makefile target to execute static analysis with mypy is available [make check-mypy]{.title-ref}. (SmartSim-PR295)
0.4.2#
Released on April 12, 2023
Description
This release of SmartSim had a focus on polishing and extending exiting features already provided by SmartSim. Most notably, this release provides support to allow users to colocate their models with an orchestrator using Unix domain sockets and support for launching models as batch jobs.
Additionally, SmartSim has updated its tool chains to provide a better user experience. Notably, SmarSim can now be used with Python 3.10, Redis 7.0.5, and RedisAI 1.2.7. Furthermore, SmartSim now utilizes SmartRedis’s aggregation lists to streamline the use and extension of ML data loaders, making working with popular machine learning frameworks in SmartSim a breeze.
A full list of changes and detailed notes can be found below:
Add support for colocating an orchestrator over UDS
Add support for Python 3.10, deprecate support for Python 3.7 and RedisAI 1.2.3
Drop support for Ray
Update ML data loaders to make use of SmartRedis’s aggregation lists
Allow for models to be launched independently as batch jobs
Update to current version of Redis to 7.0.5
Add support for RedisAI 1.2.7, pyTorch 1.11.0, Tensorflow 2.8.0, ONNXRuntime 1.11.1
Fix bug in colocated database entrypoint when loading PyTorch models
Fix test suite behavior with environment variables
Detailed Notes
Running some tests could result in some SmartSim-specific environment variables to be set. Such environment variables are now reset after each test execution. Also, a warning for environment variable usage in Slurm was added, to make the user aware in case an environment variable will not be assigned the desired value with [–export]{.title-ref}. (SmartSim-PR270)
The PyTorch and TensorFlow data loaders were update to make use of aggregation lists. This breaks their API, but makes them easier to use. (SmartSim-PR264)
The support for Ray was dropped, as its most recent versions caused problems when deployed through SmartSim. We plan to release a separate add-on library to accomplish the same results. If you are interested in getting the Ray launch functionality back in your workflow, please get in touch with us! (SmartSim-PR263)
Update from Redis version 6.0.8 to 7.0.5. (SmartSim-PR258)
Adds support for Python 3.10 without the ONNX machine learning backend. Deprecates support for Python 3.7 as it will stop receiving security updates. Deprecates support for RedisAI 1.2.3. Update the build process to be able to correctly fetch supported dependencies. If a user attempts to build an unsupported dependency, an error message is shown highlighting the discrepancy. (SmartSim-PR256)
Models were given a [batch_settings]{.title-ref} attribute. When launching a model through [Experiment.start]{.title-ref} the [Experiment]{.title-ref} will first check for a non-nullish value at that attribute. If the check is satisfied, the [Experiment]{.title-ref} will attempt to wrap the underlying run command in a batch job using the object referenced at [Model.batch_settings]{.title-ref} as the batch settings for the job. If the check is not satisfied, the [Model]{.title-ref} is launched in the traditional manner as a job step. (SmartSim-PR245)
Fix bug in colocated database entrypoint stemming from uninitialized variables. This bug affects PyTorch models being loaded into the database. (SmartSim-PR237)
The release of RedisAI 1.2.7 allows us to update support for recent versions of PyTorch, Tensorflow, and ONNX (SmartSim-PR234)
Make installation of correct Torch backend more reliable according to instruction from PyTorch
In addition to TCP, add UDS support for colocating an orchestrator with models. Methods [Model.colocate_db_tcp]{.title-ref} and [Model.colocate_db_uds]{.title-ref} were added to expose this functionality. The [Model.colocate_db]{.title-ref} method remains and uses TCP for backward compatibility (SmartSim-PR246)
0.4.1#
Released on June 24, 2022
Description: This release of SmartSim introduces a new experimental feature to help make SmartSim workflows more portable: the ability to run simulations models in a container via Singularity. This feature has been tested on a small number of platforms and we encourage users to provide feedback on its use.
We have also made improvements in a variety of areas: new utilities to load scripts and machine learning models into the database directly from SmartSim driver scripts and install-time choice to use either [KeyDB]{.title-ref} or [Redis]{.title-ref} for the Orchestrator. The [RunSettings]{.title-ref} API is now more consistent across subclasses. Another key focus of this release was to aid new SmartSim users by including more extensive tutorials and improving the documentation. The docker image containing the SmartSim tutorials now also includes a tutorial on online training.
Launcher improvements
New methods for specifying [RunSettings]{.title-ref} parameters (SmartSim-PR166) (SmartSim-PR170)
Better support for [mpirun]{.title-ref}, [mpiexec]{.title-ref}, and [orterun]{.title-ref} as launchers (SmartSim-PR186)
Experimental: add support for running models via Singularity (SmartSim-PR204)
Documentation and tutorials
Tutorial updates (SmartSim-PR155) (SmartSim-PR203) (SmartSim-PR208)
Add SmartSim Zoo info to documentation (SmartSim-PR175)
New tutorial for demonstrating online training (SmartSim-PR176) (SmartSim-PR188)
General improvements and bug fixes
Set models and scripts at the driver level (SmartSim-PR185)
Optionally use KeyDB for the orchestrator (SmartSim-PR180)
Ability to specify system-level libraries (SmartSim-PR154) (SmartSim-PR182)
Fix the handling of LSF gpus_per_shard (SmartSim-PR164)
Fix error when re-running [smart build]{.title-ref} (SmartSim-PR165)
Fix generator hanging when tagged configuration variables are missing (SmartSim-PR177)
Dependency updates
CMake version from 3.10 to 3.13 (SmartSim-PR152)
Update click to 8.0.2 (SmartSim-PR200)
0.4.0#
Released on Feb 11, 2022
Description: In this release SmartSim continues to promote ease of use. To this end SmartSim has introduced new portability features that allow users to abstract away their targeted hardware, while providing even more compatibility with existing libraries.
A new feature, Co-located orchestrator deployments has been added which provides scalable online inference capabilities that overcome previous performance limitations in seperated orchestrator/application deployments. For more information on advantages of co-located deployments, see the Orchestrator section of the SmartSim documentation.
The SmartSim build was significantly improved to increase customization
of build toolchain and the smart
command line inferface was expanded.
Additional tweaks and upgrades have also been made to ensure an optimal experience. Here is a comprehensive list of changes made in SmartSim 0.4.0.
Orchestrator Enhancements:
Add Orchestrator Co-location (SmartSim-PR139)
Add Orchestrator configuration file edit methods (SmartSim-PR109)
Emphasize Driver Script Portability:
Add ability to create run settings through an experiment (SmartSim-PR110)
Add ability to create batch settings through an experiment (SmartSim-PR112)
Add automatic launcher detection to experiment portability functions (SmartSim-PR120)
Expand Machine Learning Library Support:
Data loaders for online training in Keras/TF and Pytorch (SmartSim-PR115) (SmartSim-PR140)
ML backend versions updated with expanded support for multiple versions (SmartSim-PR122)
Launch Ray internally using
RunSettings
(SmartSim-PR118)Add Ray cluster setup and deployment to SmartSim (SmartSim-PR50)
Expand Launcher Setting Options:
Add ability to use base
RunSettings
on a Slurm, or PBS launchers (SmartSim-PR90)Add ability to use base
RunSettings
on LFS launcher (SmartSim-PR108)
Deprecations and Breaking Changes
Orchestrator classes combined into single implementation for portability (SmartSim-PR139)
smartsim.constants
changed tosmartsim.status
(SmartSim-PR122)smartsim.tf
migrated tosmartsim.ml.tf
(SmartSim-PR115) (SmartSim-PR140)TOML configuration option removed in favor of environment variable approach (SmartSim-PR122)
General Improvements and Bug Fixes:
Improve and extend parameter handling (SmartSim-PR107) (SmartSim-PR119)
Abstract away non-user facing implementation details (SmartSim-PR122)
Add various dimensions to the CI build matrix for SmartSim testing (SmartSim-PR130)
Add missing functions to LSFSettings API (SmartSim-PR113)
Add RedisAI checker for installed backends (SmartSim-PR137)
Remove heavy and unnecessary dependencies (SmartSim-PR116) (SmartSim-PR132)
Fix LSFLauncher and LSFOrchestrator (SmartSim-PR86)
Fix over greedy Workload Manager Parsers (SmartSim-PR95)
Fix Slurm handling of comma-separated env vars (SmartSim-PR104)
Fix internal method calls (SmartSim-PR138)
Documentation Updates:
Updates to documentation build process (SmartSim-PR133) (SmartSim-PR143)
Updates to documentation content (SmartSim-PR96) (SmartSim-PR129) (SmartSim-PR136) (SmartSim-PR141)
Update SmartSim Examples (SmartSim-PR68) (SmartSim-PR100)
0.3.2#
Released on August 10, 2021
Description:
Upgraded RedisAI backend to 1.2.3 (SmartSim-PR69)
PyTorch 1.7.1, TF 2.4.2, and ONNX 1.6-7 (SmartSim-PR69)
LSF launcher for IBM machines (SmartSim-PR62)
Improved code coverage by adding more unit tests (SmartSim-PR53)
Orchestrator methods to get address and check status (SmartSim-PR60)
Added Manifest object that tracks deployables in Experiments (SmartSim-PR61)
Bug fixes (SmartSim-PR52) (SmartSim-PR58) (SmartSim-PR67) (SmartSim-PR73)
Updated documentation and examples (SmartSim-PR51) (SmartSim-PR57) (SmartSim-PR71)
Improved IP address aquisition (SmartSim-PR72)
Binding database to network interfaces
0.3.1#
Released on May 5, 2021
Description: This release was dedicated to making the install process
easier. SmartSim can be installed from PyPI now and the smart
cli tool
makes installing the machine learning runtimes much easier.
Pip install (SmartSim-PR42)
smart
cli tool for ML backends (SmartSim-PR42)Build Documentation for updated install (SmartSim-PR43)
Migrate from Jenkins to Github Actions CI (SmartSim-PR42)
Bug fix for setup.cfg (SmartSim-PR35)
0.3.0#
Released on April 1, 2021
Description:
initial 0.3.0 (first public) release of SmartSim
SmartRedis#
0.6.1#
Released on 27 September, 2024
Description
Fix RedisAI build to allow for compilation with GCC-14
Fix a memory leak in the Fortran Dataset implementation
Detailed Notes
Fix RedisAI build to allow for compilation with GCC-14. Also, we only use the Torch backend and change the compilation of RedisAI to use CMake (like SmartSim) (PR518)
The dataset object, if used in a loop, would leave memory dangling. To alleviate this, a final procedure has been implemented. Fortran compilers, however, are notoriously bad at detecting when an object goes out of scope and to destroy them automatically. We thus also provide an explicit destructor procedure. (PR514)
0.6.0#
Released on 25 September, 2024
Description
Fix instructions for including SmartRedis as an ExternalProject in CMake-based projects
Include algorithm import in rediscluster for gcc-14 and updated github artifact version
Touch-up outdated information in README.md
Update codecov to v4.5.0 for github actions
Remove broken oss.redis.com URLs from documentation
Add option to allow SmartRedis Fortran library to retain the path to the main client library
Update examples and tests to use find_package(smartredis)
Generate config files necessary to allow CMake projects to add SmartRedis via find_package
Allow users to specify install location of SmartRedis libraries
Streamline compilation of SmartRedis dependencies
Pin NumPy version to 1.x
Detailed Notes
Instructions for including SmartRedis as a CMake ExternalProject had a couple of missing closing parentheses and typo in the definition of the libsmartredis-fortran block (PR503)
Include algorithm import in rediscluster.h to satisfy gcc-14 compilation error. (PR505)
Update github actions to upload-artifact@v3 and download-artifact@v3 (PR505) (PR511) (PR512)
Update links to install documentation and remove outdated version numbers in the README.md (PR501)
Update codecov to v4.5.0 for github actions (PR502)
As part of this cleanup, some behaviors of how the libraries were named have been removed. The testing suite now distinguishes between various build types (e.g. Debug, Coverage, etc.) by specifying the
CMAKE_INSTALL_PREFIX
instead of appending it as part of the name of the library itself. (PR497)The SmartRedis Fortran library now by default will retain the path to the SmartRedis C/C++ library. This should avoid occasional problems where users were getting “library not found” errors if they had moved libraries post-installation (PR497)
All the examples and tests now use the
find_package
functionality to setup linking flags (PR497)The install process now generates package configuration files for the C/C++ SmartRedis library and the Fortran SmartRedis library. Users can use the
find_package()
command in their CMakeLists.txt to setup the linking and include flags automatically (PR497)The CMakeLists.txt for SmartRedis now includes the install commands which allow users to specify the specific install prefix to install the SmartRedis libraries, header files, and Fortran .mod files (PR497)
hiredis, redis++, and pybind are now retrieved and installed in
CMakeLists.txt
instead of in the Makefile. This decouples the user-facing side of SmartRedis from the Makefile, which now can be used pureley as a convenient interface to compile SmartRedis with various options and coordinate testing (PR497)The new major version release of Numpy is incompatible with modules compiled against Numpy 1.x. For both SmartSim and SmartRedis we request a 1.x version of numpy. This is needed in SmartSim because some of the downstream dependencies request NumPy. (PR498)
Ensure errors raised from client include details
0.5.3#
Released on 14 May 2024
Description
Improve client error logging
Fix pylint regression error
Fix build wheel error
Fix header styling issue
Correct changelog indention
Automate the creation of release notes
Auto-post release PR to develop from master
Upgrade ubuntu to 22.04 and gcc to 11
Drop Python 3.8 support
Fix C++ cosmetic defects leading to compiler warnings
Re-enable SR_PEDANTIC for the Makefile targets
Enforce changelog updates
Removed unused TensorBase constructor parameter
Remove unused parameter in internal redis cluster method
Enforce matching TensorType for DataSet::unpack_tensor()
Update CI for Intel suite
Add socket time out environment variable
Fix inconsistency in C-API ConfigOptions is_configured() parameters
Bump redis dep to 7.2.4
Fix widths field for list-table in install documentation
Remove a vestigial requirements.txt file
Detailed Notes
Ensure errors raised from client include details (PR485)
Pin pylint to fix regression error (PR492)
Add cstdint import to fix ubuntu with gcc wheel build (PR491)
Incorrect lineup of the changelog page index. This fixes the header sizes to avoid this issue. (PR489)
After converting from rst to md, readthedocs began throwing indention errors in old release info. This fixes the styling. (PR488)
Add a configuration file to the root of .github/ to configure the generated release notes. (PR487)
Add to github release workflow to auto generate a pull request from master into develop for release. (PR486)
After dropping support for Python 3.8, ubuntu and gcc need to be upgraded. (PR484)
Python 3.8 is reaching its end-of-life in October, 2024, so it will no longer continue to be supported. (PR482)
Fixes some mainly cosmetic defects in the C++ client that were leading to warnings when pedantic compiler flags were enabled (PR476)
Re-enable SR_PEDANTIC for the [test-lib]{.title-ref} and [test-lib-with-fortran]{.title-ref} Makefile targets (PR476)
Add Github Actions workflow that checks if changelog is edited on pull requests into develop. (PR480)
The TensorBase constructor SRMemoryLayout parameter was removed because it was not used. It is not needed as a member variable because all Tensor<T> objects store internal representations in contiguous memory. (PR479)
Client::unpack_tensor() enforces that the user-provided TensorType matches the known tensor type. Now DataSet::unpack_tensor() enforces the same condition. (PR478)
Removes an unused parameter in the RedisCluster::_get_model_script_db() method. (PR477)
Version numbers changed for the Intel Compiler chain that lead to the C and C++ compilers not being available. Now, the entirety of the Base and HPC kits are installed to ensure consistent versions. (PR475)
Add the socket timeout parameter as a user-configurable option via environment variables. (PR474)
Fix an inconsistency in the C-API ConfigOptions is_configured() parameter names. (PR471)
Fix an issue where incorrect compiler flags are defined and result in build failures due to the redis_fstat macro. (PR470)
Fix wrong widths value which was preventing table from displaying. (PR468)
The requirements.txt file is unused, therefore removing. (PR462)
0.5.2#
Released on February 16, 2024
Description
Fixed bug which was sending tensors to the database twice (Python Client)
Detailed Notes
A previous bug fix for the Python client which addressed a problem when sending numpy views inadvertently kept the original put_tensor call in place. This essentially doubles the cost of the operation. (PR464)
0.5.1#
Released on February 15, 2024
Description
Fix bug when sending an array view
Add concurrency groups for Github Action testing
Update license to include 2024
Increase build space for Github Actions
Update README python versions
Expose Typehints
Update supported python versions [Add 3.11, remove 3.7]
Tweak the build system to enable building SmartRedis with Nvidia’s NVHPC toolchain
Improvements/upgrades to the container used for Github actions
Code updates to avoid compiler warnings
Added developer documentation on how to run a single test case and eliminated duplicative environment variables
Resolve a linting issue with pybind-to-python error propagation
Use mutable fields to enable Dataset get methods that store memory to be marked const
Detailed Notes
Detect whether the tensor the user is sending is a view and if so, make an explicit copy. (PR453)
Add support to concurrency groups in the [run_tests]{.title-ref} workflow. (PR456)
Update license to include 2024. (PR454)
Add new Github Action that removes unneeded packages and resizes the root disk space. (PR455)
Update developer documentation to reflect newly supported versions of Python (PR450) (PR452)
Add and ship [py.typed]{.title-ref} marker to expose inline type hints (PR451)
Deprecate support for Python 3.7 by removing from the allowed Python versions (PR450)
Update Python package dependencies to add support for Python 3.11 (PR450)
Change the order of arguments in our MakeFile to ensure that all dependencies are compiled with GCC (PR448)
Add new user-configurable parameters DEP_CC, DEP_CXX to control which compiler is used to build dependencies (PR448)
Ameliorate some compiler warnings related that were flagged in GCC 12 (unreachable code blocks, signed/unsigned mismatches) (PR448)
CI/CD: Bump the container version used in Github Actions Ubuntu 22.04 to be able to start testing GCC 12 (PR448)
CI/CD: Bump the versions of GCC used in testing to the currently maintained versions (PR448)
CI/CD: Add NVHPC to the testing matrix (PR448)
CI/CD: Test the shared/static compilations and examples with all compilers (PR448)
CI/CD: Compile Redis and RedisAI and use those versions in testing instead of extracting from a container (PR448)
CI/CD: Bump the version of Redis used in testing to 7.0.5, the same version as we use with SmartSim (PR448)
CI/CD: Pin the Torch version to 1.11.0, the same as supported in SmartSim (PR448)
Added developer documentation on how to run a single test case with the new test/build system and eliminated use of SMARTREDIS_TEST_DEVICE and SMARTREDIS_TEST_CLUSTER environment variables (PR445)
Resolve a linting issue with pybind-to-python error propagation by changing import format and narrowing the lookup of pybind error names to the error module (PR444)
Use mutable fields to enable Dataset get methods that store memory to be marked const (PR443)
0.5.0#
Released on December 18, 2023
Description
Unpin the Intel Fortran compiler in CI/CD
Added a missing space in an error message
Improved consistency of namespace declarations for C++ pybind interface
Improved const correctness of C++ Client
Improved const correctness of C++ Dataset
Updated documentation
Added test cases for all Client construction parameter combinations
Centralized dependency tracking to setup.cfg
Improved robustness of Python client construction
Updated Client and Dataset documentation
Expanded list of allowed characters in the SSDB address
Added coverage to SmartRedis Python API functions
Improved responsiveness of library when attempting connection to missing backend database
Moved testing of examples to on-commit testing in CI/CD pipeline
Added name retrieval function to the DataSet object
Updated RedisAI version used in post-commit check-in testing in Github pipeline
Allow strings in Python interface for Client.run_script, Client.run_script_multiGPU
Improved support for model execution batching
Added support for model chunking
Updated the third-party RedisAI component
Updated the third-party lcov component
Add link to contributing guidelines
Added link to contributing guidelines
Added support for multiple backend databases via a new Client constructor that accepts a ConfigOptions object
Detailed Notes
Unpin the Intel Fortran compiler in CI/CD. This requires running the compiler setup script twice, once for Fortran and once for other languages, since they’re on different releases (PR436)
Added a missing space in an error message (PR435)
Made the declaration of the py namespace in py*.h consistently outside the SmartRedis namespace declaration (PR434)
Fields in several C++ API methods are now properly marked as const (PR430)
The Dataset add_tensor method is now const correct, as are all internal the methods it calls (PR427)
Some broken links in the documentation were fixed, and the instructions to run the tests were updated (PR423)
Added test cases for all Client construction parameter combinations (PR422)
Merged dependency lists from requirements.txt and requirements-dev.txt into setup.cfg to have only one set of dependencies going forward (PR420)
Improved robustness of Python client construction by adding detection of invalid kwargs (PR419), (PR421)
Updated the Client and Dataset API documentation to clarify which interacts with the backend db (PR416)
The SSDB address can now include ‘-’ and ‘_’ as special characters in the name. This gives users more options for naming the UDS socket file (PR415)
Added tests to increase Python code coverage
Employed a Redis++ ConnectionsObject in the connection process to establish a TCP timeout of 100ms during connection attempts (PR413)
Moved testing of examples to on-commit testing in CI/CD pipeline (PR412)
Added a function to the DataSet class and added a test
Updated RedisAI version used in post-commit check-in testing in Github pipeline to a version that supports fetch of model chunking size (PR408)
Allow users to pass single keys for the inputs and outputs parameters as a string for Python run_script and run_script_multigpu
Exposed access to the Redis.AI MINBATCHTIMEOUT parameter, which limits the delay in model execution when trying to accumulate multiple executions in a batch (PR406)
Models will now be automatically chunked when sent to/received from the backed database. This allows use of models greater than 511MB in size. (PR404)
Updated from RedisAI v1.2.3 (test target)/v1.2.4 and v1.2.5 (CI/CD pipeline) to v1.2.7 (PR402)
Updated lcov from version 1.15 to 2.0 (PR396)
Create CONTRIBUTIONS.md file that points to the contribution guideline for both SmartSim and SmartRedis (PR395)
Migrated to ConfigOptions-based Client construction, adding multiple database support (PR353)
0.4.2#
Released on September 13, 2023
Description
Reduced number of suppressed lint errors
Expanded documentation of aggregation lists
Updated third-party software dependencies to current versions
Updated post-merge tests in CI/CD to work with new test system
Enabled static builds of SmartRedis
Improve robustness of test runs
Fixed installation link
Updated supported languages documentation
Removed obsolete files
Added pylint to CI/CD pipeline and mitigate existing errors
Improved clustered redis initialization
Detailed Notes
Refactor factory for ConfigOptions to avoid using protected member outside an instance (PR393)
Added a new advanced topics documentation page with a section on aggregation lists (PR390)
Updated pybind (2.10.3 => 2.11.1), hiredis (1.1.0 => 1.2.0), and redis++ (1.3.5 => 1.3.10) dependencies to current versions (PR389)
Post-merge tests in CI/CD have been updated to interface cleanly with the new test system that was deployed in the previous release (PR388)
Static builds of SmartRedis can now work with Linux platforms. Fortran is tested with GNU, PGI, Intel compilers (PR386)
Preserve the shell output of test runs while making sure that server shutdown happens unconditionally (PR381)
Fix incorrect link to installation documentation (PR380)
Update language support matrix in documentation to reflect updates from the last release (PR379)
Fix typo causing startup failure in utility script for unit tests (PR378)
Update pylint configuration and version, mitigate most errors, execute in CI/CD pipeline (PR371, PR382)
Deleted obsolete build and testing files that are no longer needed with the new build and test system (PR366)
Reuse existing redis connection when mapping the Redis cluster (PR364)
0.4.1#
Released on July 5, 2023
Description
This release revamps the build and test systems for SmartRedis as well as improving compatibility with different Fortran compilers and laying the groundwork for future support for interacting with multiple concurrent backend databases:
Documentation improvements
Improved compatibility of type hints with third-party software
Added type hints to the Python interface layer
Add support for Python 3.10
Updated setup.py to work with the new build system
Remove unneeded method from Python SRObject class
Fixed a memory leak in the C layer
Revamp SmartRedis test system
Remove debug output in pybind layer
Update Hiredis version to 1.1.0
Enable parallel build for the SmartRedis examples
Experimental support for Nvidia toolchain
Major revamp of build and test systems for SmartRedis
Refactor Fortran methods to return default logical kind
Update CI/CD tests to use a modern version of MacOS
Fix the spelling of the Dataset destructor’s C interface (now DeallocateDataSet)
Update Redis++ version to 1.3.8
Refactor third-party software dependency installation
Add pip-install target to Makefile to automate this process going forward (note: this was later removed)
Added infrastructure for multiDB support
Detailed Notes
Assorted updates and clarifications to the documentation (PR367)
Turn [ParamSpec]{.title-ref} usage into forward references to not require [typing-extensions]{.title-ref} at runtime (PR365)
Added type hints to the Python interface layer (PR361)
List Python 3.10 support and loosen PyTorch requirement to allow for versions support Python 3.10 (PR360)
Streamlined setup.py to simplify Python install (PR359)
Remove from_pybind() from Python SRObject class as it’s not needed and didn’t work properly anyway (PR358)
Fixed memory leaked from the C layer when calling get_string_option() (PR357)
Major revamp to simplify use of SmartRedis test system, automating most test processes (PR356)
Remove debug output in pybind layer associated with put_dataset (PR352)
Updated to the latest version of Hiredis (1.1.0) (PR351)
Enable parallel build for the SmartRedis examples by moving utility Fortran code into a small static library (PR349)
For the NVidia toolchain only: Replaces the assumed rank feature of F2018 used in the Fortran client with assumed shape arrays, making it possible to compile SmartRedis with the Nvidia toolchain. (PR346)
Rework the build and test system to improve maintainability of the library. There have been several significant changes, including that Python and Fortran clients are no longer built by defaults and that there are Make variables that customize the build process. Please review the build documentation and
make help
to see all that has changed. (PR341)Many Fortran routines were returning logical kind = c_bool which turns out not to be the same default kind of most Fortran compilers. These have now been refactored so that users need not import [iso_c_binding]{.title-ref} in their own applications (PR340)
Update MacOS version in CI/CD tests from 10.15 to 12.0 (PR339)
Correct the spelling of the C DataSet destruction interface from DeallocateeDataSet to DeallocateDataSet (PR338)
Updated the version of Redis++ to v1.3.8 to pull in a change that ensures the redis++.pc file properly points to the generated libraries (PR334)
Third-party software dependency installation is now handled in the Makefile instead of separate scripts
New pip-install target in Makefile will be a dependency of the lib target going forward so that users don’t have to manually pip install SmartRedis in the future (PR330)
Added ConfigOptions class and API, which will form the backbone of multiDB support (PR303)
0.4.0#
Released on April 12, 2023
Description
This release provides a variety of features to improve usability and debugging of the SmartRedis library, notably including Unix domain socket support, logging, the ability to print a textual representation of a string or dataset, dataset inspection, documentation updates, fixes to the multi-GPU support, and much more:
Prepare 0.4.0 release
Disable codecov CI tests
Improved error message in to_string methods in C interface
Streamlined PyBind interface layer
Updated Python API documentation
Streamlined C interface layer
Improved performance of get, put, and copy dataset methods
Fix a bug which prevented multi-GPU model set in some cases
Streamline pipelined execution of tasks for backend database
Enhance code coverage to include all 4 languages supported by SmartRedis
Fix a bug which resulted in wrong key prefixing when retrieving aggregation lists in ensembles
Correct assorted API documentation errors and omissions
Improve documentation of exception handling in Redis server classes
Improve error handling for setting of scripts and models
Add support to inspect the dimensions of a tensor via get_tensor_dims()
Split dataset prefixing control from use_tensor_ensemble_prefix() to use_dataset_ensemble_prefix()
Update to the latest version of redis-plus-plus
Update to the latest version of PyBind
Change documentation theme to sphinx_book_theme and fix doc strings
Add print capability for Client and DataSet
Add support for inspection of tensors and metadata inside datasets
Add support for user-directed logging for Python clients, using Client, Dataset, or LogContext logging methods
Add support for user-directed logging for C and Fortran clients without a Client or Dataset context
Additional error reporting for connections to and commands run against Redis databases
Improved error reporting capabilities for Fortran clients
Python error messages from SmartRedis contain more information
Added logging functionality to the SmartRedis library
A bug related to thread pool initialization was fixed.
This version adds new functionality in the form of support for Unix Domain Sockets.
Fortran client can now be optionally built with the rest of the library
Initial support for dataset conversions, specifically Xarray.
Detailed Notes
Update docs and version numbers in preparation for version 0.4.0. Clean up duplicate marking of numpy dependency (PR321)
Remove codecov thresholds to avoid commits being marked as ‘failed’ due to coverage variance (PR317)
Corrected the error message in to_string methods in C interface to not overwrite the returned error message and to name the function (PR320)
Streamlined PyBind interface layer to reduce repetitive boilerplate code (PR315)
Updated Python API summary table to include new methods (PR313)
Streamlined C interface layer to reduce repetitive boilerplate code (PR312)
Leveraged Redis pipelining to improve performance of get, put, and copy dataset methods (PR311)
Redis::set_model_multigpu() will now upload the correct model to all GPUs (PR310)
RedisCluster::_run_pipeline() will no longer unconditionally apply a retry wait before returning (PR309)
Expand code coverage to all four languages and make the CI/CD more efficent (PR308)
An internal flag was set incorrectly, it resulted in wrong key prefixing when accessing (retrieving or querying) lists created in ensembles (PR306)
Corrected a variety of Doxygen errors and omissions in the API documentation (PR305)
Added throw documentation for exception handling in redis.h, redisserver.h, rediscluster.h (PR301)
Added error handling for a rare edge condition when setting scripts and models (PR300)
Added support to inspect the dimensions of a tensor via new get_tensor_dims() method (PR299)
The use_tensor_ensemble_prefix() API method no longer controls whether datasets are prefixed. A new API method, use_dataset_ensemble_prefix() now manages this. (PR298)
Updated from redis-plus-plus v1.3.2 to v1.3.5 (PR296)
Updated from PyBind v2.6.2 to v2.10.3 (PR295)
Change documentation theme to sphinx_book_theme to match SmartSim documentation theme and fix Python API doc string errors (PR294)
Added print capability for Client and DataSet to give details diagnostic information for debugging (PR293)
Added support for retrieval of names and types of tensors and metadata inside datasets (PR291)
Added support for user-directed logging for Python clients via {Client, Dataset, LogContext}.{log_data, log_warning, log_error} methods (PR289)
Added support for user-directed logging without a Client or Dataset context to C and Fortran clients via _string() methods (PR288)
Added logging to capture transient errors that arise in the _run() and _connect() methods of the Redis and RedisCluster classes (PR287)
Tweak direct testing of Redis and RedisCluster classes (PR286)
Resolve a disparity in the construction of Python client and database classes (PR285)
Fortran clients can now access error text and source location (PR284)
Add exception location information from CPP code to Python exceptions (PR283)
Added client activity and manual logging for developer use (PR281)
Fix thread pool error (PR280)
Update library linking instructions and update Fortran tester build process (PR277)
Added [add_metadata_for_xarray]{.title-ref} and [transform_to_xarray]{.title-ref} methods in [DatasetConverter]{.title-ref} class for initial support with Xarray (PR262)
Change Dockerfile to use Ubuntu 20.04 LTS image (PR276)
Implemented support for Unix Domain Sockets, including refactorization of server address code, test cases, and check-in tests. (PR252)
A new make target [make lib-with-fortran]{.title-ref} now compiles the Fortran client and dataset into its own library which applications can link against (PR245)
0.3.1#
Released on June 24, 2022
Description
Version 0.3.1 adds new functionality in the form of DataSet aggregation lists for pipelined retrieval of data, convenient support for multiple GPUs, and the ability to delete scripts and models from the backend database. It also introduces multithreaded execution for certain tasks that span multiple shards of a clustered database, and it incorporates a variety of internal improvements that will enhance the library going forward.
Detailed Notes
Implemented DataSet aggregation lists in all client languages, for pipelined retrieval of data across clustered and non-clustered backend databases. (PR258) (PR257) (PR256) (PR248) New commands are:
append_to_list()
delete_list()
copy_list()
rename_list()
get_list_length()
poll_list_length()
poll_list_length_gte()
poll_list_length_lte()
get_datasets_from_list()
get_dataset_list_range()
use_list_ensemble_prefix()
Implemented multithreaded execution for parallel dataset list retrieval on clustered databases. The number of threads devoted for this purpose is controlled by the new environment variable SR_THERAD_COUNT. The value defaults to 4, but may be any positive integer or special value zero, which will cause the SmartRedis runtime to allocate one thread for each available hardware context. (PR251) (PR246)
Augmented support for GPUs by implementing multi-GPU convenience functions for all client languages. (PR254) (PR250) (PR244) New commands are:
set_model_from_file_multigpu()
set_model_multigpu()
set_script_from_file_multigpu()
set_script_multigpu()
run_model_multigpu()
run_script_multigpu()
delete_model_multigpu()
delete_script_multigpu()
Added API calls for all clients to delete models and scripts from the backend database. (PR240) New commands are:
delete_script()
delete_model()
Updated the use of backend RedisAI API calls to discontinue use of deprecated methods for model selection (AI.MODELSET) and execution (AI.MODELRUN) in favor of current methods AI.MODELSTORE and AI.MODELEXECUTE, respectively. (PR234)
SmartRedis will no longer call the C runtime method srand() to ensure that it does not interfere with random number generation in client code. It now uses a separate instance of the C++ random number generator. (PR233)
Updated the way that the Fortran enum_kind type defined in the fortran_c_interop module is defined in order to better comply with Fortran standard and not interfere with GCC 6.3.0. (PR231)
Corrected the spelling of the word “command” in a few error message strings. (PR221)
SmartRedis now requires a CMake version 3.13 or later in order to utilize the add_link_options CMake command. (PR217)
Updated and improved the documentation of the SmartRedis library. In particular, a new SmartRedis Integration Guide provides an introduction to using the SmartRedis library and integrating it with existing software. (PR261) (PR260) (PR259) (SSPR214)
Added clustered Redis testing to automated GitHub check-in testing. (PR239)
Updated the SmartRedis internal API for building commands for the backend database. (PR223) This change should not be visible to clients.
The SmartRedis example code is now validated through the automated GitHub checkin process. This will help ensure that the examples do not fall out of date. (PR220)
Added missing copyright statements to CMakeLists.txt and the SmartRedis examples. (PR219)
Updated the C++ test coverage to ensure that all test files are properly executed when running “make test”. (PR218)
Fixed an internal naming conflict between a local variable and a class member variable in the DataSet class. (PR215) This should not be visible to clients.
Updated the internal documentation of methods in SmartRedis C++ classes with the override keyword to improve compliance with the latest C++ standards. (PR214) This change should not be visible to clients.
Renamed variables internally to more cleanly differentiate between names that are given to clients for tensors, models, scripts, datasets, etc., and the keys that are used when storing them in the backend database. (PR213) This change should not be visible to clients.
0.3.0#
Released on February 11, 2022
Description
Improve error handling across all SmartRedis clients (PR159) (PR191) (PR199) (PR205) (PR206) Includes changes to C and Fortran function prototypes that are not backwards compatible. Includes changes to error class names and enum type names that are not backwards compatible
Add
poll_dataset
functionality to all SmartRedis clients (PR184) Due to other breaking changes made in this release, applications using methods other thanpoll_dataset
to check for the existence of a dataset should now usepoll_dataset
Add environment variables to control client connection and command timeout behavior (PR194)
Add AI.INFO command to retrieve statistics on scripts and models via Python and C++ clients (PR197)
Create a Dockerfile for SmartRedis (PR180)
Update
redis-plus-plus
version to 1.3.2 (PR162)Internal client performance and API improvements (PR138) (PR141) (PR163) (PR203)
Expose Redis
FLUSHDB
,CONFIG GET
,CONFIG SET
, andSAVE
commands to the Python client (PR139) (PR160)Extend inverse CRC16 prefixing to all hash slots (PR161)
Improve backend dataset representation to enable performance optimization (PR195)
Simplify SmartRedis build proccess (PR189)
Fix zero-length array transfer in Fortran
convert_char_array_to_c
(PR170)Add continuous integration for all SmartRedis tests (PR165) (PR173) (PR177)
Update SmartRedis documentation and examples (PR202) (PR208) (PR210)
0.2.0#
Released on August, 5, 2021
Description
Improved tensor memory management in the Python client (PR70)
Improved metadata serialization and removed protobuf dependency (PR61)
Added unit testing infrastructure for the C++ client (PR96)
Improve command execution fault handling (PR65) (PR97) (PR105)
Added copy, rename, and delete tensor and DataSet commands in the Python client (PR66)
Upgrade to RedisAI 1.2.3 (PR101)
Fortran and C interface improvements (PR93) (PR94) (PR95) (PR99)
Add Redis INFO command execution to the Python client (PR83)
Add Redis CLUSTER INFO command execution to the Python client (PR105)
0.1.1#
Released on May 5, 2021
Description
0.1.0#
Released on April 1, 2021
Description
Initial 0.1.0 release of SmartRedis
SmartDashboard#
0.0.4#
Released on 14 May 2024
Description
Fix header styling issue (SmartDashboard-PR55)
Automate the creation of release notes (SmartDashboard-PR54)
Add database telemetry documentation. (SmartDashboard-PR52)
Auto-post release PR to develop from master (SmartDashboard-PR53)
Decrease the pinned version of Pydantic (SmartDashboard-PR51)
Bump version to 0.0.4, exclude streamlit version 1.31.X (SmartDashboard-PR50)
Drop Python 3.8 support, add 3.11 support. (SmartDashboard-PR49)
Add Database Telemetry page. (SmartDashboard-PR38)
Add Github Actions workflow that checks if changelog is edited on pull requests into develop. (SmartDashboard-PR47)
Add manifest file tracking. (SmartDashboard-PR46)
0.0.3#
Released on 15 February 2024
Description
Added defined schemas for entity objects. (SmartDashboard-PR31)
Added experiment level logs to the dashboard. (SmartDashboard-PR37)
0.0.2#
Released on 14 December 2023
Description
The initial release of SmartDashboard includes capabilities for viewing experiment entity properties and statuses.