SmartSim


SmartSim

SmartSim

Last updated: 24 Apr 2024
0.6.2 — Released on: 16 Feb 2024
https://github.com/CrayLabs/SmartSim

SmartSim is a workflow library that makes it easier to use common Machine Learning (ML) libraries, like PyTorch and TensorFlow, in combination with High Performance Computing (HPC) simulations and applications. SmartSim launches ML infrastructure on HPC systems alongside user workloads and supports most HPC workload managers (e.g. Slurm, PBSPro, LSF). SmartSim also provides a set of client libraries in Python, C++, C, and Fortran. These client libraries allow users to send and receive data between user applications and the machine learning infrastructure. Moreover, the client APIs enable the execution of machine learning tasks like inference and online training from within user code. The exchange of data and execution of machine learning tasks is orchestrated by a high performance in-memory database that is launched and managed by SmartSim.

Execution Environment

  • User Interfaces

  • Python API
  • Python Client API
  • C++ Client API
  • C Client API
  • Fortran Client API
  • Resource Managers

  • Slurm
  • PBSPro
  • LSF
  • Linux/MacOS
  • Transfer Protocols

  • TCP/IP
  • Unix Domain Sockets (UDS)

Contributors

35   |   212   |   40   |   BSD-2-Clause

al-rigazziSparteeMattToastEricGustinastatideashaoankonamellis13amandarichardsonnjedwards4bctandon11billschereriiiAlyssaCotejuliaputkoben-albrechtbenjamin-robbinsrickybalin