GPU Acceleration

mythos can leverage GPUs at multiple levels: the simulation backend (oxDNA CUDA), the JAX runtime (for energy evaluation and gradient computation), and the Ray scheduler (for distributing GPU resources across workers). This page covers configuration for each.

oxDNA CUDA Backend

The oxDNA simulator supports a CUDA backend that runs the simulation on an NVIDIA GPU. This can dramatically accelerate individual simulations, especially for large systems.

Enabling the CUDA backend

Set the backend to CUDA in your oxDNA input file(s):

backend = CUDA
CUDA_device = <device_id>

When mythos detects backend = CUDA in the input configuration, it automatically passes -DCUDA=ON to CMake during the oxDNA build step. Note that the CUDA_device is a relative index, so when running with the ray optimizer backend, in general it can be set to 0, as we will assign one GPU to the task via Ray’s scheduler hints (see below).

The option can be passed in the input file directly as above or overridden via the oxdna.oxDNASimulator constructor using input_overrides:

simulator = oxdna.oxDNASimulator(
    ...,
    input_overrides={
        "backend": "CUDA",
        "CUDA_device": 0,  # optional: specify which GPU to use
    },
)

Finally, note also that there may be other input parameters that are required to use the CUDA backend, depending on the simulation type. Some types do not support CUDA at all. See the oxDNA documentation

Build requirements

The CUDA backend requires:

  • An NVIDIA GPU with a supported compute capability

  • The CUDA toolkit installed and available on the build node (nvcc on PATH)

  • A compatible C++ compiler (e.g., gcc)

If you are building on an HPC cluster, you will typically need to load CUDA and compiler modules before running your optimization:

module load gcc/9.3.0 cuda/11.8 cmake/3.27.9

Note

When using the CUDA backend with the RayOptimizer, ensure that CUDA is available on every worker node that may run an oxDNA simulation task. See oxDNA documentation for full build instructions and supported GPU architectures.

JAX GPU Usage

JAX can use GPUs for energy function evaluation and gradient computation (DiffTRe reweighting). By default, JAX will use a GPU if one is available.

Installing JAX with GPU support

The default jax package is CPU-only. To enable GPU support, install the CUDA-enabled variant:

pip install "jax[cuda12]"

See the JAX installation guide for other CUDA versions, ROCm support, and troubleshooting.

Controlling the JAX platform

If a GPU-enabled JAX installation is present and a GPU is allocated via SchedulerHints, JAX will automatically use the GPU — no extra configuration is needed.

To force JAX onto CPU instead (useful when GPU memory is limited and you want to reserve it entirely for the simulation backend), set JAX_PLATFORM_NAME in the Ray worker environment. This is necessary because jax.config calls in the driver process have no effect inside workers:

ray.init(
    runtime_env={
        "env_vars": {
            "JAX_ENABLE_X64": "True",
            "JAX_PLATFORM_NAME": "cpu",
        },
    },
)

Tip

A common pattern is to run the oxDNA simulation on the GPU (CUDA backend) while running the JAX gradient computation on the CPU. This avoids competition for GPU memory between the simulation binary and JAX’s autodiff graph.

Other Simulators

GROMACS

GROMACS supports GPU acceleration when built with CUDA or OpenCL. Since mythos invokes the gmx binary directly, GPU usage depends on how GROMACS was built and configured on your system. Consult the GROMACS installation guide for building with GPU support. Once installed, GPU offloading is typically controlled via gmx mdrun flags (e.g., -nb gpu, -pme gpu).

LAMMPS

LAMMPS supports GPU acceleration through several packages (GPU, KOKKOS, INTEL). As with GROMACS, mythos calls the lmp binary directly, so GPU support depends on your LAMMPS build. See the LAMMPS GPU documentation for build instructions and runtime configuration.

For both GROMACS and LAMMPS, use num_gpus in SchedulerHints to ensure Ray allocates GPU resources appropriately for these tasks.

GPU Allocation with Ray Scheduler Hints

The RayOptimizer uses SchedulerHints to tell Ray how many GPUs each task requires. Ray uses this information to partition available GPUs across workers and set CUDA_VISIBLE_DEVICES accordingly.

Setting num_gpus

Specify GPU requirements per simulator or objective:

from mythos.utils.scheduler import SchedulerHints

simulator = oxdna.oxDNASimulator(
    ...,
    scheduler_hints=SchedulerHints(
        num_cpus=4,
        num_gpus=1,       # reserve 1 GPU for this task
        mem_mb=8192,
    ),
)

Fractional GPU sharing

If your simulations are small enough that multiple can share a single GPU, use fractional values:

scheduler_hints=SchedulerHints(
    num_gpus=0.5,  # two tasks can share one GPU
)

Ray will schedule up to two tasks with num_gpus=0.5 on the same GPU.

Note

Fractional GPU sharing relies on tasks fitting within GPU memory simultaneously. If tasks exceed the GPU’s memory when co-scheduled, you will see CUDA out-of-memory errors. Use num_gpus=1 to guarantee exclusive GPU access per task.

For full details on scheduler hints, including mem_mb, max_retries, and custom options, see the Ray Optimizer page.

Slurm GPU Partitions

When running on an HPC cluster with GPU nodes, request a GPU partition and allocate GPUs in your sbatch script:

#SBATCH --partition=gpu
#SBATCH --gres=gpu:1
#SBATCH --tasks-per-node=1
#SBATCH --cpus-per-task=8

Ensure that the CUDA toolkit is loaded and that your scheduler_hints match the number of GPUs allocated per node. See Running on Slurm HPC Systems for the full Slurm setup guide.