NCCL

      Comments Off on NCCL

Description

The NVIDIA Collective Communication Library (NCCL) implements multi-GPU and multi-node communication primitives optimized for NVIDIA GPUs and Networking. NCCL provides routines such as all-gather, all-reduce, broadcast, reduce, reduce-scatter as well as point-to-point send and receive that are optimized to achieve high bandwidth and low latency over PCIe and NVLink high-speed interconnects within a node and over NVIDIA Mellanox Network across nodes.

Set up the environment and version

ml nvidia/nccl

Available version : 2.18.1

Tutorial

Run NCCL tests on a GPU server.

  • Run tests on a GPU server. Faire tourner les tests sur un serveur GPU
  • Edit the file called nccl_test.sh
#!/bin/sh
#SBATCH --job-name=nccl_test
#SBATCH --partition=bigpu
#SBATCH --gres=gpu:2
#SBATCH --time=0:10:00
#SBATCH --output=job-%j.out
#SBATCH --nodes=1

ml nvidia/nccl
git clone https://github.com/NVIDIA/nccl-tests.git
cd nccl-tests
make
./build/all_reduce_perf -b 8 -e 256M -f 2 -g 2
#./build/all_reduce_perf -b 8 -e 256M -f 2 -g <ngpus>
  • Launch the job
sbatch test_nccl.sh

Documentation