The NVIDIA Collective Communication Library (NCCL) implements multi-GPU and multi-node communication primitives optimized for NVIDIA GPUs and Networking. NCCL provides routines such as all-gather, all-reduce, broadcast, reduce, reduce-scatter as well as point-to-point send and receive that are optimized to achieve high bandwidth and low latency over PCIe and NVLink high-speed interconnects within a node and over NVIDIA Mellanox Network across nodes.
Set up the environment and version
Available version : 2.18.1
Run NCCL tests on a GPU server.
- Run tests on a GPU server. Faire tourner les tests sur un serveur GPU
- Edit the file called nccl_test.sh
#!/bin/sh #SBATCH --job-name=nccl_test #SBATCH --partition=bigpu #SBATCH --gres=gpu:2 #SBATCH --time=0:10:00 #SBATCH --output=job-%j.out #SBATCH --nodes=1 ml nvidia/nccl git clone https://github.com/NVIDIA/nccl-tests.git cd nccl-tests make ./build/all_reduce_perf -b 8 -e 256M -f 2 -g 2 #./build/all_reduce_perf -b 8 -e 256M -f 2 -g <ngpus>
- Launch the job