Categories
LLM

Solve Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

When doing finetuning model, you may encountered with this error message that telling CUDA Out of Memory (OOM) with detail

RuntimeError: CUDA error: out of memory; Compile with TORCH_USE_CUDA_DSA to enable device-side assertions

For my case, I did upgrade NVIDIA drivers to 5.30 version from 5.25 that cause this problem.

So, the solution is to downgrade my NVIDIA drivers back to 5.25 version and using the latest Transformers and Torch installation like in https://www.yodiw.com/install-transformers-pytorch-tensorflow-ubuntu-2023/

Leave a Reply

Your email address will not be published. Required fields are marked *