Difficult Nvidia/CUDA library install on WSL2/ Ubuntu 20.04
CUDA is used by all the nvidia-ai stuff. In my progression of research in artificial intelligent (AI) i’ve been held-back because I frequently can’t get the code examples with the papers to compile. So I can read the paper, but I can’t run the examples and today I decided to try and figure out “why” (i.e. I’m going to focus on this piece until it’s working OR i’ve run out of ideas). .. this is not my first attempt, I’ve attempted this at least a dozen times over the past few years — it is a constant source of “why isn’t this working” and “why can’t this be easier” for me.
So basically, I’m having the issue here:
https://github.com/NVIDIA/nvidia-docker/issues/1225
which is that docker is crashing on any container image which tries to access cuda.
I was writing this notes in Github but I think it’s better to put them on my own space.
So, nothing has improved. I’ll post my troubleshooting.
[](https://askubuntu.com/questions/1289811/cant-install-nvidia-driver-toolkit-on-ubuntu-20-04-lts-needs-uninstallable-pa?newreg=47efd436901f4d1d971e4ff45479313a)
suggests this course of action:
$ apt-get upgrade
$ sudo apt-cache policy | grep “nvidia”
500 https://nvidia.github.io/nvidia-docker/ubuntu18.04/amd64 Packages
release v=1.0,o=https://nvidia.github.io/nvidia-docker,a=bionic,n=bionic,l=NVIDIA CORPORATION <cudatools@nvidia.com>,c=
origin nvidia.github.io
500 https://nvidia.github.io/nvidia-container-runtime/stable/ubuntu18.04/amd64 Packages
release v=1.0,o=https://nvidia.github.io/nvidia-container-runtime,a=bionic,n=bionic,l=NVIDIA CORPORATION <cudatools@nvidia.com>,c=
origin nvidia.github.io
500 https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/amd64 Packages
release v=1.0,o=https://nvidia.github.io/libnvidia-container,a=bionic,n=bionic,l=NVIDIA CORPORATION <cudatools@nvidia.com>,c=
origin nvidia.github.io
500 https://nvidia.github.io/libnvidia-container/experimental/ubuntu18.04/amd64 Packages
release v=1.0,o=https://nvidia.github.io/libnvidia-container/experimental,a=bionic,n=experimental,l=NVIDIA CORPORATION <cudatools@nvidia.com>,c=
origin nvidia.github.io
500 http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64 Packages
origin developer.download.nvidia.com
$ sudo apt-get --purge remove “*cublas*” “cuda*” “*nvidia*”
….
$ apt-get update
$ sudo apt-cache policy
$ apt-get install nvidia-cuda-toolkit
Eureka, it looks like I have
$ sudo apt-cache policy 500 https://nvidia.github.io/nvidia-docker/ubuntu18.04/amd64 Packages release v=1.0,o=https://nvidia.github.io/nvidia-docker,a=bionic,n=bionic,l=NVIDIA CORPORATION <cudatools@nvidia.com>,c=origin nvidia.github.io500 https://nvidia.github.io/nvidia-container-runtime/stable/ubuntu18.04/amd64 Packages release v=1.0,o=https://nvidia.github.io/nvidia-container-runtime,a=bionic,n=bionic,l=NVIDIA CORPORATION <cudatools@nvidia.com>,c=origin nvidia.github.io500 https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/amd64 Packages release v=1.0,o=https://nvidia.github.io/libnvidia-container,a=bionic,n=bionic,l=NVIDIA CORPORATION <cudatools@nvidia.com>,c=origin nvidia.github.io 500 https://nvidia.github.io/libnvidia-container/experimental/ubuntu18.04/amd64 Packages release v=1.0,o=https://nvidia.github.io/libnvidia-container/experimental,a=bionic,n=experimental,l=NVIDIA CORPORATION <cudatools@nvidia.com>,c=origin nvidia.github.io 500 http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64 Packages Release o=NVIDIA,l=NVIDIA CUDA,c=origin developer.download.nvidia.com
So it looks like not only do I have Ubuntu 18.04 release files (i’m on 20.04)
I’m trying all the add-apt-repository commands to figure out how to get these ancient packages out of my operating system. Crazy stuff like:
sudo add-apt-repository --remove ppa:developer.download.nvidia.com/ppa
Nothing is working. Found some files in the /etc/apt/sources.list.d
cd /etc/apt/sources.list.d
rm -f *cuda* *nvidia*
But of course the “sudo apt-cache policy” still shows the ancient libraries in them. So apt-cache policy displays the priorities of package sources, suggestsions are “sudo apt-get clean” or “sudo apt-get autoclean” and variations on “add-apt-repository — delete “ with no luck. Ultimately I went into the /etc/apt/sources.list.d and removed the nvidia-docker.list*
Okay so now it’s time for the reinstall …
https://github.com/NVIDIA/nvidia-docker/issues/1204
which ultimate links here:
Altered the driver to match my DRIVER_VERSION=461.72 which didn’t work, so i went with the default 450.80.02
https://us.download.nvidia.com/XFree86/Linux-x86_64/460.56/NVIDIA-Linux-x86_64-460.56.run
I’m not sure that will ultimately work, since I’m running on WSL not Ubuntu LTS .. update: it didn’t work
Searched for CUDA on WSL found this:
https://docs.nvidia.com/cuda/wsl-user-guide/index.html
which sent me here:
It has a note:
New developer drivers are planned to be made available this coming Monday 03/22/21, thanks for your patience.
I did a bit of browsing of the NVIDIA packages repository and found this:
http://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/
apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/7fa2af80.pubsh -c 'echo "deb http://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64 /" > /etc/apt/sources.list.d/cuda.list'apt-get updateapt-get install -y cuda-toolkit-11-2
This downloaded 4gb of cuda tools .. and got this:
/usr/local/cuda/samples/4_Finance/BlackScholes# ./BlackScholes
[./BlackScholes] - Starting...
CUDA error at ../../common/inc/helper_cuda.h:779 code=35(cudaErrorInsufficientDriver) "cudaGetDeviceCount(&device_count)"
Ultimately arriving at the fact that it is known that this doesn’t work on my build of windows. According to
https://github.com/NVIDIA/nvidia-docker/issues/1437
“you need to register with the Insider program and build version 20145 or higher in order for this to work.”
So I did, in the process agreeing to tell Microsoft about every app that is installed on my PC _AND_ my web browsing history (no clue, what my frequent visits to lemonparty.org is gonna tell them!)
Of course after all, 2021–03 Cumulative Update for Windows 10 Version 2004 for x64-based Systems (KB5000802) gets to 20% installation and halts.
(to be continued I suppose)
…