Numba is a Python compiler that can compile Python code for execution on CUDA-capable GPUs or multicore CPUs. In this blog, we will show steps to introduce such capability into Origin’s embedded Python.
Step 1 Find out the CUDA version of your GPU
First we need to find out the CUDA version. Go to the control panel (Search Control Panel in the Windows OS Search Entry), in the Control Panel, find NVIDIA Control Panel and open it.
Click System Information link at bottom left. Select Components tab and find NVCUDA64.DLL in the table. You should find the CUDA version higher than 11.1.1 in this case.
Step 2 Download CUDA toolkit
Go to the webpage https://developer.nvidia.com/cuda-toolkit-archive and download the toolkit for the corresponding CUDA version. Execute the .exe file to install the toolkit.
The tool kit should be installed in C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1. Go to Windows System Properties dialog to verify if environment variable CUDA_PATH is correctly set as above path; if not exist, set it as shown below.
Step 3. Install Python Package numba to the embedded Python in Origin
Launch Origin. In the Origin workspace, select from menu Connectivity: Python Packages… If the numpy version is newer than 1.21, we need to uninstall the numpy package first as numba has requirement numpy <1.21 and >=1.17. Select numpy in the table, click Uninstall button to uninstall numpy 1.21.4.
Click Install… button. Type numba in the Package Names entry and click OK to install numba package. After the installation, restart Origin.
Step 4. Test Python code.
Select from menu Connectivity: Open untitled.py… and copy paste the following code. Press F5 key to execute.
from numba import cuda import numpy as np @cuda.jit def cudakernel0(array): for i in range(array.size): array[i] += 0.5 array = np.array([0, 1], np.float32) print('Initial array:', array) print('Kernel launch: cudakernel0[1, 1](array)') cudakernel0[1, 1](array) print('Updated array:',array)
You should be able to see the result indicating the CUDA kernel is properly launched.
Initial array: [0. 1.] Kernel launch: cudakernel0[1, 1](array) Updated array: [0.5 1.5]