Directly check the location of libcuda.so where the linker expects to find it#185
Open
ocaisa wants to merge 1 commit intoEESSI:mainfrom
Open
Directly check the location of libcuda.so where the linker expects to find it#185ocaisa wants to merge 1 commit intoEESSI:mainfrom
libcuda.so where the linker expects to find it#185ocaisa wants to merge 1 commit intoEESSI:mainfrom
Conversation
Member
Author
|
Tested this on Vega, where the drivers for 2023.06 are available but not for [eualano@gn06 ~]$ source /cvmfs/software.eessi.io/versions/2025.06/init/lmod/bash
Modules purged before initialising EESSI
Module for EESSI/2025.06 loaded successfully
EESSI has selected x86_64/amd/zen2 as the compatible CPU target for EESSI/2025.06
EESSI has selected accel/nvidia/cc80 as the compatible accelerator target for EESSI/2025.06
(for debug information when loading the EESSI module, set the environment variable EESSI_MODULE_DEBUG_INIT)
# Without the PR it (incorrectly) loads the module
{EESSI/2025.06} [eualano@gn06 ~]$ module load OSU-Micro-Benchmarks/7.5-gompi-2024a-CUDA-12.6.0
# Enable the PR
{EESSI/2025.06} [eualano@gn06 ~]$ export LMOD_PACKAGE_PATH=$PWD/software-layer-scripts/generate/.lmod
{EESSI/2025.06} [eualano@gn06 ~]$ module load OSU-Micro-Benchmarks/7.5-gompi-2024a-CUDA-12.6.0
Lmod has detected the following error:
You requested to load UCX-CUDA which relies on the CUDA runtime environment and driver libraries. In order to be able to use the module, you will need to make sure EESSI can find
the GPU driver libraries on your host system. The file being checked for on your system is
/cvmfs/software.eessi.io/versions/2025.06/compat/linux/x86_64/lib/nvidia/libcuda.so
You can override this check by setting the environment variable EESSI_OVERRIDE_GPU_CHECK but the loaded application will not be able to execute on your system.
For more information on how to do this, see https://www.eessi.io/docs/site_specific_config/gpu/.
While processing the following module(s):
...
{EESSI/2025.06} [eualano@gn06 ~]$ module purge
# This resets LMOD_PACKAGE_PATH
[eualano@gn06 ~]$ module load EESSI/2023.06
Module for EESSI/2023.06 loaded successfully
{EESSI/2023.06} [eualano@gn06 ~]$ module load OSU-Micro-Benchmarks/7.5-gompi-2023b-CUDA-12.4.0
# Still works with the PR enabled
{EESSI/2023.06} [eualano@gn06 ~]$ export LMOD_PACKAGE_PATH=$PWD/software-layer-scripts/generate/.lmod
{EESSI/2023.06} [eualano@gn06 ~]$ module load OSU-Micro-Benchmarks/7.5-gompi-2023b-CUDA-12.4.0
{EESSI/2023.06} [eualano@gn06 ~]$ |
Member
Author
|
bot: build repo:eessi.io-2023.06-software instance:eessi-bot-deucalion for:arch=aarch64/a64fx |
|
New job on instance
|
|
New job on instance
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #184