Common HPC Dependencies¶
BLT creates named targets for the common HPC dependencies that most HPC projects need, such as MPI, CUDA, HIP, and OpenMP. Something BLT assists it’s users with is getting these dependencies to interoperate within the same library or executable.
As previously mentioned in Adding Tests, BLT also provides bundled versions of GoogleTest, GoogleMock, GoogleBenchmark, and FRUIT. Not only are the source for these included, we provide named CMake targets for them as well.
BLT’s blt::mpi
, blt::cuda
, blt::cuda_runtime
, blt::hip
, blt::hip_runtime
,
and blt::openmp
targets are all defined via the blt_import_library macro.
This creates a true CMake imported target that is inherited properly through the CMake’s
dependency graph.
Note
BLT also supports exporting its third-party targets via the BLT_EXPORT_THIRDPARTY
option.
See Exporting Targets for more information.
You have already seen one use of DEPENDS_ON
for a BLT dependency, gtest
, in test_1
:
blt_add_executable( NAME test_1
SOURCES test_1.cpp
DEPENDS_ON calc_pi gtest)
MPI¶
Our next example, test_2
, builds and tests the calc_pi_mpi
library,
which uses MPI to parallelize the calculation over the integration intervals.
To enable MPI, we set ENABLE_MPI
, MPI_C_COMPILER
, and MPI_CXX_COMPILER
in our host config file. Here is a snippet with these settings for LLNL’s Lassen Cluster:
set(ENABLE_MPI ON CACHE BOOL "")
set(MPI_HOME "/usr/tce/packages/mvapich2/mvapich2-2.3.6-${GCC_VERSION}" CACHE PATH "")
set(MPI_C_COMPILER "${MPI_HOME}/bin/mpicc" CACHE PATH "")
set(MPI_CXX_COMPILER "${MPI_HOME}/bin/mpicxx" CACHE PATH "")
set(MPI_Fortran_COMPILER "${MPI_HOME}/bin/mpif90" CACHE PATH "")
set(MPIEXEC "/usr/bin/srun" CACHE PATH "")
set(MPIEXEC_NUMPROC_FLAG "-n" CACHE PATH "")
Here, you can see how calc_pi_mpi
and test_2
use DEPENDS_ON
:
blt_add_library( NAME calc_pi_mpi
HEADERS calc_pi_mpi.hpp calc_pi_mpi_exports.h
SOURCES calc_pi_mpi.cpp
DEPENDS_ON blt::mpi)
if(WIN32 AND BUILD_SHARED_LIBS)
target_compile_definitions(calc_pi_mpi PUBLIC WIN32_SHARED_LIBS)
endif()
blt_add_executable( NAME test_2
SOURCES test_2.cpp
DEPENDS_ON calc_pi calc_pi_mpi gtest)
For MPI unit tests, you also need to specify the number of MPI Tasks
to launch. We use the NUM_MPI_TASKS
argument to blt_add_test macro.
blt_add_test( NAME test_2
COMMAND test_2
NUM_MPI_TASKS 2) # number of mpi tasks to use
As mentioned in Adding Tests, GoogleTest provides a default main()
driver that will execute all unit tests defined in the source. To test MPI code,
we need to create a main that initializes and finalizes MPI in addition to Google
Test. test_2.cpp
provides an example driver for MPI with GoogleTest.
// main driver that allows using mpi w/ GoogleTest
int main(int argc, char * argv[])
{
int result = 0;
::testing::InitGoogleTest(&argc, argv);
MPI_Init(&argc, &argv);
result = RUN_ALL_TESTS();
MPI_Finalize();
return result;
}
Note
While we have tried to ensure that BLT chooses the correct setup information for MPI, there are several niche cases where the default behavior is insufficient. We have provided several available override variables:
BLT_MPI_COMPILE_FLAGS
BLT_MPI_INCLUDES
BLT_MPI_LIBRARIES
BLT_MPI_LINK_FLAGS
BLT also has the variable ENABLE_FIND_MPI
which turns off all CMake’s FindMPI
logic and then uses the MPI wrapper directly when you provide them as the default
compilers.
CUDA¶
Finally, test_3
builds and tests the calc_pi_cuda
library,
which uses CUDA to parallelize the calculation over the integration intervals.
To enable CUDA, we set ENABLE_CUDA
, CMAKE_CUDA_COMPILER
,
CMAKE_CUDA_ARCHITECTURES
, and CUDA_TOOLKIT_ROOT_DIR
in our host config file.
Also before enabling the CUDA language in CMake, you need to set
CMAKE_CUDA_HOST_COMPILER
in CMake 3.9+ or CUDA_HOST_COMPILER
in previous versions.
If you do not call enable_language(CUDA)
, BLT will set the appropriate host
compiler variable for you and enable the CUDA language.
Note
The BLT_CXX_STD
variable is useful to set the C++ and CUDA language standard to the
same level. For example, c++17
will set a both to C++17.
Here is a snippet with these settings for LLNL’s Lassen Cluster:
set(ENABLE_CUDA ON CACHE BOOL "")
set(CUDA_TOOLKIT_ROOT_DIR "/usr/tce/packages/cuda/cuda-11.2.0" CACHE PATH "")
set(CMAKE_CUDA_COMPILER "${CUDA_TOOLKIT_ROOT_DIR}/bin/nvcc" CACHE PATH "")
set(CMAKE_CUDA_HOST_COMPILER "${CMAKE_CXX_COMPILER}" CACHE PATH "")
set(CMAKE_CUDA_ARCHITECTURES "70" CACHE STRING "")
set(CMAKE_CUDA_FLAGS "-restrict --expt-extended-lambda -G" CACHE STRING "")
set(CUDA_SEPARABLE_COMPILATION ON CACHE BOOL "" )
Here, you can see how calc_pi_cuda
and test_3
use DEPENDS_ON
:
blt_add_library( NAME calc_pi_cuda
HEADERS calc_pi_cuda.hpp calc_pi_cuda_exports.h
SOURCES calc_pi_cuda.cpp
DEPENDS_ON blt::cuda)
if(WIN32 AND BUILD_SHARED_LIBS)
target_compile_definitions(calc_pi_cuda PUBLIC WIN32_SHARED_LIBS)
endif()
blt_add_executable( NAME test_3
SOURCES test_3.cpp
DEPENDS_ON calc_pi calc_pi_cuda gtest)
blt_add_test( NAME test_3
COMMAND test_3)
The blt::cuda
dependency for calc_pi_cuda
is a little special,
along with adding the normal CUDA library and headers to your library or executable,
it also tells BLT that this target’s C/C++/CUDA source files need to be compiled via
nvcc
or cuda-clang
. If this is not a requirement, you can use the dependency
blt::cuda_runtime
which also adds the CUDA runtime library and headers but will not
compile each source file with nvcc
.
Some other useful CUDA variables are:
set(ENABLE_CUDA ON CACHE BOOL "")
set(CUDA_TOOLKIT_ROOT_DIR "/usr/tce/packages/cuda/cuda-11.2.0" CACHE PATH "")
set(CMAKE_CUDA_COMPILER "${CUDA_TOOLKIT_ROOT_DIR}/bin/nvcc" CACHE PATH "")
set(CMAKE_CUDA_HOST_COMPILER "${CMAKE_CXX_COMPILER}" CACHE PATH "")
set(CMAKE_CUDA_ARCHITECTURES "70" CACHE STRING "")
set(CMAKE_CUDA_FLAGS "-restrict --expt-extended-lambda -G" CACHE STRING "")
set(CUDA_SEPARABLE_COMPILATION ON CACHE BOOL "" )
# nvcc does not like gtest's 'pthreads' flag
set(gtest_disable_pthreads ON CACHE BOOL "")
OpenMP¶
To enable OpenMP, set ENABLE_OPENMP
in your host-config file or before loading
SetupBLT.cmake
. Once OpenMP is enabled, simply add blt::openmp
to your library
executable’s DEPENDS_ON
list.
Here is an example of how to add an OpenMP enabled executable:
blt_add_executable(NAME blt_openmp_smoke
SOURCES blt_openmp_smoke.cpp
OUTPUT_DIR ${TEST_OUTPUT_DIRECTORY}
DEPENDS_ON blt::openmp
FOLDER blt/tests )
Here is an example of how to add an OpenMP enabled test that sets the amount of threads used:
blt_add_test(NAME blt_openmp_smoke
COMMAND blt_openmp_smoke
NUM_OMP_THREADS 4)
HIP¶
BLT’s AMD HIP support is very similar to it’s CUDA support with one caveat. Our HIP support was implemented before CMake had full HIP language support and therefore requires that the HIP compilers be set as the main compilers. This will change soon.
Important Setup Variables
ENABLE_HIP
: Enables HIP support in BLTHIP_ROOT_DIR
: Root directory for HIP installationCMAKE_HIP_ARCHITECTURES
: GPU architecture to use when generating HIP/ROCm code
BLT Targets
blt::hip
: Adds include directories, hip runtime libraries, and compiles source with hipccblt::hip_runtime
: Adds include directories and hip runtime libraries
Note
The BLT_CXX_STD
variable is useful to set the C++ and HIP language standard to the
same level. For example, c++17
will set a both to C++17.
The following two code snippets show an example of a basic host-config with HIP enabled for the toss_4_x86_64_ib_cray platform:
set(_compiler_root "/opt/rocm-5.6.0/llvm")
set(CMAKE_C_COMPILER "${_compiler_root}/bin/amdclang" CACHE PATH "")
set(CMAKE_CXX_COMPILER "${_compiler_root}/bin/amdclang++" CACHE PATH "")
set(CMAKE_Fortran_COMPILER "${_compiler_root}/bin/amdflang" CACHE PATH "")
set(CMAKE_Fortran_FLAGS "-Mfreeform" CACHE STRING "")
set(ENABLE_FORTRAN ON CACHE BOOL "")
set(_rocm_root "/opt/rocm-5.6.0")
set(ENABLE_HIP ON CACHE BOOL "")
set(HIP_ROOT_DIR "${_rocm_root}/hip" CACHE STRING "")
set(CMAKE_HIP_ARCHITECTURES "gfx90a" CACHE STRING "")
set(CMAKE_EXE_LINKER_FLAGS "-Wl,--disable-new-dtags -L${_rocm_root}/hip/../llvm/lib -L${_rocm_root}/hip/lib -Wl,-rpath,${_rocm_root}/hip/../llvm/lib:${_rocm_root}/hip/lib -lpgmath -lflang -lflangrti -lompstub -lamdhip64 -L${_rocm_root}/hip/../lib64 -Wl,-rpath,${_rocm_root}/hip/../lib64 -L${_rocm_root}/hip/../lib -Wl,-rpath,${_rocm_root}/hip/../lib -lamd_comgr -lhsa-runtime64 " CACHE STRING "")
Here is an example of using the BLT HIP target to create an executable:
blt_add_executable(NAME blt_hip_smoke
SOURCES blt_hip_smoke.cpp
OUTPUT_DIR ${TEST_OUTPUT_DIRECTORY}
DEPENDS_ON blt::hip
FOLDER blt/tests )