External Dependencies¶
One key goal for BLT is to simplify the use of external dependencies when building your libraries and executables.
To accomplish this BLT provides a DEPENDS_ON
option for the
blt_add_library()
and blt_add_executable()
macros that supports both CMake targets
and external dependencies registered using the blt_register_library()
macro.
The blt_register_library()
macro allows you to reuse all information needed
for an external dependency under a single name. This includes any include
directories, libraries, compile flags, link flags, defines, etc. You can also
hide any warnings created by their headers by setting the
TREAT_INCLUDES_AS_SYSTEM
argument.
For example, to find and register the external dependency axom as a BLT registered library, you can simply use:
# FindAxom.cmake takes in AXOM_DIR, which is a installed Axom build and
# sets variables AXOM_INCLUDES, AXOM_LIBRARIES
include(FindAxom.cmake)
blt_register_library(NAME axom
TREAT_INCLUDES_AS_SYSTEM ON
DEFINES HAVE_AXOM=1
INCLUDES ${AXOM_INCLUDES}
LIBRARIES ${AXOM_LIBRARIES})
Then axom is available to be used in the DEPENDS_ON list in the following
blt_add_executable()
or blt_add_library()
calls.
This is especially helpful for external libraries that are not built with CMake
and don’t provide CMake-friendly imported targets. Our ultimate goal is to use blt_register_library()
to import all external dependencies as first-class imported CMake targets to take full advanced of CMake’s dependency lattice.
MPI, CUDA, and OpenMP are all registered via blt_register_library()
.
You can see how in blt/thirdparty_builtin/CMakelists.txt
.
BLT also supports using blt_register_library()
to provide additional options for existing CMake targets.
The implementation doesn’t modify the properties of the existing targets,
it just exposes these options via BLT’s support for DEPENDS_ON
.
blt_register_library
A macro to register external libraries and dependencies with BLT.
The named target can be added to the DEPENDS_ON
argument of other BLT macros,
like blt_add_library()
and blt_add_executable()
.
You have already seen one use of DEPENDS_ON
for a BLT
registered dependency in test_1: gtest
blt_add_executable( NAME test_1
SOURCES test_1.cpp
DEPENDS_ON calc_pi gtest)
gtest
is the name for the Google Test dependency in BLT registered via
blt_register_library()
. Even though Google Test is built-in and uses CMake,
blt_register_library()
allows us to easily set defines needed by all dependent
targets.
MPI Example¶
Our next example, test_2
, builds and tests the calc_pi_mpi
library,
which uses MPI to parallelize the calculation over the integration intervals.
To enable MPI, we set ENABLE_MPI
, MPI_C_COMPILER
, and MPI_CXX_COMPILER
in our host config file. Here is a snippet with these settings for LLNL’s Surface Cluster:
set(ENABLE_MPI ON CACHE BOOL "")
set(MPI_C_COMPILER "/usr/local/tools/mvapich2-gnu-2.0/bin/mpicc" CACHE PATH "")
set(MPI_CXX_COMPILER "/usr/local/tools/mvapich2-gnu-2.0/bin/mpicc" CACHE PATH "")
set(MPI_Fortran_COMPILER "/usr/local/tools/mvapich2-gnu-2.0/bin/mpif90" CACHE PATH "")
Here, you can see how calc_pi_mpi
and test_2
use DEPENDS_ON
:
blt_add_library( NAME calc_pi_mpi
HEADERS calc_pi_mpi.hpp calc_pi_mpi_exports.h
SOURCES calc_pi_mpi.cpp
DEPENDS_ON mpi)
if(WIN32 AND BUILD_SHARED_LIBS)
target_compile_definitions(calc_pi_mpi PUBLIC WIN32_SHARED_LIBS)
endif()
blt_add_executable( NAME test_2
SOURCES test_2.cpp
DEPENDS_ON calc_pi calc_pi_mpi gtest)
For MPI unit tests, you also need to specify the number of MPI Tasks
to launch. We use the NUM_MPI_TASKS
argument to blt_add_test()
macro.
blt_add_test( NAME test_2
COMMAND test_2
NUM_MPI_TASKS 2) # number of mpi tasks to use
As mentioned in Unit Testing, google test provides a default main()
driver that will execute all unit tests defined in the source. To test MPI code,
we need to create a main that initializes and finalizes MPI in addition to Google
Test. test_2.cpp
provides an example driver for MPI with Google Test.
// main driver that allows using mpi w/ google test
int main(int argc, char * argv[])
{
int result = 0;
::testing::InitGoogleTest(&argc, argv);
MPI_Init(&argc, &argv);
result = RUN_ALL_TESTS();
MPI_Finalize();
return result;
}
Note
While we have tried to ensure that BLT chooses the correct setup information for MPI, there are several niche cases where the default behavior is insufficient. We have provided several available override variables:
BLT_MPI_COMPILE_FLAGS
BLT_MPI_INCLUDES
BLT_MPI_LIBRARIES
BLT_MPI_LINK_FLAGS
BLT also has the variable ENABLE_FIND_MPI
which turns off all CMake’s FindMPI
logic and then uses the MPI wrapper directly when you provide them as the default
compilers.
CUDA Example¶
Finally, test_3
builds and tests the calc_pi_cuda
library,
which uses CUDA to parallelize the calculation over the integration intervals.
To enable CUDA, we set ENABLE_CUDA
, CMAKE_CUDA_COMPILER
, and
CUDA_TOOLKIT_ROOT_DIR
in our host config file. Also before enabling the
CUDA language in CMake, you need to set CMAKE_CUDA_HOST_COMPILER
in CMake 3.9+
or CUDA_HOST_COMPILER
in previous versions. If you do not call
enable_language(CUDA)
, BLT will set the appropriate host compiler variable
for you and enable the CUDA language.
Here is a snippet with these settings for LLNL’s Surface Cluster:
set(ENABLE_CUDA ON CACHE BOOL "")
set(CUDA_TOOLKIT_ROOT_DIR "/opt/cudatoolkit-8.0" CACHE PATH "")
set(CMAKE_CUDA_COMPILER "/opt/cudatoolkit-8.0/bin/nvcc" CACHE PATH "")
set(CMAKE_CUDA_HOST_COMPILER "${CMAKE_CXX_COMPILER}" CACHE PATH "")
set(CUDA_SEPARABLE_COMPILATION ON CACHE BOOL "")
Here, you can see how calc_pi_cuda
and test_3
use DEPENDS_ON
:
# avoid warnings about sm_20 deprecated
set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS};-arch=sm_30)
blt_add_library( NAME calc_pi_cuda
HEADERS calc_pi_cuda.hpp calc_pi_cuda_exports.h
SOURCES calc_pi_cuda.cpp
DEPENDS_ON cuda)
if(WIN32 AND BUILD_SHARED_LIBS)
target_compile_definitions(calc_pi_cuda PUBLIC WIN32_SHARED_LIBS)
endif()
blt_add_executable( NAME test_3
SOURCES test_3.cpp
DEPENDS_ON calc_pi calc_pi_cuda gtest cuda_runtime)
blt_add_test( NAME test_3
COMMAND test_3)
The cuda
dependency for calc_pi_cuda
is a little special,
along with adding the normal CUDA library and headers to your library or executable,
it also tells BLT that this target’s C/CXX/CUDA source files need to be compiled via
nvcc
or cuda-clang
. If this is not a requirement, you can use the dependency
cuda_runtime
which also adds the CUDA runtime library and headers but will not
compile each source file with nvcc
.
Some other useful CUDA flags are:
# Enable separable compilation of all CUDA files for given target or all following targets
set(CUDA_SEPARABLE_COMPILIATION ON CACHE BOOL “”)
set(CUDA_ARCH “sm_60” CACHE STRING “”)
set(CMAKE_CUDA_FLAGS “-restrict –arch ${CUDA_ARCH} –std=c++11” CACHE STRING “”)
set(CMAKE_CUDA_LINK_FLAGS “-Xlinker –rpath –Xlinker /path/to/mpi” CACHE STRING “”)
# Needed when you have CUDA decorations exposed in libraries
set(CUDA_LINK_WITH_NVCC ON CACHE BOOL “”)
OpenMP¶
To enable OpenMP, set ENABLE_OPENMP
in your host-config file or before loading
SetupBLT.cmake
. Once OpenMP is enabled, simply add openmp
to your library
executable’s DEPENDS_ON
list.
Here is an example of how to add an OpenMP enabled executable:
blt_add_executable(NAME blt_openmp_smoke SOURCES blt_openmp_smoke.cpp OUTPUT_DIR ${TEST_OUTPUT_DIRECTORY} DEPENDS_ON openmp FOLDER blt/tests )
Note
While we have tried to ensure that BLT chooses the correct compile and link flags for
OpenMP, there are several niche cases where the default options are insufficient.
For example, linking with NVCC requires to link in the OpenMP libraries directly instead
of relying on the compile and link flags returned by CMake’s FindOpenMP package. An
example of this is in host-configs/llnl/blueos_3_ppc64le_ib_p9/clang@upstream_link_with_nvcc.cmake
.
We provide two variables to override BLT’s OpenMP flag logic:
BLT_OPENMP_COMPILE_FLAGS
BLT_OPENMP_LINK_FLAGS
Here is an example of how to add an OpenMP enabled test that sets the amount of threads used:
blt_add_test(NAME blt_openmp_smoke COMMAND blt_openmp_smoke NUM_OMP_THREADS 4)
Example Host-configs¶
Here are the full example host-config files that use gcc 4.9.3 for LLNL’s Surface, Ray and Quartz Clusters.
llnl-surface-chaos_5_x86_64_ib-gcc@4.9.3.cmake
llnl/blueos_3_ppc64le_ib_p9/clang@upstream_nvcc_xlf
llnl/toss_3_x86_64_ib/gcc@4.9.3.cmake
Note
Quartz does not have GPUs, so CUDA is not enabled in the Quartz host-config.
Here is a full example host-config file for an OSX laptop, using a set of dependencies built with spack.
Building and testing on Surface¶
Here is how you can use the host-config file to configure a build of the calc_pi
project with MPI and CUDA enabled on Surface:
# load new cmake b/c default on surface is too old
ml cmake/3.9.2
# create build dir
mkdir build
cd build
# configure using host-config
cmake -C ../../host-configs/other/llnl-surface-chaos_5_x86_64_ib-gcc@4.9.3.cmake \
-DBLT_SOURCE_DIR=../../../../blt ..
After building (make
), you can run make test
on a batch node (where the GPUs reside)
to run the unit tests that are using MPI and CUDA:
bash-4.1$ salloc -A <valid bank>
bash-4.1$ make
bash-4.1$ make test
Running tests...
Test project blt/docs/tutorial/calc_pi/build
Start 1: test_1
1/8 Test #1: test_1 ........................... Passed 0.01 sec
Start 2: test_2
2/8 Test #2: test_2 ........................... Passed 2.79 sec
Start 3: test_3
3/8 Test #3: test_3 ........................... Passed 0.54 sec
Start 4: blt_gtest_smoke
4/8 Test #4: blt_gtest_smoke .................. Passed 0.01 sec
Start 5: blt_fruit_smoke
5/8 Test #5: blt_fruit_smoke .................. Passed 0.01 sec
Start 6: blt_mpi_smoke
6/8 Test #6: blt_mpi_smoke .................... Passed 2.82 sec
Start 7: blt_cuda_smoke
7/8 Test #7: blt_cuda_smoke ................... Passed 0.48 sec
Start 8: blt_cuda_runtime_smoke
8/8 Test #8: blt_cuda_runtime_smoke ........... Passed 0.11 sec
100% tests passed, 0 tests failed out of 8
Total Test time (real) = 6.80 sec
Building and testing on Ray¶
Here is how you can use the host-config file to configure a build of the calc_pi
project with MPI and CUDA
enabled on the blue_os Ray cluster:
# load new cmake b/c default on ray is too old
ml cmake
# create build dir
mkdir build
cd build
# configure using host-config
cmake -C ../../host-configs/llnl/blueos_3_ppc64le_ib_p9/clang@upstream_nvcc_xlf.cmake \
-DBLT_SOURCE_DIR=../../../../blt ..
And here is how to build and test the code on Ray:
bash-4.2$ lalloc 1 -G <valid group>
bash-4.2$ make
bash-4.2$ make test
Running tests...
Test project projects/blt/docs/tutorial/calc_pi/build
Start 1: test_1
1/7 Test #1: test_1 ........................... Passed 0.01 sec
Start 2: test_2
2/7 Test #2: test_2 ........................... Passed 1.24 sec
Start 3: test_3
3/7 Test #3: test_3 ........................... Passed 0.17 sec
Start 4: blt_gtest_smoke
4/7 Test #4: blt_gtest_smoke .................. Passed 0.01 sec
Start 5: blt_mpi_smoke
5/7 Test #5: blt_mpi_smoke .................... Passed 0.82 sec
Start 6: blt_cuda_smoke
6/7 Test #6: blt_cuda_smoke ................... Passed 0.15 sec
Start 7: blt_cuda_runtime_smoke
7/7 Test #7: blt_cuda_runtime_smoke ........... Passed 0.04 sec
100% tests passed, 0 tests failed out of 7
Total Test time (real) = 2.47 sec