mkn.gpu CUDA/HIP C++20 convenience wrappers ====== Whether you are using CUDA or ROCM, we attempt to deduce from available headers. If automatic detection fails, specify appropriate define like `-DMKN_GPU_CUDA=1` See: inc/mkn/gpu/defines.hpp Compile argument switches Key MKN_GPU_CUDA Type bool Default 0 Description activate CUDA as impl of mkn::gpu::* Key MKN_GPU_ROCM Type bool Default 0 Description activate ROCM as impl of mkn::gpu::* Key MKN_GPU_FN_PER_NS Type bool Default 0 Description expose functions explicitly via mkn::gpu::hip::* mkn::gpu::cuda::* Key _MKN_GPU_WARP_SIZE_ Type uint Default use manufacturer provided (eg warpSize), usually 32 Description override use if defined Key _MKN_GPU_THREADED_STREAM_LAUNCHER_WAIT_MS_ Type uint Default 1 Description Initial wait time in milliseconds for polling active jobs for completion Key _MKN_GPU_THREADED_STREAM_LAUNCHER_WAIT_MS_ADD_ Type uint Default 10 Description Additional wait time in milliseconds for polling active jobs for completion when no job is finished. Key _MKN_GPU_THREADED_STREAM_LAUNCHER_WAIT_MS_MAX_ Type uint Default 100 Description Max wait time in milliseconds for polling active jobs for completion when no job is finished. Key MKN_CPU_DO_NOT_DEFINE_DIM3 Type bool Default false Description if true, skips defining dim3 struct which is usually provided by gpu headers