Skip to content

Latest commit

 

History

History
69 lines (46 loc) · 2.82 KB

README.md

File metadata and controls

69 lines (46 loc) · 2.82 KB

cuBLASLt Library API examples

Description

This folder demonstrates cuBLASLt library API usage.

Samples

  • LtDgemmPresetAlgo

    Sample wrapper executing double precision gemm with a predefined algorithm using cublasLtMatmul, nearly a drop-in replacement for cublasDgemm, with addition of the workspace to support split-K algorithms.

  • LtHSHgemmStridedBatchSimple

    Sample wrapper executing mixed precision gemm with cublasLtMatmul, nearly a drop-in replacement for cublasGemmEx, with addition of the workspace to support split-K algorithms.

  • LtIgemmTensor

    Use cublasLtMatmul to perform tensor-op Igemm with memory order transforms on all buffers.

  • LtPlanarComplex

    Use cublasLtMatmul to perform tensor-op Cgemm using planar complex memory layout and half-precision inputs.

  • LtSgemm

    Sample wrapper executing single precision gemm with cublasLtMatmul, nearly a drop-in replacement for cublasSgemm, with addition of the workspace to support split-K algorithms.

  • LtSgemmCustomFind

    Sample wrapper running through multiple algo and config attributes combination for single precision gemm using cublasLt low-level API.

  • LtSgemmSimpleAutoTuning

    Sample wrapper executing single precision gemm algorithm auto tuning by querying cublasLt heuristics for best algorithms, iterate over the results and pick the algorithm that have the best performance for the given problem.

Supported SM Architectures

SM 5.0 SM 5.2 SM 5.3 SM 6.0 SM 6.1 SM 6.2 SM 7.0 SM 7.2 SM 7.5 SM 8.0

Supported OSes

Linux & Windows

Supported CPU Architecture

x86_64

Tested Platforms:

  • Windows 10
  • Ubuntu 18.04

cuBLAS Documentation

cuBLAS

Prerequisites

  • A Linux/Windows system with NVIDIA driver of version 450.41 and above.
  • CUDA 11.0 toolkit.
  • CMake 3.10 and above
  • Compiler with C++ 11 or above capabilities

Build Steps

    git clone https://github.com/NVIDIA/CUDALibrarySamples.git
    cd CUDALibrarySamples/cuBLASLt
    mkdir build
    cd build
    cmake .. -DCMAKE_BUILD_TYPE=Release
    make -j

On Windows, instead of running the last build step, open the Visual Studio Solution that was created and build.