Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARSE] Update oneMKL backends to match new sparse API #500

Merged
merged 41 commits into from
Sep 6, 2024
Merged
Show file tree
Hide file tree
Changes from 35 commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
1f53e3e
[SPARSE] Update oneMKL backends to match new sparse API
Rbiessy Jun 7, 2024
3a967be
Do not allow changing data types of dense handles
Rbiessy Jun 7, 2024
9667522
Add check container type is not changed
Rbiessy Jun 7, 2024
c64b57c
Fix is_ptr_accessible_on_host
Rbiessy Jun 7, 2024
02483e7
Check workspace container is compatible with the handles
Rbiessy Jun 7, 2024
5cb4518
Fix example static_cast
Rbiessy Jun 7, 2024
7190c6a
Disallow symmetric/hermitian conjtrans configurations for spmv
Rbiessy Jun 7, 2024
d68155b
Remove enable_if from template instantiations
Rbiessy Jun 13, 2024
31d0a5f
More generic exception message for unimplemented exceptions
Rbiessy Jun 27, 2024
42dba2e
Force at least one element in random sparse matrices
Rbiessy Jun 27, 2024
3436abe
Test more sizes with spmm
Rbiessy Jun 27, 2024
b0ecc40
Use default beta for spmv with long indices
Rbiessy Jun 27, 2024
84fa7f8
Fix nnz in tests resetting data
Rbiessy Jun 27, 2024
a79c5af
Fix invalid accesses in tests
Rbiessy Jun 27, 2024
d452846
Test scalars on device memory
Rbiessy Jun 28, 2024
002d788
Add documentation for alpha and beta limitations
Rbiessy Jul 2, 2024
5c37ee3
Reword and format mkl_handles comments
Rbiessy Jul 2, 2024
9dbd67b
Replace __FUNCTION__ with __func__
Rbiessy Jul 2, 2024
c2b89f5
Allow to access host USM allocations on the host
Rbiessy Jul 3, 2024
2e9fb7c
Merge branch 'develop' into romain/update_sparse_mkl
Rbiessy Jul 4, 2024
d109332
Remove version from known limitations
Rbiessy Jul 4, 2024
9b9548a
Disable spsv symmetric conjtrans
Rbiessy Jul 4, 2024
82566e5
Test symmetric with complex types and hermitian and conjtrans with re…
Rbiessy Jul 9, 2024
0bac3d4
Merge operations in one file
Rbiessy Jul 12, 2024
d04452a
Make get_data_type constexpr
Rbiessy Jul 12, 2024
e8eac87
Remove unused macro TEST_RUN_CT_SELECT
Rbiessy Jul 15, 2024
df584ca
Merge branch 'develop' into romain/update_sparse_mkl
Rbiessy Jul 17, 2024
2a59f22
clang-format
Rbiessy Jul 17, 2024
43f4669
Take string as reference
Rbiessy Jul 18, 2024
2f59edc
Reduce number of calls to get_pointer_type
Rbiessy Jul 19, 2024
cd1a71a
Wait before freeing USM pointers
Rbiessy Jul 29, 2024
95a902a
Move example static_cast
Rbiessy Aug 16, 2024
d76ca03
Make buffer optimize functions asynchronous
Rbiessy Aug 20, 2024
0c34cfc
Remove fill_buffer_to_0
Rbiessy Aug 16, 2024
aff9ee2
format with clang-format-9
Rbiessy Aug 20, 2024
c7a4420
Throw unsupported for spmv using symmetric or hermitian + conjtrans
Rbiessy Aug 26, 2024
5f8e183
Add checks that buffer_size and optimize functions are called before …
Rbiessy Aug 26, 2024
6a533df
clang-format-9
Rbiessy Aug 27, 2024
4f39b22
Move check for incompatible container earlier
Rbiessy Aug 27, 2024
0ecb032
Reword exception
Rbiessy Sep 2, 2024
b6f5a3a
Improve function name in exceptions
Rbiessy Sep 4, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 36 additions & 0 deletions docs/domains/sparse_linear_algebra.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
.. _onemkl_sparse_linear_algebra:

Sparse Linear Algebra
=====================

See the latest specification for the sparse domain `here
<https://oneapi-spec.uxlfoundation.org/specifications/oneapi/latest/elements/onemkl/source/domains/spblas/spblas>`_.

This page documents implementation specific or backend specific details of the
sparse domain.

OneMKL Intel CPU and GPU backends
---------------------------------

Currently known limitations:

- All operations' algorithms except ``no_optimize_alg`` map to the default
algorithm.
- The required external workspace size is always 0 bytes.
- ``oneapi::mkl::sparse::set_csr_data`` and
``oneapi::mkl::sparse::set_coo_data`` functions cannot be used on a handle
that has already been used for an operation or its optimize function. Doing so
will throw an ``oneapi::mkl::unimplemented`` exception.
- Using ``spsv`` with the ``oneapi::mkl::sparse::spsv_alg::no_optimize_alg`` and
a sparse matrix that does not have the
``oneapi::mkl::sparse::matrix_property::sorted`` property will throw an
``oneapi::mkl::unimplemented`` exception.
- Using ``spmm`` on Intel GPU with a sparse matrix that is
``oneapi::mkl::transpose::conjtrans`` and has the
``oneapi::mkl::sparse::matrix_property::symmetric`` property will throw an
``oneapi::mkl::unimplemented`` exception.
- Using ``spsv`` on Intel GPU with a sparse matrix that is
``oneapi::mkl::transpose::conjtrans`` and will throw an
``oneapi::mkl::unimplemented`` exception.
- Scalar parameters ``alpha`` and ``beta`` should be host pointers to prevent
synchronizations and copies to the host.
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -33,4 +33,5 @@ Contents

onemkl-datatypes.rst
domains/dense_linear_algebra.rst
domains/sparse_linear_algebra.rst
create_new_backend.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,8 @@ endif()
include(WarningsUtils)

foreach(backend ${SPARSE_BLAS_BACKENDS})
set(EXAMPLE_NAME example_sparse_blas_gemv_usm_${backend})
add_executable(${EXAMPLE_NAME} sparse_blas_gemv_usm_${backend}.cpp)
set(EXAMPLE_NAME example_sparse_blas_spmv_usm_${backend})
add_executable(${EXAMPLE_NAME} sparse_blas_spmv_usm_${backend}.cpp)
target_include_directories(${EXAMPLE_NAME}
PUBLIC ${PROJECT_SOURCE_DIR}/examples/include
PUBLIC ${PROJECT_SOURCE_DIR}/include
Expand All @@ -39,6 +39,6 @@ foreach(backend ${SPARSE_BLAS_BACKENDS})
target_link_libraries(${EXAMPLE_NAME} PRIVATE ONEMKL::SYCL::SYCL onemkl_sparse_blas_${backend})

# Register example as ctest
add_test(NAME sparse_blas/EXAMPLE/CT/sparse_blas_gemv_usm_${backend} COMMAND ${EXAMPLE_NAME})
add_test(NAME sparse_blas/EXAMPLE/CT/sparse_blas_spmv_usm_${backend} COMMAND ${EXAMPLE_NAME})
endforeach(backend)

Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
/*
*
* Content:
* This example demonstrates use of DPCPP API oneapi::mkl::sparse::gemv
* This example demonstrates use of DPCPP API oneapi::mkl::sparse::spmv
* using unified shared memory to perform general sparse matrix-vector
* multiplication on a INTEL CPU SYCL device.
*
Expand All @@ -32,7 +32,7 @@
*
*
* This example demonstrates only single precision (float) data type for
* gemv matrix data
* spmv matrix data
*
*
*******************************************************************************/
Expand Down Expand Up @@ -77,7 +77,7 @@ int run_sparse_matrix_vector_multiply_example(const sycl::device &cpu_dev) {
}
catch (sycl::exception const &e) {
std::cout << "Caught asynchronous SYCL "
"exception during sparse::gemv:\n"
"exception during sparse::spmv:\n"
<< e.what() << std::endl;
}
}
Expand Down Expand Up @@ -128,7 +128,10 @@ int run_sparse_matrix_vector_multiply_example(const sycl::device &cpu_dev) {
//

oneapi::mkl::transpose transA = oneapi::mkl::transpose::nontrans;
std::cout << "\n\t\tsparse::gemv parameters:\n";
oneapi::mkl::sparse::spmv_alg alg = oneapi::mkl::sparse::spmv_alg::default_alg;
oneapi::mkl::sparse::matrix_view A_view;

std::cout << "\n\t\tsparse::spmv parameters:\n";
std::cout << "\t\t\ttransA = "
<< (transA == oneapi::mkl::transpose::nontrans
? "nontrans"
Expand All @@ -137,23 +140,49 @@ int run_sparse_matrix_vector_multiply_example(const sycl::device &cpu_dev) {
std::cout << "\t\t\tnrows = " << nrows << std::endl;
std::cout << "\t\t\talpha = " << alpha << ", beta = " << beta << std::endl;

// create and initialize handle for a Sparse Matrix in CSR format
oneapi::mkl::sparse::matrix_handle_t handle = nullptr;

oneapi::mkl::sparse::init_matrix_handle(cpu_selector, &handle);

auto ev_set = oneapi::mkl::sparse::set_csr_data(cpu_selector, handle, nrows, nrows, nnz,
oneapi::mkl::index_base::zero, ia, ja, a);

auto ev_opt = oneapi::mkl::sparse::optimize_gemv(cpu_selector, transA, handle, { ev_set });

auto ev_gemv =
oneapi::mkl::sparse::gemv(cpu_selector, transA, alpha, handle, x, beta, y, { ev_opt });

auto ev_release =
oneapi::mkl::sparse::release_matrix_handle(cpu_selector, &handle, { ev_gemv });

ev_release.wait_and_throw();
// Create and initialize handle for a Sparse Matrix in CSR format
oneapi::mkl::sparse::matrix_handle_t A_handle = nullptr;
oneapi::mkl::sparse::init_csr_matrix(cpu_selector, &A_handle, nrows, nrows, nnz,
oneapi::mkl::index_base::zero, ia, ja, a);

// Create and initialize dense vector handles
oneapi::mkl::sparse::dense_vector_handle_t x_handle = nullptr;
oneapi::mkl::sparse::dense_vector_handle_t y_handle = nullptr;
oneapi::mkl::sparse::init_dense_vector(cpu_selector, &x_handle, sizevec, x);
oneapi::mkl::sparse::init_dense_vector(cpu_selector, &y_handle, sizevec, y);

// Create operation descriptor
oneapi::mkl::sparse::spmv_descr_t descr = nullptr;
oneapi::mkl::sparse::init_spmv_descr(cpu_selector, &descr);

// Allocate external workspace
std::size_t workspace_size = 0;
oneapi::mkl::sparse::spmv_buffer_size(cpu_selector, transA, &alpha, A_view, A_handle, x_handle,
&beta, y_handle, alg, descr, workspace_size);
void *workspace = sycl::malloc_device(workspace_size, cpu_queue);
Rbiessy marked this conversation as resolved.
Show resolved Hide resolved

// Optimize spmv
auto ev_opt =
oneapi::mkl::sparse::spmv_optimize(cpu_selector, transA, &alpha, A_view, A_handle, x_handle,
&beta, y_handle, alg, descr, workspace);

Rbiessy marked this conversation as resolved.
Show resolved Hide resolved
// Run spmv
auto ev_spmv = oneapi::mkl::sparse::spmv(cpu_selector, transA, &alpha, A_view, A_handle,
x_handle, &beta, y_handle, alg, descr, { ev_opt });

// Release handles and descriptor
std::vector<sycl::event> release_events;
release_events.push_back(
oneapi::mkl::sparse::release_dense_vector(cpu_selector, x_handle, { ev_spmv }));
release_events.push_back(
oneapi::mkl::sparse::release_dense_vector(cpu_selector, y_handle, { ev_spmv }));
release_events.push_back(
oneapi::mkl::sparse::release_sparse_matrix(cpu_selector, A_handle, { ev_spmv }));
release_events.push_back(
oneapi::mkl::sparse::release_spmv_descr(cpu_selector, descr, { ev_spmv }));
for (auto event : release_events) {
event.wait_and_throw();
}

//
// Post Processing
Expand Down Expand Up @@ -181,7 +210,7 @@ int run_sparse_matrix_vector_multiply_example(const sycl::device &cpu_dev) {
good &= check_result(res[row], z[row], nrows, row);
}

std::cout << "\n\t\t sparse::gemv example " << (good ? "passed" : "failed") << "\n\tFinished"
std::cout << "\n\t\t sparse::spmv example " << (good ? "passed" : "failed") << "\n\tFinished"
<< std::endl;

free_vec(fp_ptr_vec, cpu_queue);
Expand Down Expand Up @@ -211,7 +240,7 @@ void print_example_banner() {
std::cout << "# and alpha, beta are floating point type precision scalars." << std::endl;
std::cout << "# " << std::endl;
std::cout << "# Using apis:" << std::endl;
std::cout << "# sparse::gemv" << std::endl;
std::cout << "# sparse::spmv" << std::endl;
std::cout << "# " << std::endl;
std::cout << "# Using single precision (float) data type" << std::endl;
std::cout << "# " << std::endl;
Expand All @@ -232,22 +261,22 @@ int main(int /*argc*/, char ** /*argv*/) {
// TODO: Add cuSPARSE compile-time dispatcher in this example once it is supported.
sycl::device cpu_dev(sycl::cpu_selector_v);

std::cout << "Running Sparse BLAS GEMV USM example on CPU device." << std::endl;
std::cout << "Running Sparse BLAS SPMV USM example on CPU device." << std::endl;
std::cout << "Device name is: " << cpu_dev.get_info<sycl::info::device::name>()
<< std::endl;
std::cout << "Running with single precision real data type:" << std::endl;

run_sparse_matrix_vector_multiply_example<float, std::int32_t>(cpu_dev);
std::cout << "Sparse BLAS GEMV USM example ran OK." << std::endl;
std::cout << "Sparse BLAS SPMV USM example ran OK." << std::endl;
}
catch (sycl::exception const &e) {
std::cerr << "Caught synchronous SYCL exception during Sparse GEMV:" << std::endl;
std::cerr << "Caught synchronous SYCL exception during Sparse SPMV:" << std::endl;
std::cerr << "\t" << e.what() << std::endl;
std::cerr << "\tSYCL error code: " << e.code().value() << std::endl;
return 1;
}
catch (std::exception const &e) {
std::cerr << "Caught std::exception during Sparse GEMV:" << std::endl;
std::cerr << "Caught std::exception during Sparse SPMV:" << std::endl;
std::cerr << "\t" << e.what() << std::endl;
return 1;
}
Expand Down
2 changes: 1 addition & 1 deletion examples/sparse_blas/run_time_dispatching/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
include(WarningsUtils)

# Build object from all example sources
set(SPARSE_BLAS_RT_SOURCES "sparse_blas_gemv_usm")
set(SPARSE_BLAS_RT_SOURCES "sparse_blas_spmv_usm")
# Set up for the right backend for run-time dispatching examples
# If users build more than one backend (i.e. mklcpu and mklgpu, or mklcpu and CUDA), they may need to
# overwrite ONEAPI_DEVICE_SELECTOR in their environment to run on the desired backend
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
/*
*
* Content:
* This example demonstrates use of DPCPP API oneapi::mkl::sparse::gemv
* This example demonstrates use of DPCPP API oneapi::mkl::sparse::spmv
* using unified shared memory to perform general sparse matrix-vector
* multiplication on a SYCL device (HOST, CPU, GPU) that is selected
* during runtime.
Expand All @@ -33,7 +33,7 @@
*
*
* This example demonstrates only single precision (float) data type for
* gemv matrix data
* spmv matrix data
*
*
*******************************************************************************/
Expand Down Expand Up @@ -78,7 +78,7 @@ int run_sparse_matrix_vector_multiply_example(const sycl::device &dev) {
}
catch (sycl::exception const &e) {
std::cout << "Caught asynchronous SYCL "
"exception during sparse::gemv:\n"
"exception during sparse::spmv:\n"
<< e.what() << std::endl;
}
}
Expand All @@ -93,6 +93,7 @@ int run_sparse_matrix_vector_multiply_example(const sycl::device &dev) {
std::size_t sizeja = static_cast<std::size_t>(27 * nrows);
std::size_t sizeia = static_cast<std::size_t>(nrows + 1);
std::size_t sizevec = static_cast<std::size_t>(nrows);
auto sizevec_i64 = static_cast<std::int64_t>(sizevec);

ia = (intType *)sycl::malloc_shared(sizeia * sizeof(intType), main_queue);
ja = (intType *)sycl::malloc_shared(sizeja * sizeof(intType), main_queue);
Expand Down Expand Up @@ -128,7 +129,10 @@ int run_sparse_matrix_vector_multiply_example(const sycl::device &dev) {
//

oneapi::mkl::transpose transA = oneapi::mkl::transpose::nontrans;
std::cout << "\n\t\tsparse::gemv parameters:\n";
oneapi::mkl::sparse::spmv_alg alg = oneapi::mkl::sparse::spmv_alg::default_alg;
oneapi::mkl::sparse::matrix_view A_view;

std::cout << "\n\t\tsparse::spmv parameters:\n";
std::cout << "\t\t\ttransA = "
<< (transA == oneapi::mkl::transpose::nontrans
? "nontrans"
Expand All @@ -137,22 +141,49 @@ int run_sparse_matrix_vector_multiply_example(const sycl::device &dev) {
std::cout << "\t\t\tnrows = " << nrows << std::endl;
std::cout << "\t\t\talpha = " << alpha << ", beta = " << beta << std::endl;

// create and initialize handle for a Sparse Matrix in CSR format
oneapi::mkl::sparse::matrix_handle_t handle = nullptr;

oneapi::mkl::sparse::init_matrix_handle(main_queue, &handle);

auto ev_set = oneapi::mkl::sparse::set_csr_data(main_queue, handle, nrows, nrows, nnz,
oneapi::mkl::index_base::zero, ia, ja, a);

auto ev_opt = oneapi::mkl::sparse::optimize_gemv(main_queue, transA, handle, { ev_set });

auto ev_gemv =
oneapi::mkl::sparse::gemv(main_queue, transA, alpha, handle, x, beta, y, { ev_opt });

auto ev_release = oneapi::mkl::sparse::release_matrix_handle(main_queue, &handle, { ev_gemv });

ev_release.wait_and_throw();
// Create and initialize handle for a Sparse Matrix in CSR format
oneapi::mkl::sparse::matrix_handle_t A_handle = nullptr;
oneapi::mkl::sparse::init_csr_matrix(main_queue, &A_handle, nrows, nrows, nnz,
oneapi::mkl::index_base::zero, ia, ja, a);

// Create and initialize dense vector handles
oneapi::mkl::sparse::dense_vector_handle_t x_handle = nullptr;
oneapi::mkl::sparse::dense_vector_handle_t y_handle = nullptr;
oneapi::mkl::sparse::init_dense_vector(main_queue, &x_handle, sizevec_i64, x);
oneapi::mkl::sparse::init_dense_vector(main_queue, &y_handle, sizevec_i64, y);

// Create operation descriptor
oneapi::mkl::sparse::spmv_descr_t descr = nullptr;
oneapi::mkl::sparse::init_spmv_descr(main_queue, &descr);

// Allocate external workspace
std::size_t workspace_size = 0;
oneapi::mkl::sparse::spmv_buffer_size(main_queue, transA, &alpha, A_view, A_handle, x_handle,
&beta, y_handle, alg, descr, workspace_size);
void *workspace = sycl::malloc_device(workspace_size, main_queue);

// Optimize spmv
auto ev_opt =
oneapi::mkl::sparse::spmv_optimize(main_queue, transA, &alpha, A_view, A_handle, x_handle,
&beta, y_handle, alg, descr, workspace);

// Run spmv
auto ev_spmv = oneapi::mkl::sparse::spmv(main_queue, transA, &alpha, A_view, A_handle, x_handle,
&beta, y_handle, alg, descr, { ev_opt });

// Release handles and descriptor
std::vector<sycl::event> release_events;
release_events.push_back(
oneapi::mkl::sparse::release_dense_vector(main_queue, x_handle, { ev_spmv }));
release_events.push_back(
oneapi::mkl::sparse::release_dense_vector(main_queue, y_handle, { ev_spmv }));
release_events.push_back(
oneapi::mkl::sparse::release_sparse_matrix(main_queue, A_handle, { ev_spmv }));
release_events.push_back(
oneapi::mkl::sparse::release_spmv_descr(main_queue, descr, { ev_spmv }));
for (auto event : release_events) {
event.wait_and_throw();
}

//
// Post Processing
Expand Down Expand Up @@ -180,7 +211,7 @@ int run_sparse_matrix_vector_multiply_example(const sycl::device &dev) {
good &= check_result(res[row], z[row], nrows, row);
}

std::cout << "\n\t\t sparse::gemv example " << (good ? "passed" : "failed") << "\n\tFinished"
std::cout << "\n\t\t sparse::spmv example " << (good ? "passed" : "failed") << "\n\tFinished"
<< std::endl;

free_vec(fp_ptr_vec, main_queue);
Expand Down Expand Up @@ -210,7 +241,7 @@ void print_example_banner() {
std::cout << "# and alpha, beta are floating point type precision scalars." << std::endl;
std::cout << "# " << std::endl;
std::cout << "# Using apis:" << std::endl;
std::cout << "# sparse::gemv" << std::endl;
std::cout << "# sparse::spmv" << std::endl;
std::cout << "# " << std::endl;
std::cout << "# Using single precision (float) data type" << std::endl;
std::cout << "# " << std::endl;
Expand All @@ -234,28 +265,28 @@ int main(int /*argc*/, char ** /*argv*/) {
sycl::device dev = sycl::device();

if (dev.is_gpu()) {
std::cout << "Running Sparse BLAS GEMV USM example on GPU device." << std::endl;
std::cout << "Running Sparse BLAS SPMV USM example on GPU device." << std::endl;
std::cout << "Device name is: " << dev.get_info<sycl::info::device::name>()
<< std::endl;
}
else {
std::cout << "Running Sparse BLAS GEMV USM example on CPU device." << std::endl;
std::cout << "Running Sparse BLAS SPMV USM example on CPU device." << std::endl;
std::cout << "Device name is: " << dev.get_info<sycl::info::device::name>()
<< std::endl;
}
std::cout << "Running with single precision real data type:" << std::endl;

run_sparse_matrix_vector_multiply_example<float, std::int32_t>(dev);
std::cout << "Sparse BLAS GEMV USM example ran OK." << std::endl;
std::cout << "Sparse BLAS SPMV USM example ran OK." << std::endl;
}
catch (sycl::exception const &e) {
std::cerr << "Caught synchronous SYCL exception during Sparse GEMV:" << std::endl;
std::cerr << "Caught synchronous SYCL exception during Sparse SPMV:" << std::endl;
std::cerr << "\t" << e.what() << std::endl;
std::cerr << "\tSYCL error code: " << e.code().value() << std::endl;
return 1;
}
catch (std::exception const &e) {
std::cerr << "Caught std::exception during Sparse GEMV:" << std::endl;
std::cerr << "Caught std::exception during Sparse SPMV:" << std::endl;
std::cerr << "\t" << e.what() << std::endl;
return 1;
}
Expand Down
Loading
Loading