Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[oneMKL][DFT] Specification for external workspaces for DFTs #509

Merged
merged 12 commits into from
Dec 19, 2023
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
.. SPDX-FileCopyrightText: Codeplay Software
..
.. SPDX-License-Identifier: CC-BY-4.0

.. _onemkl_dft_config_workspace_placement:

Workspace placement
--------------------------------------

DFT implementations often require temporary storage for intermediate data whilst computing DFTs.
This temporary storage is referred to as a *workspace*.
Whilst this is managed automatically by default (``config_param::WORKSPACE_AUTOMATIC``),
hjabird marked this conversation as resolved.
Show resolved Hide resolved
it may be preferable to provide an external workspace (``config_param::WORKSPACE_EXTERNAL``) for the following reasons:
hjabird marked this conversation as resolved.
Show resolved Hide resolved

* To reduce the number of GPU mallocs / frees
hjabird marked this conversation as resolved.
Show resolved Hide resolved
* To reduce memory consumption
hjabird marked this conversation as resolved.
Show resolved Hide resolved
hjabird marked this conversation as resolved.
Show resolved Hide resolved

A typical workflow for using ``config_param::WORKSPACE_EXTERNAL`` is given in the section :ref:`onemkl_dft_typical_usage_of_workspace_external`.
hjabird marked this conversation as resolved.
Show resolved Hide resolved

WORKSPACE_PLACEMENT
+++++++++++++++++++

For ``config_param::WORKSPACE_PLACEMENT``, valid configuration values are ``config_value::WORKSPACE_AUTOMATIC`` and ``config_value::WORKSPACE_EXTERNAL``.

.. container:: section

.. _onemkl_dft_config_value_workspace_automatic:

.. rubric:: WORKSPACE_AUTOMATIC
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the indentation intentional? It results in funky indents in the html page

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes for _onemkl_dft_config_value_workspace_automatic - otherwise I get WARNING: Content block expected for the "container" directive; none found. The indent of ..rubric: doesn't give any difference in the page that I can see, nor does it affect warnings.

I'm new to writing in restructured text (thank you for your comments!), so do you think I should be formatting this differently. Perhaps I should get rid of the .. container:: sections, in which case no indents are needed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do see a slight difference in the sub-title indentation when removing the indent before ..rubric:: on my end, but it might be browser-dependent: I honestly don't know.
I would not claim any expertise in restructured text myself, my own efforts were heavily guided by trial-and-error attempts and very specific web searches. I must say that the purpose of the .. container:: section directives remains very unclear to me. I opted for removing them in my own changes when that did not affect the end result, personally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed the ..container in e139d35, to follow the style of your previous PR. Hopefully this also removes the indentation.


The default value for the ``config_param::WORKSPACE_PLACEMENT`` is ``config_value::WORKSPACE_AUTOMATIC``.

When set to ``config_value::WORKSPACE_AUTOMATIC`` the user does not need to provide an external workspace. The workspace will be automatically managed by the backend library.

.. container:: section

.. _onemkl_dft_config_value_workspace_external:

.. rubric:: WORKSPACE_EXTERNAL

The configuration ``config_param::WORKSPACE_PLACEMENT`` can be set to
``config_value::WORKSPACE_EXTERNAL`` to allow the workspace to be set manually.

When a descriptor is committed with ``config_value::WORKSPACE_EXTERNAL`` set,
hjabird marked this conversation as resolved.
Show resolved Hide resolved
the user must provide an external workspace.
hjabird marked this conversation as resolved.
Show resolved Hide resolved
See :ref:`onemkl_dft_descriptor_set_workspace` and :ref:`onemkl_dft_typical_usage_of_workspace_external`.

.. _onemkl_dft_typical_usage_of_workspace_external:

Typical usage of ``WORKSPACE_EXTERNAL``
+++++++++++++++++++++++++++++++++++++++

Usage of ``WORKSPACE_EXTERNAL`` typically involves the following order of operations:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Usage of ``WORKSPACE_EXTERNAL`` typically involves the following order of operations:
Usage of ``config_value::WORKSPACE_EXTERNAL`` typically involves the following order of operations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use of namespace done in 749f614.

I've kept the colon as advised in this english stack-exchange question.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the stack-exchange reference: it's always helpful for non-native speakers (and appreciated) :-)
My motivation - biased from my own native language - was that list elements following a colon are not supposed to be capitalized and should end with a semi-colon. However, I understand that it is not necessarily transferrable to English as your link points out. That being said, I had suggested changes in another bullet list the same page, above: I let you judge what's best in your opinion in either case but my personal preference is to use the same convention for similar cases.


#. ``WORKSPACE_EXTERNAL`` is set for the uncommitted descriptor.
hjabird marked this conversation as resolved.
Show resolved Hide resolved
#. The descriptor is committed.
#. The required workspace size is queried.
#. A workspace of sufficient size is provided to the descriptor.
#. Compute functions following the type of external workspace provided are called.
hjabird marked this conversation as resolved.
Show resolved Hide resolved
#. The user is responsible for freeing the external workspace.

This is shown in the following example code:

.. code-block:: cpp

// Create a descriptor
mkl::dft::descriptor<mkl::dft::precision::SINGLE, dom> desc(n);
// 1. Set the workspace placement to WORKSPACE_EXTERNAL
desc.set_value(mkl::dft::config_param::WORKSPACE_PLACEMENT,
mkl::dft::config_value::WORKSPACE_EXTERNAL);
// Set further configuration parameters
// ...
// 2. Commit the descriptor
desc.commit(myQueue);
// 3. Query the required workspace size
std::int64_t workspaceBytes{0};
desc.get_value(mkl::dft::config_param::WORKSPACE_EXTERNAL_BYTES_REQUIRED, &workspaceBytes);
// Obtain a sufficiently large USM allocation or buffer. For this example, a USM allocation is used.
float* workspaceUsm = sycl::malloc_device<float>(workspaceBytes / sizeof(float), myQueue);
// 4. Set the workspace
desc.set_workspace(workspaceUsm);
// 5. Now USM compute functions can be called.


**Parent topic:** :ref:`onemkl_dft_enums`

82 changes: 82 additions & 0 deletions source/elements/oneMKL/source/domains/dft/descriptor.rst
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,9 @@ The ``descriptor`` class is defined in the ``oneapi::mkl::dft`` namespace.
void set_value(oneapi::mkl::dft::config_param param, ...);

void get_value(oneapi::mkl::dft::config_param param, ...);

void set_workspace(sycl::buffer<scalar_type, 1> &workspaceBuf);
void set_workspace(scalar_type* workspaceUSM);

void commit(sycl::queue &queue);

Expand Down Expand Up @@ -129,6 +132,8 @@ The ``descriptor`` class is defined in the ``oneapi::mkl::dft`` namespace.
* - :ref:`onemkl_dft_descriptor_get_value`
- Queries the configuration value associated with a particular
configuration parameter.
* - :ref:`onemkl_dft_descriptor_set_workspace`
- Set the external workspace to use when ``config_param::WORKSPACE_PLACEMENT`` is ``config_value::WORKSPACE_EXTERNAL``.
hjabird marked this conversation as resolved.
Show resolved Hide resolved
* - :ref:`onemkl_dft_descriptor_commit`
- Commits the ``descriptor`` object to enqueue the operations relevant
to the (batched) DFT(s) it determines to a given, user-provided
Expand Down Expand Up @@ -417,6 +422,83 @@ type ``oneapi::mkl::dft::domain``, ``oneapi::mkl::dft::precision``,

**Descriptor class member table:** :ref:`onemkl_dft_descriptor_member_table`

.. _onemkl_dft_descriptor_set_workspace:

set_workspace
+++++++++++++

Set the workspace for when ``config_param::WORKSPACE_PLACEMENT`` is set to ``config_value::WORKSPACE_EXTERNAL``.
hjabird marked this conversation as resolved.
Show resolved Hide resolved

.. rubric:: Description

This function sets the workspace to use when computing DFTs for when an
external workspace is set.
This function may only be called after the descriptor has been committed.
The size of the provided workspace must be equal to or larger than the required
workspace size obtained by calling ``descriptor<prec, dom>::get_value(config_param::WORKSPACE_EXTERNAL_BYTES_REQUIRED, &workspaceBytes)``.

A descriptor where ``WORKSPACE_EXTERNAL`` is specified is not a valid descriptor
hjabird marked this conversation as resolved.
Show resolved Hide resolved
for compute calls until this function has been successfully called.

The type of workspace must match the compute calls for which it is used.
That is, if the workspace is provided as a ``sycl::buffer``, the compute
calls must also use ``sycl::buffer`` for their arguments. Likewise, a USM
allocated workspace must only be used with USM compute calls.
Failing to do this will result in an invalid descriptor for compute calls.
hjabird marked this conversation as resolved.
Show resolved Hide resolved

If the workspace is a USM allocation, the user must not use it for other purposes
in parallel whilst the DFT `compute_forward` or `compute_backward` are in progress.
hjabird marked this conversation as resolved.
Show resolved Hide resolved

This function can be called on committed descriptors where the workspace placement
is not ``config_value::WORKSPACE_EXTERNAL``. The provided workspace may or may not
be used in compute calls. However, the aforementioned restrictions will still apply.
hjabird marked this conversation as resolved.
Show resolved Hide resolved

.. rubric:: Syntax (buffer workspace)

.. code-block:: cpp

namespace oneapi::mkl::dft {

template <oneapi::mkl::dft::precision prec, oneapi::mkl::dft::domain dom>
void descriptor<prec,dom>::set_workspace(sycl::buffer<scalar_type, 1> &workspaceBuf);
}

.. rubric:: Syntax (USM workspace)

.. code-block:: cpp

namespace oneapi::mkl::dft {

template <oneapi::mkl::dft::precision prec, oneapi::mkl::dft::domain dom>
void descriptor<prec,dom>::set_workspace(scalar_type* workspaceUSM);

}

.. container:: section

.. rubric:: Input Parameters

workspaceBuf
hjabird marked this conversation as resolved.
Show resolved Hide resolved
A workspace buffer where ``scalar_type`` is the floating point type according to ``prec``. This buffer must be sufficiently large or an exception will be thrown.
hjabird marked this conversation as resolved.
Show resolved Hide resolved

workspaceUSM
A workspace USM allocation where ``scalar_type`` is the floating point type according to ``prec``. This allocation must be accessible on the device on which the descriptor is committed. It is assumed that this USM allocation is sufficiently large. The pointer is expected to be aligned to ``scalar_type``.
hjabird marked this conversation as resolved.
Show resolved Hide resolved

.. container:: section

.. rubric:: Throws

The `descriptor::set_workspace()` routine shall throw the following exceptions if the associated condition is detected. An implementation may throw additional implementation-specific exception(s) in case of error conditions not covered here:
hjabird marked this conversation as resolved.
Show resolved Hide resolved

:ref:`oneapi::mkl::invalid_argument()<onemkl_exception_invalid_argument>`
If the provided buffer ``workspaceBuf`` is not sufficiently large, or if the provided USM allocation ``workspaceUSM`` is ``nullptr`` when an external workspace of size greater than zero is required.

:ref:`oneapi::mkl::uninitialized()<onemkl_exception_uninitialized>`
If ``set_workspace`` is called before the descriptor is committed.


**Descriptor class member table:** :ref:`onemkl_dft_descriptor_member_table`

.. _onemkl_dft_descriptor_commit:

commit
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,10 @@ the :ref:`descriptor<onemkl_dft_descriptor>` class.
OUTPUT_STRIDES, // deprecated

FWD_DISTANCE,
BWD_DISTANCE
BWD_DISTANCE,

WORKSPACE_PLACEMENT,
WORKSPACE_EXTERNAL_BYTES_REQUIRED
hjabird marked this conversation as resolved.
Show resolved Hide resolved
};

Configuration parameters represented by ``config_param::FORWARD_DOMAIN`` and
Expand Down Expand Up @@ -262,6 +265,16 @@ the :ref:`descriptor<onemkl_dft_descriptor>` class.
:math:`M > 1`.
- | ``std::int64_t``
| [0]
* - :ref:`WORKSPACE_PLACEMENT<onemkl_dft_config_workspace_placement>`
- Some FFT algorithm computation steps require a scratch space for permutations or other purposes.
This parameter controls whether this scratch space is automatically allocated or provided by the user.
- | :ref:`onemkl_dft_config_workspace_placement` (possible values are ``config_value::WORKSPACE_AUTOMATIC`` or ``config_value::WORKSPACE_EXTERNAL``).
hjabird marked this conversation as resolved.
Show resolved Hide resolved
| [``config_value::WORKSPACE_AUTOMATIC``]
* - WORKSPACE_EXTERNAL_BYTES_REQUIRED
- The required external workspace size in bytes when ``WORKSPACE_PLACEMENT`` is set to :ref:`config_value::WORKSPACE_EXTERNAL<onemkl_dft_config_value_workspace_automatic>`.
A read-only value.
- | ``std::int64_t``
hjabird marked this conversation as resolved.
Show resolved Hide resolved


.. _onemkl_dft_enum_config_value:

Expand All @@ -288,6 +301,10 @@ values associated with some
// for config_param::PLACEMENT
INPLACE,
NOT_INPLACE

// For config_param::WORKSPACE_PLACEMENT
WORKSPACE_AUTOMATIC,
WORKSPACE_EXTERNAL,
};

**Parent topic:** :ref:`onemkl_dft`
Expand All @@ -297,3 +314,4 @@ values associated with some

config_params/data_layouts
config_params/storage_formats
config_params/workspace_placement