Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize implementation of cuda::std::conditional_t #2779

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

miscco
Copy link
Collaborator

@miscco miscco commented Nov 12, 2024

The current implementation always instantiates conditional at least once.

However, we can get aways with a total of 2 class instantiations by using the implementation of the internal _If alias template.

This not only makes conditional_t more efficient but also simplifies usage, because no one has to think about what to take

The current implementation always instantiates `conditional` at least once.

However, we can get aways with a total of 2 class instantiations by using the implementation of the internal `_If`

This not only makes `conditional_t` more efficient but also simplifies usage, because no one has to think about what to take
Comment on lines +458 to +461
using BlevBufferSrcsOutT =
typename ::cuda::std::conditional_t<IsMemcpy, const void*, cub::detail::value_t<InputBufferIt>>;
using BlevBufferDstOutT =
typename ::cuda::std::conditional_t<IsMemcpy, void*, cub::detail::value_t<OutputBufferIt>>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
using BlevBufferSrcsOutT =
typename ::cuda::std::conditional_t<IsMemcpy, const void*, cub::detail::value_t<InputBufferIt>>;
using BlevBufferDstOutT =
typename ::cuda::std::conditional_t<IsMemcpy, void*, cub::detail::value_t<OutputBufferIt>>;
using BlevBufferSrcsOutT =
::cuda::std::conditional_t<IsMemcpy, const void*, cub::detail::value_t<InputBufferIt>>;
using BlevBufferDstOutT =
::cuda::std::conditional_t<IsMemcpy, void*, cub::detail::value_t<OutputBufferIt>>;

@@ -178,8 +178,7 @@ struct csr_matrix

void print_internals(std::ostream& out) const
{
out << (HostStorage ? "host" : "device") << "_csr_matrix"
<< "(" << m_num_rows << ", " << m_num_columns << ")\n"
out << (HostStorage ? "host" : "device") << "_csr_matrix" << "(" << m_num_rows << ", " << m_num_columns << ")\n"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wonder why this changed

struct __conditional_impl<false>
{
template <class _If, class _Else>
using type _CCCL_NODEBUG_ALIAS = _Else;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it doesn't much matter in this case, but our metaprogramming utilities like to call this nested alias template __call. then conditional_t could be an alias for __type_call2<__conditional_impl<_Cond>, _If, _Else>.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants