Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Get/Set for vectors and use them to implement Concat* operators for RVV #2362

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

lsrcz
Copy link
Contributor

@lsrcz lsrcz commented Oct 23, 2024

This pull request

  1. implements the Get and Set operators for RVV, which correspond to the __riscv_vget* and __riscv_vset* intrinsics, respectively.
  2. optimizes the Concat* operators with Get and Set.

There are new type categories introduced (_GET_SET, _GET_SET_VIRT, _GET_SET_SMALLEST). The reason why new categories are introduces is:

  1. RVV only provide vget and vset intrinsics for non-fractional vector types as the whole register movement instructions are only available for LMUL >= 1. This corresponds to the _GET_SET category. We cannot simply reuse the _TRUNC category as it contains fractional vector types.
  2. The _GET_SET_VIRT category simulates the intrinsics using instructions other than whole register movement. The result is a register with half in LMUL.
  3. Similar to LowerHalf, it is useful to have Get and Set also for the smallest LMUL for each SEW. This is the _GET_SET_SMALLEST category, and the result would be the same as the original register type for this category.

For the Get operator for fractional vector types, we use Trunc, which is effectively a no-op, to extract the lower part, and we use SlideDown for the upper parts.

For the Set operator for fractional vector types, we use vmv with the tail undisturbed configuration to set the lower part. We use SlideUp for the upper parts.

The Concat operator can then be optimized with Get and Set. For operators other than ConcatLowerUpper, we need a Get and a Set, and the program should be optimal for all SEW/LMUL combinations. For the ConcatLowerUpper operator, we will need two Gets and two Sets. This would be optimal only when the Get operation is always a no-op, meaning that the LMUL for the extracted part must be greater or equal to zero.

@lsrcz lsrcz marked this pull request as draft October 25, 2024 22:48
@johnplatts
Copy link
Contributor

@lsrcz Could you please go ahead and merge the changes that I made in the lsrcz_hwy_concat_fixes_120224 branch of the https://github.com/johnplatts/jep_google_highway repository (which can be found at johnplatts@914cb69) into this pull request?

Here are the git commands for merging the changes in commit johnplatts@914cb69:

git remote add jep_hwy https://github.com/johnplatts/jep_google_highway.git
git fetch jep_hwy lsrcz_hwy_concat_fixes_120224
git merge jep_hwy/lsrcz_hwy_concat_fixes_120224

@lsrcz
Copy link
Contributor Author

lsrcz commented Dec 2, 2024

Hi @johnplatts, thanks for the changes! I am super busy at this time, but I will try to have a look at it as soon as possible.

@lsrcz
Copy link
Contributor Author

lsrcz commented Dec 18, 2024

Hi @johnplatts,

I finally got time to look at it -- thank you for your excellent work!

I think your solution addresses the issue in the original code where partial vectors weren't handled correctly (related to #2345), which I hadn't had time to resolve. It's an elegant approach, and I'll submit the code for review.

@lsrcz lsrcz marked this pull request as ready for review December 18, 2024 16:53
Copy link
Member

@jan-wassenberg jan-wassenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, thank you @johnplatts and @lsrcz :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants