Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extracting all lanes from a v128 #1529

Open
verbessern opened this issue Sep 16, 2024 · 1 comment
Open

Extracting all lanes from a v128 #1529

verbessern opened this issue Sep 16, 2024 · 1 comment
Labels

Comments

@verbessern
Copy link

verbessern commented Sep 16, 2024

At the moment (as far as I see) the only way to extract more then one lane from a v128 value is to do the following:

(local $t v128)
...
  local.tee $t
  f32x4.extract_lane 0
  local.get $t
  f32x4.extract_lane 1
  local.get $t
  f32x4.extract_lane 2
  local.get $t
  f32x4.extract_lane 3
...

This seems quite inefficient, and clearly has a large storage footprint. I'm wondering whether there is a need of [f32x4,f64x2,...].extract_all.

@sunfishcode
Copy link
Member

Some things that would be useful to motivate this issue are:

  • How often does this sequence occur in realistic applications?
  • What CPU architectures have SIMD instructions that could be used to optimize extract_all better than 4 separate extracts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants