Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Opinion on Refactoring Ulysses #6843

Open
Eugene29 opened this issue Dec 9, 2024 · 0 comments
Open

Opinion on Refactoring Ulysses #6843

Eugene29 opened this issue Dec 9, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@Eugene29
Copy link
Contributor

Eugene29 commented Dec 9, 2024

Is your feature request related to a problem? Please describe.
Hi, I am suggesting my ideas on refactoring Ulysses.

  1. post_all2all_func and .permute() seems unnecessary if we opt out for regular all_to_all function instead of all_to_all_single. After all_to_all, we can torch.cat along the gather_idx dimension to concatenate agnostically.
  2. Double nested if loops due to batch_dim_idx makes code very hard to read. I'm thinking that splitting input tensor agnostic to batch_dim_idx and using all_to_all can also alleviate this?

Questions from Commit 17ed7c7

  1. So interestingly, fusing QKV actually degrades performance if concat and slices are necessary?
  2. Do you think you can elaborate on pipelining QKV comm and GEMM? Is this referring to doing GEMM with data that stays during all2all?

Appreciate your feedback!

@Eugene29 Eugene29 added the enhancement New feature or request label Dec 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant