You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Hi, I am suggesting my ideas on refactoring Ulysses.
post_all2all_func and .permute() seems unnecessary if we opt out for regular all_to_all function instead of all_to_all_single. After all_to_all, we can torch.cat along the gather_idx dimension to concatenate agnostically.
Double nested if loops due to batch_dim_idx makes code very hard to read. I'm thinking that splitting input tensor agnostic to batch_dim_idx and using all_to_all can also alleviate this?
Is your feature request related to a problem? Please describe.
Hi, I am suggesting my ideas on refactoring Ulysses.
post_all2all_func
and.permute()
seems unnecessary if we opt out for regularall_to_all
function instead ofall_to_all_single
. Afterall_to_all
, we cantorch.cat
along thegather_idx
dimension to concatenate agnostically.batch_dim_idx
makes code very hard to read. I'm thinking that splitting input tensor agnostic tobatch_dim_idx
and usingall_to_all
can also alleviate this?Questions from Commit 17ed7c7
Appreciate your feedback!
The text was updated successfully, but these errors were encountered: