Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Llama-3_1-Nemotron-51B #10669

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

ymcki
Copy link

@ymcki ymcki commented Dec 5, 2024

Make sure to read the contributing guidelines before submitting a PR

More details is here:
#10648

Seems like my changes in vocab.py doesn't really break CI test.

It might be a better idea to not to modify vocab.py but instead ask the user to fix the tokenizer_config.json instead. In that case, you can ignore the changes I made in vocab.py.

@github-actions github-actions bot added the python python script changes label Dec 5, 2024
@bartowski1182
Copy link
Contributor

I wonder if it's also a better idea not to group this with the normal llama archs since it requires so many changes, may be better to make it its own model type?

@ymcki
Copy link
Author

ymcki commented Dec 6, 2024

I wonder if it's also a better idea not to group this with the normal llama archs since it requires so many changes, may be better to make it its own model type?

I think src/llama.cpp doesn't change that much but convert_hf_to_gguf.py does have more changes. Anyway, I can make another fork to make it a separate model type and submit another pull request.

What do you think about the vocab.py problem? Should I just leave the original vocab.py as is and ask people to fix tokenizer_config.json instead?

@ymcki
Copy link
Author

ymcki commented Dec 6, 2024

Created a separate Deci Model. This version doesn't change vocab.py and relies on people manually fixing 51B model's tokenizer_config.json.

@ymcki
Copy link
Author

ymcki commented Dec 11, 2024

Any updates?

src/llama.cpp Outdated
Comment on lines 11063 to 11065
if (n_head == 0) // attention-free layer of Llama-3_1-Nemotron-51B
cur = inpL;
else {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (n_head == 0) // attention-free layer of Llama-3_1-Nemotron-51B
cur = inpL;
else {
if (n_head == 0) { // attention-free layer of Llama-3_1-Nemotron-51B
cur = inpL;
} else {

src/llama.cpp Outdated
Comment on lines 11076 to 11078
} else if (n_head > 0)
// self-attention
{
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
} else if (n_head > 0)
// self-attention
{
} else if (n_head > 0) {
// self-attention

@ngxson
Copy link
Collaborator

ngxson commented Dec 12, 2024

To fix editorconfig / flake8 tests, you need to modify your source code to remove trailing spaces / add new line.

And to fix server CI, you need to merge latest commits from master branch

@ymcki
Copy link
Author

ymcki commented Dec 15, 2024

Can someone approve the workflows?

@ymcki ymcki reopened this Dec 19, 2024
@ymcki
Copy link
Author

ymcki commented Dec 19, 2024

Yay! Finally passed all checks! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python python script changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants