Support for Llama-3_1-Nemotron-51B #10669

ymcki · 2024-12-05T06:57:56Z

Make sure to read the contributing guidelines before submitting a PR

More details is here:
#10648

Seems like my changes in vocab.py doesn't really break CI test.

It might be a better idea to not to modify vocab.py but instead ask the user to fix the tokenizer_config.json instead. In that case, you can ignore the changes I made in vocab.py.

bartowski1182 · 2024-12-06T04:45:05Z

I wonder if it's also a better idea not to group this with the normal llama archs since it requires so many changes, may be better to make it its own model type?

ymcki · 2024-12-06T06:16:08Z

I wonder if it's also a better idea not to group this with the normal llama archs since it requires so many changes, may be better to make it its own model type?

I think src/llama.cpp doesn't change that much but convert_hf_to_gguf.py does have more changes. Anyway, I can make another fork to make it a separate model type and submit another pull request.

What do you think about the vocab.py problem? Should I just leave the original vocab.py as is and ask people to fix tokenizer_config.json instead?

ymcki · 2024-12-06T15:56:23Z

Created a separate Deci Model. This version doesn't change vocab.py and relies on people manually fixing 51B model's tokenizer_config.json.

ymcki · 2024-12-11T07:38:40Z

Any updates?

ggerganov · 2024-12-12T09:00:37Z

src/llama.cpp

+            if (n_head == 0) // attention-free layer of Llama-3_1-Nemotron-51B
+                cur = inpL;
+            else {


Suggested change

if (n_head == 0) // attention-free layer of Llama-3_1-Nemotron-51B

cur = inpL;

else {

if (n_head == 0) { // attention-free layer of Llama-3_1-Nemotron-51B

cur = inpL;

} else {

ggerganov · 2024-12-12T09:00:54Z

src/llama.cpp

+            } else if (n_head > 0)
+            // self-attention
+            {


Suggested change

} else if (n_head > 0)

// self-attention

{

} else if (n_head > 0) {

// self-attention

ngxson · 2024-12-12T15:53:35Z

To fix editorconfig / flake8 tests, you need to modify your source code to remove trailing spaces / add new line.

And to fix server CI, you need to merge latest commits from master branch

ymcki · 2024-12-15T23:51:32Z

Can someone approve the workflows?

ymcki · 2024-12-19T03:09:33Z

Yay! Finally passed all checks! :)

github-actions bot added the python python script changes label Dec 5, 2024

ymcki closed this Dec 6, 2024

ymcki force-pushed the master branch from e2afcc0 to 6c5bc06 Compare December 6, 2024 10:41

ymcki reopened this Dec 6, 2024

ggerganov approved these changes Dec 12, 2024

View reviewed changes

ymcki closed this Dec 19, 2024

ymcki force-pushed the master branch from ef82afb to 9177484 Compare December 19, 2024 01:45

conflict resolution

ecad966

ymcki reopened this Dec 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for Llama-3_1-Nemotron-51B #10669

Support for Llama-3_1-Nemotron-51B #10669

ymcki commented Dec 5, 2024

bartowski1182 commented Dec 6, 2024

ymcki commented Dec 6, 2024

ymcki commented Dec 6, 2024

ymcki commented Dec 11, 2024

ggerganov Dec 12, 2024

ggerganov Dec 12, 2024

ngxson commented Dec 12, 2024

ymcki commented Dec 15, 2024

ymcki commented Dec 19, 2024

Support for Llama-3_1-Nemotron-51B #10669

Are you sure you want to change the base?

Support for Llama-3_1-Nemotron-51B #10669

Conversation

ymcki commented Dec 5, 2024

bartowski1182 commented Dec 6, 2024

ymcki commented Dec 6, 2024

ymcki commented Dec 6, 2024

ymcki commented Dec 11, 2024

ggerganov Dec 12, 2024

Choose a reason for hiding this comment

ggerganov Dec 12, 2024

Choose a reason for hiding this comment

ngxson commented Dec 12, 2024

ymcki commented Dec 15, 2024

ymcki commented Dec 19, 2024