Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eval bug: llama-3.2-3B got error loading model: done_getting_tensors: wrong number of tensors; expected 255, got 254 #10759

Open
franklyd opened this issue Dec 10, 2024 · 1 comment

Comments

@franklyd
Copy link

Name and Version

version: 3411 (e02b597)
built with cc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0 for x86_64-linux-gnu

Operating systems

Linux

GGML backends

CPU

Hardware

NA

Models

llama-3.2-3B

Problem description & steps to reproduce

when I run ./llama-server -m <model_name>, got error:

llama_model_load: error loading model: done_getting_tensors: wrong number of tensors; expected 255, got 254

The model was converted using python convert_hf_to_gguf.py.

First Bad Commit

No response

Relevant log output

llama_model_loader: - type  f32:   58 tensors
llama_model_loader: - type  f16:  197 tensors
llm_load_vocab: special tokens cache size = 256
llm_load_vocab: token to piece cache size = 0.7999 MB
llm_load_print_meta: format           = GGUF V3 (latest)
llm_load_print_meta: arch             = llama
llm_load_print_meta: vocab type       = BPE
llm_load_print_meta: n_vocab          = 128256
llm_load_print_meta: n_merges         = 280147
llm_load_print_meta: vocab_only       = 0
llm_load_print_meta: n_ctx_train      = 131072
llm_load_print_meta: n_embd           = 3072
llm_load_print_meta: n_layer          = 28
llm_load_print_meta: n_head           = 24
llm_load_print_meta: n_head_kv        = 8
llm_load_print_meta: n_rot            = 128
llm_load_print_meta: n_swa            = 0
llm_load_print_meta: n_embd_head_k    = 128
llm_load_print_meta: n_embd_head_v    = 128
llm_load_print_meta: n_gqa            = 3
llm_load_print_meta: n_embd_k_gqa     = 1024
llm_load_print_meta: n_embd_v_gqa     = 1024
llm_load_print_meta: f_norm_eps       = 0.0e+00
llm_load_print_meta: f_norm_rms_eps   = 1.0e-05
llm_load_print_meta: f_clamp_kqv      = 0.0e+00
llm_load_print_meta: f_max_alibi_bias = 0.0e+00
llm_load_print_meta: f_logit_scale    = 0.0e+00
llm_load_print_meta: n_ff             = 8192
llm_load_print_meta: n_expert         = 0
llm_load_print_meta: n_expert_used    = 0
llm_load_print_meta: causal attn      = 1
llm_load_print_meta: pooling type     = 0
llm_load_print_meta: rope type        = 0
llm_load_print_meta: rope scaling     = linear
llm_load_print_meta: freq_base_train  = 500000.0
llm_load_print_meta: freq_scale_train = 1
llm_load_print_meta: n_ctx_orig_yarn  = 131072
llm_load_print_meta: rope_finetuned   = unknown
llm_load_print_meta: ssm_d_conv       = 0
llm_load_print_meta: ssm_d_inner      = 0
llm_load_print_meta: ssm_d_state      = 0
llm_load_print_meta: ssm_dt_rank      = 0
llm_load_print_meta: model type       = ?B
llm_load_print_meta: model ftype      = F16
llm_load_print_meta: model params     = 3.21 B
llm_load_print_meta: model size       = 5.98 GiB (16.00 BPW)
llm_load_print_meta: general.name     = 0cb88a4f764b7a12671c53f0838cd831a0843b95
llm_load_print_meta: BOS token        = 128000 '<|begin_of_text|>'
llm_load_print_meta: EOS token        = 128009 '<|eot_id|>'
llm_load_print_meta: LF token         = 128 'Ä'
llm_load_print_meta: EOT token        = 128009 '<|eot_id|>'
llm_load_print_meta: max token length = 256
llm_load_tensors: ggml ctx size =    0.12 MiB
llama_model_load: error loading model: done_getting_tensors: wrong number of tensors; expected 255, got 254
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model './models/0cb88a4f764b7a12671c53f0838cd831a0843b95-3.2B-0cb88a4f764b7a12671c53f0838cd831a0843b95-F16.gguf'
 ERR [              load_model] unable to load model | tid="139850917829632" timestamp=1733839943 model="./models/0cb88a4f764b7a12671c53f0838cd831a0843b95-3.2B-0cb88a4f764b7a12671c53f0838cd831a0843b95-F16.gguf"
Segmentation fault (core dumped)
@bartowski1182
Copy link
Contributor

That is quite an old build, I'd highly recommend updating. I assume this is from before they added the rope tensor, it's a common error for old builds

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants