Eval bug: llama-3.2-3B got error loading model: done_getting_tensors: wrong number of tensors; expected 255, got 254 #10759

franklyd · 2024-12-10T14:12:59Z

Name and Version

version: 3411 (e02b597)
built with cc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0 for x86_64-linux-gnu

Operating systems

Linux

GGML backends

CPU

Hardware

NA

Models

llama-3.2-3B

Problem description & steps to reproduce

when I run ./llama-server -m <model_name>, got error:

llama_model_load: error loading model: done_getting_tensors: wrong number of tensors; expected 255, got 254

The model was converted using python convert_hf_to_gguf.py.

First Bad Commit

No response

Relevant log output

llama_model_loader: - type  f32:   58 tensors
llama_model_loader: - type  f16:  197 tensors
llm_load_vocab: special tokens cache size = 256
llm_load_vocab: token to piece cache size = 0.7999 MB
llm_load_print_meta: format           = GGUF V3 (latest)
llm_load_print_meta: arch             = llama
llm_load_print_meta: vocab type       = BPE
llm_load_print_meta: n_vocab          = 128256
llm_load_print_meta: n_merges         = 280147
llm_load_print_meta: vocab_only       = 0
llm_load_print_meta: n_ctx_train      = 131072
llm_load_print_meta: n_embd           = 3072
llm_load_print_meta: n_layer          = 28
llm_load_print_meta: n_head           = 24
llm_load_print_meta: n_head_kv        = 8
llm_load_print_meta: n_rot            = 128
llm_load_print_meta: n_swa            = 0
llm_load_print_meta: n_embd_head_k    = 128
llm_load_print_meta: n_embd_head_v    = 128
llm_load_print_meta: n_gqa            = 3
llm_load_print_meta: n_embd_k_gqa     = 1024
llm_load_print_meta: n_embd_v_gqa     = 1024
llm_load_print_meta: f_norm_eps       = 0.0e+00
llm_load_print_meta: f_norm_rms_eps   = 1.0e-05
llm_load_print_meta: f_clamp_kqv      = 0.0e+00
llm_load_print_meta: f_max_alibi_bias = 0.0e+00
llm_load_print_meta: f_logit_scale    = 0.0e+00
llm_load_print_meta: n_ff             = 8192
llm_load_print_meta: n_expert         = 0
llm_load_print_meta: n_expert_used    = 0
llm_load_print_meta: causal attn      = 1
llm_load_print_meta: pooling type     = 0
llm_load_print_meta: rope type        = 0
llm_load_print_meta: rope scaling     = linear
llm_load_print_meta: freq_base_train  = 500000.0
llm_load_print_meta: freq_scale_train = 1
llm_load_print_meta: n_ctx_orig_yarn  = 131072
llm_load_print_meta: rope_finetuned   = unknown
llm_load_print_meta: ssm_d_conv       = 0
llm_load_print_meta: ssm_d_inner      = 0
llm_load_print_meta: ssm_d_state      = 0
llm_load_print_meta: ssm_dt_rank      = 0
llm_load_print_meta: model type       = ?B
llm_load_print_meta: model ftype      = F16
llm_load_print_meta: model params     = 3.21 B
llm_load_print_meta: model size       = 5.98 GiB (16.00 BPW)
llm_load_print_meta: general.name     = 0cb88a4f764b7a12671c53f0838cd831a0843b95
llm_load_print_meta: BOS token        = 128000 '<|begin_of_text|>'
llm_load_print_meta: EOS token        = 128009 '<|eot_id|>'
llm_load_print_meta: LF token         = 128 'Ä'
llm_load_print_meta: EOT token        = 128009 '<|eot_id|>'
llm_load_print_meta: max token length = 256
llm_load_tensors: ggml ctx size =    0.12 MiB
llama_model_load: error loading model: done_getting_tensors: wrong number of tensors; expected 255, got 254
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model './models/0cb88a4f764b7a12671c53f0838cd831a0843b95-3.2B-0cb88a4f764b7a12671c53f0838cd831a0843b95-F16.gguf'
 ERR [              load_model] unable to load model | tid="139850917829632" timestamp=1733839943 model="./models/0cb88a4f764b7a12671c53f0838cd831a0843b95-3.2B-0cb88a4f764b7a12671c53f0838cd831a0843b95-F16.gguf"
Segmentation fault (core dumped)

The text was updated successfully, but these errors were encountered:

bartowski1182 · 2024-12-10T16:19:25Z

That is quite an old build, I'd highly recommend updating. I assume this is from before they added the rope tensor, it's a common error for old builds

franklyd added the bug-unconfirmed label Dec 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eval bug: llama-3.2-3B got error loading model: done_getting_tensors: wrong number of tensors; expected 255, got 254 #10759

Eval bug: llama-3.2-3B got error loading model: done_getting_tensors: wrong number of tensors; expected 255, got 254 #10759

franklyd commented Dec 10, 2024

bartowski1182 commented Dec 10, 2024

Eval bug: llama-3.2-3B got error loading model: done_getting_tensors: wrong number of tensors; expected 255, got 254 #10759

Eval bug: llama-3.2-3B got error loading model: done_getting_tensors: wrong number of tensors; expected 255, got 254 #10759

Comments

franklyd commented Dec 10, 2024

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

bartowski1182 commented Dec 10, 2024