llama-run : improve progress bar (#10821)

Set default width to whatever the terminal is. Also fixed a small bug around default n_gpu_layers value. Signed-off-by: Eric Curtin <[email protected]>
ggerganov · Dec 19, 2024 · 7909e85 · 7909e85
1 parent 9177484
commit 7909e85
Show file tree

Hide file tree

Showing 3 changed files with 304 additions and 126 deletions.
diff --git a/README.md b/README.md
@@ -448,7 +448,7 @@ To learn more about model quantization, [read this documentation](examples/quant
 
     </details>
 
-[^3]: [https://github.com/containers/ramalama](RamaLama)
+[^3]: [RamaLama](https://github.com/containers/ramalama)
 
 ## [`llama-simple`](examples/simple)
 

diff --git a/examples/run/README.md b/examples/run/README.md
@@ -4,7 +4,7 @@ The purpose of this example is to demonstrate a minimal usage of llama.cpp for r
 
 ```bash
 llama-run granite-code
-...
+```
 
 ```bash
 llama-run -h
@@ -19,6 +19,8 @@ Options:
       Context size (default: 2048)
   -n, --ngl <value>
       Number of GPU layers (default: 0)
+  -v, --verbose, --log-verbose
+      Set verbosity level to infinity (i.e. log all messages, useful for debugging)
   -h, --help
       Show help message
 
@@ -42,6 +44,6 @@ Examples:
   llama-run https://example.com/some-file1.gguf
   llama-run some-file2.gguf
   llama-run file://some-file3.gguf
-  llama-run --ngl 99 some-file4.gguf
-  llama-run --ngl 99 some-file5.gguf Hello World
-...
+  llama-run --ngl 999 some-file4.gguf
+  llama-run --ngl 999 some-file5.gguf Hello World
+```