informatique:ai_lm:gpu_bench
Différences
Ci-dessous, les différences entre deux révisions de la page.
| Les deux révisions précédentesRévision précédenteProchaine révision | Révision précédente | ||
| informatique:ai_lm:gpu_bench [11/06/2026 13:46] – [Avec vrai PCIe ✅] cyrille | informatique:ai_lm:gpu_bench [25/06/2026 18:18] (Version actuelle) – [Nemotron-Cascade-2-30B-A3B] cyrille | ||
|---|---|---|---|
| Ligne 338: | Ligne 338: | ||
| build: e25a32e98 (9584) | build: e25a32e98 (9584) | ||
| + | </ | ||
| + | |||
| + | === gemma-4-26B-A4B-it-qat-UD-Q4_K_XL === | ||
| + | |||
| + | < | ||
| + | prompt eval time = | ||
| + | eval time = 1338.88 ms / 86 tokens ( 15.57 ms per token, | ||
| + | total time = 1657.05 ms / 251 tokens | ||
| + | | ||
| + | stop processing: n_tokens = 20931, truncated = 0 | ||
| + | |||
| + | prompt eval time = 3143.73 ms / 4850 tokens ( 0.65 ms per token, | ||
| + | eval time = | ||
| + | total time = | ||
| + | | ||
| + | stop processing: n_tokens = 27604, truncated = 0 | ||
| </ | </ | ||
| === Qwen3-Coder-30B-A3B-Instruct-Q4_K_M === | === Qwen3-Coder-30B-A3B-Instruct-Q4_K_M === | ||
| + | |||
| + | J'ai essayé des '' | ||
| < | < | ||
| $ ./ | $ ./ | ||
| - | llama_bench: | + | llama_bench: |
| </ | </ | ||
| + | |||
| + | === Qwen3-Coder-30B-A3B-Instruct-UD-Q4_K_XL === | ||
| + | |||
| + | J'ai essayé des '' | ||
| < | < | ||
| - | exec llama-server \ | + | $ ./ |
| - | -m Qwen3-Coder-30B-A3B-Instruct-Q4_K_M.gguf \ | + | |
| - | --host 0.0.0.0 --port 8012 \ | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | -c 96000 | + | |
| - | common_params_print_info: build 9584 (e25a32e98) with GNU 15.2.0 for Linux x86_64 | + | ggml_cuda_init: found 1 CUDA devices |
| - | log_info: verbosity = 4 (adjust with the `-lv N` CLI arg) | + | |
| - | device_info: | + | | model |
| - | | + | | ------------------------------ |
| - | - CPU : Intel(R) Core(TM) Ultra 7 270K Plus (93508 MiB, 93508 MiB free) | + | llama_bench: error: failed |
| - | system_info: | + | </code> |
| - | srv llama_server: | + | |
| - | ... | + | |
| - | common_params_fit_impl: | + | |
| - | common_params_fit_impl: | + | |
| - | common_params_fit_impl: | + | |
| - | common_fit_params: successfully fit params to free device memory | + | |
| - | common_fit_params: fitting params | + | |
| - | llama_model_loader: | + | |
| - | ... | + | |
| - | load_tensors: | + | |
| - | load_tensors: | + | |
| - | load_tensors: | + | |
| - | load_tensors: | + | |
| - | load_tensors: | + | |
| - | load_tensors: | + | |
| - | load_tensors: | + | |
| - | ... | + | |
| - | llama_context: | + | |
| - | llama_context: | + | |
| - | llama_kv_cache: | + | |
| - | llama_kv_cache: | + | |
| - | ... | + | |
| - | sched_reserve: | + | |
| - | sched_reserve: | + | |
| - | sched_reserve: | + | |
| - | sched_reserve: | + | |
| - | sched_reserve: | + | |
| - | sched_reserve: | + | |
| - | sched_reserve: | + | |
| - | ... | + | |
| - | srv load_model: prompt cache is enabled, size limit: 8192 MiB | + | |
| - | ... | + | |
| - | srv init: init: chat template, thinking = 0 | + | |
| - | srv llama_server: | + | |
| - | srv llama_server: | + | |
| - | srv update_slots: | + | |
| - | $ nvidia-smi | + | === Nemotron-Cascade-2-30B-A3B === |
| - | +-----------------------------------------------------------------------------------------+ | + | |
| - | | NVIDIA-SMI 595.71.05 | + | |
| - | +-----------------------------------------+------------------------+----------------------+ | + | |
| - | | GPU Name | + | |
| - | | Fan Temp | + | |
| - | | | + | |
| - | |=========================================+========================+======================| | + | |
| - | | | + | |
| - | | 0% | + | |
| - | | | + | |
| - | +-----------------------------------------+------------------------+----------------------+ | + | |
| - | +-----------------------------------------------------------------------------------------+ | + | J'ai essayé des '' |
| - | | Processes: | | + | |
| - | | GPU | + | < |
| - | | | + | $ ./ |
| - | |=========================================================================================| | + | ggml_cuda_init: |
| - | | | + | |
| - | +-----------------------------------------------------------------------------------------+ | + | | model |
| + | | ------------------------------ | ||
| + | llama_bench: | ||
| </ | </ | ||
informatique/ai_lm/gpu_bench.1781178414.txt.gz · Dernière modification : de cyrille
