informatique:ai_lm:gpu_bench
Différences
Ci-dessous, les différences entre deux révisions de la page.
| Les deux révisions précédentesRévision précédenteProchaine révision | Révision précédente | ||
| informatique:ai_lm:gpu_bench [17/06/2026 13:52] – [Qwen3-Coder-30B-A3B-Instruct-Q4_K_M] cyrille | informatique:ai_lm:gpu_bench [25/06/2026 18:18] (Version actuelle) – [Nemotron-Cascade-2-30B-A3B] cyrille | ||
|---|---|---|---|
| Ligne 357: | Ligne 357: | ||
| === Qwen3-Coder-30B-A3B-Instruct-Q4_K_M === | === Qwen3-Coder-30B-A3B-Instruct-Q4_K_M === | ||
| + | |||
| + | J'ai essayé des '' | ||
| < | < | ||
| $ ./ | $ ./ | ||
| - | llama_bench: | + | llama_bench: |
| </ | </ | ||
| + | |||
| + | === Qwen3-Coder-30B-A3B-Instruct-UD-Q4_K_XL === | ||
| + | |||
| + | J'ai essayé des '' | ||
| < | < | ||
| - | exec llama-server \ | + | $ ./ |
| - | -m Qwen3-Coder-30B-A3B-Instruct-Q4_K_M.gguf \ | + | |
| - | --host 0.0.0.0 --port 8012 \ | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | -c 96000 | + | |
| - | common_params_print_info: build 9584 (e25a32e98) with GNU 15.2.0 for Linux x86_64 | + | ggml_cuda_init: found 1 CUDA devices |
| - | log_info: verbosity = 4 (adjust with the `-lv N` CLI arg) | + | |
| - | device_info: | + | | model |
| - | | + | | ------------------------------ |
| - | - CPU : Intel(R) Core(TM) Ultra 7 270K Plus (93508 MiB, 93508 MiB free) | + | llama_bench: error: failed |
| - | system_info: | + | </code> |
| - | srv llama_server: | + | |
| - | ... | + | |
| - | common_params_fit_impl: | + | |
| - | common_params_fit_impl: | + | |
| - | common_params_fit_impl: | + | |
| - | common_fit_params: successfully fit params to free device memory | + | |
| - | common_fit_params: fitting params | + | |
| - | llama_model_loader: | + | |
| - | ... | + | |
| - | load_tensors: | + | |
| - | load_tensors: | + | |
| - | load_tensors: | + | |
| - | load_tensors: | + | |
| - | load_tensors: | + | |
| - | load_tensors: | + | |
| - | load_tensors: | + | |
| - | ... | + | |
| - | llama_context: | + | |
| - | llama_context: | + | |
| - | llama_kv_cache: | + | |
| - | llama_kv_cache: | + | |
| - | ... | + | |
| - | sched_reserve: | + | |
| - | sched_reserve: | + | |
| - | sched_reserve: | + | |
| - | sched_reserve: | + | |
| - | sched_reserve: | + | |
| - | sched_reserve: | + | |
| - | sched_reserve: | + | |
| - | ... | + | |
| - | srv load_model: prompt cache is enabled, size limit: 8192 MiB | + | |
| - | ... | + | |
| - | srv init: init: chat template, thinking = 0 | + | |
| - | srv llama_server: | + | |
| - | srv llama_server: | + | |
| - | srv update_slots: | + | |
| - | $ nvidia-smi | + | === Nemotron-Cascade-2-30B-A3B === |
| - | +-----------------------------------------------------------------------------------------+ | + | |
| - | | NVIDIA-SMI 595.71.05 | + | |
| - | +-----------------------------------------+------------------------+----------------------+ | + | |
| - | | GPU Name | + | |
| - | | Fan Temp | + | |
| - | | | + | |
| - | |=========================================+========================+======================| | + | |
| - | | | + | |
| - | | 0% | + | |
| - | | | + | |
| - | +-----------------------------------------+------------------------+----------------------+ | + | |
| - | +-----------------------------------------------------------------------------------------+ | + | J'ai essayé des '' |
| - | | Processes: | | + | |
| - | | GPU | + | < |
| - | | | + | $ ./ |
| - | |=========================================================================================| | + | ggml_cuda_init: |
| - | | | + | |
| - | +-----------------------------------------------------------------------------------------+ | + | | model |
| + | | ------------------------------ | ||
| + | llama_bench: | ||
| </ | </ | ||
informatique/ai_lm/gpu_bench.1781697141.txt.gz · Dernière modification : de cyrille
