informatique:ai_lm
Différences
Ci-dessous, les différences entre deux révisions de la page.
| Les deux révisions précédentesRévision précédenteProchaine révision | Révision précédente | ||
| informatique:ai_lm [30/04/2026 16:10] – [Compilation pour GPU] cyrille | informatique:ai_lm [03/05/2026 11:02] (Version actuelle) – [NanoLLM] cyrille | ||
|---|---|---|---|
| Ligne 51: | Ligne 51: | ||
| * https:// | * https:// | ||
| - | ==== Estimations ==== | + | * [[/ |
| - | **Devstral avec llama.cpp sur RTX 3060 12 Go.** | ||
| - | |||
| - | by ChatGPT : | ||
| - | |||
| - | | Modèle | ||
| - | | ----------------- | ------------------ | --------------------- | ---------------------------------------- | | ||
| - | | Devstral Small 7B | 1024 | 4 | Très sûr, VRAM ample | | ||
| - | | Devstral Small 7B | 2048 | 2‑3 | Bon compromis vitesse/ | ||
| - | | Devstral Small 7B | 4096 | 1‑2 | VRAM presque saturée | ||
| - | | Devstral 13B | 1024 | 2 | VRAM limitée | ||
| - | | Devstral 13B | 2048 | 1‑2 | Optimal, attention VRAM | | ||
| - | | Devstral 13B | 4096 | 1 | VRAM saturée, offload CPU conseillé | ||
| - | | Devstral 13B | 8192 | 1 | Possible mais contexte long → risque OOM | | ||
| - | |||
| - | by LeChat: | ||
| - | |||
| - | | contexte (tokens) | modèle (paramètres) | VRAM estimée (Go) | Batch size optimal | Latence estimée (tok/s) | Notes | | ||
| - | | 512 | 7B | ~5.5 | 8 | 15-25 | Idéal pour des tâches courtes et rapides. | | ||
| - | | 1024 | 7B | ~6.0 | 4 | 10-20 | Bon compromis pour des prompts moyens. | | ||
| - | | 2048 | 7B | ~7.0 | 2 | 5-15 | Nécessite une gestion fine de la VRAM. | | ||
| - | | 4096 | 7B | ~8.5 | 1 | 3-10 | Proche de la limite VRAM, risque de ralentissement. | | ||
| - | | 512 | 13B | ~9.0 | 4 | 8-15 | Modèle plus gros, latence accrue. | | ||
| - | | 1024 | 13B | ~10.0 | 2 | 4-10 | VRAM presque saturée, batch_size réduit. | | ||
| - | | 2048 | 13B | ~11.5 | 1 | 2-8 | Risque élevé de dépassement VRAM, latence importante. | | ||
| ==== Online services ==== | ==== Online services ==== | ||
| Ligne 339: | Ligne 315: | ||
| Linux OneApi toolkit | Linux OneApi toolkit | ||
| * https:// | * https:// | ||
| + | * 71 paquets pour 2.3 Go | ||
| + | * Relire https:// | ||
| + | |||
| + | Par défaut '' | ||
| + | |||
| + | intel-oneapi-ccl-2022.0 intel-oneapi-ccl-devel intel-oneapi-ccl-devel-2022.0 intel-oneapi-common-licensing intel-oneapi-common-licensing-2026.0 | ||
| + | intel-oneapi-common-oneapi-vars intel-oneapi-common-oneapi-vars-2026.0 intel-oneapi-common-vars intel-oneapi-compiler-cpp-eclipse-cfg-2026.0 | ||
| + | intel-oneapi-compiler-dpcpp-cpp intel-oneapi-compiler-dpcpp-cpp-2026.0 intel-oneapi-compiler-dpcpp-cpp-common-2026.0 | ||
| + | intel-oneapi-compiler-dpcpp-cpp-runtime-2026.0 intel-oneapi-compiler-dpcpp-eclipse-cfg-2026.0 intel-oneapi-compiler-fortran-2026.0 | ||
| + | intel-oneapi-compiler-fortran-common-2026.0 intel-oneapi-compiler-fortran-runtime-2026.0 intel-oneapi-compiler-shared-2026.0 | ||
| + | intel-oneapi-compiler-shared-common-2026.0 intel-oneapi-compiler-shared-runtime-2026.0 intel-oneapi-dev-utilities intel-oneapi-dev-utilities-2026.0 | ||
| + | intel-oneapi-dev-utilities-eclipse-cfg-2026.0 intel-oneapi-dnnl-2026.0 intel-oneapi-dnnl-devel intel-oneapi-dnnl-devel-2026.0 | ||
| + | intel-oneapi-dpcpp-cpp-2026.0 intel-oneapi-dpcpp-debugger-2026.0 intel-oneapi-icc-eclipse-plugin-cpp-2026.0 intel-oneapi-ipp-2026.0 | ||
| + | intel-oneapi-ipp-devel intel-oneapi-ipp-devel-2026.0 intel-oneapi-ippcp-2026.0 intel-oneapi-ippcp-devel intel-oneapi-ippcp-devel-2026.0 | ||
| + | intel-oneapi-libdpstd-devel-2022.12 intel-oneapi-mkl-classic-devel-2026.0 intel-oneapi-mkl-classic-include-2026.0 intel-oneapi-mkl-cluster-2026.0 | ||
| + | intel-oneapi-mkl-cluster-devel-2026.0 intel-oneapi-mkl-core-2026.0 intel-oneapi-mkl-core-devel-2026.0 intel-oneapi-mkl-devel | ||
| + | intel-oneapi-mkl-devel-2026.0 intel-oneapi-mkl-sycl-2026.0 intel-oneapi-mkl-sycl-blas-2026.0 intel-oneapi-mkl-sycl-data-fitting-2026.0 | ||
| + | intel-oneapi-mkl-sycl-devel-2026.0 intel-oneapi-mkl-sycl-dft-2026.0 intel-oneapi-mkl-sycl-include-2026.0 intel-oneapi-mkl-sycl-lapack-2026.0 | ||
| + | intel-oneapi-mkl-sycl-rng-2026.0 intel-oneapi-mkl-sycl-sparse-2026.0 intel-oneapi-mkl-sycl-stats-2026.0 intel-oneapi-mkl-sycl-vm-2026.0 | ||
| + | intel-oneapi-mpi-2021.18 intel-oneapi-mpi-devel intel-oneapi-mpi-devel-2021.18 intel-oneapi-openmp-2026.0 intel-oneapi-openmp-common-2026.0 | ||
| + | intel-oneapi-tbb-2023.0 intel-oneapi-tbb-devel intel-oneapi-tbb-devel-2023.0 intel-oneapi-tcm-1.5 intel-oneapi-tlt intel-oneapi-tlt-2026.0 | ||
| + | intel-oneapi-toolkit intel-oneapi-toolkit-env-2026.0 intel-oneapi-toolkit-getting-started-2026.0 intel-oneapi-umf-1.1 intel-oneapi-vtune | ||
| + | |||
| + | |||
| + | < | ||
| + | $ source / | ||
| + | $ sycl-ls | ||
| + | [opencl: | ||
| + | </ | ||
| + | |||
| + | En fait ça ne va pas car | ||
| + | |||
| + | <code bash> | ||
| + | $ ./ | ||
| + | ./ | ||
| + | |||
| + | # Probleme de version 😩 | ||
| + | $ find / | ||
| + | / | ||
| + | / | ||
| + | / | ||
| + | / | ||
| + | / | ||
| + | / | ||
| + | / | ||
| + | / | ||
| + | </ | ||
| + | |||
| + | Ok, passe à la compilation comme expliqué sur https:// | ||
| + | |||
| + | <code bash> | ||
| + | ./ | ||
| + | </ | ||
| + | |||
| + | Compilation sans erreur, mais ... " | ||
| + | |||
| + | < | ||
| + | $ ./ | ||
| + | # idem avec | ||
| + | $ ./ | ||
| + | |||
| + | [New LWP 35410] | ||
| + | [New LWP 35409] | ||
| + | [New LWP 35408] | ||
| + | [New LWP 35407] | ||
| + | [New LWP 35406] | ||
| + | [New LWP 35405] | ||
| + | [New LWP 35404] | ||
| + | [New LWP 35403] | ||
| + | [New LWP 35402] | ||
| + | [New LWP 35401] | ||
| + | [New LWP 35400] | ||
| + | [New LWP 35399] | ||
| + | [New LWP 35398] | ||
| + | [New LWP 35397] | ||
| + | [New LWP 35396] | ||
| + | |||
| + | This GDB supports auto-downloading debuginfo from the following URLs: | ||
| + | < | ||
| + | Enable debuginfod for this session? (y or [n]) [answered N; input not from terminal] | ||
| + | Debuginfod has been disabled. | ||
| + | ... | ||
| + | Using host libthread_db library "/ | ||
| + | 0x000079304a910813 in __GI___wait4 (pid=35411, stat_loc=0x0, | ||
| + | warning: 30 ../ | ||
| + | #0 0x000079304a910813 in __GI___wait4 (pid=35411, stat_loc=0x0, | ||
| + | 30 in ../ | ||
| + | #1 0x000079304e48aa1a in ggml_print_backtrace () from / | ||
| + | #2 0x000079304e4a3d76 in ggml_uncaught_exception() () from / | ||
| + | #3 0x000079304acbb0da in ?? () from / | ||
| + | #4 0x000079304aca5a55 in std:: | ||
| + | #5 0x000079304acbb391 in __cxa_throw () from / | ||
| + | #6 0x000079304b19e765 in dpct:: | ||
| + | #7 0x000079304b16e8f3 in ggml_backend_sycl_print_sycl_devices () from / | ||
| + | #8 0x0000000000405527 in main () | ||
| + | [Inferior 1 (process 35394) detached] | ||
| + | terminate called after throwing an instance of ' | ||
| + | what(): | ||
| + | PLEASE submit a bug report to https:// | ||
| + | Abandon (core dumped) | ||
| + | </ | ||
| + | |||
| + | Et fait un reboot puis ça fonctionne. Les perfs: 2.6 plus rapide que sans SYCL (36.34 vs 13.94). | ||
| ==== ollama ==== | ==== ollama ==== | ||
| Ligne 372: | Ligne 451: | ||
| * https:// | * https:// | ||
| * https:// | * https:// | ||
| - | |||
| Todo | Todo | ||
| * [[https:// | * [[https:// | ||
| + | ==== ZML ==== | ||
| + | |||
| + | https:// | ||
informatique/ai_lm.1777558223.txt.gz · Dernière modification : de cyrille
