informatique:egpu
Différences
Ci-dessous, les différences entre deux révisions de la page.
| Les deux révisions précédentesRévision précédenteProchaine révision | Révision précédente | ||
| informatique:egpu [08/02/2026 10:29] – [eGpu] cyrille | informatique:egpu [24/04/2026 11:52] (Version actuelle) – [Update 2026-04] cyrille | ||
|---|---|---|---|
| Ligne 44: | Ligne 44: | ||
| Au final on ne fait tourner que de petits models avec de petit context ... | Au final on ne fait tourner que de petits models avec de petit context ... | ||
| + | |||
| + | ===== Update 2026-04 ===== | ||
| + | |||
| + | Nouvel essai pour la RTX 5060 Ti | ||
| + | |||
| + | Ajout de la source '' | ||
| + | |||
| + | < | ||
| + | sudo add-apt-repository ppa: | ||
| + | sudo apt update | ||
| + | > ... nvidia-driver-595-open ... | ||
| + | sudo apt upgrade | ||
| + | > ... Building initial module nvidia/ | ||
| + | # Oups, penser à supprimer version 590 | ||
| + | sudo apt purge nvidia-utils-590 nvidia-driver-590-open nvidia-dkms-590-open nvidia-compute-utils-590 | ||
| + | </ | ||
| + | |||
| + | Après l' | ||
| + | |||
| + | |||
| + | Branchement de la RTX via THB | ||
| + | < | ||
| + | kernel: thunderbolt 0-1: new device found, vendor=0x215 device=0x41 | ||
| + | kernel: thunderbolt 0-1: TB4 HOME TB4 eGFX | ||
| + | boltd[1096]: | ||
| + | ... | ||
| + | kernel: nvidia: loading out-of-tree module taints kernel. | ||
| + | kernel: nvidia: module verification failed: signature and/or required key missing - tainting kernel | ||
| + | kernel: nvidia-nvlink: | ||
| + | kernel: | ||
| + | kernel: nvidia 0000: | ||
| + | kernel: nvidia 0000: | ||
| + | kernel: NVRM: loading NVIDIA UNIX Open Kernel Module for x86_64 | ||
| + | systemd[2149]: | ||
| + | kernel: nvidia-modeset: | ||
| + | kernel: [drm] [nvidia-drm] [GPU ID 0x00000500] Loading driver | ||
| + | kernel: [drm] Initialized nvidia-drm 0.0.0 for 0000: | ||
| + | kernel: nvidia 0000: | ||
| + | systemd[1]: Starting nvidia-persistenced.service - NVIDIA Persistence Daemon... | ||
| + | nvidia-persistenced[4123]: | ||
| + | nvidia-persistenced[4123]: | ||
| + | nvidia-persistenced[4123]: | ||
| + | nvidia-persistenced[4123]: | ||
| + | nvidia-persistenced[4123]: | ||
| + | systemd[1]: Started nvidia-persistenced.service - NVIDIA Persistence Daemon. | ||
| + | boltd[1120]: | ||
| + | ... | ||
| + | </ | ||
| + | |||
| + | Essai avec llama.cpp tout frais et compilé avec CUDA_ARCHITECTURES=120 et CUDA 12.9. | ||
| + | * 🚀 Quelques questions dans le chat de llama.cpp : Ok | ||
| + | * 🚀 Refactoring de code avec '' | ||
| + | * 😩 Détection sur un boucle d' | ||
| + | * **Xid 79 "GPU has fallen off the bus"** | ||
| + | |||
| + | < | ||
| + | kernel: NVRM: GPU at PCI: | ||
| + | kernel: NVRM: GPU Board Serial Number: 0 | ||
| + | kernel: NVRM: Xid (PCI: | ||
| + | kernel: NVRM: GPU 0000: | ||
| + | kernel: NVRM: GPU 0000: | ||
| + | kernel: NVRM: GPU0 krcRcAndNotifyAllChannels_IMPL: | ||
| + | kernel: NVRM: GPU0 _threadNodeCheckTimeout: | ||
| + | ... | ||
| + | kernel: NVRM: GPU0 _issueRpcAndWait: | ||
| + | kernel: NVRM: GPU0 nvCheckOkFailedNoLog: | ||
| + | ... | ||
| + | </ | ||
| + | |||
| + | Alors je passe au nouvel essai proposé sur [[https:// | ||
| ===== Nvidia ===== | ===== Nvidia ===== | ||
| Ligne 54: | Ligne 124: | ||
| * nvidia_uvm: Unified Virtual Memory (UVM) support | * nvidia_uvm: Unified Virtual Memory (UVM) support | ||
| * nvidia_drm: Direct Rendering Management (DRM) support | * nvidia_drm: Direct Rendering Management (DRM) support | ||
| + | |||
| + | Séries RTX: | ||
| + | * 30 (Ampere) | ||
| + | * 40 (Ada) | ||
| + | * 50 (Blackwell) | ||
| Ligne 70: | Ligne 145: | ||
| La RTX 3060 fonctionne bien avec la version 580 '' | La RTX 3060 fonctionne bien avec la version 580 '' | ||
| - | ==== nvidia-headless-575-open ==== | ||
| - | |||
| - | < | ||
| - | $ sudo apt install nvidia-headless-575-open | ||
| - | Les NOUVEAUX paquets suivants seront installés : | ||
| - | libnvidia-cfg1-575 libnvidia-compute-575 libnvidia-decode-575 libnvidia-gpucomp-575 nvidia-compute-utils-575 nvidia-dkms-575-open nvidia-firmware-575 nvidia-headless-575-open nvidia-headless-no-dkms-575-open nvidia-kernel-common-575 nvidia-kernel-source-575-open nvidia-persistenced | ||
| - | </ | ||
| - | |||
| - | < | ||
| - | ggml_cuda_init: | ||
| - | </ | ||
| - | |||
| - | ==== nvidia-uvm ==== | ||
| - | |||
| - | < | ||
| - | $ modinfo nvidia-uvm | ||
| - | |||
| - | filename: | ||
| - | version: | ||
| - | supported: | ||
| - | license: | ||
| - | srcversion: | ||
| - | depends: | ||
| - | name: | ||
| - | retpoline: | ||
| - | vermagic: | ||
| - | sig_id: | ||
| - | signer: | ||
| - | sig_key: | ||
| - | sig_hashalgo: | ||
| - | signature: | ||
| - | 8F: | ||
| - | F5: | ||
| - | 8D: | ||
| - | B0: | ||
| - | F3: | ||
| - | F4: | ||
| - | 38: | ||
| - | 0B: | ||
| - | FE: | ||
| - | 91: | ||
| - | FD: | ||
| - | E4: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | parm: | ||
| - | |||
| - | $ systool -m nvidia_uvm -v | ||
| - | |||
| - | Module = " | ||
| - | Attributes: | ||
| - | coresize | ||
| - | initsize | ||
| - | initstate | ||
| - | refcnt | ||
| - | srcversion | ||
| - | taint = " | ||
| - | uevent | ||
| - | version | ||
| - | Parameters: | ||
| - | uvm_ats_mode | ||
| - | uvm_block_cpu_to_cpu_copy_with_ce= " | ||
| - | uvm_channel_gpfifo_loc= " | ||
| - | uvm_channel_gpput_loc= " | ||
| - | uvm_channel_num_gpfifo_entries= " | ||
| - | uvm_channel_pushbuffer_loc= " | ||
| - | uvm_conf_computing_channel_iv_rotation_limit= " | ||
| - | uvm_cpu_chunk_allocation_sizes= " | ||
| - | uvm_debug_enable_push_acquire_info= " | ||
| - | uvm_debug_enable_push_desc= " | ||
| - | uvm_debug_prints | ||
| - | uvm_disable_hmm | ||
| - | uvm_downgrade_force_membar_sys= " | ||
| - | uvm_enable_builtin_tests= " | ||
| - | uvm_enable_debug_procfs= " | ||
| - | uvm_enable_va_space_mm= " | ||
| - | uvm_exp_gpu_cache_peermem= " | ||
| - | uvm_exp_gpu_cache_sysmem= " | ||
| - | uvm_fault_force_sysmem= " | ||
| - | uvm_force_prefetch_fault_support= " | ||
| - | uvm_global_oversubscription= " | ||
| - | uvm_leak_checker | ||
| - | uvm_page_table_location= " | ||
| - | uvm_peer_copy | ||
| - | uvm_perf_access_counter_batch_count= " | ||
| - | uvm_perf_access_counter_migration_enable= " | ||
| - | uvm_perf_access_counter_threshold= " | ||
| - | uvm_perf_fault_batch_count= " | ||
| - | uvm_perf_fault_coalesce= " | ||
| - | uvm_perf_fault_max_batches_per_service= " | ||
| - | uvm_perf_fault_max_throttle_per_service= " | ||
| - | uvm_perf_fault_replay_policy= " | ||
| - | uvm_perf_fault_replay_update_put_ratio= " | ||
| - | uvm_perf_map_remote_on_eviction= " | ||
| - | uvm_perf_map_remote_on_native_atomics_fault= " | ||
| - | uvm_perf_migrate_cpu_preunmap_block_order= " | ||
| - | uvm_perf_migrate_cpu_preunmap_enable= " | ||
| - | uvm_perf_pma_batch_nonpinned_order= " | ||
| - | uvm_perf_prefetch_enable= " | ||
| - | uvm_perf_prefetch_min_faults= " | ||
| - | uvm_perf_prefetch_threshold= " | ||
| - | uvm_perf_reenable_prefetch_faults_lapse_msec= " | ||
| - | uvm_perf_thrashing_enable= " | ||
| - | uvm_perf_thrashing_epoch= " | ||
| - | uvm_perf_thrashing_lapse_usec= " | ||
| - | uvm_perf_thrashing_max_resets= " | ||
| - | uvm_perf_thrashing_nap= " | ||
| - | uvm_perf_thrashing_pin= " | ||
| - | uvm_perf_thrashing_pin_threshold= " | ||
| - | uvm_perf_thrashing_threshold= " | ||
| - | uvm_release_asserts = " | ||
| - | uvm_release_asserts_dump_stack= " | ||
| - | uvm_release_asserts_set_global_error= " | ||
| - | </ | ||
| - | |||
| - | Le plantage de la RTX 5060 Ti arrive plus tard si '' | ||
| - | |||
| - | ==== Séries RTX ==== | ||
| - | |||
| - | * 30 (Ampere) | ||
| - | * 40 (Ada) | ||
| - | * 50 (Blackwell) | ||
| Ligne 274: | Ligne 177: | ||
| J'ai acheté un câble Thunderbolt certifié (50€) pour remplacer celui fourni avec l' | J'ai acheté un câble Thunderbolt certifié (50€) pour remplacer celui fourni avec l' | ||
| + | |||
| + | Toujours plantage avec driver nvidia 590, cuda 13.1 et conf modprobe. | ||
| + | |||
| === nvidia-kkms-565 === | === nvidia-kkms-565 === | ||
informatique/egpu.1770542967.txt.gz · Dernière modification : de cyrille
