This hasn’t happened to me yet but I was just thinking about it. Let’s say you have a server with an iGPU, and you use GPU passthrough to let VMs use the iGPU. And then one day the host’s ssh server breaks, maybe you did something stupid or there was a bad update. Are you fucked? How could you possibly recover, with no display and no SSH? The only thing I can think of is setting up serial access for emergencies like this, but I rarely hear about serial access nowadays so I wonder if there’s some other solution here.
Boot to live disk.
Edit vmconfig to not start at boot.
Mount vmdisk to live disk
Fix ssh
As mentioned in another reply, this doesn’t work if you have encrypted disk. The price for security I suppose
Edit: nevermind I thought that secure boot and disk encryption would prevent you from mounting the disk to another system, but that appears to be wrong
I passthrough a GPU (no iGPU on this mobo).
It only hijacks the GPU when I start the VM, for which I haven’t configured autostart.
Before the VM is started it’s showing the host prompt. It doesn’t return to the prompt if the VM is shutdown or crashed, but a reboot would, hence not autostarting that VM.
If it got borked too much, putting a temporary GPU might be easier.Also, don’t break your ssh.
Pretty easy with PKI auth.It only hijacks the GPU when I start the VM
How did you do this? All the tutorials I read hijack the GPU at startup. Do you have to manually detach the GPU from the host before assigning it to the VM?
Interesting.
I’m not doing anything special that wasn’t in one of the popular tutorials and I thought that’s how it was supposed to work, although it might very well be a “bug” how it behaves right now.I don’t know enough about this, but the drivers are blacklisted on the host at boot, yet the console is still displayed through the GPU’s HDMI at that time which might depend on the specific GPU (a vega64 in my case).
The host doesn’t have a graphical desktop environment, just the shell.
the drivers are blacklisted on the host at boot
This is the problem I was alluding to, though I’m surprised you are still able to see the console despite the driver being blacklisted. I have heard of people using scripts to manually detach the GPU and attach it to a VM, but sounds like you don’t need that, which is interesting
Live boot, plug in a display?
Maybe I’m missing something here, but won’t booting from live media run a normal environment?
If you don’t have a live boot option you can also pull the disk and fix it on another machine, or put a different boot disk in the system entirely.
You can probably also disable hardware virtualization extensions in the bios to break the VM so it doesn’t steal the graphics card.
A rescue iso doesn’t work if you have encrypted disk. I thought everybody encrypted disk nowadays.
If you don’t have a live boot option you can also pull the disk and fix it on another machine, or put a different boot disk in the system entirely.
This is an interesting idea though, as long as the other machine has a different GPU then the system shouldn’t hijack it on startup.
You can probably also disable hardware virtualization extensions in the bios to break the VM so it doesn’t steal the graphics card.
AFAIK GPU passthrough is usually configured to detach the GPU from the host automatically on startup. So even if all VMs were broken, the GPU would still be detached. However as another commenter pointed out, it’s possible to detach it manually which might be safer against accidental lockouts.
😅 naa for me encryption a bigger risk than theft
That said, you should be able to decrypt your disks with the right key even on a live boot. Even if the secrets are in the tpm you should be able to use whatever your normal system uses to decrypt the disks.
If you don’t enter a password to boot, the keys are available. If you do, the password can decrypt the keys afaik.
Again, I don’t do this but that’s what I’ve picked up here and there so take it with a grain of salt I may be wrong.
Actually that might work. I thought that secure boot and disk encryption would prevent mounting the disk to a different system, but now I can’t think of any reason why it would. Good idea
Proxmox on the host. It uses a webserver for admin stuff.
No other things that run on the host ––> no other things that break on the host.
If you want to lock down the web server and ssh behind a VPN, that’s where you can fuck up and lock yourself out though.
For very simple tasks you can usually blindly log in and run commands. I’ve done this with very simple tasks, e.g., rebooting or bringing up a network interface. It’s maybe not the smartest, but basically, just type
root
, the root password, anddhclient eth0
or whatever magic you need. No display required, unless you make a typo…In your specific case, you could have a shell script that stops VMs and disables passthrough, so you just log in and invoke that script. Bonus points if you create a dedicated user with that script set as their shell (or just put in the appropriate dot rc file).
I’ll admit I’ve done this too 😅 Not ideal but a good idea nonetheless