- May 19, 2011
- 17,767
- 9,727
- 136
A week or so ago I was using my computer for basic apps usage (office / browsing apps) when the screen went blank. No caps lock response, I tried waiting for a bit, so I pressed the reset button.
In hindsight I should have tried harder to regain control of the system but Linux troubleshooting is not a field that I have much experience in (unlike Windows). I'm asking now because I want to learn more.
After rebooting, I took a copy of /var/log/kern.log, and here are the bits that I think are most relevant:
The system evidently was still responding on some level because the log contains firewall activity info after the gpu going nuts.
In hindsight, I think I should have tried:
Ctrl+Alt+F1 to switch to a command terminal
Ctrl+Alt+F7 to switch back to X
Ctrl+Alt+Backspace - does this still restart the X server? A quick google suggests I'd lose any apps that were running in the X session?
I could also have tried reading kern.log if I could switch to a command terminal and picked up the clue about Firefox, and terminated any processes to do with firefox.
Any more suggestions/advice would be appreciated. I've been using Mint for a few years (since 2021/2022?) and on my Haswell rig this kind of incident only happened once. My latest setup (AMD 7000, RX 6700 XT, now 6.5 kernel updated from 6.2) is only a few months old and this has happened just this one time so far.
In hindsight I should have tried harder to regain control of the system but Linux troubleshooting is not a field that I have much experience in (unlike Windows). I'm asking now because I want to learn more.
After rebooting, I took a copy of /var/log/kern.log, and here are the bits that I think are most relevant:
Code:
Jan 22 14:55:56 mikepc kernel: [30967.242712] [drm] VRAM is lost due to GPU reset!
Jan 22 14:55:56 mikepc kernel: [30967.242714] [drm] PSP is resuming...
Jan 22 14:56:02 mikepc kernel: [30972.542393] [drm:psp_v11_0_memory_training [amdgpu]] *ERROR* send training msg failed.
Jan 22 14:56:02 mikepc kernel: [30972.542520] [drm:psp_resume [amdgpu]] *ERROR* Failed to process memory training!
Jan 22 14:56:02 mikepc kernel: [30972.542620] [drm:amdgpu_device_fw_loading [amdgpu]] *ERROR* resume of IP block <psp> failed -62
Jan 22 14:56:02 mikepc kernel: [30972.542712] amdgpu 0000:03:00.0: amdgpu: GPU reset(1) failed
Jan 22 14:56:02 mikepc kernel: [30972.542732] [drm] Skip scheduling IBs!
<last message repeated many times>
Jan 22 14:56:02 mikepc kernel: [30972.656377] amdgpu 0000:03:00.0: amdgpu: GPU reset end with ret = -62
Jan 22 14:56:02 mikepc kernel: [30972.656379] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* GPU Recovery Failed: -62
Jan 22 14:56:11 mikepc kernel: [30981.799107] [UFW BLOCK] IN=eno1 OUT= MAC=01:00:5e:00:00:01:80:1f:02:fb:41:54:08:00 SRC=192.168.0.200 DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=53711 PROTO=2
Jan 22 14:56:12 mikepc kernel: [30982.658193] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=3498097, emitted seq=3498099
Jan 22 14:56:12 mikepc kernel: [30982.658414] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process firefox-bin pid 29856 thread firefox:cs0 pid 29918
Jan 22 14:56:12 mikepc kernel: [30982.658583] amdgpu 0000:03:00.0: amdgpu: GPU reset begin!
Jan 22 14:56:12 mikepc kernel: [30982.658862] amdgpu 0000:03:00.0: amdgpu: Failed to disallow df cstate
The system evidently was still responding on some level because the log contains firewall activity info after the gpu going nuts.
In hindsight, I think I should have tried:
Ctrl+Alt+F1 to switch to a command terminal
Ctrl+Alt+F7 to switch back to X
Ctrl+Alt+Backspace - does this still restart the X server? A quick google suggests I'd lose any apps that were running in the X session?
I could also have tried reading kern.log if I could switch to a command terminal and picked up the clue about Firefox, and terminated any processes to do with firefox.
Any more suggestions/advice would be appreciated. I've been using Mint for a few years (since 2021/2022?) and on my Haswell rig this kind of incident only happened once. My latest setup (AMD 7000, RX 6700 XT, now 6.5 kernel updated from 6.2) is only a few months old and this has happened just this one time so far.