![](https://lemmy.world/pictrs/image/bf8c2498-56b6-4a3c-bbaa-d2f914ce3d88.png)
![](https://lemmy.ml/pictrs/image/q98XK4sKtw.png)
I’m glad you appreciate it! It’s always fun digging into kernel internals and learning new things :D
I’m also open to criticism about the writing if you have any.
InfoSec Person | Alt-Account#2
I’m glad you appreciate it! It’s always fun digging into kernel internals and learning new things :D
I’m also open to criticism about the writing if you have any.
Thank you, I’ll send you an email within a day.
Would you consider sending it to Austria? I’d pay shipping charges (if it’s within reason lol). If you are, you can send me an email at: sneela-hwelemmy92fd [at] port87.com
Are you planning to scrap the CPU? I may be interested in it as I find faulty hardware fun to experiment on.
You haven’t given us much information about the CPU. That is very important when dealing with Machine Check Errors (MCEs).
I’ve done a bit of work with MCEs and AMD CPUs, so I’ll help with understanding what may be going wrong and what you probably can do.
I’ve done a bit of searching from the microcode & the Dell Wyse thin client that you mentioned. From what I can garner, are you using a Dell Wyse 5060 Thin Client with an AMD steppe Eagle GX-424 [1]? This is my assumption for the rest of this comment.
Machine Check Errors (MCEs) are hard to decipher find out without the right documentation. As far as I can tell from AMD’s Data Sheet for the G-Series [2], this CPU belongs to family 16H.
You have two MCEs in your image:
Now, you can attempt to decipher these with a tool I used some time ago, MCE-Ryzen-Decoder [4]; you may note that the name says Ryzen - this tool only decodes MCEs of Ryzen architectures. However, MCE designs may not change much between families, but I wouldn’t bank (pun not intended) on it because it seems that the G-Series are an embedded SOC compared to the Ryzen CPUs which are not. I gave it a shot and the tool spit out that you may have an issue in:
$ python3 run.py 04 f600000000070f0f
Bank: Read-As-Zero (RAZ)
Error: ( 0x7)
$ python3 run.py 01 b400000001020103
Bank: Instruction Fetch Unit (IF)
Error: IC Full Tag Parity Error (TagParity 0x2)
Wouldn’t bank (pun intended this time) on it though.
What you can do is to go through the AMD Family 16H’s BIOS and Kernel Developer Guide [3] (Section 2.16.1.5 Error Code). From Section 2.16.1.1 Machine Check Registers, it looks like Bank 01 corresponds to the IC (Instruction Cache) and Bank 04 corresponds to the NB (Northbridge). This means that the CPU found issues in the NB in core 0 and the IC in core 1. You can go even further and check what those exact codes decipher to, but I wouldn’t put in that much effort - there’s not much you can do with that info (maybe the NB, but… too much effort). There are some MSRs that you can read out that correspond to errors of these banks (from Table 86: Registers Commonly Used for Diagnosis), but like I said, there’s not much you can do with this info anyway.
Okay, now that the boring part is over (it was fun for me), what can you do? It looks like the CPU is a quad core CPU. I take it to mean that it’s 4 cores * 2 SMT threads. If you have access to the linux command line parameters [5], say via GRUB for example, I would try to isolate the two faulty cores we see here: core 0 and core 1. Add isolcpus=0,1
to see the kernel boots. There’s a good chance that we see only two CPU cores failing, but others may also be faulty but the errors weren’t spit out. It’s worth a shot, but it may not work.
Alternatively, you can tell the kernel to disable MCE checks entirely and continue executing; this can be done with the mce=off
command line parameter [6] . Beware that this means that you’re now willingly running code on a CPU with two cores that have been shown to be faulty (so far). isolcpus
will make sure that the kernel doesn’t execute any “user” code on those cores unless asked to (via taskset
for example)
Apart from this, like others have pointed out, the red dots on the screen aren’t a great sign. Maybe you can individually replace defective parts, or maybe you have to buy a new machine entirely. What I told you with this comment is to check whether your CPU still works with 2 SMT threads faulty.
Good luck and I hope you fix your server 🤞.
Edited to add: I have seen MCEs appear due to extremely low/high/fluctuating voltages. As others pointed out, your PSU or other components related to power could be busted.
[4] https://github.com/DimitriFourny/MCE-Ryzen-Decoder
[5] https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html
[6] https://elixir.bootlin.com/linux/v6.9.2/source/Documentation/arch/x86/x86_64/boot-options.rst
https://www.gimp.org/news/2024/05/05/gimp-2-10-38-released/
This (possibly last) GIMP 2 stable release brings much-requested backports from GTK3, including improved support for tablets on Windows. A number of bug fixes and minor improvements are also included in this release.
If the release says that this is possibly the last GIMP2 stable release, it feels like GIMP3 is actually on its way. I understand your cynicism, but I’d be more optimistic this time around.
In dark mode, the anchor tags are difficult to read. They’re dark blue on a dark background. Perhaps consider something with a much higher contrast?
Apart from that, nice idea - I’m going to deploy the zipbomb today!
Hurd-ng is on its way: https://www.gnu.org/software/hurd/hurd/ng.html
See Wendover Productions’ most recent video, “The Increasing Reality of War in Space” (from around 7:54); they talk about SpaceX launching unknown satellites and not reporting it either.
Ah, maybe the whole context wasn’t added here, but I tried to download an XPI file for a different program that uses Firefox under the hood (called Zotero). I wanted to download the file to install it manually for the other program.
Firefox naturally thought that the XPI file was meant for itself and tried to install it. The XPI file was never intended for Firefox.
Edited to add: probably a pretty obscure thing that I noticed, but it’s still bizarre.
I used to have the same exact one. For some reason, I can only find similar ones here (Amazon de, in):
https://www.amazon.de/ZYTB-Mauspad-Schreibtisch-Computertastatur-Teppich-World-Map/dp/B08HLN4B3B
https://www.amazon.in/Sounce-Extended-Spill-Resistant-Special-Textured-Anti-Fray/dp/B0C14RHW4H
However, searching for “world map large desk mat” yields you similar results.
I’m not sure where I bought that one now hmmm
I recently had an issue with my computer freezing occasionally on a Deb12 Linux 6.1 where no errors showed up in syslog after a force reboot.
The way I finally found out about the issue was having dmesg open on a different monitor and waiting for the freeze to happen. Just before the freeze did happen, a number of error logs were spewed to dmesg - enough for me to catch a glimpse of the issue: intel WiFi.
I’m not saying that intel WiFi is your issue. I’m suggesting you keep dmesg -w
open in another monitor (if you can) and go about your normal activity until a freeze happens.
- Got a text-based launcher (Lunar Launcher)
By this, do you mean this launcher for Android? Searching duckduckgo predominantly leads me to a launcher with the same name for Minecraft
You’re right, that’s exactly what happened. If you look at the top of the trace, it says __handle_sysrq. Moreover, it’s in the sysrq_handle_crash
. That gets called when a sysrq combo is pressed.
Absolutely. Check out side channel attacks. The problem here isn’t about software exploits, but hardware issues. https://en.wikipedia.org/wiki/Side-channel_attack
Some things to get you started: Meltdown and Spectre: https://en.wikipedia.org/wiki/Meltdown_(security_vulnerability), https://en.wikipedia.org/wiki/Spectre_(security_vulnerability)
Rowhammer: https://en.wikipedia.org/wiki/Row_hammer
These are exploited by malicious processes doing something to the hardware which may result in information about your process(es) being leaked. Now, if this is on your computer, then the chances of encountering a malicious process that exploits this hardware bug would be low.
However, when you move this scenario to the cloud, things become more possible. Your vm/container is being scheduled on CPUs that may/may not be shared by other containers. All it would take is for a malicious guest VM to be scheduled on the same core/CPU as you and try exploiting the same hardware you’re sharing.
That title is… something
Isn’t Angstrom 10^-10 meters? And nanometers 10^-9 meters? So 20A (assuming A = Angstrom) is just 2nm?
Are they trying to say that by moving to this new era, they’ll go single digit Angstrom i.e., 0.x nm?