Truenas upgrade from 24.10.2.4 to 25.10.3

After a reboot following an upgrade, my TrueNAS system crashed with the following error:

This was followed by a large number of SCB errors in an endless loop.

The problem is fairly straightforward if you know where to look. First, let’s break down the error message:

– DMAR – DMA remapping
– DRHD – DMA Remapping Hardware Unit Definition [1]

Because of the DMAR reference, the error is related to Intel’s IOMMU technology. What does the IOMMU do? When a device accesses memory via DMA, the IOMMU creates a mapping table and keeps track of which device is allowed to access which memory region. Without an IOMMU, a faulty or compromised driver could access memory regions it shouldn’t, potentially causing a segmentation fault or worse.

The root cause in my case was a PCIe-to-PCI bridge. If you look closely, you can see that there is a bridge device between the actual device and the motherboard:

So what happens?

A device attempts to access memory. However, because of the bridge, the request contains the bridge’s device ID instead of the actual device’s ID. You can see this in the original error message: the bridge ID is 05:00.0, while the actual PCI devices are 05:02.0 and 05:02.1.

How to solve this?

Disable the entire Intel IOMMU with a kernel parameter:

intel_iommu=off

How to do this permanently?

In truenas you can add a kernel parameter with the following command, so it won’t break the followup upgrades:

Links:

[1]: https://www.kernel.org/doc/Documentation/Intel-IOMMU.txt

This entry was posted in English, Linux, Truenas and tagged , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *