ASRock.com Homepage
Forum Home Forum Home > Technical Support > AMD Motherboards
  New Posts New Posts RSS Feed - Fatal1ty x399 - Linux pcie errors
  FAQ FAQ  Forum Search Search  Events   Register Register  Login Login

Fatal1ty x399 - Linux pcie errors

 Post Reply Post Reply
Author
Message
openjaf View Drop Down
Newbie
Newbie


Joined: 10 Jan 2018
Status: Offline
Points: 4
Post Options Post Options   Thanks (0) Thanks(0)   Quote openjaf Quote  Post ReplyReply Direct Link To This Post Topic: Fatal1ty x399 - Linux pcie errors
    Posted: 10 Jan 2018 at 11:13am

Recently purchased the Fatal1ty x399 with a 1950x.
Updated bios to the latest, 2.0.

Was having issues while trying to use an nvidia card under linux,  Latest fedora and arch releases.  Symptoms are described in the following thread link.
https://forum.level1techs.com/t/threadripper-pcie-bus-errors/118977

PCIE resets were happening, I applied the fix suggested and they have been resolved.  
"pcie_aspm=off"

But I am now getting periodic video blanking problems, with messages like:

Jan 09 21:32:30 ripper kernel: nvidia-modeset: WARNING: GPU:0: Lost display notification (0:0x00000000); continuing.
Jan 09 21:32:46 ripper kernel: nvidia-modeset: ERROR: GPU:0: Idling display engine timed out: 0x0000857d:0:0

Within the referenced thread above there is an individual that was not having the 'pcie reset' issues with an asus board.  It may be attributable to a bios level disabling of aspm opposed to the kernel parameter.  It has yet to be confirmed.

Are the pcie reset messages and the messages I posted above known issues?
How may I be able to disable aspm on the fatal1ty board from the bios, or something similar to trouble shoot this further? 
Given the 'pcie_aspm=off' parameter gets rid of the reset messages, what could be causing the messages and video blanking that I mention above?

I would prefer not to return this board and try the referenced thread's mentioned asus board to troubleshoot.

Thanks in advance,

Jim
Back to Top
antorsae View Drop Down
Newbie
Newbie


Joined: 16 Dec 2017
Status: Offline
Points: 36
Post Options Post Options   Thanks (0) Thanks(0)   Quote antorsae Quote  Post ReplyReply Direct Link To This Post Posted: 10 Jan 2018 at 5:56pm
I too have had a lot of issues using Fatal1ty 399 w/ TR 1950X w/ Linux.
I use it for deep learning w/ 2 x 1080 Tis.

I have the following options for loading linux: iommu=pt pci_aspm=off

Also, check your memory overclock and try to running it first WITHOUT overclocking. I had my system stable both Win10 and Linux stress tested w/ CPU + memory workloads @ 3333 Mhz (4 x 16 Gb sticks), however when I moved to deep learning and left it training overnight I started seeing nvidia XID errors (http://docs.nvidia.com/deploy/xid-errors/index.html) but cannot remember which one.

I also had it boot with full UEFI (not compatibility) so nvidia driver does not complain on dmesg too.

With the above setting and memory @ 3200 Mhz so far it's been very stable.

Hope this helps.
Back to Top
MisterJ View Drop Down
Senior Member
Senior Member


Joined: 19 Apr 2017
Status: Offline
Points: 1097
Post Options Post Options   Thanks (0) Thanks(0)   Quote MisterJ Quote  Post ReplyReply Direct Link To This Post Posted: 11 Jan 2018 at 12:35am
There are very few Linux knowledgeable posters here and, of course, TR is officially supported only under W10 version 1703 and higher.  I hope you two experts can help each other.  I am glad to help when I can, but am very short on knowledge here.  Enjoy, John.
Fat1 X399 Pro Gaming, TR 1950X, RAID0 3xSamsung SSD 960 EVO, G.SKILL FlareX F4-3200C14Q-32GFX, Win 10 x64 Pro, Enermx Platimax 850, Enermx Liqtech TR4 CPU Cooler, Radeon RX580, BIOS 2.00, 2xHDDs WD
Back to Top
openjaf View Drop Down
Newbie
Newbie


Joined: 10 Jan 2018
Status: Offline
Points: 4
Post Options Post Options   Thanks (0) Thanks(0)   Quote openjaf Quote  Post ReplyReply Direct Link To This Post Posted: 16 Jan 2018 at 4:47am
My apologies for taking a while to reply.  I was waiting for an RMA card to perform some tests.

@antorsae Thank you for the feedback, your suggestion inspired me to do further testing.  I did not know to adjust the ' iommu=pt pci_aspm=off' flags you mentioned.

System is running fedora 27, UEFI with kernel 4.14.13-300.fc27.x86_64  and the nvidia driver 384.111 manually installed.

These tests were using glmark2 and just general desktop application usage.  I will start testing KVM PCI pass through with the 970 and 1080ti  this week.

550ti:
NVIDIA binary driver fails within minuets of bootup.  Throws many pci errors.  Adjusting IOMU within the bios and kernel flags does not help.
Nouveau driver will work for a bit but then eventually fails.  I can not reproduce the error condition to cause it to happen, so.... eol anybody?

970:
NVIDIA binary driver works, did not test the nouveau driver.  No flags or bios options were required for stability.

1080ti:
NVIDIA binary driver works, did not test the nouveau driver.  No flags or bios options were required for stability.

I am only guessing here but it seems the 550ti has a non uefi bios and is having problems.
Back to Top
MisterJ View Drop Down
Senior Member
Senior Member


Joined: 19 Apr 2017
Status: Offline
Points: 1097
Post Options Post Options   Thanks (0) Thanks(0)   Quote MisterJ Quote  Post ReplyReply Direct Link To This Post Posted: 16 Jan 2018 at 5:58am
openjaf, I use GPU-Z which reports UEFI capability of GPU.  I don't see a Linux version, but perhaps you make use of it.  Enjoy, John.
Fat1 X399 Pro Gaming, TR 1950X, RAID0 3xSamsung SSD 960 EVO, G.SKILL FlareX F4-3200C14Q-32GFX, Win 10 x64 Pro, Enermx Platimax 850, Enermx Liqtech TR4 CPU Cooler, Radeon RX580, BIOS 2.00, 2xHDDs WD
Back to Top
 Post Reply Post Reply
  Share Topic   

Forum Jump Forum Permissions View Drop Down

Forum Software by Web Wiz Forums® version 12.04
Copyright ©2001-2021 Web Wiz Ltd.

This page was generated in 0.094 seconds.