Thank you to the guys at HEGE supporting Badcaps [ HEGE ] [ HEGE DEX Chart ]

Announcement

Collapse
No announcement yet.

[Troubleshooting] Dell T7610, computer won't post after turning on memory mapped IO >

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    [Troubleshooting] Dell T7610, computer won't post after turning on memory mapped IO >

    I have a Dell Precision T7610, 2 Xeon 2650 v2, 128Gb of ram, MB NK70N. Bought off of ebay in December and shipped with 1 processor and 64 gb of ram. I recently filled the other side of the board with processor #2 and 64 more Gb of ram. There are two gpu's, Nvidia K420 for video and Nvidia P40 for accelerated graphics. I have shelled out my entire budget on electronics tools and building this ‘homelab' computer. I do not have the money to buy another motherboard at this time, otherwise I would!

    The motherboard that shipped with the computer needed replacement as socket #2 wasn't detecting the cpu. Bought another motherboard off ebay, but it came with a broken cmos socket. Seller shipped another, so I now have 3 motherboards on hand. There's a reason I mention this, as I've been using the extra boards to help me troubleshoot what's going on.

    So, with the latest motherboard, I set everything back up and was able to boot into the os just fine. A part of my project is enabling a Nvidia Tesla P40 gpu to work within a Windows vm running on Proxmox. Since the gpu has a large amount of ram, I read that it would be good to enable memory mapped IO > 4Gb. After finding the setting in the bios menu, I enabled it and rebooted.

    Upon powering back on, the computer ‘seemed to turn back on' but was rather lifeless. No video signal being sent and the 4 disk array of sas drives would not initialize (as in I couldn't hear them working, they would spin up but that's all), the computer was also showing diagnostic lights indicating some hw failure but the manual was not more specific than that.

    I dug out from storage the motherboard with the broken cmos socket and fixed it with solder, wire, and tape. I set this motherboard back up and was able to get to the bios splash screen with the dell logo and load the grub bootloader for proxmox but after first showing a FW and ME error and then ‘loading initial ramdisk' message appeared in the prompt, the computer would turn off.

    I had a Proxmox flash usb live disk laying around so I stuck it in and was able to boot into the installer but the prompt again showed the FW and ME error before doing so. I selected ‘advanced options' and saw there was a memory test feature. I chose to run the memtest to observe any errors.

    The test would begin and run for some time but then the computer would shut off, mid test. I had no problem with the computer when it first shipped, so I knew that it was likely the ram that shipped with the computer at that time was good. I decided to remove the 2nd bank of ram/processor and ran the memtest again. This time, no crash, the computer was able to perform the test without crashing.

    Seeing that I could boot into Proxmox, and access it via web gui, I was curious to see how operable the system was at this point. I started a Windows 11 vm but the vm would freeze during the Windows splash screen, not immediatly, but before Windows had a chance to boot up completely.

    Before turning on the memory mapped IO > 4Gb feature in bios, the computer ran just fine. Proxmox reported both cpu's and listed 128 gb of available ram. I was able to create vm's at will and had no problems. Then, after turning on the memory mapped feature it almost seemed like it somehow damaged the hardware.

    I am curious to get the 3rd motherboard working again and see how the system runs with that. I want to figure out a way to reset the bios on that 3rd motherboard, because it was working before I changed the memory mapped setting in the bios.

    Hopefully these details help make the picture clear enough for people to chime in.

    Thank you,
    el
    Attached Files

    #2
    Re: [Troubleshooting] Dell T7610, computer won't post after turning on memory mapped

    Have you enabled IOMMU?

    Comment


      #3
      Re: [Troubleshooting] Dell T7610, computer won't post after turning on memory mapped

      Tangential note:

      I have a Supermicro with two 771 sockets that also doesn't detect the CPUs. The problem begun with only the secondary socket not detecting its CPU, and over the months the primary socket became faulty too. Surprisingly, a workaround is to press both CPUs against their sockets for a few seconds during boot, before the power button is pressed, and release pressure after the power-good LED on the motherboard is lit (~3 seconds). Thereafter it all works, for me. The pressure can be scary, perhaps over 10 kilos per socket.
      Last edited by davidebaldini; 03-01-2023, 03:13 AM.

      Comment

      Working...
      X