Announcement

Collapse
No announcement yet.

Maxwell Titan X 12G pcb repair

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    Maxwell Titan X 12G pcb repair

    Hi, I have a bunch of old GPUs im using to learn pcb repair and microsoldering. The card has artifacts whilst running and display cuts out if you try to game on it. After initial testing I have found that the memory inductors are showing 12v each side. So being a bit of a noob at this stuff, Im assuming I would have to replace the inductors and find the broken mosfett that is causing this. Im going to add the results of my stabbing guide here tomorrow which should further help the situation. In the meantime are there any boardviews I can use as reference for this card? Heres a pic of the pcb without ohms and volt readings for now.

    Click image for larger version

Name:	DSC_0010.jpg
Views:	379
Size:	2.47 MB
ID:	3575285

    #2
    if card work but artifacting than u must have power delivery ok but have bad comunication with some vram chip. run mats to check it

    Comment


      #3
      I reflowed the gpu and memory chips and went over a few other bits this eve with the hot air gun and its all working fine now.Going to run it for a few days and see what happens. All artifacts are gone and re checked the inductors etc and they are outputting 1.5volts either side of inductor now. Could have been a loose connection the whole time. The card passes mats test everytime and has never chucked an error since this has happened. Very strange, guess if it goes again and the inductor start reading 12v at the memory inlet then must be a mosfet or another inductor to blame. Will post a follow up.

      Comment


        #4
        worst thing to do is reflow. why do it when it be dead fast. do reball and u never have problem with it

        Comment


          #5
          Im curious, why is it the worst thing to do? Damage to the board via heat? or potiential short? I agree it will die again as the solder must have degraded. Im going to get the kit to reball as most of my failed cards have similar issues. So will end up reballing them all eventually. Just need a working card until I can properly fix them all for my racing sim rig.

          Comment


            #6
            Reflow of gpu and all mems is like shooting with a tank to kill a fly. Use Mats first, like ktmmotocross recommended, and reball only one these mems which are faulty. Reflow/reballing of gpu is the last thing to do, it is risky as there is always a big stress to the chip. To my experience reflow or even reballing does not give always permanent results. Sometimes chips repair their silicon due to heat, but then get back degraded after some time. It is difficult to determine if the reason was bad contact of balls to pads or internal issue on the silicon. You do right here to test the card for a longer period.

            Comment


              #7
              Most of the time, artifacting means either RAM or GPU chip failure. Given it's a relatively modern card, it's probably the latter.

              Reballing won't permanently fix this shit, so stop spreading misinformation!
              It's not the solder balls that fail/crack on these chips between the PCB and the substrate. Almost 99% of the cases, it's the bumps between the GPU silicon die and the GPU substrate that fail... and there's no fix for that!!! As DynaxSC noted, it's really the heat (be it from a reball or a reflow) that "repairs" the silicon die's bumps. But it's a temp fix at best.

              As for running MATS - I find even that to be fairly useless on these modern GPUs, because it can only tell you that there is an error / connection failure somewhere between RAM and GPU. But it can't point if the connection failure is between a RAM chip and the card's PCB (which is rare anyways) or if it's between the GPU silicon die and GPU substrate... which goes to the card's PCB (which is an unrepairable problem.) MATS is really only useful for a handful of specific cases, where you might have two data lines shorted together, or one being shorted by some external component, or other similar scenario.

              So in short, artifacts = usually a fucked card.

              I only get such GPUs now if they are at scrap prices - i.e. whatever the cooler would be worth in scrap metal. Of course, I don't scrap it, but instead try to use it on other video cards with crappier coolers... that is, only if the card I got is artifacting and doesn't want to work after a reflow or two. If it does work, put the cooler back on, let the fans run at 100% if I'm putting any significant load on it, and use it until it shits the bed again. After that, chances are very low anything can bring it back.

              My two cents: don't waste too much of your $$ on these scrap cards, and only get a few for hot air / rework purposes only.
              Last edited by momaka; 02-22-2025, 01:23 PM.

              Comment


                #8
                I should have mentioned but I have run loads of mats/mods tests on this card previously which all passed without issue. Ive thouroughly tested the cards pcb components too. So my logic was theres got to be a issue with the connectoin between the ram and the gpu itself hence why I reflowed it. Its been going strong now since this and hasnt skipped a beat. The card has an AIO cooler on it and is running stable at 60 deg full load. Lets see how long it lasts. Ive used this card for cuda rendering for years so have had my moniesworth out of it. If I can squeeze another few months out of it I will be happy and hopefully I can learn a bit more about micro soldering on the way..

                Comment


                  #9
                  Originally posted by momaka View Post
                  Most of the time, artifacting means either RAM or GPU chip failure. Given it's a relatively modern card, it's probably the latter.

                  Reballing won't permanently fix this shit, so stop spreading misinformation!
                  It's not the solder balls that fail/crack on these chips between the PCB and the substrate. Almost 99% of the cases, it's the bumps between the GPU silicon die and the GPU substrate that fail... and there's no fix for that!!! As DynaxSC noted, it's really the heat (be it from a reball or a reflow) that "repairs" the silicon die's bumps. But it's a temp fix at best.

                  As for running MATS - I find even that to be fairly useless on these modern GPUs, because it can only tell you that there is an error / connection failure somewhere between RAM and GPU. But it can't point if the connection failure is between a RAM chip and the card's PCB (which is rare anyways) or if it's between the GPU silicon die and GPU substrate... which goes to the card's PCB (which is an unrepairable problem.) MATS is really only useful for a handful of specific cases, where you might have two data lines shorted together, or one being shorted by some external component, or other similar scenario.

                  So in short, artifacts = usually a fucked card.

                  I only get such GPUs now if they are at scrap prices - i.e. whatever the cooler would be worth in scrap metal. Of course, I don't scrap it, but instead try to use it on other video cards with crappier coolers... that is, only if the card I got is artifacting and doesn't want to work after a reflow or two. If it does work, put the cooler back on, let the fans run at 100% if I'm putting any significant load on it, and use it until it shits the bed again. After that, chances are very low anything can bring it back.

                  My two cents: don't waste too much of your $$ on these scrap cards, and only get a few for hot air / rework purposes only.



                  Your skills are laughlable. nearly everything u wrote is BS. bumgate was MANY years ago.
                  i reball cores nearly every day and have maybe one or two cards coming back after year
                  Last edited by ktmmotocross; 02-24-2025, 09:45 AM.

                  Comment


                    #10
                    Originally posted by ktmmotocross View Post
                    Your skills are laughlable. nearly everything u wrote is BS. bumgate was MANY years ago.
                    Yes, the bumpgate issue was indeed many years ago... but the failure mode is still the same: die separates from substrate, and that's that.
                    The silicon die and the PCB substrate have slightly different coefficients of expansion, so thermal cycling will eventually break the bond between these, no matter what. The difference is how fast this is expected to happen, which is not only dependent on the number of cycles, but also the temperature delta.

                    Then there is one more failure mode on modern silicon: material wear. That's right - you can now consider silicon as a consumable part of your hardware (not that it wasn't before.) Why? Because modern fabs have considerably less silicon than older larger fabs. So electromigration within modern chips is now certainly a more frequent cause of failure than it was before. And the fact that everything is pushed right up to the limit (clocks, core voltages, temperature, and etc.) further accelerates this wear.

                    When any of these issues cause a CPU or GPU to malfunction, there is NO fix for this.
                    But whatever, I ain't gonna rip people off, telling them that this is a proper fix.

                    Originally posted by ktmmotocross View Post
                    i reball cores nearly every day and have maybe one or two cards coming back after year
                    And have you asked yourself why that might be?
                    I mean, it could be because you were right if the issue was somehow cracked solder. Or it could also be the person who got their hardware repaired just said "F-- it, I'm not spending any more money on this" after it broke again... thus the reason why you never saw it come back.
                    Anyways, I'm not trying to belittle your skills or anything. But I was in this same position some years ago (actually a little more than a decade) working in a console repair shop. It was the same thing, essentially - reballs and "new" GPU chips on Xbox 360's and PS3's (back in their hay-day of failures.) Guess what... after a number of years, many of those Xbox 360's we "fixed" with a reball still came back. In the end, it just came down to MS's shitty cooling design on the Xbox 360. Poor cooling = failed GPU... regardless if only reflowed or reballed with leaded solder.
                    .
                    .
                    That said, I do sincerely wish you good luck with your repairs/reballs.

                    Comment


                      #11
                      Just for the sake of the discussion on this thread. I will keep you posted on how long the reflow lasts. And when it does go again will replace the ram chips with new ones and will properly reball the gpu to see how long that lasts for etc. Will add photos of the pads unde the chips too just to see if there is any damage to the substrate and traces/pads.

                      Comment


                        #12
                        Originally posted by Sully114 View Post
                        Just for the sake of the discussion on this thread. I will keep you posted on how long the reflow lasts. And when it does go again will replace the ram chips with new ones and will properly reball the gpu to see how long that lasts for etc. Will add photos of the pads unde the chips too just to see if there is any damage to the substrate and traces/pads.
                        Awesome, please do!
                        .

                        Comment

                        Working...
                        X