Announcement

Collapse
No announcement yet.

10 hours lost because stupid laptop wouldn't boot! GRRR!

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    10 hours lost because stupid laptop wouldn't boot! GRRR!

    Woke up this morning ready to get some work done, and the laptop was frozen. Tried everything, and ultimately had to hard reset. And then I got the dreadful blank screen with the little blinking dash at the top. I was able to boot into recovery and tried various things, which led to the computer telling me the disk was full. I thought, well that's just impossible. I just installed Kali a month ago, there's no way I filled up 300Gb in a month. du and df both also said the disk usage was at 100%, but I still didn't believe it. Next step was to boot the live CD and run fsck to check the disk for errors. It took about 5 hours, and then another 3 to fix the bad blocks, and still I had disk usage at 100%. Ultimately, I ran sudo du -a / | sort -n -r | head -n 20 and found that /var/log was using 250Gb. It was mainly 3 files, syslog, messages, and user.log, at about 80Gb each. I truncated them and I'm back up and running again. I obviously didn't fix the underlying problem, but I'll figure that out over the weekend. If I would have just believed what it was telling me, I probably would have had it fixed in minutes.
    Last edited by lookimback; 10-25-2018, 02:22 AM.
    ------------signature starts here------------



    #2
    Re: 10 hours lost because stupid laptop wouldn't boot! GRRR!

    sigh log files hogging up all the disk space again! i have that issue too on windows with those dr watson application error log files! so i just emptied the log file as a blank zero byte file and set the file to be read only and no more damn log files hogging up all the disk space!

    Comment


      #3
      Re: 10 hours lost because stupid laptop wouldn't boot! GRRR!

      Originally posted by ChaosLegionnaire View Post
      sigh log files hogging up all the disk space again! i have that issue too on windows with those dr watson application error log files! so i just emptied the log file as a blank zero byte file and set the file to be read only and no more damn log files hogging up all the disk space!
      Not a bad idea.
      ------------signature starts here------------


      Comment


        #4
        Re: 10 hours lost because stupid laptop wouldn't boot! GRRR!

        logs are usefull,
        maybe direct them to a seperate partition, or make a script to trim or delete them when you shutdown.
        i use a script to wipe the browser cache folders like that.

        Comment


          #5
          Re: 10 hours lost because stupid laptop wouldn't boot! GRRR!

          The dreaded blinking cursor of doom, usually means a corrupted boot sector...
          ASRock B550 PG Velocita

          Ryzen 9 "Vermeer" 5900X

          16 GB AData XPG Spectrix D41

          Sapphire Nitro+ Radeon RX 6750 XT

          eVGA Supernova G3 750W

          Western Digital Black SN850 1TB NVMe SSD

          Alienware AW3423DWF OLED




          "¡Me encanta "Me Encanta o Enlistarlo con Hilary Farr!" -Mí mismo

          "There's nothing more unattractive than a chick smoking a cigarette" -Topcat

          "Today's lesson in pissivity comes in the form of a ziplock baggie full of GPU extension brackets & hardware that for the last ~3 years have been on my bench, always in my way, getting moved around constantly....and yesterday I found myself in need of them....and the bastards are now nowhere to be found! Motherfracker!!" -Topcat

          "did I see a chair fly? I think I did! Time for popcorn!" -ratdude747

          Comment


            #6
            Re: 10 hours lost because stupid laptop wouldn't boot! GRRR!

            Kali must handle out of disk oddly then. Ive driven my Suse install out of disk space multiple times. All that happens is syslog refuses to start, and X11 refusing to start, dropping me to a TTY.
            Things I've fixed: anything from semis to crappy Chinese $2 radios, and now an IoT Dildo....

            "Dude, this is Wyoming, i hopped on and sent 'er. No fucking around." -- Me

            Excuse me while i do something dangerous


            You must have a sad, sad boring life if you hate on people harmlessly enjoying life with an animal costume.

            Sometimes you need to break shit to fix it.... Thats why my lawnmower doesn't have a deadman switch or engine brake anymore

            Follow the white rabbit.

            Comment


              #7
              Re: 10 hours lost because stupid laptop wouldn't boot! GRRR!

              Originally posted by stj View Post
              logs are usefull,
              +42

              maybe direct them to a seperate partition, or make a script to trim or delete them when you shutdown.
              +1

              I have a minimalist /var as part of the root partition. Then, mount a separate partition OVER this when booting to multiuser. So, when /var fills up, I get messages about THAT filesystem being full but the root filesystem is still "workable".

              I can then, either, transition to single user (and unmount /var) to figure out where the culprit lies; or sort out the problem with /var still mounted.

              E.g., I've been building lots of "packages" from source. The scripts that do all of the work really beat on the accounting logs. So, /var/account fills up pretty easily (I only have 500MB set aside for /var). I get a message on the console complaining that /var is full, remind myself to turn off accounting (it is normally turned on in rc.d), trim those log files (sa(8)) and things keep moving along.

              Also have newsyslog(8) set up to aggressively roll over log files and compress them. (I log at a very fine level of detail and keep log files for a long time)

              Comment


                #8
                Re: 10 hours lost because stupid laptop wouldn't boot! GRRR!

                Originally posted by stj View Post
                logs are usefull,
                maybe direct them to a seperate partition, or make a script to trim or delete them when you shutdown.
                i use a script to wipe the browser cache folders like that.
                I agree they can be useful. I'll probably make a script to delete entries more than x days old or something.

                Originally posted by goontron View Post
                Kali must handle out of disk oddly then. Ive driven my Suse install out of disk space multiple times.
                I'm wondering if it would have eventually started. I waited about 15 minutes, then decided it wasn't working. But, even after clearing temp files, it still just hung there.

                Originally posted by RJARRRPCGP View Post
                The dreaded blinking cursor of doom, usually means a corrupted boot sector...
                Possibly. After finding the huge log files and truncating them, I also decided to remove the old Kali kernel and updated grub, so maybe that would have fixed any boot problems. I also ran dpkg --configure -a, apt-get fix-broken install, apt-get update, apt-get upgrade, and apt-get dist-upgrade before even trying to boot again because I wasn't sure what errors the bad blocks would have left me with. It's way faster now than it had been recently. I'm guessing it was using a lot of resources to write to those huge log files.
                ------------signature starts here------------


                Comment


                  #9
                  Re: 10 hours lost because stupid laptop wouldn't boot! GRRR!

                  logrotate; or if using journald, there are tons of options to limit log growth...

                  Me, I just watch my disk space, if I know I didn't download something big, I better not see my disk space usage percentage go up, even over time.

                  Comment


                    #10
                    Re: 10 hours lost because stupid laptop wouldn't boot! GRRR!

                    Originally posted by eccerr0r View Post
                    Me, I just watch my disk space, if I know I didn't download something big, I better not see my disk space usage percentage go up, even over time.
                    That depends on the number and types of services that you have running on the machine (and, of course, the logging detail).

                    Most of the "appliances" on my network use syslogd(8) to record "significant events" -- that aren't actually happening ON the machine that is logging them. Just turning a box on and off again can generate several KB of logs (effectively, dmesg(8) output from the appliance).

                    Mail to root and operator autogenerated by daily(5)/weekly(5)/monthly(5) regularly swell those mbx's (presently, root's is ~40M, operator's is ~10M).

                    "Raw" accounting files can quickly grow to hundreds of MB before sa(8) runs (mine had grown to ~300MB last night after just a few hours of heavy make(1) activity).

                    cron(8) keeps a steady trickle of logging activity.

                    Each of my ~dozen UPSs uploads a log of current power/battery/load conditions to ~UPS/logs/<hostname> every 10 minutes.

                    The DHCPd, LPd, NTPd, TFTPd, HTTPd and FTPd services record actions, there -- even if those actions don't really "add content" to the box.

                    As a rule, it's wise to put portions of the hierarchy that can be "unattendedly" consumed on separate partitions so the core system is always operable (and bootable). Usually, just /home and /var will handle most of your "exposure".

                    (And quotas if you really need to lock down specific UIDs).

                    Comment


                      #11
                      Re: 10 hours lost because stupid laptop wouldn't boot! GRRR!

                      Well, for a server that you don't monitor, yes you can use logrotate or configure journald. But for machines meant as a desktop that you work on frequently, you do monitor its usage and not hard to say "something's not quite right" and go investigate.

                      Comment


                        #12
                        Re: 10 hours lost because stupid laptop wouldn't boot! GRRR!

                        Originally posted by eccerr0r View Post
                        Well, for a server that you don't monitor, yes you can use logrotate or configure journald. But for machines meant as a desktop that you work on frequently, you do monitor its usage and not hard to say "something's not quite right" and go investigate.
                        But that assumes you are sitting watching it to SEE when it coughs.

                        E.g., I have been building packages (pkgsrc) on one of my desktop machines. The first step is:
                        Code:
                        # make fetch-list > foo
                        which creates a script that will, eventually, fetch the source code tarballs for the various packages.

                        This takes a fair bit of time (a few thousand packages). But, its worth the effort as I don't want to bother fetching stuff that I've already got, on-hand.

                        So, I just let the machine chug away at it -- for a day or three.

                        But, "make fetch-list" executes a gazillion commands to do its work (because it's a recursive make script). And, as I had accounting turned on, that quickly swelled the accounting log file to a few hundred megabytes. And, overfilled the /var partition (/var/account). As the accounting files rarely grow that big, they aren't pruned but once per day!

                        Had I been sitting there watching the BLANK screen (output having been redirected to "foo"), I would have noticed the error logged to console.

                        But, why would I sit around watching something that will take hours/days? Instead, I'll work on something else.

                        Because /var was a separate partition, when it was overfull, it affected logging and other things that rely on /var -- but, not the "make fetch-list". Had /var been on the root partition, the box would probably have panicked/crashed.

                        This same sort of "problem" will manifest when I eventually run that "foo" script to pull down the various, missing tarballs. (but, I will have made note of its impact on accounting and taken steps to ensure I wasn't bitten, again!)

                        Comment


                          #13
                          Re: 10 hours lost because stupid laptop wouldn't boot! GRRR!

                          I think I might have figured it out. I was trying to figure out why an eMMC wasn't being detected and ran sudo tail -f /var/log/syslog. What I saw was a steady stream of error messages, like hundreds per second, from guvcview saying Unable to dequeue buffer: No such device. Well, I use guvcview with my USB miscroscope and had unplugged it without closing the app first. I had also used it the night before this issue came up, and probably left it open all night.
                          ------------signature starts here------------


                          Comment


                            #14
                            Re: 10 hours lost because stupid laptop wouldn't boot! GRRR!

                            Originally posted by lookimback View Post
                            i think i might have figured it out...
                            PEBCaK

                            Comment


                              #15
                              Re: 10 hours lost because stupid laptop wouldn't boot! GRRR!

                              Originally posted by Curious.George View Post
                              PEBCaK
                              Lol, perhaps, but I'd say this is a situation which should be expected to occur.
                              ------------signature starts here------------


                              Comment


                                #16
                                Re: 10 hours lost because stupid laptop wouldn't boot! GRRR!

                                Originally posted by lookimback View Post
                                Lol, perhaps, but I'd say this is a situation which should be expected to occur.
                                Agreed! But, that's ALWAYS the reason behind software bugs -- the software developer makes an ASSUMPTION which is arbitrary and, often, incorrect! And, when confronted with HIS failure, replies with something like "But you're not supposed to DO that!" (Then, why did you LET me?!)

                                Comment


                                  #17
                                  Re: 10 hours lost because stupid laptop wouldn't boot! GRRR!

                                  Originally posted by Curious.George View Post
                                  But that assumes you are sitting watching it to SEE when it coughs.
                                  You just turned your "workstation" into an unattended "server" so your use model changed...

                                  Comment


                                    #18
                                    Re: 10 hours lost because stupid laptop wouldn't boot! GRRR!

                                    Originally posted by eccerr0r View Post
                                    You just turned your "workstation" into an unattended "server" so your use model changed...
                                    So, if I am willing to sit and watch it compile hundreds of pieces of code, it's a workstation. But, if I get up and walk away, it magically becomes a server?

                                    And, prior to walking away, I should reinstall and reconfigure the software so that it operates AS a server?

                                    Then, when I need to write some NEW code, change everything BACK??

                                    If I sit down in front of machine A, grab the motion controller in my left hand and digitizing pen in my right and begin to design some 3D mechanical parts, it's a workstation, right? And, once I "assemble" those parts into a 3D model set to motion, it becomes a SERVER if I don't care to sit and watch while the machine does its elaborate ray-tracing and rendering?

                                    If I move over to machine B while that's happening and put the finishing touches on a schematic, I'm back at a workstation (?). But, once I place those components on a virtual piece of FR4 and click "autoroute", it transforms itself into a SERVER? Unless I patiently sit and watch it route foils and rip up existing foils as it encounters problems?

                                    Is there a time limit for how long a workstation can be left unattended before it transforms into a server? Should I buy a large stuffed animal to set in my chair, in my absence, to "trick" the machine into retaining its workstation identity?

                                    Don't be silly.

                                    OTOH, the box sitting under my dresser with neither a keyboard nor a monitor and providing services (gee, that word sounds a lot like servers!) to the rest of my network IS a server. And, ADDING a monitor or a keyboard to it won't change that fact/role.

                                    OToOH, the same boxes, running the same OS, sitting atop my workbench are workstations (even without keyboards).
                                    Last edited by Curious.George; 10-26-2018, 11:12 AM.

                                    Comment


                                      #19
                                      Re: 10 hours lost because stupid laptop wouldn't boot! GRRR!

                                      The problem is that you're doing something unattended. That's not a usage model for a typical machine that you do something and get a response back right away, instead of waiting for hours, unattended - typical of a "server" type application.

                                      What's really silly here is dumping stuff to a file that you don't know the size of that may overflow the disk. That's what's silly. Be more responsible of what you do to your machine. There's no excuse here.

                                      Comment


                                        #20
                                        Re: 10 hours lost because stupid laptop wouldn't boot! GRRR!

                                        Originally posted by eccerr0r View Post
                                        The problem is that you're doing something unattended. That's not a usage model for a typical machine that you do something and get a response back right away, instead of waiting for hours, unattended - typical of a "server" type application.
                                        The amount of time you wait for a result has nothing to do with the type of application/machine. Ever download something large? Do you sit and WATCH the download progress? By your definition, if you get up (or look away) your machine now assumes the characteristics of a SERVER -- because the action wasn't quick enough (or you weren't patient enough) to keep you glued to the chair.

                                        What's really silly here is dumping stuff to a file that you don't know the size of that may overflow the disk. That's what's silly. Be more responsible of what you do to your machine. There's no excuse here.
                                        If you reread my post, you will see that redirecting the output of the make(1) to "foo" was not the cause of the problem. Rather, the built-in accounting facilities were not expecting hundreds of thousands of commands to be executed, in rapid succession, between cleanings of the accounting files.

                                        Also note that my installation AVOIDED any "problem" -- as the OP's didn't --by ensuring the accounting log file was allowed to grow on a separate partition where it couldn't interfere with the continued operation (nor "bootability") of the machine.

                                        Similarly, the output of the make(1) was redirected to a file that exists on still another partition -- the one that holds all of the sources for the system, X, and all "packages".

                                        That's how I get uptimes of 400+ days without a crash/reboot. Forty years of running/administering UN*X systems teaches one a little bit about how to keep availability at six nines! When there are "other users" who rely on the machine being "up", you learn how to do lots of things without taking the system down -- lest you want to be doing maintenance in the wee hours of Sunday mornings!

                                        But, hey, I can understand you've probably never done anything that taxed the ability of your machine. Playing solitaire and shooting bad guys is all some folks CAN do with theirs!

                                        Comment

                                        Working...
                                        X