guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Data corruption on reboot


From: Danny Milosavljevic
Subject: Re: Data corruption on reboot
Date: Sat, 26 Apr 2025 22:36:15 +0200
User-agent: mu4e 1.12.9; emacs 29.4

Found something new:

 dannym@nova ~/src/guix-d/guix$ git diff origin/master..HEAD
diff --git a/gnu/services/base.scm b/gnu/services/base.scm
index 8c6563c99d..eca90c7fbb 100644
--- a/gnu/services/base.scm
+++ b/gnu/services/base.scm
@@ -351,11 +351,29 @@ (define %root-file-system-shepherd-service
              ;; Return #f if successfully stopped.
              (sync)
 
+             (call-with-output-file "/dev/console" (lambda (port) (display 
"hello world after sync!\n" port) (force-output)))
+             
+
+             (format #t "processes: ~s~%" (processes))
              (let ((null (%make-void-port "w")))
                ;; Redirect the default output ports.
                (set-current-output-port null)
                (set-current-error-port null)
 
+               (call-with-output-file "/dev/console" (lambda (port) (display 
"hzllo world ports gone!\n" port) (force-output)))
+
+               (system* #$(file-append (@ (gnu packages lsof) lsof)
+                                       "/bin/lsof"))
+
+               (call-with-output-file "/dev/console" (lambda (port) (display 
"hello world after lsof!\n" port) (force-output)))
+               (call-with-output-file "/dev/console" (lambda (port) (display 
mount port) (force-output)))
+
+               (if (zero? (system* #$(file-append (@ (gnu packages linux) 
psmisc) "/bin/fuser") "-ikvwm" "/"))
+                 (call-with-output-file "/dev/console" (lambda (port) (display 
"hxllo world fuser 0!\n" port) (force-output)))
+                 (call-with-output-file "/dev/console" (lambda (port) (display 
"hxllo world fuser not 0!\n" port) (force-output))))
+
+               (call-with-output-file "/dev/console" (lambda (port) (display 
"hello world end!\n" port) (force-output)))
+
                ;; Close /dev/console.
                (for-each close-fdes '(0 1 2))
 
@@ -369,6 +387,12 @@ (define %root-file-system-shepherd-service
                                     #:update-mtab? #f)
                              #t)
                            (const #f))
+                   (when (zero? n)
+                     (call-with-output-file "/dev/console" (lambda (port) 
(display "hello world endless loop!\n" port) (force-output)))
+
+                     (let loop ((q 0))
+                       ((@ (fibers) sleep) 1)
+                       (loop (+ q 1))))
                    (unless (zero? n)
                      ;; Yield to the other fibers.  That gives logging fibers
                      ;; an opportunity to close log files so the 'mount' call

Output:
hello world after sync!
hzllo world ports gone!
#<procedure ... mount>
hello world end!
[177.726300] EXT4-fs error (device dm-0):
ext4_mark_recovery_complete:6276: comm shepherd: Orphan file not empty on 
read-only fs.

Sure enough, after reboot, fsck with "definitely damaged" fs (and about
2 pages of errors again!).  It had been reported "clean" by fsck
previous to this attempt.

Linux nova 6.13.12 #1 SMP PREEMPT_DYNAMIC 1 x86_64 GNU/Linux

I'm thinking either a kernel bug or the fs was already damaged from
my previous attempts (and fsck didn't notice).  Still interesting, though.

See also
<http://patchwork.ozlabs.org/project/linux-ext4/patch/4E66478E.90102@redhat.com/>.
for a discussion about a similar thing.

(There's a "noload" option, huh,
<https://unix.stackexchange.com/questions/61306/what-does-the-noload-option-do-in-fstab>
but not that interesting to fix this here.  "does not change the
filesystem data in any way")

"fsck -f" supposedly helps against it[1].  Well, I guess I can try that next.

[1]
<https://lists.opensuse.org/archives/list/bugs@lists.opensuse.org/message/JRKH5CFGCTLNXOWUKPVV5OJIH45PUYGI/>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]