bug-ddrescue
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-ddrescue] Hanging ddrescue (infinite read operations)


From: Jay Ashworth
Subject: Re: [Bug-ddrescue] Hanging ddrescue (infinite read operations)
Date: Tue, 16 Jan 2018 12:13:41 -0500
User-agent: K-9 Mail for Android

There is one exception to that. Unix in general, and I believe still Linux, 
makes a distinction between fast and slow IO devices, and when a 'fast' device, 
like a disk, gets hung up inside its device driver, there is no way to kill it.

On January 16, 2018 11:03:59 AM EST, Antonio Diaz Diaz <address@hidden> wrote:
>Hi Linus,
>
>Linus Lüssing wrote:
>> First of all, I need to complain vehemently: GNU ddrescue works
>> too well :-). Now for the third time, it saved one of my neighbors
>> data! How should people learn to make backups if such an
>> awesome tool like ddrescue exists? :P
>
>How true! :-)
>
>
>> Just kidding ;) - you guys are awesome, GNU ddrescue is one of the
>> most valuable (and still too unknown) pieces of free software in my
>> opinion.
>
>Thanks.
>
>
>> During these hangs, the Ctrl-C would do nothing and even a SIGKILL
>> would not kill ddrescue. The SATA-to-USB adapter would continue
>> flashing its blue LED, seemingly still trying to read.
>
>We have already had bad experiences with USB adapters in this list.
>(The 
>advice is to plug the drive directly to the motherboard). But in this 
>case there seems to be also a bug in the kernel driver regarding 
>SIGKILL. According to POSIX, SIGKILL cannot be handled or ignored. The 
>GNU C library manual even states that:
>"In fact, if 'SIGKILL' fails to terminate a process, that by itself 
>constitutes an operating system bug which you should report."
>
>
>> Question A): Would it be possible to reset the operation from
>> software somehow? A timeout in ddrescue? Or does this sound like a
>> hangup on an even lower level, the Linux kernel (I was using a
>> 4.14.12 kernel on a 32bit ARM device, an Odroid U3) or maybe even
>> the disk and/or SATA-USB adapter so that power cycling the disk /
>> reconnecting the adapter is the only choice?
>
>The kernel driver for a device should know and implement whatever 
>timeout required for that type of device. The problem, I think, is that
>
>USB is not a device, but a communication bus, and maybe the driver just
>
>sits and waits forever. In any case, if SIGKILL fails to terminate 
>ddrescue, there is nothing that ddrescue can do.
>
>
>> Another observation: During the trimming and scraping phases (so
>> with the chunk size of 1 / 512B instead of 128x 512B chunks?) I
>> did not experience those tedious hangs anymore. Could it be a
>> firmware bug happening when requesting larger chunks?
>
>Maybe. Next time maybe you could try if --cluster-size=1 prevents the 
>hangs during the copying phase.
>
>
>> Also, after pulling the USB cable, ddrescue unfroze and exited
>> with an error, as expected.
>
>This seems consistent with "the driver just sits and waits forever" 
>(until the connection is interruped).
>
>
>> Regarding the unplugging I also noticed: Pulling without a
>> previous Ctrl+C seemed like a bad idea. This lead to ddrescue
>> adding many Megabytes of false negatives to the mapfile.
>>
>> Question B): Would it be possible to prevent this?
>
>Yes, using --reopen-on-error, --max-error-rate or --max-bad-areas. 
>--reopen-on-error should return immediately reporting "Can't reopen 
>input file". (Maybe --reopen-on-error sould be enabled by default).
>
>
>> For the Ctrl+C and then unplugging I noticed: Sometimes it exits
>> with an "interrupted by user", sometimes with a "input file/device
>> vanished". I couldn't figure out when one or the other might
>> happen, the result was seemingly random.
>
>It depends on how fast the kernel removes the device name from /dev. 
>ddrescue stats the device name after each read error and, if it still 
>exists, moves to the next block (and then exits with "interrupted by
>user").
>
>
>> Also it seemed, that only for the latter exit case a bad cluster
>> was added to the mapfile? Which was the desirable result for me as
>> this was indeed a cluster hanging forever. For the "interrupted by
>> user" case it seemed that (usually?) no error was added to the
>> mapfile. Does that make sense?
>
>If ddrescue is blocked in the read call when unplugging, it should 
>always mark the block as "bad" (non-trimmed, etc) in the mapfile. The 
>user interrupt is checked before making the read call. Maybe the USB 
>adapter is returning fake data, tricking ddrescue into marking the
>block 
>as finished in the "interrupted by user" case?
>
>
>Best regards,
>Antonio.
>
>_______________________________________________
>Bug-ddrescue mailing list
>address@hidden
>https://lists.gnu.org/mailman/listinfo/bug-ddrescue

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]