1

Back when HDDs were dominant, it's common to use the dd command to migrate system disks. But that creates a problem with SSDs: unused space from the source disk will also be written once, with either zeros or garbage. SSDs don't like that; they need to know which sectors hold really useful data.

Yes, I know I could do a fstrim or Optimize Disk after the migration, but that approach is imperfect: partition gaps, reserved partitions (like MSR), and space allocated but never actually written to (NTFS MFT/btrfs metadata/GlobalReserve/ext4 inode space?) cannot be trimmed. As an enthusiast I'd like to find a way not to mis-allocate one MiB on my new shiny SSD.

Then I came across the conv=sparse option of the dd command. Seems that by doing a dd if=/dev/sda of=/dev/sdb bs=4K conv=sparse, one could skip all unused sectors when moving data from old SSD to new SSD, assuming that both has the Deterministic Read Zeros after TRIM feature. But will that create new problems? For example, some database redo-log files are inherently packed with 0s, then will it create errors if a sector of 0s is allocated in the OS but not in the underlying flash storage?

Any insights are welcome.

Fred Qian
  • 432
  • will it create errors if a sector of 0s is allocated in the OS but not in the underlying flash storage Of course NOT. Nothing on any level higher than where SSD controller is can tell if a logical block is mapped or unmapped. – Tom Yan Oct 23 '22 at 11:54
  • According to these tests, sparse won't save a great deal. – harrymc Oct 23 '22 at 11:59
  • @TomYan indeed "of course not", but not for the reason you are giving. The controller may well behave like this: "The OS wants to read a logical sector that I haven't mapped on any physical sector, I'm going to tell him he's completely wrong". But it would be a mess, even dd would be unable to clone the disk at the black level. – PierU Oct 23 '22 at 12:05
  • 1
    @harrymc I'm the author if the linked tests. sparse can save from nothing to a lot, it depends on the input data in the first place. It so happened my input data in these tests did not allow to save much; but it could be different. Still I think the link you gave may be helpful here: if the OP uses large obs then they will probably save less than they would with smaller obs. – Kamil Maciorowski Oct 23 '22 at 12:07
  • 1
    @PierU What are you even talking about? Have you misunderstood my point or what? I have no idea what you mean "may well behave like this". So you have bought some drive that comes with a warning like "don't read the part of the drive that hasn't been written or has been TRIM'd, otherwise the read will get you an I/O error instead of zeroes / pattern / old data"? – Tom Yan Oct 23 '22 at 12:10
  • @TomYan I just point out that the reason you are giving is not the right one. – PierU Oct 23 '22 at 12:18
  • @PierU I suppose you thought that I meant conv=sparse will have no effect anyway, and no, I wasn't trying to say that at all. What I mean is whether the logical block is mapped behind the scene will not determine whether normal read will succeed or fail. Or maybe you are trying to say "things are still possible even when they could never happen", idk. (By things I mean like drives that are insane by design.) – Tom Yan Oct 23 '22 at 12:20
  • 1
    The use of the conv=sparse option here entirely relies on the assumption that any SSD always returns zeros when the OS wants to read a logical sector that has been trimmed. While it seem to be a common behavior, I am not sure this can be garanteed. – PierU Oct 23 '22 at 12:23
  • Yes, and I've said nothing that is even related to that. And the OP obviously knows that working RZAT is a requirement for it to work as desired. – Tom Yan Oct 23 '22 at 12:27
  • (1) Your concern about packs of zeros that are useful data is justified. I think in general you need to blkdiscard the whole SSD prior to dd conv=sparse. (2) Unless you're totally sure the source drive is fully healthy, GNU ddrescue is better than dd. Unfortunately it doesn't let you use --sparse when writing to a block device, while dd conv=sparse simply works. I think with ddrescue you could achieve something similar by using --generate-mode in a clever way, but this requires reading the source file twice. – Kamil Maciorowski Oct 23 '22 at 12:32
  • @TomYan in my previous comment was answering the OP, not you. About our argument I think that there is some misunderstanding, but the format of the comments is not convenient to sort it out. – PierU Oct 23 '22 at 12:45
  • A long discussion but no conclusion - or am I wrong? Wouldn't it be safer and surer to backup using a disk image utility that knows about sparse files and unallocated sectors (the later may imply the former)? – harrymc Oct 23 '22 at 13:07
  • @KamilMaciorowski I think you meant fstrim. You don't need to blkdiscard a new drive and you certainly don't want to use it on a drive to clone from... – Tom Yan Oct 25 '22 at 00:08
  • @TomYan I meant blkdiscard on the new SSD, in general, in case "new" means "not the old one", not necessarily "totally not used before". If blkdiscard is not needed for this particular SSD then it will be a harmless no-op. – Kamil Maciorowski Oct 25 '22 at 06:11

0 Answers0