0

I have a synology nas drive in raid6 that had a drive failed. I replaced the drive and had two other drives kicked off the raid after a power failure crashing the storage pool. I was able to move all the drives over to another system and rebuild the raid and lvm but there are errors on the btrfs file system that prevent mounting.

I have ran btrfs restore with dry run and a number of expected files show up. I currently get

mount error

can't read superblock on /dev/mapper/vg1-volume_1.

This is the last error message, along with the cant read superblock. from dmesg

BTRFS info (device dm-1): using crc32c (crc32c-intel) checksum algorithm
BTRFS info (device dm-1): using free space tree
BTRFS critical (device dm-1): corrupt leaf: root=1 block=503087104 slot=23, invalid root flags, have 0x400000000 expect mask 0x1000000000001
BTRFS error (device dm-1): read time tree block corruption detected on logical 503087104 mirror 1
BTRFS critical (device dm-1): corrupt leaf: root=1 block=503087104 slot=23, invalid root flags, have 0x400000000 expect mask 0x1000000000001
BTRFS error (device dm-1): read time tree block corruption detected on logical 503087104 mirror 2
BTRFS error (device dm-1): open_ctree failed

The last error I was able to clear was from https://unix.stackexchange.com/questions/369520/btrfs-super-recover-says-all-superblocks-are-good-but-mount-disagrees

zero log cleared the first error.

btrfs rescue super-recover -v /dev/mapper/vg1-volume_1
All Devices:
        Device: id = 1, name = /dev/mapper/vg1-volume_1

Before Recovering: [All good supers]: device name = /dev/mapper/vg1-volume_1 superblock bytenr = 65536

            device name = /dev/mapper/vg1-volume_1
            superblock bytenr = 67108864

            device name = /dev/mapper/vg1-volume_1
            superblock bytenr = 274877906944

    [All bad supers]:

All supers are valid, no need to recover

btrfs check --check-data-csum
Opening filesystem to check...
Checking filesystem on /dev/mapper/vg1-volume_1
UUID: c9d2c563-30cb-4b27-b29a-d5f2642597d8
[1/7] checking root items
[2/7] checking extents
Invalid key type(BLOCK_GROUP_ITEM) found in root(202)
ignoring invalid key
--------There are a lot of these
Invalid key type(BLOCK_GROUP_ITEM) found in root(202)
ignoring invalid key
[3/7] checking free space tree
[4/7] checking fs roots
[5/7] checking csums against data
[6/7] checking root refs
[7/7] checking quota groups skipped (not enabled on this FS)
found 37552772378624 bytes used, no error found
total csum bytes: 2551494312
total tree bytes: 7828389888
total fs tree bytes: 4063068160
total extent tree bytes: 754450432
btree space waste bytes: 1181718069
file data blocks allocated: 37548545822720
 referenced 37710498009088
btrfs inspect-internal dump-super /dev/mapper/vg1-volume_1
superblock: bytenr=65536, device=/dev/mapper/vg1-volume_1
---------------------------------------------------------
csum_type               0 (crc32c)
csum_size               4
csum                    0x5be2c7ca [match]
bytenr                  65536
flags                   0x1
                        ( WRITTEN )
magic                   _BHRfS_M [match]
fsid                    c9d2c563-30cb-4b27-b29a-d5f2642597d8
metadata_uuid           c9d2c563-30cb-4b27-b29a-d5f2642597d8
label                   2019.10.19-07:46:08 v24922
generation              2838041
root                    29818880
sys_array_size          129
chunk_root_generation   2793730
root_level              1
chunk_root              21020672
chunk_root_level        1
log_root                0
log_root_transid        0
log_root_level          0
total_bytes             39958383427584
bytes_used              37552772378624
sectorsize              4096
nodesize                16384
leafsize (deprecated)   16384
stripesize              4096
root_dir                6
num_devices             1
compat_flags            0x8000000000000000
compat_ro_flags         0x3
                        ( FREE_SPACE_TREE |
                          FREE_SPACE_TREE_VALID )
incompat_flags          0x16b
                        ( MIXED_BACKREF |
                          DEFAULT_SUBVOL |
                          COMPRESS_LZO |
                          BIG_METADATA |
                          EXTENDED_IREF |
                          SKINNY_METADATA )
cache_generation        18446744073709551615
uuid_tree_generation    2838039
dev_item.uuid           3ce7be34-1a1f-42aa-9330-186edef4841e
dev_item.fsid           c9d2c563-30cb-4b27-b29a-d5f2642597d8 [match]
dev_item.type           0
dev_item.total_bytes    39958383427584
dev_item.bytes_used     38619297349632
dev_item.io_align       4096
dev_item.io_width       4096
dev_item.sector_size    4096
dev_item.devid          1
dev_item.dev_group      0
dev_item.seek_speed     0
dev_item.bandwidth      0
dev_item.generation     0

I'm not sure how to proceed that a lot of the options are potentially destructive. When I had the drives in the read only synology pool I was able to downloads all my critical files, but cant find out why this cant be mounted on a regular Linux box.

Addition:

I've done more research and found a talk on the mailing lists about dumping the corrupt leaf. (https://lore.kernel.org/linux-btrfs/da987d5b-630f-bf41-8c5c-bb222d09e5a4@gmx.com/T/)

I've dumped the leaf from dmseg with

btrfs ins dump-tree --follow -b 503087104 /dev/mapper/vg1-volume_1

and found two offending leafs

item 23 key (2350 ROOT_ITEM 0) itemoff 12022 itemsize 439
                generation 2838037 root_dirid 256 bytenr 502956032 
                byte_limit 0 bytes_used 81920
                last_snapshot 0 flags 0x400000000(none) refs 1
                drop_progress key (0 UNKNOWN.0 0) drop_level 0
                level 1 generation_v2 2838037
                uuid 7d68b7fc-ce56-ea45-8790-420407a74ad2
                parent_uuid 00000000-0000-0000-0000-000000000000
                received_uuid 00000000-0000-0000-0000-000000000000
                ctransid 2838037 otransid 241922 stransid 0 rtransid 0
                ctime 1708421965.207146578 (2024-02-20 04:39:25)
                otime 1643063079.756142504 (2022-01-24 17:24:39)
                stime 0.0 (1969-12-31 19:00:00)
                rtime 0.0 (1969-12-31 19:00:00)
item 24 key (2350 ROOT_BACKREF 257) itemoff 11990 itemsize 32
                root backref key dirid 256 sequence 113 name @synologydrive
item 25 key (2351 ROOT_ITEM 0) itemoff 11551 itemsize 439
                generation 2836999 root_dirid 256 bytenr 34504704 
                byte_limit 0 bytes_used 16384
                last_snapshot 0 flags 0x400000000(none) refs 1
                drop_progress key (0 UNKNOWN.0 0) drop_level 0
                level 0 generation_v2 2836999
                uuid 131923b8-a88e-0c45-b91f-a43baed3487c
                parent_uuid 00000000-0000-0000-0000-000000000000
                received_uuid 00000000-0000-0000-0000-000000000000
                ctransid 2836999 otransid 268246 stransid 0 rtransid 0
                ctime 1708303323.350998786 (2024-02-18 19:42:03)
                otime 1647641350.503037199 (2022-03-18 18:09:10)
                stime 0.0 (1969-12-31 19:00:00)
                rtime 0.0 (1969-12-31 19:00:00)

From the error messages the flags seem to be the issue and when I dump-tree on the leafs they both show a large number of valid blocks under them. Most of the filenames seem to revolve around synology nas services which were still running during the failed rebuild.

I think I can brute force all filenames under this tree just to verify, but is there a way to force to flags on this leaf back to some form of operable state?

  • The raid was recovered and restored as well as the underlying lvm group etc..... It wont mount on a clean linux box. Also the bfrfs raid6 feature was not used, btrfs was over a standard linux software raid using lvm. – L. Pinkston Mar 11 '24 at 03:25
  • I see, so I think I made a mistake. All I read was the first line, BTRFS RAID6 no more working. ^^, sorry about that. – paladin Mar 11 '24 at 07:06

1 Answers1

0

Sorry for my first comment. But at least your problem is a good example why you cannot trust block level RAID. One might think that recovering the RAID array is all you need to do for full recover, but block level RAID is not 100% safe and cause of the block level nature, you won't even detect an error. BTRFS is a strict filesystem, nearly everything is checksummed. The reason, why your BTRFS detects these errors is, because your RAID failure has destroyed some data on this filesystem. Less strict filesystems like EXT4 wouldn't be able to detect those kinds of errors, or would just "delete" the erroneous things (your lost files). I strongly recommend to use a backup.

It might be possible to "repair" the BTRFS filesystem but you should know, that this kind of repair will never give you the lost data back. I don't recommend this approach. Use a backup. If you have no backup and the data is very important for you, then consult a profesional data rescue company. Don't experiment with the BTRFS "recover" tools, they are extremely dangerous in terms of data safety.

When you use a backup, I recommend the following setup for a new filesystem.

Don't use MDADM, don't use LVM, just use BTRFS RAID1. BTRFS has an inbuilt LVM and RAID functionality which is safer in terms of data protection than MDADM+LVM.

PS don't use BTRFS RAID5 or RAID6, those modes are experimental and not safe to use

paladin
  • 263
  • I checked that the raid events were all identical, and the file system mounts on my synology nas box as read only(Ive also pulled several GB of files sucessfully). It will not recover because it listed to pool as crashed even though the drives are fine.. Im trying top move the drives to a seperate linux box to restore the array. – L. Pinkston Mar 11 '24 at 21:33
  • You have 4 layers. Layer 1, your physical disk drives; Layer 2, your RAID; Layer 3, your LVM; layer 4, your BTRFS filesystem. If you have not used a uninterruptible power supply (UPS), it's hard to tell which of the layer has failed. Layer 4, your checksumming BTRFS filesystem has noticed something is wrong. I would guess that the error is within Layer 2 or Layer 3. Your RAID and LVM tools only see block level, they can only check block level, they cannot verify against a filesystem. All they can say is that from their point of view it looks okay, but BTRFS knows better, they are wrong. – paladin Mar 12 '24 at 11:31
  • Your RAID events are wrong! They are corrupt! Don't trust them! Your filesystems are being mounted in read only to protect your data! If you force to mount them someway to mount in read-write mode, you will lose data! Don't do this! If your mdadm RAID and LVM has no protection vs power failure crashes (no UPS), they might not be able self heal and thus they might become corrupt. Use a backup. Build a new array, use only BTRFS, don't use mdadm RAID, don't use LVM. – paladin Mar 12 '24 at 11:34