I have created a python tool which can do this. I did this because I tried @Thomas Luzat's approach in both my own and @Johannes Ernst's implementation, and the used space doubled from 20GB to 40GB in the cloning procedure. I thought something more efficient was needed.
Consider this common file system history:
current ---------------------------------\
| | | |
snap4 snap3 snap2 snap1
With Thomas' algorithm, "current" would be cloned first, and all snapshots (being snapshots of former states of "current") would use "current" as clone source / parent. Obviously, it would be better to base snap3 on snap4, snap2 on snap3, etc.
And this is just the tip of the iceberg; finding the "best" clone sources (in terms of space savings) in a btrfs file system with a complex history is a non-trivial problem. I've come up with 3 other strategies to solve this problem, which seem to use space much more efficiently. One has actually resulted in clones size slightly below that of the source.
You can read the details on the github page if you're interested.
ogenmeans? – drumfire Jun 03 '16 at 16:09ogenis the subvolume's "origin generation". I have to admit that I do not fully understand the differences or whether using the (non-origin) generation would be correct, but assume some test indicated that this worked better (avoided duplication). The generation does seem to get updated when creating snapshots based on a subvolume, ogen doesn't. I would be interested in hearing about some findings. It's probably best to check on IRC or the Btrfs mailing list. – Thomas Luzat Jun 04 '16 at 14:10