HDF5 is, to some extent, a filesystem on its own. By introducing B-Trees and by the way it manages blocks, it duplicates the functionality of a filesystem. When you are running your code, you are probably running it on an operating system with a proven and scalable filesystem. Hence, I would suggest to write your numerical raw data into a single file using raw file access or MPI-IO and write the meta-data (endianess, size, attributes, etc.) into a separate JSON or XML file. If you have multiple datasets you can organize them into a directory or a hierarchy of directories. When you want to distribute the dataset, you just have to pack it into a ZIP file.
The only downside is that you have to deal with Endianness yourself, which is, however, not hard.
For an inspiration on how this can be done see Dragly, et. al. "A. Experimental Directory Structure (Exdir): An Alternative to HDF5 Without Introducing a New File Format" Front. Neuroinform., 2018, 12.