Question
How can I teach 7-Zip -- specifically the command line version 7z -- that I want to disable only the ARM64 filter, but all others are acceptable?
My goal would now be to disable the ARM64 filter in the newer 7-Zip versions, keeping compatibility with the older 7-Zip versions.
Background
7-Zip 23.00 introduced the ARM64 filter for binaries targeting the eponymous architecture. Excerpt from the history.txt:
23.00 2023-05-07
- 7-Zip now can use new ARM64 filter for compression to 7z and xz archives. ARM64 filter can increase compression ratio for data containing executable files compiled for ARM64 (AArch64) architecture. Also 7-Zip now parses executable files (that have exe and dll filename extensions) before compressing, and it selects appropriate filter for each parsed file:
- BCJ or BCJ2 filter for x86 executable files,
- ARM64 filter for ARM64 executable files. Previous versions by default used x86 filter BCJ or BCJ2 for all exe/dll files.
This effectively introduces a backward incompatibility. Whenever asking older versions of 7-Zip to process a file created with version 23.00 or newer and containing an ARM64 binary, the output will look something like:
Method = 0A LZMA2:26 LZMA:20 BCJ2 ARMT
... whereas with the new versions we can see that 0A translates to ARM64:
Method = ARM64 LZMA2:26 LZMA:20 BCJ2 ARMT
What I have tried & "benchmarks"
While with -mf=BCJ2 I can tell it to use exclusively the specified filter or with -mf=off I can turn all filters off, neither seems to be what I want. Turning off all filters means worse compression ratio. Turning on only a particular filter everywhere means slower compression.
Comparison (file system cache was prepopulated for all of the tests):
- switches as previously: yield
Method = ARM64 LZMA2:26 LZMA:20 BCJ2 ARMT, in 36.620 s and a size of 115454049 bytes (111 MiB)-t7z -mx=9 -myx=7 -mmt=on -mmtf=on -ms=on -mqs=on -mtm=off -mtc=off -mta=off -mtr=off -stl
- with additional
-mf=off: yieldsMethod = LZMA2:26, in 37.692 s and a size of 116861788 bytes (112 MiB) - with additional
-mf=BCJ2: yieldsMethod = LZMA2:26 LZMA:20 BCJ2, in 36.031 s and a size of 115483281 bytes (111 MiB)
Obviously for my example -- which is a smaller subset of what I am actually trying -- the size and performance difference between -mf=BCJ2 and without is small enough. But for the actual size of data I am working with we're getting into regions where these sum up.