42

In 7 Zip when adding a folder to an archive there is the option to change the Word Size.

How does this word size affect compression, in particular the final size of the zip?

I noticed that changing the compression level increases the word size, however even on ultra it only selects a word size of 128 even though the largest option is over double that. Is there a reason why ultra doesn't select the largest? Is optimal compression size somewhere between the biggest and smallest word size?

Aequitas
  • 673
  • Check out what Shell says on this post about part of your questions here --> The Post. – Vomit IT - Chunky Mess Style Jan 06 '16 at 02:21
  • @LMFAO_A_JOKE that just says for some files higher is better sometimes not – Aequitas Jan 06 '16 at 02:49
  • 3
    This doesn't ANSWER all your questions in great detail but for the ONE question of --> How does this word size affect compression, in particular the final size of the zip? I think the post part stating WordSize: usually the bigger, the better (and slower) for well-compressible data (such as documents). Archive size depends quite non-monotonically of it. gives you an explanation to PART of your set of questions. This is why I only put this here for a comment and did NOT answer -- just trying to give you something!!! – Vomit IT - Chunky Mess Style Jan 06 '16 at 02:52
  • What does the last sentence mean, Archive size... non monotonically of it – Aequitas Jan 06 '16 at 03:23
  • 1
    I think this means that the archive size will be smaller (decreasing in size from the original size more) "typically" with the bigger the WordSize value, but it "depends" on the compressibility of the data types that are being compressed such as text as opposed to image files perhaps as one example. The suggestion was to test the different values to get the most optimal value for your data though to know you pick the best options to suit your need. – Vomit IT - Chunky Mess Style Jan 06 '16 at 03:58
  • Do you think I should add my comments as an answer for you or did you determine something more definitive or conclusive? I'll check back at some point and make a decision on my own if I don't hear back. I like to get all questions answers if possible and no one else has taken a stab so I will likely add something at some point but wondering your thoughts on that since it is your question. – Vomit IT - Chunky Mess Style Apr 28 '17 at 00:50
  • This was my experience with a 41,319,424 byte set of files compressing to 39,034,628 bytes on Ultra compression with the standard Word size of 128. Changing it to a Word size of 256 resulted ina .ZIP of 39,033,758 bytes, saving >1KB. YMMV. – rob Mar 13 '20 at 14:15

1 Answers1

12

It really depends on the data you're compressing and the algorithm used.

Word size

Enter the length of words, which will be used to find identical sequences of bytes for compression. For LZMA, big word size usually gives a little bit better compression ratio and slower compression process. Big word size parameter can significantly increase compression ratio in case when files contain long identical sequences of bytes. For PPMd word size has a big meaning. It strongly affects both compression ratio and compression/decompression speed.

There are some comparisons here

Hefewe1zen
  • 1,824