21

Until not so long ago, I've not even known that you could compress specific folders, files or even entire drives using windows' builtin compression. A simple way to do this is just go to properties and check "compress contents to save disk space" and you're all set.

Firstly heard of it I thought it was just like WinZip compressing files to reduce size or combine all files in one zipped file. But it seems like it has different use case than that.

And what's most interesting is the file is compressed, but the file's hash output remained the same (short experiment using a third-party hash calculator). How can this be real? If the input changes, the hash output must change (except in case of collision which is pretty rare and is off topic). For example, let’s say I compressed a file named MYDOCUMENT.pdf, and I can just keep it that way, putting it on USB drives or other newly installed PC and just use it as it is just normal file, without manually decompressing it and such?

When I checked the file size in properties, the size didn't change even a byte, but only the "size on disk" changed decreasingly. So it seems that the file's data remains intact as is (the same hash value probably proves this), but it just compresses and decompresses when reading it from the OS side.

Another question would be: There's another compress algorithm using the command line prompt, by typing "compact.exe /compactos:always". What's the difference between the two?

Windows is giving me a headache these days :/

Canadian Luke
  • 24,339
dddaasdf
  • 229
  • 17
    The files are decompressed when you read them. – DavidPostill May 08 '22 at 10:21
  • 19
    The keyword here is transparent compression. Much like transparent encryption (Bitlocker, VeraCrypt, LUKS, ...). – Daniel B May 08 '22 at 10:37
  • 2
    Files are "stored" on the disk like compressed files, but are presented to you and other programs like normal files. Same hash because its decompressed before being read by the program – Saaransh Garg May 08 '22 at 10:38
  • @DavidPostill ♦ // Daniel B// Saaransh Garg thank you for the replies. so file really is compressed , hash changes at the moment of compression, and revert back to original when reading it, and if I probably want to undo this compression, I need to manually uncheck compress option to get those files to the original state? and as long as I use windows 10 or later, it is anyway read normally whether compressed or not although negligibly slower? – dddaasdf May 08 '22 at 11:05
  • @dddaasdf Correct – DavidPostill May 08 '22 at 11:15
  • 1
    @dddaasdf the compression and decompression is handled by the filesystem driver at the stage before data is sent to the disk or received from the disk. The compression algorithms were chosen to be fast enough that any CPU should be able to compress and decompress data going to an HDD faster than the disk can accept. As such it is transparent and while there will be some CPU load it should not significantly affect read or write speed. I wouldn't expect it to keep up with a modern 5GB/s SSD, but it will easily outperform an old 100MB/s HDD on any modern system. – Mokubai May 08 '22 at 11:55
  • @Mokubai I see. so is there any negative impact on storage life? will it have additional read/write operation to the drive every time it is compressed/decompressed, therefore damaging SSD's P/E cycles? I am afraid so because after decompressing to read files, it will have to rewrite the file in the compressed form for the future use. – dddaasdf May 08 '22 at 12:04
  • 1
    It will only use SSD P/E cycles when you change the "compressed" setting as that will mean reading the data, compressing (or decompressing) it, and then writing it back. From being freshly written the compression and decompression will happen "on the fly" in memory so shouldn't cause significant "extra damage". If your data is highly compressible then it could actually extend the life as you may be using far less blocks (so less written to disk) and leaving more free for the wear levelling to do it's job. What benefits or downsides you see will depend on your data and use case though. – Mokubai May 08 '22 at 12:13
  • @Mokubai thank you a ton! since I don't have too much of data to rely on compression, I won't be bothering about that option for now. but at least I know what to do when I need to use that. I usually don't prefer to make frequent changes to the system , especially when that change involves messing around with files , since one hiccup in the process would end up corrupting a file. – dddaasdf May 08 '22 at 12:32
  • @dddaasdf I doubt it matters but since you mentioned "Windows 10 or later", the NTFS compression feature is very old. I'm pretty sure it goes back to Windows 2000, but if not, I'm almost certain it predates Windows 7 / goes back to Server 2003. – pbristow May 09 '22 at 02:08
  • 1
  • Windows, and MS-DOS, have supported transparent FS compression for about thirty years. – Sneftel May 09 '22 at 17:17

2 Answers2

28

What exactly does NTFS compression do to files?

By default, it transparently compresses them using a variant of Lempel-Ziv compression:

The LZNT1 compression algorithm is the only compression algorithm implemented. As a result, the LZNT1 compression algorithm is used as the DEFAULT compression method.

Source: FSCTL_SET_COMPRESSION control code

When you read the file (for example to calculate the file hash) it is transparently decompressed on the fly.


There's another compress algorithm using command line prompt

compact.exe /compactos:always", what's the difference between the two?

compact displays or alters the compression of files on NTFS partitions.

always Will compress all OS binaries and set the system state to non-compact.

It supports different algorithms for exe files only:

/EXE Use compression optimised for executable files which are read frequently and not modified,

Supported algorithms are:

XPRESS4K (fastest) default XPRESS8K XPRESS16K LZX (most compact)

/CompactOs Set or query the systems compression state.

Supported options are:

query - Query the systems compact state. always - Compress all OS binaries and set the system state to non-compact. never - Uncompress all OS binaries and set the system state to compact which remains unless an administrator changes it.

Source: Compact - Compress files - Windows CMD - SS64.com


Further Reading

DavidPostill
  • 156,873
  • thank you a lot. so assuming worst case scenario, I copy files from compressed drive without knowing it is compressed , and move it to another storage which do not support decompression (old day windows or other little known linux distributions), then the file won't open and I cannot use normal extractor to decompress the file, feeling like an idiot haha. – dddaasdf May 08 '22 at 11:28
  • 13
    No - the file will be decompressed when what you read it and will stay that way if you move it to another drive. – DavidPostill May 08 '22 at 11:31
  • you're right. I just finished testing and confirmed file stays compressed if I move it to other folder in the same drive, but return to its original when moving to different drive. I might have been too paranoid! lol Answer selected. – dddaasdf May 08 '22 at 11:39
  • ahh, one more question pls. the compactos command compression - maybe lzw algorithm - also applies in the same way ntfs works, in terms of it being decompressed when reading it, and moving it to another drive will make the compression void? – dddaasdf May 08 '22 at 11:44
  • Yes, indeed that is correct. – DavidPostill May 08 '22 at 12:31
  • Is the compression random-access? Or sequential only? – TLW May 08 '22 at 23:29
  • 8
    @TLW the compression is semi-random access. Always 64Kbytes / 16 Clusters (Cluster sizes different than 4K are not supported for compression) are compressed. If they take less space (at least one cluster less), then these are stored in compressed form, otherwise the data for these 64KB is stored as is – at least in the original NTFS compression. – Ro-ee May 09 '22 at 00:38
  • @DavidPostill: So if a user boots from an Ubuntu live disk and copies the files from the Windows NTFS partition, onto an ext4 filesystem on another pen drive, would the files still stay compressed? – Nav May 09 '22 at 07:13
  • 3
    @Nav. No "Linux supports transparent compression of NTFS drives." See Does Ubuntu support working with compressed Windows files? – DavidPostill May 09 '22 at 07:38
  • @Nav: If they would stay compressed, they (and, by extension, the NTFS driver employed) would be effectively unusable, as you could not view any documents from said drive. – DevSolar May 09 '22 at 12:58
  • 2
    It should be noted that, in order to support the "semi-random access I/O", NTFS transparent file compression has worse compression ratios than non-transparent compression tools like WinZip or 7-Zip. – user46971 May 09 '22 at 16:24
4

@DavidPostill's answer is correct and complete, but let me try to explain too:

NTFS compression is transparent to any application that uses it. This means that for every:

  • Read of the file, the file is decompressed before being presented to the application for display / hash calculation / copying somewhere / etc.
  • Write of the file, the file is compressed after the application writes to it.

Performance: In theory, reading/writing the file becomes somewhat slower, due to the compression/decompression that's happening. However, this is usually negligible nowadays, and may also be offset by the drive needing to store less info.

Management:

  • Windows Explorer shows you see how much space on disk you're using, vs. how much the file would use if not compressed.
  • compact.exe also shows you that, and also lets you enable/disable compression of files. compact.exe /CompactOs:always will set Windows compress all OS binaries.
Jonathan
  • 3,537
  • "In theory, reading/writing the file becomes somewhat slower" - writing is almost certainly slower, but reading can be quicker when the data is highly compressible, the drive speed is slow and the CPU speed is fast. – rjmunro May 10 '22 at 11:57
  • 2
    @rjmunro This is incorrect. I’ve seen games that have save files who’s size is dropped by an order of magnitude by compression. (About an 8x reduction). The save (and load) operation was at least twice at fast when the save folder was flagged as compressed. Presumably this is because my SATA SSD was “slow” compared to the CPU cost of compressing the file. Your statement is likely true for an efficient file format, but probably not for large (10MB+) XML, text, and JSON files. – Patrick M May 10 '22 at 14:54
  • @PatrickM Yeah, "almost certainly" was a bit too strong. Perhaps I should have said "usually". – rjmunro May 11 '22 at 10:43
  • Try to copy big file to the compressed directory and you will see that it's x2 slower at least compared with regular folder... so it's not a zero price cost! – Yura Oct 24 '22 at 13:06