4

I am trying to define a robust file naming scheme to be used across the systems I am using. I work in university labs and I deal with all the three major OS: linux, macos, windows. Therefore I'd like to stick on something that can be easily used for all major OS.

Currently my file/folder naming rule is very simple:

  1. only use lower case letters from the English alphabet, and numbers
  2. no spaces, use underscores (_) as separator
  3. anything else is forbidden

Although this is simple and works across systems, it is too restrictive.

For example many modern search file/folders tools having spaces within file names maybe convenient. In fact most of these tools if I search for "XXX YYY", the tool will look for all those files/folders having these two sequences in the file name. This is not the case without spaces.

Another example is music album. Most software have naming scheme like "Artist - AlbumTitle", "Artist - Year - AlbumTitle", etc.

My question are:

  1. how dangerous are spaces in file/folder names? I was educated that to ensure good and easy maintainable system, it's better to avoid spaces. Is this really the case?

  2. I have been always told that another good rule is to avoid dashes (-). But I realized that in the linux world for example, dashes are used all over the system. And for instance the ISO standard date is something like "2019-05-17".

What do you thing about these issues? I'd really love to learn what is your favorite file/folder naming scheme

Best Pietro

  • 1
    "how dangerous are spaces in file/folder names?" - these are actually three different questions as it might differ in each OS. Also might depends on which other services use your files (backup, legacy apps ...) as those might be more restrictive. "I'd really love to learn what is your favorite file/folder naming scheme" - this part is explicitly off-topic here making your question too subjective. – Máté Juhász May 17 '19 at 07:50
  • Try to avoid anything that expands to something dangerous when using wildcards. For example when you create a file -rf on linux, the command rm * will expand to rm -rf fileA fileB folderC, when folderC would not be deleted when there is no -rf file. And the file -rf itself is not deleted. Many windows programs use / for parameters, that is reserves anyway, but some use dashes as well or in addition. Spaces cause problems, because they need to be escaped and sometimes you need two or more layers to escape spaces and escape characters, causing headache for developers. – allo May 17 '19 at 08:42
  • 1
    Personally I dislike underscores, because I need to press Shift to type those, and a normal hyphen - is just faster. But see the answer by @grawity why they shouldn't be used at the beginning of a filename. And please, never ever use & ;-) While this is allowed in Windows, it makes writing any sort of batch file a disaster. And yes, we have that in our corporate network... – Berend May 17 '19 at 08:56

1 Answers1

11

only use lower case letters from the English alphabet, and numbers

Capital letters are free. Windows is not case-sensitive, but it is case-preserving. (Of course, avoid having multiple filenames which only differ in case as Windows will only allow accessing one of them.)

All listed operating systems work well with Unicode, so there should be no problems with naming your files in e.g. Russian or Korean. (Some programs, e.g. Dropbox, only support BMP up to U+FFFF, i.e. no emojis or other modern non-BMP additions – but that alone covers most of the world's languages already. You can check specific characters at https://codepoints.net or similar websites if you're not sure.)

how dangerous are spaces in file/folder names? I was educated that to ensure good and easy maintainable system, it's better to avoid spaces. Is this really the case?

For regular documents/music/pictures, spaces in the middle of a file name are practically never a problem. They're a slight inconvenience in command-line due to having to be quoted, but that's usually it.

(Trailing spaces at the end are not allowed in Windows, and leading spaces might cause issues.)

On the other hand, for development or system files, I'd strongly recommend avoiding spaces.

There are some tools which don't like them – if you ever tried to compile a program which uses "autotools" or some other build systems, or just a poorly-written Makefile, it will be difficult to do so when there's a space anywhere in the path.

Some programs (system daemons, most commonly) also use configuration files which expect a list of space-separated paths. Some of them allow spaces within a name to be quoted or escaped, others do not.

I have been always told that another good rule is to avoid dashes (-). But I realized that in the linux world for example, dashes are used all over the system. And for instance the ISO standard date is something like "2019-05-17".

Dashes in the middle are never a problem. Leading dashes are often a problem, as command-line programs may confuse them with --options and you have to use additional syntax to use such file names.

However, it's still usually an easily avoidable problem. For example, rm -file- won't work, but rm ./-file- will. Graphical programs do not have this issue at all (unless they run CLI tools behind the scenes).


Other possible problems (more likely to occur with folders than files):

  • The usual Windows forbidden character set: : < > ? * " \ / |
  • Names beginning with a period (.) are considered "hidden files" on Linux.
  • Names ending with a period (.) or a space () are not allowed in Windows.
  • Names ending with a tilde (~) may be ignored by some programs as "backup files".
  • Multi-line filenames, while technically allowed in Linux, will cause all sorts of trouble.

The rest is subjective and is out of scope for this site.

u1686_grawity
  • 452,512
  • rm from GNU Coreutils, like every program that uses the getopt function to parse its arguments, has the command line option -- that ensures that all following arguments are not interpreted as options. – Tobias May 20 '20 at 14:32