Say I am editing some file with vim (or gvim). I have no idea about the file's encoding and I want to know whether it is in UTF-8 or ISO-8859-1 or whatever? Can I somehow tell vim to show me what encoding is used?
3 Answers
The fileencoding setting shows the current buffer's encoding:
:set fileencoding
fileencoding=utf8
There really isn't a common way to determine the encoding of a plaintext file, as that information isn't saved in the file itself - except UTF-8 Files where you've got a so called BOM which indicates the Encoding. This is why xml and html files have charset metatags.
You can enforce a particular encoding with the 'encoding' setting. See :help encoding and :help fileencoding in Vim for how the editor handles these settings. You can also add several fileencoding settings to your vimrc to have vim try detecting based on the ones listed.
- 21,727
Note that files' encoding is not explicitly stated anywhere in a file. Thus, VIM and other applications must guess at the encoding. The canonical way of doing this is with the chardet application, which can be run from within VIM as so:
:!chardet %
The answer provided by jtimberman shows you the encoding of the current buffer which may not be the same encoding as the file on disk. Thus, you will notice that chardet will sometimes show a different encoding than VIM, especially if you have VIM configured to always use a specific encoding (i.e. UTF-8).
The nice thing about chardet is that it gives a confidence score for its guess, whereas VIM can be (and often is) wrong about guessing the encoding if there are not many characters above \x7F (ASCII 127). For instance, adding a single א to a long file of PHP code makes chardet think that the file is ISO-8859-2 with a confidence of 0.72, whereas adding the slightly longer phrase שלום, עולם! gives UTF-8 with a confidence score of 0.99. In both cases, set fileencoding? showed UTF-8 not because the file on disk was UTF-8, but because VIM is configured to use UTF-8 internally.
- 11,460
-
1I suggest that you mention a word about the availability of chardet across OS'es. – Soundararajan Aug 31 '18 at 09:28
-
@Soundararajan: I'm probably not the guy to mention that as I use Debian and CentOS only. You are invited to edit the answer if you have relevant information, though. Thanks! – dotancohen Aug 31 '18 at 12:28
-
I don't see the need to do that inside VIM, better to do it from outside:
chardet <file>. Still, good suggestion. – lepe Aug 03 '19 at 07:10 -
@dotancohen I believe Soundararajan's point is that the Windows command line does not ship with
chardet, and this answer will not work out of the box there. (If it wasn't clear to readers,:!is a shortcut in vim to run a command on the command line, herechardet, which is not [directly] related to vim. This is also why lepe says you can skip the middlehuman and run it on the commandline outside of vim.) – ruffin Oct 14 '21 at 21:01
I found that : https://vim.fandom.com/wiki/Reloading_a_file_using_a_different_encoding
You can reload a file using a different encoding if Vim was not able to detect the correct encoding
:e ++enc=<encoding>
where encoding could be cp850, ISO-8859-1, UTF-8, ...
You can use file yourfilename to find encoding or chardetect (provided by python-chardet or uchardet depending your Linux distribution) as suggested by dotancohen.
- 361
- 2
- 7
-
This doesn't answer the question of how to find out current encoding. Instead this command will force some other encoding on the buffer. – Ruslan Aug 09 '19 at 09:55
set fileencoding?(with a trailing question-mark)? – Seldom 'Where's Monica' Needy Jun 22 '16 at 08:28