How do I find the encoding of the current buffer in vim?

Question

Say I am editing some file with vim (or gvim). I have no idea about the file's encoding and I want to know whether it is in UTF-8 or ISO-8859-1 or whatever? Can I somehow tell vim to show me what encoding is used?

Community · Accepted Answer · 2011-11-08 14:53:22Z

106

The fileencoding setting shows the current buffer's encoding:

:set fileencoding
fileencoding=utf8

There really isn't a common way to determine the encoding of a plaintext file, as that information isn't saved in the file itself - except UTF-8 Files where you've got a so called BOM which indicates the Encoding. This is why xml and html files have charset metatags.

You can enforce a particular encoding with the 'encoding' setting. See :help encoding and :help fileencoding in Vim for how the editor handles these settings. You can also add several fileencoding settings to your vimrc to have vim try detecting based on the ones listed.

edited Nov 8 '11 at 14:53

Community♦

1

answered Aug 24 '09 at 13:52

jtimberman

20.3k10 gold badges64 silver badges77 bronze badges

7

Unfortunatelly, not correct. For Vim cannot find the encoding of the file you're reading. It is not written in the file. It can only guess based on the available characters in the file. For example a file with the text "abcdef" can be in several encodings, since practically all support those characters, but a file with "šđčćž" will likely be in CP1252. So, you're not reading the encoding from somewhere, but guessing what encoding could that be, and based on that displaying it properly. – Rook Aug 24 '09 at 14:29
6

What you are doing here is explicitly setting the encoding, based on your observations of the file's contents. If you wish for vim to try several encoding, when opening a file, put several of them in the option in your _vimrc. – Rook Aug 24 '09 at 14:32
@ldigas, thanks for the feedback, I've updated the answer to be a bit more clear on that (I hope!) – jtimberman Aug 24 '09 at 15:18
I only wish that the answer were this easy. It's not, see my answer below for the 'right' way and explanation. – dotancohen Dec 26 '13 at 7:00
2

Probably worth mentioning that BOMs are 1.) Not unique to UTF-8 -- though UTF-8's is distinct from other BOMs, 2.) Not required and often not found in UTF-8. – ruffin Oct 16 '14 at 15:09

| show 2 more comments

dotancohen · Accepted Answer · 2013-12-26 06:59:51Z

Note that files' encoding is not explicitly stated anywhere in a file. Thus, VIM and other applications must guess at the encoding. The canonical way of doing this is with the chardet application, which can be run from within VIM as so:

:!chardet %

The answer provided by jtimberman shows you the encoding of the current buffer which may not be the same encoding as the file on disk. Thus, you will notice that chardet will sometimes show a different encoding than VIM, especially if you have VIM configured to always use a specific encoding (i.e. UTF-8).

The nice thing about chardet is that it gives a confidence score for its guess, whereas VIM can be (and often is) wrong about guessing the encoding if there are not many characters above \x7F (ASCII 127). For instance, adding a single א to a long file of PHP code makes chardet think that the file is ISO-8859-2 with a confidence of 0.72, whereas adding the slightly longer phrase שלום, עולם!‏ gives UTF-8 with a confidence score of 0.99. In both cases, set fileencoding? showed UTF-8 not because the file on disk was UTF-8, but because VIM is configured to use UTF-8 internally.

I suggest that you mention a word about the availability of chardet across OS'es. — Soundararajan, Aug 31 '18 at 9:28
@Soundararajan: I'm probably not the guy to mention that as I use Debian and CentOS only. You are invited to edit the answer if you have relevant information, though. Thanks! — dotancohen, Aug 31 '18 at 12:28
I don't see the need to do that inside VIM, better to do it from outside: chardet <file>. Still, good suggestion. — lepe, Aug 3 '19 at 7:10

Pierre-Damien · Accepted Answer · 2019-06-20 09:05:03Z

I found that : https://vim.fandom.com/wiki/Reloading_a_file_using_a_different_encoding

You can reload a file using a different encoding if Vim was not able to detect the correct encoding :

:e ++enc=<encoding>

where encoding could be cp850, ISO-8859-1, UTF-8, ...

You can use file yourfilename to find encoding or chardetect (provided by python-chardet or uchardet depending your Linux distribution) as suggested by dotancohen.

This doesn't answer the question of how to find out current encoding. Instead this command will force some other encoding on the buffer. — Ruslan, Aug 9 '19 at 9:55

How do I find the encoding of the current buffer in vim?

3 Answers 3

Your Answer

Not the answer you're looking for? Browse other questions tagged vim gvim character-encoding or ask your own question.

Linked

Hot Network Questions

How do I find the encoding of the current buffer in vim?

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged vim gvim character-encoding or ask your own question.

Linked

Related

Hot Network Questions