Find the UTF-8 and UCS value of a character.
November 9, 2009
Posted by on
I write software that processes Chinese characters. Thus, I have to deal with UCS and UTF-8 encoding for Chinese characters. It used to took me quite some time to find or calculate the UTF-8 and UCS value of a Chinese character until I came accross an article about VIM.
I’ve been using VIM for about 8 years but I didn’t know that I can get the UCS and UTF-8 value of a character directly from VIM. To get the value:
- Place the cursor on the character, in command mode, then type `ga’ for the unicode value, `g8′ for the UTF-8 value.
VIM saves files in UTF-8 encoding. To see the content of the file in HEX:
- `hd filename’ or `hexdump -C filename’
The output will be the same as `g8′.
The first image shows the UTF-8 value of the Chinese character ‘你’. The second image show the UCS value of the same character.