Find the UTF-8 and UCS value of a character.

I write software that processes Chinese characters.  Thus, I have to deal with UCS and UTF-8 encoding for Chinese characters.  It used to took me quite some time to find or calculate the UTF-8 and UCS value of a Chinese character until I came accross an article about VIM.

I’ve been using VIM for about 8 years but I didn’t know that I can get the UCS and UTF-8 value of a character directly from VIM.  To get the value:

  • Place the cursor on the character, in command mode, then type `ga’ for the unicode value, `g8′ for the UTF-8 value.

VIM saves files in UTF-8 encoding.  To see the content of the file in HEX:

  • `hd filename’ or `hexdump -C filename’

The output will be the same as `g8′.

The first image shows the UTF-8 value of the Chinese character ‘你’.  The second image show the UCS value of the same character.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: