[Golang] Conversion of Traditional and Simplified Chinese
OpenCC is a tool (both online and offline) for conversion Traditional and Simplified Chinese. In this post, we will write a Go program to use OpenCC to convert Simplified Chinese to Traditional Chinese.
If you need a converter implemented in Go, please visit gojianfan [7].
Install OpenCC
See OpenCC repository on GitHub for installation. If you use Ubuntu Linux 15.10, you can search and install OpenCC by following command:
$ apt-cache search opencc
fcitx-modules - Flexible Input Method Framework - core modules
libopencc-dbg - simplified-traditional chinese conversion library - debug
libopencc-dev - simplified-traditional chinese conversion library - development
libopencc1 - simplified-traditional chinese conversion library - runtime
opencc - simplified-traditional chinese conversion tool
python-opencc - simplified-traditional chinese conversion library - Python support
$ sudo apt-get install opencc libopencc-dev
OpenCC wrapper for Golang
After some googling [2], I found an OpenCC wrapper [3] for Go. I tried this wrapper but it did not work. So I forked it and made some modifications to make it work on my system. Assume Go is already install in your system, install the my modified wrapper [4] by:
$ go get github.com/siongui/go-opencc
Another problem I had is that README said the configurations are .json files. I run locate opencc command:
$ locate opencc
/usr/lib/x86_64-linux-gnu/libopencc.so.1
/usr/lib/x86_64-linux-gnu/libopencc.so.1.0.0
/usr/lib/x86_64-linux-gnu/opencc
/usr/lib/x86_64-linux-gnu/opencc/from_tw_phrases.txt
/usr/lib/x86_64-linux-gnu/opencc/from_tw_variants.txt
/usr/lib/x86_64-linux-gnu/opencc/mix2zhs.ini
/usr/lib/x86_64-linux-gnu/opencc/mix2zht.ini
/usr/lib/x86_64-linux-gnu/opencc/simp_to_trad_characters.ocd
/usr/lib/x86_64-linux-gnu/opencc/simp_to_trad_phrases.ocd
/usr/lib/x86_64-linux-gnu/opencc/to_cn_phrases.txt
/usr/lib/x86_64-linux-gnu/opencc/to_tw_phrases.txt
/usr/lib/x86_64-linux-gnu/opencc/to_tw_variants.txt
/usr/lib/x86_64-linux-gnu/opencc/trad_to_simp_characters.ocd
/usr/lib/x86_64-linux-gnu/opencc/trad_to_simp_phrases.ocd
/usr/lib/x86_64-linux-gnu/opencc/zhs2zht.ini
/usr/lib/x86_64-linux-gnu/opencc/zhs2zhtw_p.ini
/usr/lib/x86_64-linux-gnu/opencc/zhs2zhtw_v.ini
/usr/lib/x86_64-linux-gnu/opencc/zhs2zhtw_vp.ini
/usr/lib/x86_64-linux-gnu/opencc/zht2zhs.ini
/usr/lib/x86_64-linux-gnu/opencc/zht2zhtw_p.ini
/usr/lib/x86_64-linux-gnu/opencc/zht2zhtw_v.ini
/usr/lib/x86_64-linux-gnu/opencc/zht2zhtw_vp.ini
/usr/lib/x86_64-linux-gnu/opencc/zhtw2zhcn_s.ini
/usr/lib/x86_64-linux-gnu/opencc/zhtw2zhcn_t.ini
/usr/lib/x86_64-linux-gnu/opencc/zhtw2zhs.ini
/usr/lib/x86_64-linux-gnu/opencc/zhtw2zht.ini
/usr/share/doc/libopencc1
/usr/share/doc/libopencc1/changelog.Debian.gz
/usr/share/doc/libopencc1/copyright
/var/lib/dpkg/info/libopencc1:amd64.list
/var/lib/dpkg/info/libopencc1:amd64.md5sums
/var/lib/dpkg/info/libopencc1:amd64.postinst
/var/lib/dpkg/info/libopencc1:amd64.postrm
/var/lib/dpkg/info/libopencc1:amd64.shlibs
/var/lib/dpkg/info/libopencc1:amd64.symbols
I saw no .json files, but saw a lot of .ini files. I used these .ini files as configurations and it worked. I guess that maybe at some moment the author of OpenCC changed the name of configurations.
Souce Code
1 2 3 4 5 6 7 8 9 | package mylib import "github.com/siongui/go-opencc" func CN2TW(input string) string { c := opencc.NewConverter("zhs2zhtw_vp.ini") defer c.Close() return c.Convert(input) } |
You can replace zhs2zhtw_vp.ini with other configurations according to your needs. All configurations I found by locate opencc are:
mix2zhs.ini
mix2zht.ini
zhs2zht.ini
zhs2zhtw_p.ini
zhs2zhtw_v.ini
zhs2zhtw_vp.ini
zht2zhs.ini
zht2zhtw_p.ini
zht2zhtw_v.ini
zht2zhtw_vp.ini
zhtw2zhcn_s.ini
zhtw2zhcn_t.ini
zhtw2zhs.ini
zhtw2zht.ini
Test
1 2 3 4 5 6 7 | package mylib import "testing" func TestCN2TW(t *testing.T) { t.Log(CN2TW("中国鼠标软件打印机")) } |
Output of Test
=== RUN TestCN2TW
--- PASS: TestCN2TW (0.02s)
zhCN2zhTW_test.go:6: 中國滑鼠軟體列印機
PASS
Tested on: Ubuntu Linux 15.10, Go 1.5.2, opencc 0.4.3-2build1.
References:
[1] | 開放中文轉換 Open Chinese Convert (OpenCC) (source code) |
[2] | Google Search: golang opencc |
[3] | stevenyao/go-opencc · GitHub (OpenCC wrapper for Golang, ) |
[4] | siongui/go-opencc · GitHub (my modified OpenCC wrapper for Golang, ) |
[5] | [JavaScript] Conversion of Traditional and Simplified Chinese |
[6] | [Python] Conversion of Traditional and Simplified Chinese |
[7] | [Golang] Converter for Traditional and Simplified Chinese |