[Python] Convert PO file to JSON Format
Introduction
Write a Python program to convert PO files to JSON format. The data of JSON format can be passed to front-end by web servers to translate a text string into the user's native language. You can use the JSON data from PO files to implement gettext function in browsers.
Sample PO files
In this example, we support two locale, zh_TW (Traditional Chinese) and vi_VN (Vietnamese). The zh_TW PO file are located at locale/zh_TW/LC_MESSAGES/messages.po and vi_VN PO file are located at locale/vi_VN/LC_MESSAGES/messages.po.
zh_TW PO file locale/zh_TW/LC_MESSAGES/messages.po:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | # Chinese translations for PACKAGE package. # Copyright (C) 2013 THE PACKAGE'S COPYRIGHT HOLDER # This file is distributed under the same license as the PACKAGE package. # Automatically generated, 2013. # msgid "" msgstr "" "Project-Id-Version: PACKAGE VERSION\n" "Report-Msgid-Bugs-To: \n" "POT-Creation-Date: 2013-06-04 10:20+0800\n" "PO-Revision-Date: 2013-03-10 05:19+0800\n" "Last-Translator: Automatically generated\n" "Language-Team: none\n" "Language: zh_TW\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" msgid "Home" msgstr "首頁" msgid "Canon" msgstr "經典" msgid "About" msgstr "關於" msgid "Setting" msgstr "設定" msgid "Translation" msgstr "翻譯" |
vi_VN PO file locale/vi_VN/LC_MESSAGES/messages.po:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | # Vietnamese translations for PACKAGE package. # Copyright (C) 2013 THE PACKAGE'S COPYRIGHT HOLDER # This file is distributed under the same license as the PACKAGE package. # Automatically generated, 2013. # msgid "" msgstr "" "Project-Id-Version: PACKAGE VERSION\n" "Report-Msgid-Bugs-To: \n" "POT-Creation-Date: 2013-06-06 23:05+0800\n" "PO-Revision-Date: 2013-06-06 22:50+0800\n" "Last-Translator: Automatically generated\n" "Language-Team: none\n" "Language: vi\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "Plural-Forms: nplurals=1; plural=0;\n" msgid "Home" msgstr "Trang chính" msgid "Canon" msgstr "Kinh điển" msgid "About" msgstr "Giới thiệu" msgid "Setting" msgstr "Thiết lập" msgid "Translation" msgstr "Dịch" |
Source Code
Convert PO files to JSON format:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | #!/usr/bin/env python # -*- coding:utf-8 -*- import re import json def getPOPath(locale, domain, localeDir): return localeDir + "/" + locale + "/LC_MESSAGES/" + domain + ".po" def extractFromPOFile(poPath): with open(poPath, 'r') as f: tuples = re.findall(r'msgid "(.+)"\nmsgstr "(.+)"', f.read()) return tuples def PO2JSON(locales, domain, localeDir): # create PO-like json data for i18n obj = {} for locale in locales: # English is default language if locale == "en_US": continue obj[locale] = {} tuples = extractFromPOFile( getPOPath(locale, domain, localeDir) ) for tuple in tuples: obj[locale][tuple[0].decode('utf-8')] = tuple[1].decode('utf-8') #obj[locale][tuple[0]] = tuple[1] return json.dumps(obj) if __name__ == '__main__': locales = ["zh_TW", "vi_VN"] domain = "messages" localeDir = "locale" print(PO2JSON(locales, domain, localeDir)) |
Output of Demo
{"zh_TW": {"Home": "\u9996\u9801", "About": "\u95dc\u65bc", "Setting": "\u8a2d\u5b9a", "Canon": "\u7d93\u5178", "Translation": "\u7ffb\u8b6f"}, "vi_VN": {"Home": "Trang ch\u00ednh", "About": "Gi\u1edbi thi\u1ec7u", "Setting": "Thi\u1ebft l\u1eadp", "Canon": "Kinh \u0111i\u1ec3n", "Translation": "D\u1ecbch"}}
Tested on: Ubuntu Linux 15.10, Python 2.7.10.
References:
[1] | Python Regular Expressions | Google for Education | Google Developers |
[2] | Regex replace (in Python) - a simpler way? - Stack Overflow |