[Pelican] Remove Prefix in PATH_METADATA Using Named Regex Group
Question
In Pelican static website, we have the following pages under content directory.
- pages/index%zh-hant.rst
- pages/about-us%zh-hant.rst
- pages/talk/thai-forest-tradition%zh-hant.rst
- pages/talk/thanissaro/how-to-fall%zh-hant.rst
We want to extract 3 metadata: urlpath, slug, and lang from the path of the pages such that
urlpath: empty stringslug: indexlang: zh-hant urlpath: empty stringslug: about-uslang: zh-hant urlpath: talk/slug: thai-forest-traditionlang: zh-hant urlpath: talk/thanissaro/slug: how-to-falllang: zh-hant
As you can see, the prefix pages/ are removed from urlpath
Solution
In pelicanconf.py
PATH_METADATA = 'pages/(?P<urlpath>[-a-zA-Z0-9/]*/|)(?P<slug>[-a-zA-Z0-9]*)%(?P<lang>[-_a-zA-Z]{2,7})\.rst'
PAGE_URL = '{urlpath}{slug}/'
PAGE_SAVE_AS = '{urlpath}{slug}/index.html'
Tested on: Ubuntu Linux 22.04, Python 3.10.6.
References:
[1] | Settings - Pelican 4.8.0 |
[2] | Regex: match an empty string instead of nothing - Stack Overflow |