Guess Metadata from HTML and Converted to reStructuredText
Guess metadata from HTML webpage and convert it to reStructuredText format. Currently the following metadata extraction (if available) is supported:
- title
- keywords (tags)
- description (summary)
- author
- og:image
Usage:
Check guess metadata from HTML commit in html2rst repo for details of source code.
Tested on: Ubuntu Linux 16.04, Go 1.6.2.
References:
[1] | [Golang] HTML to reStructuredText |
[2] | Online Taobao Item to reStructuredText Image on Google App Engine Go |
[3] | [Golang] Create reStructuredText Metadata via text/template Package |
[4] | [Golang] Extract Title, Image, and URL via goquery |
[5] | Extract title and metadata from a reStructuredText document // homework prod. |