Guess Metadata from HTML and Converted to reStructuredText


Guess metadata from HTML webpage and convert it to reStructuredText format. Currently the following metadata extraction (if available) is supported:

  • title
  • keywords (tags)
  • description (summary)
  • author
  • og:image

Usage:

Check guess metadata from HTML commit in html2rst repo for details of source code.


Tested on: Ubuntu Linux 16.04, Go 1.6.2.


References:

[1][Golang] HTML to reStructuredText
[2]Online Taobao Item to reStructuredText Image on Google App Engine Go
[3][Golang] Create reStructuredText Metadata via text/template Package
[4][Golang] Extract Title, Image, and URL via goquery
[5]Extract title and metadata from a reStructuredText document // homework prod.