[Golang] Get HTML Element Attribute via Regular Expression


Given a short string of a HTML element, we want to extract atrribute of the element from the given string. We can use net/html package or goquery to get the job done. But I do not like to go get packages only for handling a short string, so I use the named group matches in Go standard regexp package.

Problem

Given the following string of YouTube embed code:

<iframe width="560" height="315" src="https://www.youtube.com/embed/YpWFR-ioQlE" frameborder="0" allowfullscreen></iframe>

Extract the attributes of width and height from the iframe element.

Solution

Run Code on Go Playground

package main

import (
      "errors"
      "fmt"
      "regexp"
)

const youtubeiframecode = `<iframe width="560" height="315" src="https://www.youtube.com/embed/YpWFR-ioQlE" frameborder="0" allowfullscreen></iframe>`

func GetAttributes(c string) (w, h string, err error) {
      pattern := `<iframe width="(?P<w>[0-9]+)" height="(?P<h>[0-9]+)" .*></iframe>`
      re := regexp.MustCompile(pattern)
      matches := re.FindStringSubmatch(c)
      names := re.SubexpNames()

      for i, match := range matches {
              if names[i] == "w" {
                      w = match
              }
              if names[i] == "h" {
                      h = match
              }
      }

      if w == "" || h == "" {
              err = errors.New("cannot find attribute")
              return
      }

      return
}

func main() {
      w, h, err := GetAttributes(youtubeiframecode)
      if err != nil {
              fmt.Println(err)
      } else {
              fmt.Println("width: ", w)
              fmt.Println("height: ", h)
      }
}

References:

[1][Golang] Regular Expression Named Group - Extract Metadata from File Path
[2]Online regex tester and debugger: PHP, PCRE, Python, Golang and JavaScript
[3]