[Golang] Get HTML Element Attribute via Regular Expression
Given a short string of a HTML element, we want to extract atrribute of the element from the given string. We can use net/html package or goquery to get the job done. But I do not like to go get packages only for handling a short string, so I use the named group matches in Go standard regexp package.
Problem
Given the following string of YouTube embed code:
<iframe width="560" height="315" src="https://www.youtube.com/embed/YpWFR-ioQlE" frameborder="0" allowfullscreen></iframe>
Extract the attributes of width and height from the iframe element.
Solution
package main
import (
"errors"
"fmt"
"regexp"
)
const youtubeiframecode = `<iframe width="560" height="315" src="https://www.youtube.com/embed/YpWFR-ioQlE" frameborder="0" allowfullscreen></iframe>`
func GetAttributes(c string) (w, h string, err error) {
pattern := `<iframe width="(?P<w>[0-9]+)" height="(?P<h>[0-9]+)" .*></iframe>`
re := regexp.MustCompile(pattern)
matches := re.FindStringSubmatch(c)
names := re.SubexpNames()
for i, match := range matches {
if names[i] == "w" {
w = match
}
if names[i] == "h" {
h = match
}
}
if w == "" || h == "" {
err = errors.New("cannot find attribute")
return
}
return
}
func main() {
w, h, err := GetAttributes(youtubeiframecode)
if err != nil {
fmt.Println(err)
} else {
fmt.Println("width: ", w)
fmt.Println("height: ", h)
}
}
References:
[1] | [Golang] Regular Expression Named Group - Extract Metadata from File Path |
[2] | Online regex tester and debugger: PHP, PCRE, Python, Golang and JavaScript |
[3] |