[Golang] XML Parsing Example (6) - Parse OPML Concisely


The previous post [5] shows how to parse OPML format, this post will show how to parse it concisely by using > in tag. First we see what the official doc in encoding/xml says about >:

Tip

If the XML element contains a sub-element whose name matches the prefix of a tag formatted as "a" or "a>b>c", unmarshal will descend into the XML structure looking for elements with the given names, and will map the innermost elements to that struct field. A tag starting with ">" is equivalent to one starting with the field name followed by ">".

Difficult to understand above explanation without example. Now take a look at the struct defined in previous example:

type opml struct {
      XMLName         xml.Name        `xml:"opml"`
      Version         string          `xml:"version,attr"`
      Head            head
      Body            body
}

type head struct {
      XMLName         xml.Name        `xml:"head"`
      Title           string          `xml:"title"`
}

type body struct {
      XMLName         xml.Name        `xml:"body"`
      Outlines        []outline       `xml:"outline"`
}

We can remove the head and body struct and keep meaningful content by:

type opml struct {
      XMLName         xml.Name        `xml:"opml"`
      Version         string          `xml:"version,attr"`
      OpmlTitle       string          `xml:"head>title"`
      Outlines        []outline       `xml:"body>outline"`
}

Title field in head struct becomes OpmlTitle in opml struct, and Outlines field in body struct becomes Outlines in opml struct.

Complete source code for concisely parsing the OPML in [5]:

Run code on Go Playground

parse-5_2.go | repository | view raw
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
package main

import (
	"io/ioutil"
	"encoding/xml"
	"fmt"
)

type opml struct {
	XMLName		xml.Name	`xml:"opml"`
	Version		string		`xml:"version,attr"`
	OpmlTitle	string		`xml:"head>title"`
	Outlines	[]outline	`xml:"body>outline"`
}

type outline struct {
	Text		string		`xml:"text,attr"`
	Title		string		`xml:"title,attr"`
	Type		string		`xml:"type,attr"`
	XmlUrl		string		`xml:"xmlUrl,attr"`
	HtmlUrl		string		`xml:"htmlUrl,attr"`
	Favicon		string		`xml:"rssfr-favicon,attr"`
}

func main() {
	o := opml{}
	xmlContent, _ := ioutil.ReadFile("example-5.xml")
	err := xml.Unmarshal(xmlContent, &o)
	if err != nil { panic(err) }
	for _, outline := range o.Outlines {
		fmt.Println(outline)
	}
}

The output result is the same as the result in [5].

Tested on: Ubuntu Linux 14.10, Go 1.4.


[Golang] XML Parsing Example series:

[1][Golang] XML Parsing Example (1)
[2][Golang] XML Parsing Example (2)
[3][Golang] XML Parsing Example (3)
[4][Golang] XML Parsing Example (4)
[5](1, 2, 3) [Golang] XML Parsing Example (5) - Parse OPML
[6][Golang] XML Parsing Example (6) - Parse OPML Concisely
[7][Golang] XML Parsing Example (7) - Parse RSS 2.0
[8][Golang] XML Parsing Example (8) - Parse Atom 1.0
[9][Golang] Convert Atom to RSS
[10][Golang] Parse Web Feed - RSS and Atom

Reference:

[a]OPML
[b]XML to Go struct : golang