Hamburger Icon

Golang XML External
Entities Guide:
Examples and Prevention

stackhawk

StackHawk|March 24, 2022

Worried about Golang XML external entities? Is your code safe? This post has sample XXE attacks and sample Go code to show you how to be safe.

XML External Entity (XXE) attacks can lead to a denial of service, loss of confidential information, and service outages due. XXE attacks help hackers snoop on systems and compromise critical data. They're a form of injection attack that takes advantage of applications that fail to protect themselves from malicious XML documents.

Let's look at Golang XML External Entities and how Go protects you from this kind of attack.

XML External Entity Attacks

XXE attacks inject applications with XML documents that refer to content that compromise the application's safety or expected operation. In order to understand how these attacks work, we need to take a brief look at how we structure XML documents. 

XML documents are made up of entities. Each entity contains or points to content. Entities that contain their content are internal, while entities that point to content outside the document are external. Some external entities use URIs to refer to content or resources on the host system or a network, including the public Internet. Others refer to content in the document description, like dictionaries that map entities to terms.

XML external entity attacks use URIs that point to resources that either compromise the application with malicious content or steal confidential information by coercing the app into retrieving and supplying the attacker with files they shouldn't be able to see. Or, they use entities to generate content that causes code to fail. 

XML External Entity Attacks

XXE attacks can take many forms. Let's go over a few more common ones, then see how they work (or not) in Go. 

File Retrieval Attacks

External entities point at URIs, and one type of URI is a local file. The attack attempts to get the targeted application to return the contents of the file. Here's a document that uses the URI for the passwd file on a Linux or macOS system:

 <?xml version="1.0" encoding="ISO-8859-1"?> 
   <!DOCTYPE foo [
        <!ELEMENT foo ANY >
        <!ELEMENT bar ANY >
        <!ENTITY xxe SYSTEM "file:///etc/passwd" >
   ]>
   <foo>
    <bar>&xxe;</bar>
   </foo>

The DOCTYPE attribute declares that the Document Type Declaration (DTD) for this document follows. This DTD declares two ELEMENTs that can contain any data type and an external ENTITY that refers to "file:///etc/passwd." A file:// URI points to a file on the local system, so this entity is looking for /etc/passwd. We can use this entity anywhere inside the new document by referring to its name; &xxe;

Then, the body of the document puts the new entity inside the foo and bar tags. The final product would be the contents of the systems' password file inside an XML document. 

The contents of the password on modern Linux and macOS systems won't get you passwords, not even the hashed versions, since both systems have moved the hashed passwords out to a location that unprivileged users can't see. But a list of valid users can be useful to snoopers, as are many other files that an unprivileged application can be tricked into retrieving for an attacker.

Network Snooping Attacks

Now that we have a template for creating external entities, we can reuse it for other attacks. 

External entities can point at websites, too. So let's imagine an attacker wants to learn about their target's internal network:

 <?xml version="1.0" encoding="ISO-8859-1"?> 
   <!DOCTYPE foo [
        <!ELEMENT foo ANY >
        <!ELEMENT bar ANY >
        <!ENTITY xxe SYSTEM "https://192.168.1.1/login" >
   ]>
   <foo>
    <bar>&xxe;</bar>
   </foo>

If this attack works, it will place the contents of https://192.168.1.1/login inside <bar>

What if there's no webserver there? Well, now that attacker knows that. What if there is, but /login is a 404? Now the attacker knows to try a different set of URLs. What if the target's internal network is 192.168.2 and not 192.168.1? That's fine; the attacker can keep trying.

Golang XML External Entities Guide Examples and Prevention image

Denial of Service Attacks

You can disable or degrade a system with an XXE attack, too. 

Let's look at a variation on a file retrieval attack:

 <?xml version="1.0" encoding="ISO-8859-1"?> 
   <!DOCTYPE foo [
        <!ELEMENT foo ANY >
        <!ELEMENT bar ANY >
        <!ENTITY xxe SYSTEM "file:///dev/random" >
   ]>
   <foo>
    <bar>&xxe;</bar>
   </foo>

The /dev/random file is a special file that generates random numbers based on system noise. It's a blocking file, which means it will not return data until an event it can use to generate a random number occurs. So, pointing an XML parser at it will often cause it to freeze. An attacker that sends a document like this over and over can disable a web service by blocking all of its input threads. 

XML Bombs

Before we look at Golang, let's look at one more type of attack. It uses XML parsing to cause a denial of service. 

The billion laughs attack takes advantage of XML's ability to nest entities. Here's an example:

<?xml version="1.0"?>
<!DOCTYPE lolz [
        <!ENTITY lol "lol">
        <!ELEMENT lolz (#PCDATA)>
        <!ENTITY lol1 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
        <!ENTITY lol2 "&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;">
        <!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
        <!ENTITY lol4 "&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;">
        <!ENTITY lol5 "&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;">
        <!ENTITY lol6 "&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;">
        <!ENTITY lol7 "&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;">
        <!ENTITY lol8 "&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;">
        <!ENTITY lol9 "&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;">
        ]>
<lolz>&lol9;</lolz>

Let's go through the process of loading this document: 

  1. The lolz contains a &lol9; entity.

  2. An &lol9; entity contains 10 &lol8; entities.

  3. Each &lol8; entity contains 10 &lol7; entities.

  4. Each &lol7; entity contains 10 &lol6; entities.

  5. Each &lol6; entity contains 10 &lol5; entities.

  6. Continue back to &lol1; and you've expanded the &lol9; entity to 109 lols.

As you can see, the billion laughs attack was named accurately. This construct adds a billion copies of "lol" to the XML document.  So, a relatively small XML document ends up gigabytes of memory when processed. 

Golang XML External Entities Guide Examples and Prevention image

Golang XML External Entity Attacks

How do you protect yourself from XXE attacks in Golang? Let's take a look. 

Let's start with the file retrieval attack described above:

<?xml version="1.0" encoding="ISO-8859-1"?>
   <!DOCTYPE foo [
        <!ELEMENT foo ANY >
        <!ELEMENT bar ANY >
        <!ENTITY xxe SYSTEM "file:///etc/passwd" >
   ]>
<foo>
    <bar>&xxe;</bar>
</foo>

Here's a simple Go program that parses the document:

func identReader(encoding string, input io.Reader) (io.Reader, error) {
	return input, nil
}
func main() {

	xmlFile, err := os.Open("retrieval.xml") 
	if err != nil {
		fmt.Println(err)
	}

	fmt.Println("Successfully Opened retrieval.xml")
	defer xmlFile.Close()

	decoder := xml.NewDecoder(xmlFile)
	decoder.Strict = false
	decoder.CharsetReader = identReader
	for {
		token, err := decoder.Token()
		if err != nil && err != io.EOF {
			fmt.Printf("Error! Decoding XML failed: %v\n", err)
			break
		}
		if token == nil {
			break
		}
		switch element := token.(type) {
		case xml.StartElement:
			fmt.Printf("<%s>", element.Name.Local)
		case xml.EndElement:
			fmt.Printf("</%s>", element.Name.Local)
		case xml.CharData:
			fmt.Printf("%s", element)
		}
	}
}

This code will print out the name of each element as the parser reaches it. If the element contains text, it will print that next; then, it will print an element's name when it reaches its end. The code also uses a simple CharsetReader to support ISO-8859-1. 

Here's the output from the attempt to retrieve the password file:

Successfully Opened retrieval.xml

<foo>
<bar>&xxe;</bar>
</foo>

Go's XML parser didn't expand the external entity! We get the same result if we run it against the snoop attack above. What's happening here? 

Golang's XML decoder doesn't process XML external entities, so these attacks don't work. If you look at issues on Github, the decoder used to fail when it encountered them, but now it simply prints the entity name instead of failing. 

So, let's try the Billion Laughs attack. 

Put this XML in a file named lulz.xml:

<?xml version="1.0"?>
<!DOCTYPE lolz [
        <!ENTITY lol "lol">
        <!ELEMENT lolz (#PCDATA)>
        <!ENTITY lol1 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
        <!ENTITY lol2 "&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;">
        <!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
        <!ENTITY lol4 "&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;">
        <!ENTITY lol5 "&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;">
        <!ENTITY lol6 "&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;">
        <!ENTITY lol7 "&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;">
        <!ENTITY lol8 "&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;">
        <!ENTITY lol9 "&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;">
        ]>
<lolz>&lol9;</lolz>

And modify the Go code to open that file name.

Successfully Opened lulz.xml

<lolz>&lol9;</lolz>

Not even a giggle! 

So, Golang's defense against external entity attacks is not to process external entities. 

Add Automated Security Testing to Your Pipeline

Wrapping up Golang and XXE Attacks

This post discussed what XML external entities are and how hackers use them to attack web applications. Then we looked at how these attacks fair against Golang XML processing; not well. Golang's XML decoder doesn't process external entities at all. So, Go applications are resilient against XXE attacks. 

But that doesn't mean you can't do more to protect your Go applications from attacks. Sign up for a free account and see how Stackhawk can help you secure your code today! 

This post was written by Eric Goebelbecker. Eric has worked in the financial markets in New York City for 25 years, developing infrastructure for market data and financial information exchange (FIX) protocol networks. He loves to talk about what makes teams effective (or not so effective!)


StackHawk  |  March 24, 2022

Read More

Add AppSec to Your CircleCI Pipeline With the StackHawk Orb

Add AppSec to Your CircleCI Pipeline With the StackHawk Orb

Application Security is Broken. Here is How We Intend to Fix It.

Application Security is Broken. Here is How We Intend to Fix It.

Using StackHawk in GitLab Know Before You Go (Live)

Using StackHawk in GitLab Know Before You Go (Live)