dpkt package official document parsing Sinicization - -- HTTP Request Example(tcp layer [tcp.data carrier http header request and other information]

Posted by codeman on Fri, 04 Oct 2019 08:34:50 +0200

Links to the original text: https://dpkt.readthedocs.io/en/latest/print_http_requests.html

dpkt package official document parsing Sinicization - -- HTTP Request Example(tcp layer [tcp.data carrier http header request and other information]

Print HTTP Requests Example

This example expands on the print_packets example. It checks for HTTP request headers and displays their contents.

(This example extends the print_packages example. It checks HTTP request headers and displays their contents. )

NOTE: We are not reconstructing 'flows' so the request (and response if you tried to parse it) will only parse correctly if they fit within a single packet. Requests can often fit in a single packet but Responses almost never will. For proper reconstruction of flows you may want to look at other projects that use DPKT (http://chains.readthedocs.io and others)

(Note: We have not reconstructed the stream, so the request (if you try to parse it, then the response) will only be parsed correctly if it conforms to a single package. Requests can usually be placed in a package, but responses almost never occur. To refactor the flow correctly, you may need to look at other projects using DPKT (http://chains.readthedocs). io and others)

Code Excerpt

# For each packet in the pcap process the contents process each package in pcap
for timestamp, buf in pcap:

    # Unpack the Ethernet frame (mac src/dst, ethertype) 
    #Decompression of Ethernet frames (mac src/dst, ethertype)
    eth = dpkt.ethernet.Ethernet(buf)

    # Make sure the Ethernet data contains an IP packet
    #Ensure that the Ethernet data contains an IP packet
    if not isinstance(eth.data, dpkt.ip.IP):
        print 'Non IP Packet type not supported %s\n' % eth.data.__class__.__name__
        continue

    # Now grab the data within the Ethernet frame (the IP packet)
    ip = eth.data

    # Check for TCP in the transport layer
    #Now get the data in the Ethernet frame (IP packet)
    if isinstance(ip.data, dpkt.tcp.TCP):

        # Set the TCP data
        tcp = ip.data

        # Now see if we can parse the contents as a HTTP request
        # Now let's see if we can parse the content into HTTP requests.
        try:
            request = dpkt.http.Request(tcp.data)
        except (dpkt.dpkt.NeedData, dpkt.dpkt.UnpackError):
            continue

        # Pull out fragment information (flags and offset all packed into off field, so use bitmasks)
        #Extract fragment information (tags and offsets are packaged into the off field, so use bit masks)
        do_not_fragment = bool(ip.off & dpkt.ip.IP_DF)
        more_fragments = bool(ip.off & dpkt.ip.IP_MF)
        fragment_offset = ip.off & dpkt.ip.IP_OFFMASK

        # Print out the info to print out information
        print 'Timestamp: ', str(datetime.datetime.utcfromtimestamp(timestamp))
        print 'Ethernet Frame: ', mac_addr(eth.src), mac_addr(eth.dst), eth.type
        print 'IP: %s -> %s   (len=%d ttl=%d DF=%d MF=%d offset=%d)' % \
              (inet_to_str(ip.src), inet_to_str(ip.dst), ip.len, ip.ttl, do_not_fragment, more_fragments, fragment_offset)
        print 'HTTP request: %s\n' % repr(request)

Example Output

Timestamp:  2004-05-13 10:17:08.222534
Ethernet Frame:  00:00:01:00:00:00 fe:ff:20:00:01:00 2048
IP: 145.254.160.237 -> 65.208.228.223   (len=519 ttl=128 DF=1 MF=0 offset=0)
HTTP request: Request(body='', uri='/download.html', headers={'accept-language': 'en-us,en;q=0.5', 'accept-encoding': 'gzip,deflate', 'connection': 'keep-alive', 'keep-alive': '300', 'accept': 'text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,image/jpeg,image/gif;q=0.2,*/*;q=0.1', 'user-agent': 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6) Gecko/20040113', 'accept-charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.7', 'host': 'www.ethereal.com', 'referer': 'http://www.ethereal.com/development.html'}, version='1.1', data='', method='GET')

Timestamp:  2004-05-13 10:17:10.295515
Ethernet Frame:  00:00:01:00:00:00 fe:ff:20:00:01:00 2048
IP: 145.254.160.237 -> 216.239.59.99   (len=761 ttl=128 DF=1 MF=0 offset=0)
HTTP request: Request(body='', uri='/pagead/ads?client=ca-pub-2309191948673629&random=1084443430285&lmt=1082467020&format=468x60_as&output=html&url=http%3A%2F%2Fwww.ethereal.com%2Fdownload.html&color_bg=FFFFFF&color_text=333333&color_link=000000&color_url=666633&color_border=666633', headers={'accept-language': 'en-us,en;q=0.5', 'accept-encoding': 'gzip,deflate', 'connection': 'keep-alive', 'keep-alive': '300', 'accept': 'text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,image/jpeg,image/gif;q=0.2,*/*;q=0.1', 'user-agent': 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6) Gecko/20040113', 'accept-charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.7', 'host': 'pagead2.googlesyndication.com', 'referer': 'http://www.ethereal.com/download.html'}, version='1.1', data='', method='GET')

...

dpkt/examples/print_http_requests.py(http requests.py dpkt/examples/print)

This example expands on the print_packets example. It checks for HTTP request headers and displays their contents. NOTE: We are not reconstructing 'flows' so the request (and response if you tried to parse it) will only

(This example extends the print_packages example. It checks HTTP request headers and displays their contents. Note: We have not refactored the stream.

parse correctly if they fit within a single packet. Requests can often fit in a single packet but Responses almost never will. For proper reconstruction of flows you may want to look at other projects that use DPKT (http://chains.readthedocs.io and others)

If they fit into a package, they are parsed correctly. Requests can usually be placed in a package, but responses almost never occur. To refactor the flow correctly, you may need to look at other projects using DPKT (http://chains.readthedocs). io and others)

examples.print_http_requests.mac_addr(address)

#Convert a MAC address to a readable/printable string (converting a MAC address to a readable/printable string)

examples.print_http_requests.inet_to_str(inet)

#Convert inet object to a string

examples.print_http_requests.print_http_requests(pcap)

#Print out information about each packet in a pcap

examples.print_http_requests.test()
#Open up a test pcap file and print out the packets (open a test pcap file and print out the packet)

The above method implements the method in the source document with link examples.

Conclusion:

IP TCP HTTP Inclusion Relation

  ip addres
      |
|IP Header |   DATA     | 
               ______
                  |
             |  TCP Header  |  TCP DATA |
                _______        _________
                   |                 |
                 port             ( HTTP head )
                                    _________
                                         |
                                      host, GET post(load)Equal request

Topics: xml Mac Windows Fragment