Cycle about HTTP part 4

I’m hungry — eats heading HTTP.

So know the principle of operation heading HTTP

Sniffer

Sniffer — A computer program or device whose task is to capture and analyze data on the network. The program will use a socket with Linux or a network interface. In the beginning, we need to know the structure of the packet and the protocol header, e.g. IP, TCP, ICMP .itp. An important element is also the ISO / OSI model.

OSI model

At the beginning we will deal with the Ethernet frame:

Ethernet Type II Frame.

It includes:

  • Destination MAC Address
  • Source Mac Address
  • Ether Type

The end of the theory now practical sniffer stuff will be written in Python because of the quick understanding of the construction and rules of the sniffer. At the beginning there will be a network interface here he:

network interface.

It is worth noting that the majority will be used from the Linux kernel files from the repository and documentation.

The Linux Kernel documentation - The Linux Kernel documentation

The principle of the socket function is as follows “socket (domain, type, protocol)”

  • Domain — the address domain defines the domain in which communication will be carried out.
  • Type — Communication method.
  • Protocol — Receive a given protocol

In the address domain it can be [1]:

  • AF_INET — IPv4 (TCP / IP family protocol).
  • AF_INET6 — IPv6.
  • AF_IPX — IPX protocol.
  • AF_PACKET — low-level packet receiving interface.
  • AF_XXX — Address Family, PF_XXX — protocol address (practically one and the same)

The types of communication are divided into:

  • SOCK_STREAM — TCP (bi-directional communication).
  • SOCK_DGRAM — UDP (connectionless communication).
  • SOCK_RAW — low-level communication access to raw packets at the level of the TCP / IP network layer.

Socket.nthos (hex) is a 16-bit conversion to integers or is zero by default.
All protocols can be found in the file [2] if_ethernet.h ETH_P_ALL — it receives all protocols.

Then we make a loop to receive packets:

Loop to receive packets.

s.recvfrom (buffer) — receiving data from the network interface, buffer is the maximum size of the Ethernet packet.

eth_length — Ethernet frame size [3] included as:

  • Target MAC — 6 bytes.
  • Source MAC — 6 bytes.
  • Frame type — 2 bytes.

eth_header — Ethernet frame data.

All we have left is to unpack the binary header into an int, string .itp we use the struct package with function unpack (‘! 6s6sH’, eth_header).

The following function is used for MAC address conversion:

function is used for MAC address conversion.

Unpacking the pattern:

  • 6s — 6 characters char.
  • H — int size 2 bytes.
Format Characters.

The eth_addr (a) function formats and converts characters or a string to the hexadecimal system through the ord () and % x functions. The ord () method converts characters to a decimal number, and% x converts to a hexadecimal system.

We will start with the most important headline IPv4:

IPv4 Header.
  • version — defines the IP version, e.g. IPv4, IPv6.
  • IHL — header length 160 bits = 20 bytes = (32 bits = 1 octet) 5 octets.
  • Total Length — total length of the packet with data maximum value is 65535 bytes.
  • Identification — the number of the package identifier that helps to put all packages together.
  • Protocol — defines which protocol will be used in the transport layer.
  • Time to Live — the lifetime of the packet, i.e. the number of hops that it can go through before being ignored.
  • Fragment Offset — displacement of the package.
  • Flags — information whether there are fragments of packages.

Python code for the IP header:

IPv4 Header code.

The smallest size of the type in the structure is 1 byte or 8 bits, it gives us a big problem with the smaller version types and the length of the header, so we have to divide it into half using the bitrate operator. We make a bitwise shift to the right to get the IP version, and then reset the value for the IP version by 0xF to get the correct value for the length of the header. How does it look like:

0100 | 0101 — binary shift

01000101 AND 00001111 = 0101 = 5 octets

The source code can be found at https://github.com/Magnum34/Sniffer/blob/master/sniffer2.py.

References:

  1. https://github.com/torvalds/linux/blob/master/include/linux/socket.h
  2. https://github.com/spotify/linux/blob/master/include/linux/if_ether.h
  3. https://www.geeksforgeeks.org/computer-network-ethernet-frame-format/


Cycle about HTTP part 4 was originally published in Qunabu Interactive on Medium, where people are continuing the conversation by highlighting and responding to this story.