reference resources
- Docker core principle: image and container
- Layerid diffid chainid cacheid relationship in docker file
Overview
Image directory and file description
- /var/lib/docker/image/overlay2 directory
- distribution directory
- Diffid by digest saves the mapping relationship between digest (layerid) - > diffid
- V2metadata by diffid saves the mapping relationship between diffid - > (digest, repository)
- digest(layerID) is the hash ID when pulling the image. The pulled image layer file is a compressed file in compressed state
- diffid is the hash ID of the image layer viewed by docker inspect. At this time, the image layer file is decompressed and in the decompressed state
- Therefore, although both IDS represent the image layer hash ID, one is compressed and the other is decompressed, so the hash operation is inconsistent
- imagedb directory
- Save metadata information of the image
- Two subfolders: content / and metadata/
- tree /var/lib/docker/image/overlay2/imagedb can view the structure of this directory
- Open the file named by the image ID in the content/sha256 directory, and we can see the metadata information of the image, including image architecture, operating system, default configuration, creation time, history information, rootfs, etc
- cat /var/lib/docker/image/overlay2/imagedb/content/sha256 / (any image ID in the directory) | python -mjson.tool
- The diffID of the layer in the image metadata is recorded in the order from low level to high level
- layerdb directory (chainID, cacah ID)
- Save the association relationship of the mirror layer
- The directory name under layerdb/sha256 is named after the chainID of the layer. Its calculation method is:
- If the layer is the lowest layer without any parent layer, diffID = chainID;
- Otherwise, chainID(n)=sha256sum(chainID(n-1)) diffID(n))
- View: tree / var / lib / docker / image / overlay 2 / layerdb - L 2
- For example, view the contents in the chainID directory of an image layer: cat /var/lib/docker/image/overlay2/layerdb/sha256 / (any chainID)
- Parent: the chainID of the parent layer
- Size: the size of the layer file
- Cache ID: the actual file contents indexed to the layer by the storage driver through the cache ID
- diff (guess): the id of the decompressed layer
- repositories.json file
- Firstly, we need to make clear that the registry is the image warehouse, and the repository represents the image group (such as different versions of nginx images)
- This file describes the repository metadata of all images on the host, mainly including image name, tag and image ID
- The image ID is Docker, and SHA256 algorithm is adopted
- View: cat / var / lib / docker / image / overlay 2 / repositories.json | Python - mjson.tool
- distribution directory
image
Docker image is a read-only container template that contains the file system required to start the docker container. The file contents of the docker image and some configuration files of the running container constitute the file system running environment of the docker container - rootfs.
After downloading an nginx image using docker pull, you can find its related information in Docker's working directory * * / var/lib/docker/image/overlay2 * * (overlay2 in the path is a storage driver currently used by Docker, and we will explain this technology in detail later):
[root@localhost ~]# docker images REPOSITORY TAG IMAGE ID CREATED SIZE nginx latest f6d0b4767a6c 3 weeks ago 133MB [root@localhost ~]# ls /var/lib/docker/image/overlay2 distribution imagedb layerdb repositories.json
Repository
In Docker's image management system, registry represents the image warehouse, such as the official Docker Hub. The repository represents an image group, that is, a collection of images with different versions. The repositories.json file describes the repository metadata of all images on the host, mainly including the image name, tag and image ID. The image ID is calculated by Docker according to the image metadata configuration file using SHA256 algorithm.
[root@localhost ~]# cat /var/lib/docker/image/overlay2/repositories.json | python -mjson.tool { "Repositories": { "nginx": { "nginx:latest": "sha256:f6d0b4767a6c466c178bf718f99bea0d3742b26679081e52dbf8e0c7c4c42d74", "nginx@sha256:10b8cc432d56da8b61b070f4c7d2543a9ed17c2b23010b43af434fd40e2ca4aa": "sha256:f6d0b4767a6c466c178bf718f99bea0d3742b26679081e52dbf8e0c7c4c42d74" } } }
Image
The metadata information of the image is saved in the imagedb directory, and its file structure is as follows:
[root@localhost ~]# tree /var/lib/docker/image/overlay2/imagedb /var/lib/docker/image/overlay2/imagedb ├── content │ └── sha256 │ └── f6d0b4767a6c466c178bf718f99bea0d3742b26679081e52dbf8e0c7c4c42d74 └── metadata └── sha256 4 directories, 1 file
Open the file named by the image ID in the content/sha256 directory, and we can see the metadata information of the image, including image architecture, operating system, default configuration, creation time, history information, rootfs, etc
[root@localhost ~]# cat /var/lib/docker/image/overlay2/imagedb/content/sha256/f6d0b4* | python -mjson.tool { "architecture": "amd64", ... "created": "2021-01-12T10:17:41.649267496Z", "docker_version": "19.03.12", ... "os": "linux", "rootfs": { "diff_ids": [ "sha256:cb42413394c4059335228c137fe884ff3ab8946a014014309676c25e3ac86864", "sha256:1c91bf69a08b515a1f9c36893d01bd3123d896b38b082e7c21b4b7cc7023525a", "sha256:56bc37de0858bc2a5c94db9d69b85b4ded4e0d03684bb44da77e0fe93a829292", "sha256:3e5288f7a70f526d6bceb54b3568d13c72952936cebfe28ddcb3386fe3a236ba", "sha256:85fcec7ef3efbf3b4e76a0f5fb8ea14eca6a6c7cbc0c52a1d401ad5548a29ba5" ], "type": "layers" }
In this information, we are most concerned about rootfs. As mentioned above, rootfs is the file system environment in which the container runs. From the above metadata information, we find that rootfs is composed of multiple layer files. The metadata records the diffids of these layers, which are calculated by Docker according to the content of the layer file using SHA256 algorithm. The advantage of this is that the integrity of the layer file can be checked according to the diffID, and the layer file with the same diffID can be shared by different images.
Layer
The directory name under layerdb/sha256 is named after the chainID of the layer. Its calculation method is:
- If the layer is the lowest layer without any parent layer, diffID = chainID;
- Otherwise, chainID(n)=sha256sum(chainID(n-1)) diffID(n))
[root@localhost ~]# tree /var/lib/docker/image/overlay2/layerdb -L 2 /var/lib/docker/image/overlay2/layerdb ├── sha256 │ ├── 3c90a0917c79b758d74b7040f62d17a7680cd14077f734330b1994a2985283b8 │ ├── 4dfe71c4470c5920135f00af483556b09911b72547113512d36dc29bfc5f7445 │ ├── a1c538085c6f891424160d8db120ea093d4dda393e94cd4713e3fff3c82299b5 │ ├── a3ee2510dcf02c980d7aff635909612006fd1662084d6225e52e769b984abeb5 │ └── cb42413394c4059335228c137fe884ff3ab8946a014014309676c25e3ac86864 └── tmp
We can find that only the diffID and chainID of layercb42413394c * are equal, because it is the bottom layer of the image. In fact, the diffID of the layer in the image metadata is recorded in the order from low level to high level. We can calculate the chainID of the penultimate layer according to the formula:
[root@localhost ~]# echo -n "sha256:cb42413394c4059335228c137fe884ff3ab8946a014014309676c25e3ac86864 sha256:1c91bf69a08b515a1f9c36893d01bd3123d896b38b082e7c21b4b7cc7023525a" | sha256sum - a3ee2510dcf02c980d7aff635909612006fd1662084d6225e52e769b984abeb5 - [root@localhost ~]# ls /var/lib/docker/image/overlay2/layerdb/sha256 | grep a3ee2510dcf0* a3ee2510dcf02c980d7aff635909612006fd1662084d6225e52e769b984abeb5
The metadata information of the mirror layer is saved in each directory named after the chainID of the layer:
[root@localhost ~]# ls /var/lib/docker/image/overlay2/layerdb/sha256/a3ee2510dcf02c980d7* cache-id diff parent size tar-split.json.gz
- Parent: the chainID of the parent layer
- Size: the size of the layer file
- Cache ID: the actual file contents indexed to the layer by the storage driver through the cache ID
[root@localhost ~]# cat /var/lib/docker/image/overlay2/layerdb/sha256/a3ee2510dcf02c980d7*/parent sha256:cb42413394c4059335228c137fe884ff3ab8946a014014309676c25e3ac86864 [root@localhost ~]# cat /var/lib/docker/image/overlay2/layerdb/sha256/a3ee2510dcf02c980d7*/size 63704232 [root@localhost ~]# cat /var/lib/docker/image/overlay2/layerdb/sha256/a3ee2510dcf02c980d7*/cache-id 0363fcae3b4410c394b8a99e0a24d1ec01eb5198c82d3422f9c411ceaad98286
If we start a container, we can find that Docker will generate a new mount directory under the layerdb Directory:
[root@localhost ~]# docker run -d --name=nginx -v /home:/home nginx 45f30cb6a063a7251db4388f17f85c1226d96277cb74693c1f38bef1d17b6193 [root@localhost ~]# tree /var/lib/docker/image/overlay2/layerdb -L 2 /var/lib/docker/image/overlay2/layerdb ├── mounts │ └── 45f30cb6a063a7251db4388f17f85c1226d96277cb74693c1f38bef1d17b6193 ├── sha256 │ ├── 3c90a0917c79b758d74b7040f62d17a7680cd14077f734330b1994a2985283b8 │ ├── 4dfe71c4470c5920135f00af483556b09911b72547113512d36dc29bfc5f7445 │ ├── a1c538085c6f891424160d8db120ea093d4dda393e94cd4713e3fff3c82299b5 │ ├── a3ee2510dcf02c980d7aff635909612006fd1662084d6225e52e769b984abeb5 │ └── cb42413394c4059335228c137fe884ff3ab8946a014014309676c25e3ac86864 └── tmp 9 directories, 0 files
There is a file named after the container ID in the mounts directory, which records the metadata information of the container layer:
[root@localhost ~]# ls /var/lib/docker/image/overlay2/layerdb/mounts/45f30cb6a063* init-id mount-id parent
So what is a container?
container #
Through the above exploration, we have known that the image is a file composed of multiple layers and becomes the running environment of the container file system - read-only rootfs when the container is started. The container is actually the result of Dokcer using the storage driver to mount a read-write layer on the read-only rootfs.
Joint mount
Joint mount technology can mount multiple file systems at one mount point at the same time, and integrate the original directory of the mount point with the mounted content, so that the final visible file system will contain the files and directories of each layer after integration.
Overlay 2 is a joint mounting technology currently used by Docker. It mainly works through four categories of directories:
- lower: the underlying file system. For Docker, it is a read-only image layer;
- Upper: upper file system. For Docker, it is a readable and writable container layer;
- merged: a federated mount point that acts as a unified view. For Docker, it is the file system from the user's perspective;
- work: provides auxiliary functions.
We start a container and observe the mounting of overlay 2 in the system:
[root@localhost ~]# docker run -d --name=nginx -v /home:/home nginx 45f30cb6a063a7251db4388f17f85c1226d96277cb74693c1f38bef1d17b6193 [root@localhost ~]# mount -t overlay overlay on /var/lib/docker/overlay2/dabd31fb6ad636b16b6f01f2332d068888de1e3e41a53751a35206e266b5dad4/merged type overlay (rw,relatime,seclabel,lowerdir=/var/lib/docker/overlay2/l/U7ZXQ4ZL7TLD6XEBUVLR77LKS4:/var/lib/docker/overlay2/l/BXVB3Q7277EPHJEMMHKEOR6YS5:/var/lib/docker/overlay2/l/5K76AX5UNX35LFXZKLATNWHOIK:/var/lib/docker/overlay2/l/QTGJLTLBMEML5OGHHZUYM3GHTR:/var/lib/docker/overlay2/l/55T5LSGE3C2NFFQ54FHM7YDNKY:/var/lib/docker/overlay2/l/R2AW2LUWRDIV7DLJFYMS67LB3L,upperdir=/var/lib/docker/overlay2/dabd31fb6ad636b16b6f01f2332d068888de1e3e41a53751a35206e266b5dad4/diff,workdir=/var/lib/docker/overlay2/dabd31fb6ad636b16b6f01f2332d068888de1e3e41a53751a35206e266b5dad4/work)
According to the output results, we can find the mount point location of overlay 2:
[root@localhost ~]# ll /var/lib/docker/overlay2/l Total consumption 0 lrwxrwxrwx. 1 root root 72 2 February 7:37 55T5LSGE3C2NFFQ54FHM7YDNKY -> ../0363fcae3b4410c394b8a99e0a24d1ec01eb5198c82d3422f9c411ceaad98286/diff lrwxrwxrwx. 1 root root 72 2 February 7:37 5K76AX5UNX35LFXZKLATNWHOIK -> ../14faccdb685b520f09148aaafec45dd31d2e6aa96516b60deaf59228ae9dbe66/diff lrwxrwxrwx. 1 root root 72 2 February 7:37 BXVB3Q7277EPHJEMMHKEOR6YS5 -> ../82a99d340893301cf33c79588042cd7d5db55cbcf34ca4b7a6ade609b6d28c96/diff lrwxrwxrwx. 1 root root 72 2 February 10:02 M5NBJHQ4ICWANIEYBYMJ5FLCZO -> ../dabd31fb6ad636b16b6f01f2332d068888de1e3e41a53751a35206e266b5dad4/diff lrwxrwxrwx. 1 root root 72 2 February 7:37 QTGJLTLBMEML5OGHHZUYM3GHTR -> ../5b75a3d9c3881dd8041cdc3fa73679a9f1e4f1052d4548e22b5f4e4ccce7c2d0/diff lrwxrwxrwx. 1 root root 72 2 February 7:37 R2AW2LUWRDIV7DLJFYMS67LB3L -> ../bd47b78b55e951d504c6e70c0e7af451b020a5741f71ed0e91cb8f8dc77a8664/diff lrwxrwxrwx. 1 root root 77 2 February 10:02 U7ZXQ4ZL7TLD6XEBUVLR77LKS4 -> ../dabd31fb6ad636b16b6f01f2332d068888de1e3e41a53751a35206e266b5dad4-init/diff
- /All files saved in the var/lib/docker/overlay2/l directory are soft link files, and their file names are short names generated to avoid the output results reaching the page size limit when using the mount command;
- The soft links in lowerdir, except U7ZXQ4ZL7TLD6XEBUVLR77LKS4, all point to the mount point of the read-only image layer file, corresponding to the five layers of the ngingx image we downloaded;
- As mentioned above, the cache ID in the layer metadata will index the actual file of the layer, and its path is: / var/lib/docker/overlay2//diff;
- U7ZXQ4ZL7TLD6XEBUVLR77LKS4 points to an init layer named cache ID init, which is automatically generated when the container is started. The file in it configures the host name and DNS service of the container.
[root@localhost ~]# cd /var/lib/docker/overlay2/l [root@localhost l]# ls ../bd47b78b55e951d504c6e70c0e7af451b020a5741f71ed0e91cb8f8dc77a8664/diff bin boot dev etc home lib lib64 media mnt opt proc root run sbin srv sys tmp usr var [root@localhost l]# ls ../0363fcae3b4410c394b8a99e0a24d1ec01eb5198c82d3422f9c411ceaad98286/diff docker-entrypoint.d etc lib tmp usr var [root@localhost l]# ls ../dabd31fb6ad636b16b6f01f2332d068888de1e3e41a53751a35206e266b5dad4-init/diff dev etc [root@localhost l]# ls ../dabd31fb6ad636b16b6f01f2332d068888de1e3e41a53751a35206e266b5dad4-init/diff/etc hostname hosts mtab resolv.conf ...
We are more deeply aware that the image layer is the file system obtained by jointly mounting multiple layer files. The directory pointed to by M5NBJHQ4ICWANIEYBYMJ5FLCZO (upperdir output by mount -t overlay command) is the read-write layer, that is, the file system of the container layer:
[root@localhost l]# ls ../dabd31fb6ad636b16b6f01f2332d068888de1e3e41a53751a35206e266b5dad4/diff etc run var
overlay2 jointly mounts the file systems of upper and lower to obtain the file system from the user's perspective:
[root@localhost l]# ls ../dabd31fb6ad636b16b6f01f2332d068888de1e3e41a53751a35206e266b5dad4/merged bin dev docker-entrypoint.sh home lib64 mnt proc run srv tmp var boot docker-entrypoint.d etc
When users modify files in the container, they only change the read-write layer and will not overwrite the file system of the lower read-only layer. The read-write layer hides the original version of the read-only layer, so the user finds that the file system has changed. If we use the docker commit command to generate a new image, the saved content is only the file updated by the read-write layer.
Container information
You can see the container related information in the * * / var/lib/docker/containers / container ID * * Directory:
[root@localhost ~]# tree /var/lib/docker/containers /var/lib/docker/containers └── 45f30cb6a063a7251db4388f17f85c1226d96277cb74693c1f38bef1d17b6193 ├── 45f30cb6a063a7251db4388f17f85c1226d96277cb74693c1f38bef1d17b6193-json.log ├── checkpoints ├── config.v2.json ├── hostconfig.json ├── hostname ├── hosts ├── mounts ├── resolv.conf └── resolv.conf.hash 3 directories, 7 files
The config.v2.json file in the directory describes the detailed configuration information of the container, which is basically consistent with the viewing results of the container using the * * docker inspect * * command:
cat /var/lib/docker/containers/45f30cb6a063a7*/config.v2.json | python -mjson.tool
You can also see the host name and DNS configuration of the container in the hostname, hosts and resolv.conf files, which are consistent with the contents in the init layer file system.
[root@localhost ~]# cat /var/lib/docker/containers/45f30cb6a063a7*/hostname 45f30cb6a063 [root@localhost ~]# cat /var/lib/docker/containers/45f30cb6a063a7*/hosts 127.0.0.1 localhost ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters 172.17.0.2 45f30cb6a063 [root@localhost ~]# cat /var/lib/docker/containers/45f30cb6a063a7*/resolv.conf # Generated by NetworkManager nameserver 192.168.1.1 nameserver 192.168.0.1