Internal structure of warehouse

Posted by Dawg on Thu, 03 Mar 2022 17:51:49 +0100


I translated this article. The following is the translated article. The source of the article is:


By testing the metadata in the yum warehouse (metadata < some index files >), this article deeply understands the internal structure of the yum warehouse. I will introduce each index file and see how users check the metadata.

What is yum warehouse

The yum warehouse is a collection of many rpm packages and metadata that can be read from the yum command line. With Yum warehouse, you can install, uninstall, upgrade and other operations for a package or a group of packages. Yum warehouse is essential for software storage, management and delivery.

Create a yum warehouse with createrepo

Before we learn more about the metadata of the warehouse, let's show how to set up a warehouse with the open source command-line tool createrepo. You can use the command-line tool createrepo to create a yum repository. Createrepo can be installed on CentOS or Red Hat by running the following command:

$ sudo yum install createrepo

The simplest way to use createrepo is to use a parameter, which is the output directory of yum warehouse.

Suppose you have several rpm packages in the current directory, you can run this command to generate a yum repository

$ createrepo .

In this way, a folder named 'repodata' is created, which contains the metadata files we will talk about in detail below.

Translator's note: list the structure after the translator generates the warehouse locally

$ tree
└── packages
    ├── kernel-5.16.12-100.fc34.x86_64.rpm
    └── kernel-core-5.16.12-100.fc34.x86_64.rpm
$ createrepo .
Directory walk started
Directory walk done - 2 packages
Temporary output repo path: ./.repodata/
Preparing sqlite DBs
Pool started (with 5 workers)
Pool finished

$ tree .
├── packages
│   ├── kernel-5.16.12-100.fc34.x86_64.rpm
│   └── kernel-core-5.16.12-100.fc34.x86_64.rpm
└── repodata
    ├── 47e2ab97092b91278fbac5f8dce24ea45224cb64a7a25986bc5af70b52775878-primary.xml.gz
    ├── 59fd4217fb82d5b775dcb87a12cddfe886378471eedc8ac65237fe3f24724326-primary.sqlite.bz2
    ├── 71d725cef5607f7e572716bf26703c8f34d2f4b42d269e7c0fa6fa9beb955404-filelists.xml.gz
    ├── 7223cc42ffbdc229f9c3f540d57662c8e88d8b3c0dcd989e394af344ef48ba2c-filelists.sqlite.bz2
    ├── 75f75cca763ed7988c4307a71a9f309318866b77927df5baa77f374315182c8a-other.xml.gz
    ├── d68d8e69fc4f554a71086fc9ab9edde61b689b1704825d942c0b057486f1ca18-other.sqlite.bz2
    └── repomd.xml

You can use GPG to sign the metadata file of the warehouse, which can assure users that these metadata are generated by you. GPG is different from using rpm or rpmsign

$ gpg --detach-sign --armor myrepo/repodata/repomd.xml

The rpm package using GPG and yum can be found here HOWTO: GPG sign and verify RPM packages and yum repositories . If you want others to access your yum warehouse, you need to build Apache, nginx or other web servers and point to the basic directory of the warehouse. It is recommended that you use SSL certificate so that the data of the package can be transmitted safely.

Translator's note: the basic directory of the warehouse here is / etc / yum repos. d/xxx. The baseurl in repo, that is, the location of the yum repository url. Here, there is repodata under the basic directory.

Of course, use packagecloud Is a faster and simpler solution. SSL, GPG, authentication, cooperation and other things you need are ready.

Metadata of yum warehouse

The metadata of yum warehouse consists of a series of XML files, including checksums of other files and referenced packages.

The common metadata files in yum warehouse include:

  • repomd.xml: it is essentially an index, which contains the location, verification, timestamp, etc. of other xml metadata files to be.
  • repomd.xml.asc: this file can only be signed using GPG repomd XML file will be generated. The signing example is mentioned above
  • primary.xml.gz: includes the details of each package in the warehouse. You can see such information as name, version, license, dependency information, timestamp, size or more.
  • filelists.xml.gz: contains information about each file or folder in each package in the warehouse
  • other.xml.gz: contains the change logs of the package in the warehouse in the rpm spec file

There are other documents, but other documents are not widely used. These documents mentioned above are enough

Translator's note: in fact, in addition to xml files, SQLite. Com is the metadata of the current yum warehouse Form of bz2

Generally, the metadata of the warehouse is in the repodata namespace of the yum warehouse url, or in the repodata directory of the warehouse server.

It is a common practice to organize yum warehouses to store packages of the same architecture together. This enables your warehouse to segment according to the architecture type, which can reduce the amount of metadata provided to clients and regenerated when updated.

A set of typical x86 from packagecloud_ Repomd.64 and i386 architectures url of XML file:


Most publicly accessible yum repositories are basically similar schemes, such as the official centos7 repomd XML file:


Translator's note: Centos7 files may not be accessible here. You can directly use centos8 stream

Check and verify Warehouse Metadata

You can use a series of command-line tools to examine the metadata of yum warehouse, calculate checksums and verify the signature of GPG

We use CentOS7. The warehouse is located at:

First, check repomd XML file

Test repomd with curl XML file. Other index files and their verification can be found in the file in this path:

$ curl -Ls

(TIP: if you want to see more details or debug, try using curl's - Lv(list only and verbose) parameter)

Verify repomd GPG signature for XML

yum will automatically verify the GPG signature of the warehouse (if repo_gpgcheck is set to 1 in yum's configuration, you can see more usage) here ), but you can also charge to verify the signature.

If the warehouse is signed by GPG and you have imported the GPG shared key, you can download repomd XML file and repomd xml. ASC to verify. Use gpg --verify instruction:

$ curl -Ls > repomd.xml
$ curl -Ls > repomd.xml.asc

$ gpg --verify repomd.xml.asc repomd.xml
gpg: Signature made Sun Oct 12 11:07:54 2014 PDT using RSA key ID 7AD95B3F
gpg: Good signature from "packagecloud ops (production key)

Check primary xml. GZ metadata

Then, let's check the primary xml. GZ metadata. As mentioned above, this file contains the information of each package in the warehouse

repomd. The location of this file is mentioned in the XML:

<data type="primary">
  <location href="repodata/primary.xml.gz"/>
  <checksum type="sha">6eb7ecc041f69a5ffeabdebcb466c443aa5e8028</checksum>
  <open-checksum type="sha">0b08c81e46081059cbe56d2f0871017ef8073d93</open-checksum>

Note: this location is not so simple, because some warehouses are primary xml. The URL of GZ file will contain the sum of SHA and MD5 of the file*

This location is relative to the basic directory of the warehouse*

Let's check whether the SHA checksum is consistent with repomd XML, repomd The checksum written in XML is the SHA check value of this file. Open checksum refers to the SHA check value of the file after decompression.

$ curl -Ls | shasuma
6eb7ecc041f69a5ffeabdebcb466c443aa5e8028  -

Good news, this check value matches. We use zless to view the extracted file and browse page by page:

$ curl -Ls | zless

primary. xml. Simple example in GZ file:

<package type="rpm">
    <version epoch="87" rel="3.el6" ver="1.0"/>
    <checksum pkgid="YES" type="sha">ea721867eb0389e28bcd32e2deef7d4472c6ced8</checksum>
    <summary>jake douglas is a very nice young man.</summary>
    <description>as it so happens, jake douglas is a very nice young man.</description>
    <time build="1401650103" file="1413137269"/>
    <size archive="4536" installed="4280" package="3740"/>
    <location href="jake-1.0-3.el6.x86_64.rpm"/>

filelists.xml.gz and other xml. gz

Repeat the above for the two files: primary xml. GZ file operation:

  1. From repomd Get the location and checksum of metadata file in XML
  2. Use curl - LS < URL > | shasum to verify whether the file check values match
  3. Use curl - LS < URL > | zless to check this file


yum warehouse metadata is composed of a series of xml files, check values and GPG signatures in some cases. The metadata describes which package is in the warehouse, a large number of attributes, files and directories of each package, and change log information

yum warehouse metadata can be checked and verified manually by using some command-line tools curl, less/zless, gpg and shasum. This is useful if you want to debug some problems in your yum repository (missing packages, missing dependencies, incorrect versions, etc.) or if you are interested in the way some important parts of your operating system work.

Topics: Linux yum repository