Linux driven development_ Detailed explanation of dev, devfs, sysfs, udev, proc device file system and introduction of kobject structure

Posted by Dennis1986 on Thu, 10 Feb 2022 13:27:18 +0100



What is a device manager?

The function of dev under Linux






What is a device manager?

The device manager is responsible for managing the peripherals on this computer. When we insert a keyboard and mouse through the USB port provided by the computer, the device manager will communicate with it to confirm what kind of device you are inserting, and then create the corresponding device driver.

Under the above premise, if your device is a popular device and supported by the device manager of the operating system, if we have an unknown device or a hardware product developed by me, such as our own keyboard, we do not use the general keyboard communication protocol. We have to create our own protocol, including the internal architecture PCB s are all designed by ourselves. Although they are also keyboards, the protocol and hardware architecture are different from the ISO standard, so the device manager of the operating system cannot recognize them, because there is no corresponding driver in the device manager to establish communication with you. At the same time, you do not meet the standard ISO device type acquisition protocol, So it doesn't know what you are.

At this time, we need to use the driver. The driver is used to establish communication with your device and pass it to the operating system. A configuration file is required on windows to tell the device manager what drivers are used for some devices. Therefore, at the beginning of our installation, the drivers on windows are generally composed of configuration files and drivers, The configuration file is used to tell the operating system our device description information, and tell the operating system which driver to call if such a device is encountered. After passing the data to the operating system through the driver, the driver will analyze the protocol.

Therefore, the driver is an intermediate layer, so the calling driver is called by the device manager, which is responsible for completing the process from device identification to driver call. In this process, your device should comply with the iso communication standard, otherwise the operating system will not recognize it, because the modern operating system has complied with the iso internationalization standard.

The function of dev under Linux

The earliest version of Linux was before Linux kernel 2.4. When the Linux kernel supports peripherals, it is very troublesome to actively add your hardware devices and your hardware interaction code in the kernel.

At that time, if you have a new device and need Linux support, you need to send an email to contact the maintenance personnel of the Linux kernel community and ask them to add your device to you. Then you can see your device only after the new Linux kernel is online. At this time, your device exists in the / dev directory.

Users cannot take the initiative to add their own devices, which leads to a problem. When the device does not exist, there is still this device in the / dev directory. After a long time, you will find that the / dev directory is full of many unknown device types. Of course, because the Linux kernel is open source, you can manually maintain a kernel version, but the cost is too high.


In order to solve the dev problem, I went to Linux 2.0 After version 4, a devfs, that is, a device manager, is introduced to manage the / dev directory. At the same time, many kernel library files are added for users. Users are allowed to manually call functions to register your device with devfs. Then devfs registers the device in the / dev directory after registering it in the kernel table.

This solves a big problem and makes the Linux kernel much more flexible, because it supports the lib library written by the kernel driver. Users can register their devices by calling the corresponding functions and instantiate the open, write and read functions. This method is quoted from unix technology, so that the file system user layer does not care about how the underlying code is implemented, The hardware opening, writing and reading control mode can be realized only by calling open, write and read in user state. This method is quoted from VFS system.

This makes Linux full of many unknown devices every time, because it not only drags down the kernel, but also makes Linux very non extensible.

Devfs solves this problem very well.

Devfs is only responsible for managing and registering devices into vfs.

If you want to know how vfs abstractly calls different open, write and read functions, please refer to my article: Linux embedded development_ Detailed explanation of main equipment number and secondary equipment number


Sysfs is Linux 2 6. A new file system is launched. Its file mounting point is in the / sys directory. When it was first launched, it did not specify which directory to mount in. It can be mounted in any directory. The mounting method of sysfs is very different from that of devfs. When we mount to / dev, devfs will create a file in this directory and give corresponding permissions, c means character device, b means block device file and s means socket file. You can use ls command in / dev directory to have a look

You can see that there is a c in the front, which means that this is a character device. Open these files by using the open function, and then look for the module corresponding to our device in the vfs table through devfs, and call the open, read and write function pointers in the module.

The attached file of sys is a directory. The directory contains many files, and each file corresponds to different information. These files describe all the information of the device, but sysfs is not registered in the VFS kernel file table. Sysfs only provides a set of file operation functions, which are used to register in the relevant tables about sysfs in the VFS virtual file system.

When it comes to the specific implementation of sysfs, we have to mention a structure: kobject

Here I'd like to add this knowledge to you

There is such a structure in the VFS virtual file system:

struct kobject {
	const char		*name;
	struct list_head	entry;
	struct kobject		*parent;
	struct kset		*kset;
	struct kobj_type	*ktype;
	struct kernfs_node	*sd; /* sysfs directory entry */
	struct kref		kref;
	struct delayed_work	release;
	unsigned int state_initialized:1;
	unsigned int state_in_sysfs:1;
	unsigned int state_add_uevent_sent:1;
	unsigned int state_remove_uevent_sent:1;
	unsigned int uevent_suppress:1;

This structure was originally used to store the description information of Linux device files. In the Linux kernel, four models are used to describe the Linux device model: bus, class, device and driver. These four are data structures to form a Linux device model. For these four models, I will write an article later, Let's get a rough idea of this concept first.

In order to facilitate the management of these four structures, the kobject structure is born. These four model structures are encapsulated in this structure, and all hierarchies are associated through the parent pointer. You can use this structure to find the associated directory under the sys directory

At the same time, it also has the algorithm of reference counting. When it is referenced, it counts + 1. When it is dereferenced, it is - 1. When the reference is 0, the Linux kernel will release the device model, because the Linux kernel can't assign multiple instantiations to you because multiple users use a driver device. It's too wasteful.

And this structure is specially born for sysfs.

Here is a brief description of the role of members in the kobject structure:


name: Should Kobject Is also the name of sysfs Directory name in. because Kobject Add to Kernel You need to register according to your name sysfs This field cannot be modified directly after. If it needs to be modified Kobject You need to call kobject_rename Interface, which will actively handle sysfs Related matters.

entry: Used to Kobject Add to Kset Medium list_head. 

parent: point parent kobject,To form a hierarchy (in sysfs It is expressed as a directory structure).

kset: Should kobject Belonging to Kset. Can be NULL. If present and not specified parent,Will put Kset As parent(Don't forget Kset Is a special Kobject). 

ktype: Should Kobject Belonging to kobj_type. each Kobject There must be one ktype,perhaps Kernel An error will be prompted.
sd: Should Kobject stay sysfs Representation in.

kref: "struct kref"Type (in) include/linux/kref.h Is a reference count that can be used for atomic operations.

state_initialized,Indicate this Kobject Has it been initialized to Kobject of Init,Put,Add Perform exception verification during operation.

state_in_sysfs: Indicate this Kobject Already in sysfs Rendering in to automatically log off from sysfs Remove from.

state_add_uevent_sent/state_remove_uevent_sent: Has the record been sent to user space ADD uevent,If yes, and not sent remove uevent,It will be reissued at the time of automatic logout REMOVE uevent,So that the user space can be handled correctly.

uevent_suppress: If this field is 1, it means to ignore all reported uevent This event is a member of udev Provided

You can see that there are many variables for sysfs, which was not available at first. In Linux 2 When version 6 launched sysfs, kobject was modified to support sysfs file system.

Because the main function of kobject in the kernel is to describe the file device model registered through sysfs.

When the kernel looks for an implementation, it will traverse all the file linked lists in VFS, which will be traversed in versions after 2.6.

Let's take a look at the prototype of each member structure:


 * struct kset - a set of kobjects of a specific type, belonging to a specific subsystem.
 * A kset defines a group of kobjects.  They can be individually
 * different "types" but overall these kobjects all want to be grouped
 * together and operated on in the same manner.  ksets are used to
 * define the attribute callbacks and other common events that happen to
 * a kobject.
 * @list: the list of all kobjects for this kset
 * @list_lock: a lock for iterating over the kobjects
 * @kobj: the embedded kobject for this kset (recursion, isn't it fun...)
 * @uevent_ops: the set of uevent operations for this kset.  These are
 * called whenever a kobject has something happen to it so that the kset
 * can add new environment variables, or filter out the uevents if so
 * desired.
struct kset {
	struct list_head list;
	spinlock_t list_lock;
	struct kobject kobj;
	const struct kset_uevent_ops *uevent_ops;

Parameter introduction:

list/list_lock: Used to save the kset Under all kobject A linked list of.

kobj: Should kset own kobject(kset Is a special kobject,Will also be in sysfs In the form of a directory).

uevent_ops: Should kset of uevent Set of operands. When any Kobject Need to report uevent When, it calls its dependent kset of

uevent_ops: Add environment variables or filter event(kset What can be decided event Can be reported). So if one kobject Does not belong to any kset Is not allowed to send uevent of


struct kobj_type {
	void (*release)(struct kobject *kobj);
	const struct sysfs_ops *sysfs_ops;
	struct attribute **default_attrs;
	const struct kobj_ns_type_operations *(*child_ns_type)(struct kobject *kobj);
	const void *(*namespace)(struct kobject *kobj);

Parameter introduction:

release: This callback function allows you to include this type kobject The memory space of the data structure is released.

sysfs_ops: This type of Kobject of sysfs File system interface.

default_attrs: This type of Kobject of atrribute List (so called attribute,namely sysfs Files in the file system). Will be in Kobject When added to the kernel, it is registered to sysfs Yes.

child_ns_type/namespace: And file system( sysfs)Related to the namespace of

It can be seen that the Linux kernel abstracts many general structure types into one structure. When traversing the VFS, it will look for the general type and traverse the structure to find the corresponding implementation of vfs open, wirte and read.

Including kobj_ Release in type is used to release the memory space of the current model. Release will be released only when the reference is 0, otherwise it will only decrement the reference.

When we are driving development, if we want to use this driving model, we also provide corresponding functions:

void kobject_init(struct kobject *kobj, struct kobj_type *ktype)
int kobject_add(struct kobject *kobj, struct kobject *parent, const char *fmt, ...)
int kobject_init_and_add(struct kobject *kobj, struct kobj_type *ktype,              struct kobject *parent, const char *fmt, ...)
struct kobject *kobject_create(void)
struct kobject *kobject_create_and_add(const char *name, struct kobject *parent)
int kobject_set_name(struct kobject *kobj, const char *fmt, ...)//Set kobject name

This is just to tell you the corresponding relationship between kobject and sysfs. Sysfs depends on the kobject structure. Sysfs will register all the information in kobject for kernel maintenance.

As mentioned above, the files registered by sysfs will be mounted in the / sys directory. Let's open the sys directory to see:

These directories correspond to different types of devices:

devices: Under this directory is the global device architecture, which contains all the discovered physical devices registered on various buses. Generally speaking, all physical devices are displayed according to their topology on the bus, but there are two exceptions, namely platform devices and system devices. 

dev: This directory stores the primary and secondary device number files, which are divided into the primary and secondary device numbers of character devices and block devices(major:minor)A file name that is a linked file and is linked to its real device(/sys/devices). 

class: Include all under this directory kernel The device type inside is the device model classified according to the device function. Each device type represents the device with one function. Under each device type subdirectory, there are symbolic links of various specific devices of this device type. These links point to/sys/devices/Specific equipment under. There is no one-to-one correspondence between device types and devices. A physical device may have multiple device types; A device type only represents a device with one function. For example, all input devices of the system will appear in/sys/class/input Regardless of which bus they are connected to the system.

block: All subdirectories under this directory represent all block devices currently found in the system. According to the function, it is placed in/sys/class Would be more appropriate, but it has always existed in China due to historical factors/sys /block,But from linux2.6.22 When this part of the kernel has been marked as past at the beginning, it can only be opened CONFIG_SYSFS_DEPRECATED This directory exists only after configuration compilation, and the contents of it are from linux2.6.26 The version has been officially moved to/sys/class/block,Old interface/sys/block Reserved for backward compatibility, but the content has changed to point to them/sys/devices/Symbolic link files for real devices in.

bus: Each subdirectory under this directory is kernel Supported and registered bus types. This is the directory structure in which the kernel devices are placed hierarchically according to the bus type,/sys/devices All devices in the are connected under some kind of bus, bus The symbolic link of each specific device can be found under each specific bus under the subdirectory. Generally speaking, each subdirectory(Bus type)It contains two subdirectories. One is devices,The other is drivers;among devices The following are all devices under this bus type. These devices are symbolic links, which point to real devices respectively(/sys/devices/lower);and drivers Here are all drivers registered on this bus, each driver Subdirectories are some that can be observed and modified driver Parameters.

fs: By design, the directory is used to describe all file systems in the system, including the file system itself and the mounted points stored according to the file system classification.

kernel: This directory stores all adjustable parameters in the kernel.

firmware: Here is the user space interface of the system loading firmware mechanism. There is a set of firmware dedicated to firmware loading API,In the appendix LDD3 There is a more detailed introduction to the firmware loading mechanism supported by the kernel in the book;

module: This directory contains the information of all modules in the system, regardless of whether these modules are inline(inlined)Is it compiled into the kernel image file or as an external module(.ko file),Can appear in/sys/module Yes. Namely module The directory contains all the loaded kernel Module.

power: This directory is the power option in the system, which is used for power Subsystem description. There are several attribute files in this directory that can be used to control the power status of the whole machine. For example, you can write control commands to shut down the machine/Restart, etc.

You can register in the corresponding directory according to your device type. When you register in the corresponding directory, a directory with the same name as your device name will be generated. This directory will register a kobject structure in the kernel to correspond with it, and the parent member in it is used to point to the hierarchy of subdirectories.

You can even control the hardware by modifying some file properties in the directory

For example, msp700 modifies the backlight

The premise of doing so is that the device driver supports you to do so according to the manual given by the manufacturer.

Because the driver of msp700 reads this file to set the backlight.

echo 20 > /sys/class/backlight/pwm-backlight/brightness;


In fact, after introducing sysfs, udev is very simple and easy to understand. Udev is in 2.6 After X, it was launched after sysfs because it is based on sysfs.

It is used to manage and optimize devfs. As mentioned earlier, devfs has some problems. For example, it is not flexible enough to automatically identify devices. When we insert a general usb, it cannot give the name of usb. When we have many devices, we can't find which device is which device at all, or even know which device is usb, so we can't distinguish device types, At the same time, the hot plug event of the device cannot be provided.

Therefore, the introduction of udev solves this problem. It relies on sysfs.

It parses the directory of sysfs and registers the device in the / dev directory.

At the same time, it will monitor the hot plug event and provide it to the user status, so that the user can know whether there is currently a device plug-in and pop-up.

It runs in user mode and runs as a daemon.

At the same time, it also has the function of automatic identification of equipment.

If we insert a hard disk, it will be automatically recognized as / dev/hda1

So we can know where our hard disk is. At the same time, it also provides a unique file system unified ID for each device, and supports custom rules.

If you want to modify udev's rules, you can modify them in this file:



proc is a pseudo virtual file system, which stores a series of special files of the current kernel running state. Users can view the information about the system hardware and the running process through these files, and even change the running state of the kernel by changing some of them.

Why is it a pseudo virtual file system, because it will monitor the status of the kernel and process in the system in real time and update the files in real time.

When we check, we will find that the directory is full of directories named by numbers:

These numbers are PIDs. These PIDs correspond to processes, that is, the running status of each PID process is stored in these directories.

We can open one at random:

Each file corresponds to different attributes. For example, cmdline corresponds to the complete command at startup. Let's open it:


The fd directory contains the file descriptors currently used by the program. These file descriptors are a symbolic link:


You can see the descriptor of socket and port

So in this directory, you can see all the process related information.

Here we introduce the common functions of file directories:

cmdline: The full command to start the current process, but this file in the zombie process directory does not contain any information

cwd: A symbolic link to the current process running directory

environ: The list of environment variables of the current process, with empty characters between each other( NULL)separate; Variables are represented in uppercase letters and their values are represented in lowercase letters

exe: Symbolic link to the executable file (full path) that starts the current process through/proc/N/exe The current copy process can be started

fd: This is a directory that contains the file descriptor of each file opened by the current process( file descriptor),These file descriptors are symbolic links to actual files

limits: Soft limit, hard limit and management unit of each restricted resource used by the current process; This file can only be used by the that actually starts the current process UID User read; (2.6.24 Later kernel versions support this feature)

maps: A list of the mapped areas in memory of each executable file and library file associated with the current process and their access rights

mem: The memory space occupied by the current process is determined by open,read and lseek And other system calls, which cannot be read by the user

root: Symbolic link to the running root directory of the current process; stay Unix and Linux On the system, usually chroot Command causes each process to run in a separate root directory

stat: The status information of the current process, including a data column formatted by the system, has poor readability and is usually provided by ps Command use

statm: Status information of memory occupied by the current process, usually in the form of "page"( page)express

status: And stat The information provided is similar, but the readability is good. As shown below, each line represents an attribute information; For details, see proc of man Man page

task: The directory file contains the relevant information of each thread run by the current process. The relevant information file of each thread is saved in a thread number( tid)In the named directory, its content is similar to that in each process directory; (kernel 2).6 (this function is supported after version)

Introduction to proc common directories:

apm: Advanced power management( APM)Information related to battery status apm Command use

buddyinfo: Information file for diagnosing memory fragmentation

cmdline: The relevant parameter information passed to the kernel at startup, which is usually provided by lilo or grub And other start-up management tools

cpuinfo: File of processor related information

crypto: The cryptographic algorithms used by the installed kernel on the system and the detailed information list of each algorithm

devices: The information of all block devices and character devices that have been loaded by the system, including the master device number and device group (device type corresponding to the master device number)

diskstats: Disks per disk device I/O List of statistical information; (kernel 2).5.69 (this feature is supported in later versions)

dma: Each in use and registered ISA DMA Information list of channels

execdomains: List of execution domains currently supported by the kernel (unique "personality" of each operating system)

fb: The frame buffer device list file contains the device number and relevant drive information of the frame buffer device

filesystems: The list of file system types currently supported by the kernel is marked as nodev The file system of does not need the support of block device; usually mount If no file system type is specified for a device, this file will be used to determine the type of file system it needs

interrupts: X86 or X86_64 Architecture on each system IRQ List of relevant interrupt numbers; Each on a multiprocessor platform CPU For each I/O The equipment has its own interrupt number

iomem: Memory on each physical device( RAM perhaps ROM)Mapping information in system memory

ioports: The input that is currently in use and has been registered to communicate with the physical device-List of output port range information; As shown below, the first column represents the registered I/O Port range, followed by the associated device

kallsyms: The module management tool is used to dynamically link or bind the symbol definitions of loadable modules, which are output by the kernel; (kernel 2).5.71 Later versions support this function); Usually, the amount of information in this file is quite large

kcore: The physical memory used by the system to ELF Core document( core file)Format storage, whose file size is the physical memory used( RAM)Plus 4 KB;This file is used to check the current state of the kernel data structure, so it is usually used by GBD Usually used by debugging tools, but you cannot open this file using the file view command

kmsg: This file is used to store the information output by the kernel, usually by/sbin/klogd or/bin/dmsg Do not attempt to open this file by using the view command

loadavg: Save about CPU And disk I/O The first three columns represent the average load value every 1 second, every 5 seconds and every 15 seconds respectively, similar to uptime Relevant information of command output; The fourth column is two values separated by slashes. The former represents the number of entities (processes and threads) currently scheduled by the kernel, and the latter represents the number of kernel scheduling entities currently surviving in the system; The fifth column represents the last process created by the kernel before this file is viewed PID

locks: Save the relevant information of the files currently locked by the kernel, including the debugging data inside the kernel; Each lock occupies a row and has a unique number; The second column of each row in the following output information indicates the locking category used by the current locking, POSIX Indicates a relatively new type of file lock, which is controlled by lockf Generated by system call, FLOCK It's traditional UNIX File lock, by flock System call generation; The third column also usually consists of two types, ADVISORY Indicates that other users are not allowed to lock this file, but reading is allowed, MANDATORY Indicates that other users are not allowed to access this file in any form during its locking

mdstat: preservation RAID The current status information of related multiple disks is not used RAID On the machine

meminfo: Information about the current memory utilization in the system is often provided by free Command use; You can use the file view command to directly read this file. Its contents are displayed in two columns. The former is the statistical attribute and the latter is the corresponding value

mounts: In kernel 2.4.29 Before version, the contents of this file are all file systems currently mounted by the system. In 2.4.19 Later, the kernel introduced the way that each process uses an independent mount namespace, and this file becomes a pointer/proc/self/mounts(Each process itself mounts symbolic links to all mount point (list) files in the namespace

modules: The name list of all modules currently loaded into the kernel can be lsmod Command, or you can view it directly

partitions: Main equipment number of each partition of block equipment( major)And secondary equipment number( minor)And other information, including the blocks contained in each partition( block)number

pci: All problems found during kernel initialization PCI List of equipment and its configuration information. Most of its configuration information is a PCI Equipment related IRQ Information, low readability, can be used“/sbin/lspci –vb"Command to obtain relevant information that is easy to understand; In 2.6 After the kernel, this file has been/proc/bus/pci Directory and its files instead

slabinfo: Objects frequently used in the kernel (e.g inode,dentry Have their own cache,Namely slab pool,and/proc/slabinfo The file lists the related objects slap Information about; See the kernel documentation for details slapinfo Manual page for

stat: Real time tracking of various statistics since the system was last started

swaps: The exchange partition and its space utilization information on the current system. If there are multiple exchange partitions, the information of each exchange partition will be stored in/proc/swap The lower the priority number, the more likely it is to be used

uptime: Elapsed time since the system was last started

version: The kernel version number of the current system

vmstat: At present, there are many kinds of statistical data in the virtual memory of the system, and the amount of information may be relatively large, which varies according to the system, and the readability is good

zoneinfo: Memory area( zone)List of details for

sys: And/proc The "read only" attribute of other files is different in that the administrator can/proc/sys The contents of many files in the subdirectory can be modified to change the running characteristics of the kernel, which can be used in advance“ ls -l"Command to see if a file is "writable". Write operations typically use a similar“ echo  DATA > /path/to/your/filename"Format. It should be noted that even if the file can be written, it can not be edited with the editor.

/proc/sys/debug Subdirectory
 This directory is usually an empty directory;

/proc/sys/dev Subdirectory
 The directory that provides parameter information files for special devices on the system. The information files of different devices are stored in different subdirectories, such as those on most systems/proc/sys/dev/cdrom and/proc/sys/dev/raid(If support is turned on at kernel compile time raid Directory, which is usually stored on the system cdrom and raid Relevant parameter information files.

Most of the above virtual files can be viewed by using file viewing commands such as cat, more or less. Some file information can be expressed at a glance, but some file information is not very readable. However, these poorly readable files can perform well when viewed with commands such as apm, free, lspci, or top.

Except that udev runs in user space, all other file systems belong to a part of the kernel.

In VFS, these structures will be abstracted, and it is convenient for the user to traverse and find the corresponding module pointer when the open function is interrupted.

It is managed through the systemd service.


Topics: Linux Operation & Maintenance