Write in front
this series is written by myself word by word, including examples and experimental screenshots. Due to the complexity of the system kernel, there may be errors or incompleteness. If there are errors, criticism and correction are welcome. This tutorial will be updated for a long time. If you have any good suggestions, you are welcome to give feedback. Code words are not easy. If this article is helpful to you, if you have spare money, you can reward and support my creation. If you want to reprint, please attach my reprint information to the back of the article and state my personal information and my blog address, but you must notify me in advance.
If you look from the middle, please read it carefully Yu Xia's view of Win system kernel -- a brief introduction , easy to learn this tutorial.
before reading this tutorial, ask a few questions. Have you prepared the basic knowledge? Have you learned the protection mode? Have you learned the system call? Have you finished your exercise? If not, don't continue.
🔒 Gorgeous dividing line 🔒
Simulation thread switching analysis
before, we analyzed the essence of simulated thread switching in detail. There are two simulated thread switching, one simulates clock switching and the other simulates active switching, that is, switching through API.
let's take a look at the part of analog clock switching:
while(TRUE) { Sleep(20); Scheduling(); }
Sleep(20) above means that I continue to call Scheduling() every 20ms to achieve thread switching, that is, the clock cycle is 20ms.
the following functions are used to simulate the active switching through API:
void GMSleep(int MilliSeconds) { GMThread_t* GMThreadp; GMThreadp = &GMThreadList[CurrentThreadIndex]; if (GMThreadp->Flags != 0) { GMThreadp->Flags = GMTHREAD_SLEEP; GMThreadp->SleepMillsecondDot = GetTickCount() + MilliSeconds; } Scheduling(); return; }
each thread will call GMSleep, which simulates the call to WinAPI:
void Thread1(void*) { while(1){ printf("Thread1\n"); GMSleep(500); } }
I assume you understand everything above. Then, follow the after-school exercise of the previous article, how to suspend and restore threads? Have you thought about it? Don't continue without thinking.
Analysis and implementation of thread suspend recovery
in order to realize the thread suspension recovery function, we should focus on how to realize thread scheduling. Suspending a thread is nothing more than not giving the thread CPU time. Let's see how the following scheduling function finds the thread:
void Scheduling(void) { int i; int TickCount; GMThread_t* SrcGMThreadp; GMThread_t* DstGMThreadp; TickCount = GetTickCount(); SrcGMThreadp = &GMThreadList[CurrentThreadIndex]; DstGMThreadp = &GMThreadList[0]; for (i = 1; GMThreadList[i].name; i++) { if (GMThreadList[i].Flags & GMTHREAD_SLEEP) { if (TickCount > GMThreadList[i].SleepMillsecondDot) { GMThreadList[i].Flags = GMTHREAD_READY; } } if (GMThreadList[i].Flags & GMTHREAD_READY) { DstGMThreadp = &GMThreadList[i]; break; } } CurrentThreadIndex = DstGMThreadp - GMThreadList; SwitchContext(SrcGMThreadp, DstGMThreadp); return; }
you can see that as long as the simulated thread is gmthread_ In the ready state, it will change the thread, that is to say, if we change the parameter marking the thread state, we can realize the suspension and recovery I want. Let's realize it.
first, we're at threadswitch H to add two declarations:
bool SyspendThread(char* Name); bool ResumeThread(char* Name);
then we implement it:
bool SyspendThread(char* Name) { for (int i=1;i<MAXGMTHREAD;i++) { if (!strcmp(Name,GMThreadList[i].name)) { GMThreadList[i].Flags = GMTHREAD_EXIT; return true; } } return false; } bool ResumeThread(char* Name) { for (int i=1;i<MAXGMTHREAD;i++) { if (!strcmp(Name,GMThreadList[i].name)) { GMThreadList[i].Flags = GMTHREAD_READY; return true; } } return false; }
for convenience of observation, we only keep Thread1 and Thread2, and finally main The contents of CPP are as follows:
#include "stdafx.h" #include <windows.h> #include "ThreadSwitch.h" extern int CurrentThreadIndex; extern GMThread_t GMThreadList[MAXGMTHREAD]; void Thread1(void*) { while(1){ printf("Thread1\n"); GMSleep(500); } } void Thread2(void*) { while (1) { printf("Thread2\n"); GMSleep(500); } } int main() { RegisterGMThread("Thread1", Thread1, NULL); RegisterGMThread("Thread2", Thread2, NULL); //SyspendThread("Thread2"); //ResumeThread("Thread2"); while(TRUE) { Sleep(20); Scheduling(); } return 0; }
next, we will demonstrate in the form of dynamic diagram:
.
SwapContext analysis
this analysis has certain requirements for the reverse level. If not, it is recommended to do more CrakeMe exercises related to IDA and analyze the process for familiarity and practice. But it doesn't matter if you have no experience. Take a closer look at this part and go back and do it again.
when you find this function, you should see the following:
SwapContext proc near ; CODE XREF: KiUnlockDispatcherDatabase(x)+72↑p ; KiSwapContext(x)+29↑p ... or cl, cl mov byte ptr es:[esi+2Dh], 2 pushf loc_46A8E8: ; CODE XREF: KiIdleLoop()+5A↓j mov ecx, [ebx] ; ebx = KPCR cmp dword ptr [ebx+994h], 0 push ecx jnz loc_46AA2D cmp ds:_PPerfGlobalGroupMask, 0 jnz loc_46AA04 loc_46A905: ; CODE XREF: SwapContext+12C↓j ; SwapContext+13D↓j ... mov ebp, cr0 mov edx, ebp mov cl, [esi+2Ch] mov [ebx+50h], cl
since we analyze what its parameters are, first of all, we need to know how they pass parameters and what they pass, so that we can solve the thinking problem left in the previous article.
then we find a reference from KiSwapContext, and the results are as follows:
; __fastcall KiSwapContext(x) @KiSwapContext@4 proc near ; CODE XREF: KiSwapThread()+41↑p var_200FE4 = dword ptr -200FE4h var_10 = dword ptr -10h var_C = dword ptr -0Ch var_8 = dword ptr -8 var_4 = dword ptr -4 sub esp, 10h mov [esp+10h+var_4], ebx mov [esp+10h+var_8], esi mov [esp+10h+var_C], edi mov [esp+10h+var_10], ebp mov ebx, ds:0FFDFF01Ch ; ebx = &_KPCR mov esi, ecx ; esi = ecx = NextReadyThread mov edi, [ebx+124h] mov [ebx+124h], esi mov cl, [edi+58h] call SwapContext mov ebp, [esp+10h+var_10] mov edi, [esp+10h+var_C] mov esi, [esp+10h+var_8] mov ebx, [esp+10h+var_4] add esp, 10h retn @KiSwapContext@4 endp
but we can't see what the parameters are. Let's go up to the next level:
@KiSwapThread@0 proc near ; CODE XREF: KiAttachProcess(x,x,x,x)+F2↑p ; KeDelayExecutionThread(x,x,x):loc_42279A↑p ... mov edi, edi push esi push edi db 3Eh mov eax, ds:0FFDFF020h mov esi, eax
we found the KiSwapThread function, which is obviously a kernel function used to switch threads. Previously, we know that 0xFFDFF000 stores the address of KPCR structure. According to the structure, 0xFFDFF020 stores the first address of KPRCB structure. The final analysis results are as follows:
@KiSwapThread@0 proc near ; CODE XREF: KiAttachProcess(x,x,x,x)+F2↑p ; KeDelayExecutionThread(x,x,x):loc_42279A↑p ... mov edi, edi push esi push edi db 3Eh mov eax, ds:0FFDFF020h mov esi, eax mov eax, [esi+_KPRCB.NextThread] ; eax = NextThread test eax, eax ; Test whether there is mov edi, [esi+_KPRCB.CurrentThread] ; edi = CurrentThread jz short loc_429CAC ; No, NextThread Just jump and [esi+_KPRCB.NextThread], 0 ; hold NextThread Clear jmp short loc_429CCF ; --------------------------------------------------------------------------- loc_429CAC: ; CODE XREF: KiSwapThread()+14↑j push ebx movsx ebx, [esi+_KPRCB.Number] xor edx, edx mov ecx, ebx call @KiFindReadyThread@8 ; KiFindReadyThread(x,x) test eax, eax jnz short loc_429CCE ; If you find the next thread, jump away mov eax, [esi+_KPRCB.IdleThread] xor edx, edx inc edx mov ecx, ebx shl edx, cl or _KiIdleSummary, edx loc_429CCE: ; CODE XREF: KiSwapThread()+2C↑j pop ebx loc_429CCF: ; CODE XREF: KiSwapThread()+1A↑j mov ecx, eax call @KiSwapContext@4 ; KiSwapContext(x)
after simple analysis, we can easily judge that ecx before KiSwapContext is a thread structure, as shown in the following figure:
according to the symbol, KiSwapContext has only one parameter. After simple analysis, the following results can be obtained:
; __fastcall KiSwapContext(x) @KiSwapContext@4 proc near ; CODE XREF: KiSwapThread()+41↑p var_200FE4 = dword ptr -200FE4h var_10 = dword ptr -10h var_C = dword ptr -0Ch var_8 = dword ptr -8 var_4 = dword ptr -4 sub esp, 10h mov [esp+10h+var_4], ebx mov [esp+10h+var_8], esi mov [esp+10h+var_C], edi mov [esp+10h+var_10], ebp mov ebx, ds:0FFDFF01Ch ; ebx = &_KPCR mov esi, ecx ; esi = ecx = NextReadyThread mov edi, [ebx+_KPCR.PrcbData.CurrentThread] mov [ebx+_KPCR.PrcbData.CurrentThread], esi mov cl, [edi+_KTHREAD.WaitIrql] call SwapContext mov ebp, [esp+10h+var_10] mov edi, [esp+10h+var_C] mov esi, [esp+10h+var_8] mov ebx, [esp+10h+var_4] add esp, 10h retn @KiSwapContext@4 endp
before executing the SwapContext function, esi becomes the next new thread to be switched, edi becomes the old thread to be switched, and ebx is the KPCR structure, that is, there are three parameters in the modified function, and we have known the meaning of each parameter. We can also get some verification by using the F5 of IDA:
char __usercall SwapContext@<al>(int *a1@<ebx>, int a2@<edi>, int a3@<esi>)
next, let's analyze where thread switching is implemented. Since the thread switching of the operating system is based on the stack, changing the esp will inevitably lead to thread switching. We can easily follow the following assembly:
loc_46A94C: ; CODE XREF: SwapContext+67↑j mov ecx, [ebx+_KPCR.TSS] mov [ecx+_KTSS.Esp0], eax mov esp, [esi+_KTHREAD.KernelStack] mov eax, [esi+_KTHREAD.Teb] mov [ebx+_KPCR.NtTib.Self], eax
after the pop-up recovery operation of the stack, call retn to complete the thread switching. Next, let's look at when to switch CR3:
mov eax, [edi+_KTHREAD.ApcState.Process] cmp eax, [esi+_KTHREAD.ApcState.Process] mov [edi+_KTHREAD.IdleSwapBlock], 0 jz short loc_46A994 mov edi, [esi+_KTHREAD.ApcState.Process] test word ptr [edi+_KTHREAD.Teb], 0FFFFh jnz short loc_46A9CE xor eax, eax loc_46A975: ; CODE XREF: SwapContext+117↓j lldt ax xor eax, eax mov gs, eax assume gs:GAP mov eax, [edi+_EPROCESS.Pcb.DirectoryTableBase] mov ebp, [ebx+_KPCR.TSS] mov ecx, dword ptr [edi+_KTHREAD.Iopl] mov [ebp+_KTSS.CR3], eax mov cr3, eax mov [ebp+_KTSS.IoMapBase], cx jmp short loc_46A994 ; --------------------------------------------------------------------------- align 4 loc_46A994: ; CODE XREF: SwapContext+86↑j ; SwapContext+AF↑j mov eax, [ebx+_KPCR.NtTib.Self]
after analysis, switching CR3 requires conditions. It will judge whether the Process of the new thread is the same as that of the old thread, and then decide whether to Process:
mov eax, [edi+_KTHREAD.ApcState.Process] cmp eax, [esi+_KTHREAD.ApcState.Process] mov [edi+_KTHREAD.IdleSwapBlock], 0 jz short loc_46A994
if not, the following code switching CR3 will be executed:
mov eax, [edi+_EPROCESS.Pcb.DirectoryTableBase] mov ebp, [ebx+_KPCR.TSS] mov ecx, dword ptr [edi+_KTHREAD.Iopl] mov [ebp+_KTSS.CR3], eax mov cr3, eax
when you see this place, question 2 will be solved. A CPU has a set of registers, which means that only one TSS address register can be stored in it. Must the ESP0 and SS0 of the current thread be stored in the TSS when the interrupt gate is lifted? Let's next analyze:
mov eax, [esi+_KTHREAD.InitialStack] mov ecx, [esi+_KTHREAD.StackLimit] sub eax, 210h mov [ebx+_KPCR.NtTib.StackLimit], ecx mov [ebx+_KPCR.NtTib.StackBase], eax xor ecx, ecx mov cl, [esi+_KTHREAD.NpxState] and edx, 0FFFFFFF1h or ecx, edx or ecx, [eax+20Ch] cmp ebp, ecx jnz loc_46A9FC lea ecx, [ecx+0] loc_46A940: ; CODE XREF: SwapContext+11F↓j test dword ptr [eax-1Ch], 20000h ; Check whether it is in virtual 8086 mode jnz short loc_46A94C sub eax, 10h loc_46A94C: ; CODE XREF: SwapContext+67↑j mov ecx, [ebx+_KPCR.TSS] mov [ecx+_KTSS.Esp0], eax ; Put the corrected stack top into TSS in mov esp, [esi+_KTHREAD.KernelStack] mov eax, [esi+_KTHREAD.Teb]
it can be seen that the above code has handled and corrected ESP0 before mov ESP, [ESI + u kthread. Kernelsack], so the ESP0 and SS0 of the current thread must be stored in the TSS when the interrupt gate is invoked. So far, the third question has been solved.
in the virtual machine I tested, the segment selectors of fs are all 0x3B, but why do different thread segment selectors point to different tebs? The reason is that it directly modifies the contents of the GDT table so that the address it points to is the TEB of our current thread. The code is as follows:
mov eax, [ebx+_KPCR.NtTib.Self] mov ecx, [ebx+_KPCR.GDT] mov [ecx+3Ah], ax shr eax, 10h mov [ecx+3Ch], al mov [ecx+3Fh], ah
so far, question 4 has been solved. Let's see where the ExceptionList of ring 0 is backed up:
pop ecx mov [ebx+_KPCR.NtTib.ExceptionList], ecx
the fifth problem is solved. It saves the ExceptionList of the new thread in KPCR. Let's take a look at how IdleThread finds:
we know that IdleThread is stored in KPCR. We query through the structure to find the address of the member.
kd> dt _KPCR 0xffdff000 nt!_KPCR +0x000 NtTib : _NT_TIB +0x01c SelfPcr : 0xffdff000 _KPCR +0x020 Prcb : 0xffdff120 _KPRCB +0x024 Irql : 0x1c '' +0x028 IRR : 4 +0x02c IrrActive : 0 +0x030 IDR : 0xffff20f8 +0x034 KdVersionBlock : 0x80546ab8 Void +0x038 IDT : 0x8003f400 _KIDTENTRY +0x03c GDT : 0x8003f000 _KGDTENTRY +0x040 TSS : 0x80042000 _KTSS +0x044 MajorVersion : 1 +0x046 MinorVersion : 1 +0x048 SetMember : 1 +0x04c StallScaleFactor : 0x64 +0x050 DebugActive : 0 '' +0x051 Number : 0 '' +0x052 Spare0 : 0 '' +0x053 SecondLevelCacheAssociativity : 0 '' +0x054 VdmAlert : 0 +0x058 KernelReserved : [14] 0 +0x090 SecondLevelCacheSize : 0 +0x094 HalReserved : [16] 0 +0x0d4 InterruptMode : 0 +0x0d8 Spare1 : 0 '' +0x0dc KernelReserved2 : [17] 0 +0x120 PrcbData : _KPRCB kd> dx -id 0,0,805539a0 -r1 ((ntkrnlpa!_KPRCB *)0xffdff120) ((ntkrnlpa!_KPRCB *)0xffdff120) : 0xffdff120 [Type: _KPRCB *] [+0x000] MinorVersion : 0x1 [Type: unsigned short] [+0x002] MajorVersion : 0x1 [Type: unsigned short] [+0x004] CurrentThread : 0x80553740 [Type: _KTHREAD *] [+0x008] NextThread : 0x0 [Type: _KTHREAD *] [+0x00c] IdleThread : 0x80553740 [Type: _KTHREAD *] [+0x010] Number : 0 [Type: char] [+0x011] Reserved : 0 [Type: char]
then let's dt look at this structure:
kd> dt _ETHREAD 0x80553740 ntdll!_ETHREAD +0x000 Tcb : _KTHREAD +0x1c0 CreateTime : _LARGE_INTEGER 0x0 +0x1c0 NestedFaultCount : 0y00 +0x1c0 ApcNeeded : 0y0 +0x1c8 ExitTime : _LARGE_INTEGER 0x0 +0x1c8 LpcReplyChain : _LIST_ENTRY [ 0x0 - 0x0 ] +0x1c8 KeyedWaitChain : _LIST_ENTRY [ 0x0 - 0x0 ] +0x1d0 ExitStatus : 0n0 +0x1d0 OfsChain : (null) +0x1d4 PostBlockList : _LIST_ENTRY [ 0x0 - 0x0 ] +0x1dc TerminationPort : (null) +0x1dc ReaperLink : (null) +0x1dc KeyedWaitValue : (null) +0x1e0 ActiveTimerListLock : 0 +0x1e4 ActiveTimerListHead : _LIST_ENTRY [ 0x0 - 0x0 ] +0x1ec Cid : _CLIENT_ID +0x1f4 LpcReplySemaphore : _KSEMAPHORE +0x1f4 KeyedWaitSemaphore : _KSEMAPHORE +0x208 LpcReplyMessage : (null) +0x208 LpcWaitingOnPort : (null) +0x20c ImpersonationInfo : (null) +0x210 IrpList : _LIST_ENTRY [ 0x0 - 0x0 ] +0x218 TopLevelIrp : 0 +0x21c DeviceToVerify : (null) +0x220 ThreadsProcess : (null) +0x224 StartAddress : (null) +0x228 Win32StartAddress : (null) +0x228 LpcReceivedMessageId : 0 +0x22c ThreadListEntry : _LIST_ENTRY [ 0x0 - 0x0 ] +0x234 RundownProtect : _EX_RUNDOWN_REF +0x238 ThreadLock : _EX_PUSH_LOCK +0x23c LpcReplyMessageId : 0 +0x240 ReadClusterSize : 0 +0x244 GrantedAccess : 0x1f03ff +0x248 CrossThreadFlags : 0 +0x248 Terminated : 0y0 +0x248 DeadThread : 0y0 +0x248 HideFromDebugger : 0y0 +0x248 ActiveImpersonationInfo : 0y0 +0x248 SystemThread : 0y0 +0x248 HardErrorsAreDisabled : 0y0 +0x248 BreakOnTermination : 0y0 +0x248 SkipCreationMsg : 0y0 +0x248 SkipTerminationMsg : 0y0 +0x24c SameThreadPassiveFlags : 0 +0x24c ActiveExWorker : 0y0 +0x24c ExWorkerCanWaitUser : 0y0 +0x24c MemoryMaker : 0y0 +0x250 SameThreadApcFlags : 0 +0x250 LpcReceivedMsgIdValid : 0y0 +0x250 LpcExitThreadCalled : 0y0 +0x250 AddressSpaceOwner : 0y0 +0x254 ForwardClusterOnly : 0 '' +0x255 DisablePageFaultClustering : 0 ''
in the above structure, + 0x224 StartAddress stores the address where the thread starts to execute, that is, the function address passed by calling CreateThread in ring 3. However, for this thread, it is special and directly empty. How can we find the function address?
the stack must be used when the program is executed. We can locate the behavior of the program through the stack. Let's focus on the KernelStack of KTHREAD. The operation involving this member exists in thread switching. Let's take a look at the local assembly code related to stack operation:
pushf mov ecx, [ebx] ; ebx = KPCR cmp [ebx+_KPCR.PrcbData.DpcRoutineActive], 0 push ecx ;KPCR.NtTib.ExceptionList ...... mov esp, [esi+_KTHREAD.KernelStack] ...... pop ecx xor eax, eax retn
let's see what the stack address is first:
kd> dt _KTHREAD 0x80553740 ntdll!_KTHREAD +0x000 Header : _DISPATCHER_HEADER +0x010 MutantListHead : _LIST_ENTRY [ 0x80553750 - 0x80553750 ] +0x018 InitialStack : 0x8054af00 Void +0x01c StackLimit : 0x80547f00 Void +0x020 Teb : (null) +0x024 TlsArray : (null) +0x028 KernelStack : 0x8054ac4c Void +0x02c DebugActive : 0 '' +0x02d State : 0x2 '' +0x02e Alerted : [2] "" +0x030 Iopl : 0 '' +0x031 NpxState : 0xa '' +0x032 Saturation : 0 '' +0x033 Priority : 16 '' +0x034 ApcState : _KAPC_STATE +0x04c ContextSwitches : 0x1736 +0x050 IdleSwapBlock : 0 '' +0x051 Spare0 : [3] "" +0x054 WaitStatus : 0n0 +0x058 WaitIrql : 0x2 '' +0x059 WaitMode : 0 '' +0x05a WaitNext : 0 '' +0x05b WaitReason : 0 '' +0x05c WaitBlockList : 0x805537b0 _KWAIT_BLOCK +0x060 WaitListEntry : _LIST_ENTRY [ 0x0 - 0x0 ] +0x060 SwapListEntry : _SINGLE_LIST_ENTRY +0x068 WaitTime : 0x555 +0x06c BasePriority : 0 '' +0x06d DecrementCount : 0 '' +0x06e PriorityDecrement : 0 '' +0x06f Quantum : -17 '' +0x070 WaitBlock : [4] _KWAIT_BLOCK +0x0d0 LegoData : (null) +0x0d4 KernelApcDisable : 0 +0x0d8 UserAffinity : 0xffffffff +0x0dc SystemAffinityActive : 0 '' +0x0dd PowerState : 0 '' +0x0de NpxIrql : 0 '' +0x0df InitialNode : 0 '' +0x0e0 ServiceTable : 0x80553fa0 Void +0x0e4 Queue : (null) +0x0e8 ApcQueueLock : 0 +0x0f0 Timer : _KTIMER +0x118 QueueListEntry : _LIST_ENTRY [ 0x0 - 0x0 ] +0x120 SoftAffinity : 1 +0x124 Affinity : 1 +0x128 Preempted : 0 '' +0x129 ProcessReadyQueue : 0 '' +0x12a KernelStackResident : 0x1 '' +0x12b NextProcessor : 0 '' +0x12c CallbackStack : (null) +0x130 Win32Thread : (null) +0x134 TrapFrame : (null) +0x138 ApcStatePointer : [2] 0x80553774 _KAPC_STATE +0x140 PreviousMode : 0 '' +0x141 EnableStackSwap : 0x1 '' +0x142 LargeStack : 0 '' +0x143 ResourceIndex : 0 '' +0x144 KernelTime : 0x2d1a +0x148 UserTime : 0 +0x14c SavedApcState : _KAPC_STATE +0x164 Alertable : 0 '' +0x165 ApcStateIndex : 0 '' +0x166 ApcQueueable : 0x1 '' +0x167 AutoAlignment : 0 '' +0x168 StackBase : 0x8054af00 Void +0x16c SuspendApc : _KAPC +0x19c SuspendSemaphore : _KSEMAPHORE +0x1b0 ThreadListEntry : _LIST_ENTRY [ 0x805539f0 - 0x805539f0 ] +0x1b8 FreezeCount : 0 '' +0x1b9 SuspendCount : 0 '' +0x1ba IdealProcessor : 0 '' +0x1bb DisableBoost : 0 ''
let's see what the stack looks like:
kd> dd 0x8054ac4c 8054ac4c 00000000 ffdff980 80542af0 00000000 8054ac5c 0000000e 00000000 00000000 00000000 8054ac6c 00000000 00000000 00000000 00000000 8054ac7c 00000000 00000000 00000000 00000000 8054ac8c 00000000 00000000 00000000 00000000 8054ac9c 00000000 00000000 00000000 00000000 8054acac 00000000 00000000 00000000 00000000 8054acbc 00000000 00000000 00000000 00000000
that is to say, the first is the ExceptionList, the second is Eflag, and the third is the address to jump after switching threads, which is the address where IdleThread continues to go. This address must be in IdleThread. Let's u say:
kd> u 80542af0 ReadVirtual: 80542af0 not properly sign extended 80542af0 fb sti ReadVirtual: 80542b00 not properly sign extended 80542af1 90 nop ReadVirtual: 80542b01 not properly sign extended 80542af2 90 nop ReadVirtual: 80542b02 not properly sign extended 80542af3 fa cli ReadVirtual: 80542b03 not properly sign extended 80542af4 3b6d00 cmp ebp,dword ptr [ebp] 80542af7 740d je nt!KiIdleLoop+0x26 (80542b06) 80542af9 b102 mov cl,2 80542afb ff15a8864d80 call dword ptr [nt!_imp_HalClearSoftwareInterrupt (804d86a8)]
that is, the function is KiIdleLoop. Let's look at the function through IDA:
; _DWORD __cdecl KiIdleLoop() @KiIdleLoop@0 proc near ; CODE XREF: KiSystemStartup(x)+2E2↓j lea ebp, [ebx+980h] jmp short loc_46AAF0 ; --------------------------------------------------------------------------- loc_46AAE8: ; CODE XREF: KiIdleLoop()+2D↓j lea ecx, [ebx+0C50h] call dword ptr [ecx] loc_46AAF0: ; CODE XREF: KiIdleLoop()+6↑j ; KiIdleLoop()+65↓j sti nop nop cli cmp ebp, [ebp+0] jz short loc_46AB06 mov cl, 2 call ds:__imp_@HalClearSoftwareInterrupt@4 ; HalClearSoftwareInterrupt(x) call KiRetireDpcList loc_46AB06: ; CODE XREF: KiIdleLoop()+17↑j cmp dword ptr [ebx+128h], 0 jz short loc_46AAE8 sti mov esi, [ebx+128h] mov edi, [ebx+124h] or ecx, 1 mov [ebx+124h], esi mov byte ptr es:[esi+2Dh], 2 mov dword ptr [ebx+128h], 0 push offset loc_46AB3F pushf jmp loc_46A8E8 ; --------------------------------------------------------------------------- loc_46AB3F: ; DATA XREF: KiIdleLoop()+54↑o lea ebp, [ebx+980h] jmp short loc_46AAF0 @KiIdleLoop@0 endp
this function has no meaning, that is, let the CPU not idle and execute a wave of meaningless code. So far, question 6 has been solved.
so far, question 7 has been solved.
what are the differences between simulated thread switching and Windows thread switching? The real thread has two stacks, one is the stack of kernel 0 ring and the other is the stack of 3 ring. The thread switching occurs in 0 ring; Simulating thread switching does not use FS, exception list and other things. Other differences can be summarized by yourself.
next is the last question. Let's go through the process of clock interrupt. All interrupts are in the IDT table. First, let's follow it and locate the table first. Just g to_ IDT, the effect is as follows:
_IDT dd offset _KiTrap00 ; DATA XREF: KiSystemStartup(x)+1D5↑o db 0, 8Eh word_5B8B02 dw 8 ; DATA XREF: KiSwapIDT()↓o dd offset _KiTrap01 dd 88E00h dd offset _KiTrap02 dd 88E00h dd offset _KiTrap03 dd 8EE00h dd offset _KiTrap04 dd 8EE00h dd offset _KiTrap05 dd 88E00h dd offset _KiTrap06 dd 88E00h dd offset _KiTrap07 dd 88E00h dd offset _KiTrap08 dd 88E00h dd offset _KiTrap09
the interrupt number of clock interrupt is 0x30. How can we locate this function? Look at the following figure:
then jump to this function:
; _DWORD __stdcall KiStartUnexpectedRange() _KiStartUnexpectedRange@0 proc near ; DATA XREF: KiGetVectorInfo(x,x)+68↑o ; INIT:005B8C7C↓o push 30h ; '0' jmp _KiEndUnexpectedRange@0 ; KiEndUnexpectedRange() _KiStartUnexpectedRange@0 endp
let's see where to jump:
; _DWORD __stdcall KiEndUnexpectedRange() _KiEndUnexpectedRange@0 proc near ; CODE XREF: KiStartUnexpectedRange()+5↑j ; _KiUnexpectedInterrupt1+5↑j ... jmp cs:off_46632E _KiEndUnexpectedRange@0 endp ; --------------------------------------------------------------------------- off_46632E dd offset _KiUnexpectedInterruptTail ; DATA XREF: KiEndUnexpectedRange()↑r
continue. In order to save space, only the calling process part is reserved:
_KiUnexpectedInterruptTail proc near ; CODE XREF: KiEndUnexpectedRange()↑j ; DATA XREF: .text:off_46632E↑o ...... loc_466E7E: ; CODE XREF: Dr_kui_a+10↑j ; Dr_kui_a+7C↑j inc dword ptr ds:0FFDFF5C4h mov ebx, [esp+68h+var_68] sub esp, 4 push esp push ebx push 1Fh call ds:__imp__HalBeginSystemInterrupt@12 ; HalBeginSystemInterrupt(x,x,x) or eax, eax jnz short loc_466E9D add esp, 8 jmp short loc_466EEC ; --------------------------------------------------------------------------- loc_466E9D: ; CODE XREF: _KiUnexpectedInterruptTail+BF↑j cli call ds:__imp__HalEndSystemInterrupt@8 ; HalEndSystemInterrupt(x,x) jmp short Kei386EoiHelper@0 ; Kei386EoiHelper() _KiUnexpectedInterruptTail endp
public HalEndSystemInterrupt HalEndSystemInterrupt proc near ; CODE XREF: sub_80010EF0+E8↑p ; sub_80017144+B3↓p ; DATA XREF: ... arg_0 = byte ptr 4 movzx ecx, [esp+arg_0] cmp byte ptr ds:0FFDFF024h, 2 jbe short loc_8001123E mov eax, ds:dword_800176EC[ecx*4] or eax, ds:0FFDFF030h out 21h, al ; Interrupt controller, 8259A. shr eax, 8 out 0A1h, al ; Interrupt Controller #2, 8259A loc_8001123E: ; CODE XREF: HalEndSystemInterrupt+C↑j mov ds:0FFDFF024h, cl mov eax, ds:0FFDFF028h mov al, ds:byte_80017784[eax] cmp al, cl ja short loc_80011256 retn 8 ; --------------------------------------------------------------------------- loc_80011256: ; CODE XREF: HalEndSystemInterrupt+35↑j add esp, 0Ch jmp ds:pKiUnexpectedInterrupt[eax*4] HalEndSystemInterrupt endp ; sp-analysis failed
pKiUnexpectedInterrupt dd offset KiUnexpectedInterrupt ; DATA XREF: HalEndSystemInterrupt+3D↑r ; sub_80011260+3D↑r dd offset sub_80016BDD dd offset sub_80016A45
sub_80016A45 proc near ; CODE XREF: KfLowerIrql:loc_800110AC↑p ; KfReleaseSpinLock:loc_8001111C↑p ... push dword ptr ds:0FFDFF024h mov byte ptr ds:0FFDFF024h, 2 and dword ptr ds:0FFDFF028h, 0FFFFFFFBh sti call ds:KiDispatchInterrupt cli call sub_80011260 jmp ds:Kei386EoiHelper sub_80016A45 endp
; _DWORD __stdcall KiDispatchInterrupt() public _KiDispatchInterrupt@0 _KiDispatchInterrupt@0 proc near ; DATA XREF: .edata:off_58D2A8↓o var_C = dword ptr -0Ch var_8 = dword ptr -8 var_4 = dword ptr -4 mov ebx, ds:0FFDFF01Ch ; a1 lea eax, [ebx+980h] cli cmp eax, [eax] jz short loc_46A85E push ebp push dword ptr [ebx] mov dword ptr [ebx], 0FFFFFFFFh mov edx, esp mov esp, [ebx+988h] push edx mov ebp, eax call KiRetireDpcList pop esp pop dword ptr [ebx] pop ebp loc_46A85E: ; CODE XREF: KiDispatchInterrupt()+F↑j sti cmp dword ptr [ebx+9ACh], 0 jnz short loc_46A8BE cmp dword ptr [ebx+128h], 0 jz short locret_46A8BD mov eax, [ebx+128h] loc_46A877: ; CODE XREF: KiDispatchInterrupt()+9F↓j sub esp, 0Ch mov [esp+0Ch+var_4], esi mov [esp+0Ch+var_8], edi mov [esp+0Ch+var_C], ebp mov esi, eax ; NewThread mov edi, [ebx+124h] ; oldThread mov dword ptr [ebx+128h], 0 mov [ebx+124h], esi mov ecx, edi mov byte ptr [edi+50h], 1 call @KiReadyThread@4 ; KiReadyThread(x) mov cl, 1 call SwapContext mov ebp, [esp+0Ch+var_C] mov edi, [esp+0Ch+var_8] mov esi, [esp+0Ch+var_4] add esp, 0Ch locret_46A8BD: ; CODE XREF: KiDispatchInterrupt()+3F↑j retn ; --------------------------------------------------------------------------- loc_46A8BE: ; CODE XREF: KiDispatchInterrupt()+36↑j mov dword ptr [ebx+9ACh], 0 call _KiQuantumEnd@0 ; KiQuantumEnd() or eax, eax jnz short loc_46A877 retn _KiDispatchInterrupt@0 endp
Process attachment
when talking about programming, we have all heard that a process can contain multiple threads, and a process must have at least one thread. The process provides resources for the thread, that is, the value of Cr3. CR3 stores the base address of the page directory table. When CR3 is determined, the memory that the thread can access is determined.
for this line of code: mov eax,dword ptr ds:[0x12345678], how does the CPU resolve this address? When the CPU parses the linear address, it needs to find the corresponding physical page through the page directory table, and the base address of the page directory table is stored in the Cr3 register. The current Cr3 value comes from the current process structure_ KPROCESS.DirectoryTableBase. So what's the matter with process attachment? Let's first look at a structure:
kd> dt _KTHREAD ntdll!_KTHREAD ...... +0x032 Saturation : Char +0x033 Priority : Char +0x034 ApcState : _KAPC_STATE +0x04c ContextSwitches : Uint4B +0x050 IdleSwapBlock : UChar ...... kd> dt _KAPC_STATE ntdll!_KAPC_STATE +0x000 ApcListHead : [2] _LIST_ENTRY +0x010 Process : Ptr32 _KPROCESS +0x014 KernelApcInProgress : UChar +0x015 KernelApcPending : UChar +0x016 UserApcPending : UChar
we have encountered the member ApcState during reverse thread switching, that is, when solving our problem 2, the Process member of this structure is the Process CR3 on which the stored Process is attached. For example, the DirectoryTableBase of EPROCESS stores parents, while ApcState stores adoptive parents. When I want resources, I get them from adoptive parents. Normally, the value of Cr3 is provided by the adoptive parents, but the value of Cr3 can also be changed to the DirectoryTableBase of other processes that have nothing to do with the current thread. Changing the current CR3 value to another Process is called "Process attachment".
Cross process memory read and write
cross process memory reading and writing must be switched to CR3 according to what we have learned before, and the read memory will be implemented to the following assembly:
mov eax,dword ptr ds:[0x12345678] mov dword ptr ds:[0x00401234],eax
we implement a cross process memory read and write int ourselves. It's OK to say, what if it's a Buffer with a specified length?
we all know that the high 2G kernel space of an application is shared, that is, no matter which application, the high 2G content addressing can be found. Then I write the memory of the reading process to the high 2G space, then switch back to CR3, and then write the high 2G cache to the specified Buffer again, and we are finished. The above read operation and write operation are similar.
Cross process read
Cross process write
let's briefly analyze the cross process memory reading function NtReadVirtualMemory implemented by Windows and the cross process writing function NtWriteVirtualMemory. The implementations of NtWriteVirtualMemory and NtReadVirtualMemory are very similar. I will only analyze the former, and the following analysis will be made by myself. The reason why it is analysis is that there are a lot of other pre knowledge, such as APC and memory management. I won't repeat how the third ring enters the kernel for convenience. In order to shorten the space and increase readability, I will use the pseudo code translated by IDA as much as possible. The result of your pseudo code should be different from mine because I have done some renaming operations.
Analysis of NtReadVirtualMemory
let's locate the pseudo code NtReadVirtualMemory first:
NTSTATUS __stdcall NtReadVirtualMemory(HANDLE ProcessHandle, PVOID BaseAddress, PVOID Buffer, SIZE_T NumberOfBytesToRead, PSIZE_T NumberOfBytesRead) { _KTHREAD *v5; // edi PSIZE_T v6; // ebx int a7; // [esp+10h] [ebp-28h] BYREF PRKPROCESS PROCESS; // [esp+14h] [ebp-24h] BYREF KPROCESSOR_MODE AccessMode[4]; // [esp+18h] [ebp-20h] NTSTATUS res; // [esp+1Ch] [ebp-1Ch] CPPEH_RECORD ms_exc; // [esp+20h] [ebp-18h] v5 = KeGetCurrentThread(); AccessMode[0] = v5->PreviousMode; if ( AccessMode[0] ) { if ( BaseAddress + NumberOfBytesToRead < BaseAddress || Buffer + NumberOfBytesToRead < Buffer || BaseAddress + NumberOfBytesToRead > MmHighestUserAddress || Buffer + NumberOfBytesToRead > MmHighestUserAddress ) { return 0xC0000005; } v6 = NumberOfBytesRead; if ( NumberOfBytesRead ) { ms_exc.registration.TryLevel = 0; if ( NumberOfBytesRead >= MmUserProbeAddress ) *MmUserProbeAddress = 0; *NumberOfBytesRead = *NumberOfBytesRead; ms_exc.registration.TryLevel = -1; } } else { v6 = NumberOfBytesRead; } a7 = 0; res = 0; if ( NumberOfBytesToRead ) { res = ObReferenceObjectByHandle(ProcessHandle, 0x10u, PsProcessType, AccessMode[0], &PROCESS, 0); if ( !res ) { res = MmCopyVirtualMemory( PROCESS, BaseAddress, v5->ApcState.Process, Buffer, NumberOfBytesToRead, AccessMode[0], &a7); ObfDereferenceObject(PROCESS); } } if ( v6 ) { *v6 = a7; ms_exc.registration.TryLevel = -1; } return res; }
we can see that the function realizes memory copy through the function MmCopyVirtualMemory. Let's click to see:
NTSTATUS __stdcall MmCopyVirtualMemory(PRKPROCESS PROCESS, PVOID BaseAddress, PRKPROCESS KPROCESS, char *buffer, SIZE_T Length, KPROCESSOR_MODE AccessMode, int *a7) { PRKPROCESS process; // ebx PRKPROCESS kprocess; // ecx NTSTATUS res; // esi struct _EX_RUNDOWN_REF *RunRefa; // [esp+8h] [ebp+8h] if ( !Length ) return 0; process = PROCESS; kprocess = PROCESS; if ( PROCESS == KeGetCurrentThread()->ApcState.Process ) kprocess = KPROCESS; RunRefa = &kprocess[1].ProfileListHead.Blink; if ( !ExAcquireRundownProtection(&kprocess[1].ProfileListHead.Blink) ) return STATUS_PROCESS_IS_TERMINATING; if ( Length <= 0x1FF ) goto LABEL_10; res = MiDoMappedCopy(process, BaseAddress, KPROCESS, buffer, Length, AccessMode, a7); if ( res == STATUS_WORKING_SET_QUOTA ) { *a7 = 0; LABEL_10: res = MiDoPoolCopy(process, BaseAddress, KPROCESS, buffer, Length, AccessMode, a7); } ExReleaseRundownProtection(RunRefa); return res; }
you may see a novel function, exacquirerun down protection. This function is to apply for a lock. It is translated from the Internet as rundown protection lock. The name is strange and sounds strange.
this does not involve our core. We continue to analyze and find that it realizes process memory reading through MiDoMappedCopy:
NTSTATUS __stdcall MiDoMappedCopy(PRKPROCESS PROCESS, char *src, PRKPROCESS process, char *buffer, SIZE_T Length, KPROCESSOR_MODE AccessMode, int *a7) { // [COLLAPSED LOCAL DECLARATIONS. PRESS KEYPAD CTRL-"+" TO EXPAND] v13 = 0; src_0 = src; buffer_1 = buffer; v7 = 0xE000; if ( Length <= 0xE000 ) v7 = Length; v16 = &MemoryDescriptorList; Length_1 = Length; v19 = v7; v20 = 0; v14 = 0; v15 = 0; while ( Length_1 ) { if ( Length_1 < v19 ) v19 = Length_1; KeStackAttachProcess(PROCESS, &ApcState); BaseAddress = 0; v12 = 0; v11 = 0; ms_exc.registration.TryLevel = 0; if ( src_0 == src && AccessMode ) { v20 = 1; if ( Length && (&src[Length] < src || &src[Length] > MmUserProbeAddress) ) ExRaiseAccessViolation(); v20 = 0; } MemoryDescriptorList.Next = 0; MemoryDescriptorList.Size = 4 * (((src_0 & 0xFFF) + v19 + 0xFFF) >> 12) + 28; MemoryDescriptorList.MdlFlags = 0; MemoryDescriptorList.StartVa = (src_0 & 0xFFFFF000); MemoryDescriptorList.ByteOffset = src_0 & 0xFFF; MemoryDescriptorList.ByteCount = v19; MmProbeAndLockPages(&MemoryDescriptorList, AccessMode, IoReadAccess); v12 = 1; BaseAddress = MmMapLockedPagesSpecifyCache(&MemoryDescriptorList, 0, MmCached, 0u, 0u, 0x20u); if ( !BaseAddress ) { v13 = 1; ExRaiseStatus(STATUS_INSUFFICIENT_RESOURCES); } KeUnstackDetachProcess(&ApcState); KeStackAttachProcess(process, &ApcState); if ( src_0 == src ) { if ( AccessMode ) { v20 = 1; ProbeForWrite(buffer, Length, 1u); v20 = 0; } } v11 = 1; qmemcpy(buffer_1, BaseAddress, v19); ms_exc.registration.TryLevel = -1; KeUnstackDetachProcess(&ApcState); MmUnmapLockedPages(BaseAddress, &MemoryDescriptorList); MmUnlockPages(&MemoryDescriptorList); Length_1 -= v19; src_0 += v19; buffer_1 += v19; } *a7 = Length; return STATUS_SUCCESS; }
KeStackAttachProcess and KeUnstackDetachProcess are related to APC. Here, you can simply understand that it is to switch CR3 to realize process attachment and de attachment. Let's pay attention to the following pseudo code:
MemoryDescriptorList.Next = 0; MemoryDescriptorList.Size = 4 * (((src_0 & 0xFFF) + v19 + 0xFFF) >> 12) + 28; MemoryDescriptorList.MdlFlags = 0; MemoryDescriptorList.StartVa = (src_0 & 0xFFFFF000); MemoryDescriptorList.ByteOffset = src_0 & 0xFFF; MemoryDescriptorList.ByteCount = v19; MmProbeAndLockPages(&MemoryDescriptorList, AccessMode, IoReadAccess); v12 = 1; BaseAddress = MmMapLockedPagesSpecifyCache(&MemoryDescriptorList, 0, MmCached, 0u, 0u, 0x20u);
MmMapLockedPagesSpecifyCache is the physical page described in the mapping. The following is Microsoft's description of the function:
The MmMapLockedPagesSpecifyCache routine maps the physical pages that are described by an MDL to a virtual address, and enables the caller to specify the cache attribute that is used to create the mapping.
even if the above operation locks the physical page and remaps it to the high 2G address, we can write it directly, which eliminates the steps of rewriting the high 2G content into the program space.
Analysis of process creation
similarly to the above analysis, I also use pseudo code. The purpose of process creation analysis is to know how the kernel creates the process. Please dig out the specific details. We locate its kernel function NtCreateProcess:
NTSTATUS __stdcall NtCreateProcess(PHANDLE ProcessHandle, ACCESS_MASK DesiredAccess, POBJECT_ATTRIBUTES ObjectAttributes, HANDLE ParentProcess, BOOLEAN InheritObjectTable, HANDLE SectionHandle, HANDLE DebugPort, HANDLE ExceptionPort) { ULONG v8; // eax v8 = (SectionHandle & 1) != 0; if ( (DebugPort & 1) != 0 ) v8 |= 2u; if ( InheritObjectTable ) v8 |= 4u; return NtCreateProcessEx( ProcessHandle, DesiredAccess, ObjectAttributes, ParentProcess, v8, SectionHandle, DebugPort, ExceptionPort, 0); }
this kernel function will call NtCreateProcessEx to realize the function. Let's click to see:
NTSTATUS __stdcall NtCreateProcessEx(PHANDLE ProcessHandle, ACCESS_MASK DesiredAccess, POBJECT_ATTRIBUTES ObjectAttributes, HANDLE ParentProcess, ULONG Flags, HANDLE SectionHandle, HANDLE DebugPort, HANDLE ExceptionPort, BOOLEAN InJob) { PHANDLE v9; // ecx NTSTATUS result; // eax if ( KeGetCurrentThread()->PreviousMode ) { v9 = ProcessHandle; if ( ProcessHandle >= MmUserProbeAddress ) *MmUserProbeAddress = 0; *ProcessHandle = *ProcessHandle; } else { v9 = ProcessHandle; } if ( ParentProcess ) result = PspCreateProcess( v9, DesiredAccess, ObjectAttributes, ParentProcess, Flags, SectionHandle, DebugPort, ExceptionPort, InJob); else result = STATUS_INVALID_PARAMETER; return result; }
this function calls PspCreateProcess to implement the task of creating the process. Continue to click to see:
🔒 Click to view pseudo code 🔒NTSTATUS __stdcall PspCreateProcess(PHANDLE ProcessHandle, ACCESS_MASK DesiredAccess, POBJECT_ATTRIBUTES ObjectAttributes, HANDLE ParentProcess, ULONG Flags, HANDLE SectionHandle, HANDLE DebugPort, HANDLE ExceptionPort, BOOLEAN InJob) { // [COLLAPSED LOCAL DECLARATIONS. PRESS KEYPAD CTRL-"+" TO EXPAND] v41 = KeGetCurrentThread(); AccessMode[0] = v41->PreviousMode; process = v41->ApcState.Process; v64 = 0; v55[0] = 0; v55[1] = 0; if ( (Flags & 0xFFFFFFF0) != 0 ) return STATUS_INVALID_PARAMETER; if ( ParentProcess ) { result = ObReferenceObjectByHandle(ParentProcess, 0x80u, PsProcessType, AccessMode[0], &eprocess, 0); ParentProcess_1 = eprocess; if ( result < 0 ) return result; if ( InJob && !eprocess[2].Affinity ) { ObfDereferenceObject(eprocess); return STATUS_INVALID_PARAMETER; } ActiveProcessors = eprocess->Affinity; } else { ParentProcess_1 = 0; ActiveProcessors = KeActiveProcessors; } v48 = ActiveProcessors; *NewIrql = PsMinimumWorkingSet; v47 = PsMaximumWorkingSet; v11 = ObCreateObject(AccessMode[0], PsProcessType, ObjectAttributes, *AccessMode, 0, 608, 0, 0, &Process); if ( v11 < 0 ) goto LABEL_97; NewProcess = Process; memset(Process, 0, 0x260u); NewProcess[1].ProfileListHead.Blink = 0; *&NewProcess[1].Header.Type = 0; NewProcess[3].ThreadListHead.Flink = &NewProcess[3].VdmTrapcHandler; NewProcess[3].VdmTrapcHandler = &NewProcess[3].VdmTrapcHandler; PspInheritQuota(NewProcess, ParentProcess_1); ObInheritDeviceMap(NewProcess, ParentProcess_1); v13 = ParentProcess_1; if ( ParentProcess_1 ) { *&NewProcess[3].AutoAlignment = *&ParentProcess_1[3].AutoAlignment; NewProcess[3].Header.WaitListHead.Flink = v13[1].DirectoryTableBase[0]; } else { *&NewProcess[3].AutoAlignment = 1; NewProcess[3].Header.WaitListHead.Flink = 0; } if ( SectionHandle ) { v14 = ObReferenceObjectByHandle(SectionHandle, 8u, MmSectionObjectType, AccessMode[0], &v40, 0); v61 = v40; v11 = v14; if ( v14 < 0 ) goto LABEL_96; v13 = ParentProcess_1; } else { v61 = 0; if ( v13 != PsInitialSystemProcess ) { if ( ExAcquireRundownProtection(&v13[1].ProfileListHead.Blink) ) { v15 = *&v13[2].StackCount; v61 = v15; if ( v15 ) ObfReferenceObject(v15); ExReleaseRundownProtection(&v13[1].ProfileListHead.Blink); } if ( !v61 ) { v11 = STATUS_PROCESS_IS_TERMINATING; goto LABEL_96; } } } *&NewProcess[2].StackCount = v61; if ( DebugPort ) { v11 = ObReferenceObjectByHandle(DebugPort, 2u, DbgkDebugObjectType, AccessMode[0], &v44, 0); if ( v11 < 0 ) goto LABEL_96; NewProcess[1].ThreadListHead.Flink = v44; if ( (Flags & 2) != 0 ) _InterlockedOr(&NewProcess[5].Int21Descriptor.Access, 2u); } else if ( v13 ) { DbgkCopyProcessDebugPort(NewProcess, v13); } if ( ExceptionPort ) { v11 = ObReferenceObjectByHandle(ExceptionPort, 0, LpcPortObjectType, AccessMode[0], &v45, 0); if ( v11 < 0 ) goto LABEL_96; NewProcess[1].ThreadListHead.Blink = v45; } *&NewProcess[5].IopmOffset = 259; v11 = PspInitializeProcessSecurity(ParentProcess_1, NewProcess); if ( v11 < 0 ) goto LABEL_96; v16 = ParentProcess_1; if ( ParentProcess_1 ) { if ( !MmCreateProcessAddressSpace(NewIrql[0], NewProcess, v55) ) goto LABEL_59; } else { NewProcess[1].ProcessLock = process[49].Count; MmInitializeHandBuiltProcess(NewProcess, v55); } _InterlockedOr(&NewProcess[5].Int21Descriptor.Access, 0x40000u); *&NewProcess[4].AutoAlignment = v47; KeInitializeProcess(NewProcess, 8, v48, v55, NewProcess[3].AutoAlignment & 4); NewProcess->ThreadQuantum = PspForegroundQuantum; LOBYTE(NewProcess[5].KernelTime) = 2; if ( v16 ) { v17 = v16[5].KernelTime; if ( v17 == 1 || v17 == 5 ) LOBYTE(NewProcess[5].KernelTime) = v17; v18 = ObInitProcess((Flags & 4) != 0 ? ParentProcess_1 : 0, NewProcess); } else { v18 = MmInitializeHandBuiltProcess2(NewProcess); } v11 = v18; if ( v18 < 0 ) goto LABEL_96; v58 = 0; if ( SectionHandle ) { v19 = MmInitializeProcessAddressSpace(NewProcess, 0, v61, &NewProcess[4].ReadyListHead.Blink); v11 = v19; if ( v19 < 0 ) goto LABEL_96; v58 = v19; v11 = PspMapSystemDll(NewProcess, 0); if ( v11 < 0 ) goto LABEL_96; v64 = 1; goto LABEL_58; } v20 = ParentProcess_1; if ( !ParentProcess_1 ) goto LABEL_58; if ( ParentProcess_1 == PsInitialSystemProcess ) { v11 = MmInitializeProcessAddressSpace(NewProcess, 0, 0, 0); if ( v11 >= 0 ) { v24 = ExAllocatePoolWithTag(PagedPool, 8u, 0x61506553u); NewProcess[4].ReadyListHead.Blink = v24; if ( v24 ) { v24->Flink = 0; v24->Blink = 0; goto LABEL_58; } goto LABEL_59; } LABEL_96: ObfDereferenceObject(NewProcess); goto LABEL_97; } *&NewProcess[2].AutoAlignment = *&ParentProcess_1[2].AutoAlignment; v11 = MmInitializeProcessAddressSpace(NewProcess, v20, 0, 0); v64 = 1; if ( v11 < 0 ) goto LABEL_96; v21 = ParentProcess_1[4].ReadyListHead.Blink; if ( v21 ) { v22 = HIWORD(v21->Flink) + 8; v23 = ExAllocatePoolWithTag(PagedPool, v22, 0x61506553u); NewProcess[4].ReadyListHead.Blink = v23; if ( v23 ) { qmemcpy(v23, ParentProcess_1[4].ReadyListHead.Blink, v22); NewProcess[4].ReadyListHead.Blink->Blink = NewProcess[4].ReadyListHead.Blink + 1; goto LABEL_58; } LABEL_59: v11 = STATUS_INSUFFICIENT_RESOURCES; goto LABEL_96; } LABEL_58: v25 = (NewProcess[1].Affinity & 0xFFFFFFF8); v26 = MmGetSessionId(NewProcess); SeSetSessionIdToken(v25, v26); v46[0] = NewProcess; v46[1] = 0; v27 = ExCreateHandle(PspCidTable, v46); NewProcess[1].DirectoryTableBase[0] = v27; if ( !v27 ) goto LABEL_59; *(NewProcess[1].ProcessLock + 8) = v27; if ( SeDetailedAuditingWithToken(0) ) SeAuditProcessCreation(NewProcess); if ( ParentProcess_1 ) { v28 = ParentProcess_1[2].Affinity; if ( v28 ) { if ( (v28[38] & 0x1000) == 0 ) { if ( (Flags & 1) != 0 ) { v11 = (v28[38] & 0x800) != 0 ? 0 : STATUS_ACCESS_DENIED; } else { v11 = PspGetJobFromSet(v28, InJob, &NewProcess[2].Affinity); if ( v11 < 0 ) goto LABEL_96; v43 = NewProcess[2].Affinity; v11 = PspAddProcessToJob(v43, NewProcess); v29 = v43[9].Header.SignalState; if ( v29 ) { v11 = SeSubProcessToken(v29, &v50, 0); if ( v11 < 0 ) goto LABEL_96; SeAssignPrimaryToken(NewProcess, v50); ObfDereferenceObject(v50); } } if ( v11 < 0 ) goto LABEL_96; } } } if ( ParentProcess_1 && v64 ) { BaseAddress[0] = 0; BaseAddress[1] = -1; if ( SectionHandle ) { v11 = MmCreatePeb(NewProcess, BaseAddress, &NewProcess[4]); if ( v11 < 0 ) { *&NewProcess[4].Header.Type = 0; goto LABEL_96; } } else { LOBYTE(BaseAddress[0]) = 1; v30 = *&ParentProcess_1[4].Header.Type; *&NewProcess[4].Header.Type = v30; MmCopyVirtualMemory(process, BaseAddress, NewProcess, v30, 8u, 0, &a7a); } } v31 = v41; --v41->KernelApcDisable; ExAcquireFastMutexUnsafe(&PspActiveProcessMutex); v32 = dword_48315C; NewProcess[1].DirectoryTableBase[1] = &PsActiveProcessHead; *&NewProcess[1].LdtDescriptor.LimitLow = v32; *v32 = NewProcess + 136; dword_48315C = &NewProcess[1].DirectoryTableBase[1]; ExReleaseFastMutexUnsafe(&PspActiveProcessMutex); v34 = (*(v31 + 212))++ == -1; if ( v34 && *(v31 + 52) != v31 + 52 ) { *(v31 + 73) = 1; LOBYTE(v33) = 1; HalRequestSoftwareInterrupt(v33); } if ( !ParentProcess_1 || (v35 = PsInitialSystemProcess, ParentProcess_1 != PsInitialSystemProcess) ) v35 = *(v31 + 68); v11 = SeCreateAccessStateEx(0, v35, &PassedAccessState, v37, DesiredAccess, (PsProcessType + 26)); if ( v11 < 0 ) goto LABEL_96; v11 = ObInsertObject(NewProcess, &PassedAccessState, DesiredAccess, 1u, 0, &Handle); v53 = v11; SeDeleteAccessState(&PassedAccessState); if ( v11 >= 0 ) { *&NewProcess[3].StackCount = 1; PsSetProcessPriorityByClass(NewProcess, 0); if ( !ParentProcess_1 || ParentProcess_1 == PsInitialSystemProcess ) { *&NewProcess[3].StackCount = 2035711; } else { v11 = ObGetObjectSecurity(NewProcess, &SecurityDescriptor, MemoryAllocated); v53 = v11; if ( v11 < 0 ) { ObCloseHandle(Handle, AccessMode[0]); goto LABEL_96; } SubjectSecurityContext.ProcessAuditId = NewProcess; SubjectSecurityContext.PrimaryToken = PsReferencePrimaryToken(NewProcess); SubjectSecurityContext.ClientToken = 0; v62 = SeAccessCheck( SecurityDescriptor, &SubjectSecurityContext, 0, 0x2000000u, 0, 0, (PsProcessType + 26), AccessMode[0], &NewProcess[3].StackCount, &AccessStatus); ObFastDereferenceObject(&NewProcess[1].Affinity, SubjectSecurityContext.PrimaryToken); ObReleaseObjectSecurity(SecurityDescriptor, MemoryAllocated[0]); if ( !v62 ) *&NewProcess[3].StackCount = 0; *&NewProcess[3].StackCount |= 0x1F07FBu; } KeQuerySystemTime(&NewProcess[1].Header.SignalState); *ProcessHandle = Handle; ms_exc.registration.TryLevel = -1; if ( v58 ) v11 = v58; goto LABEL_96; } LABEL_97: if ( ParentProcess_1 ) ObfDereferenceObject(ParentProcess_1); return v11; }
we all know that when creating a process, a thread will be created, which is called the main thread. The above are operations such as initializing the process structure and creating TEB. Where is the thread created? This is because it is called in the third ring. Let's see what the third ring looks like:
BOOL __stdcall CreateProcessW(LPCWSTR lpApplicationName, LPWSTR lpCommandLine, LPSECURITY_ATTRIBUTES lpProcessAttributes, LPSECURITY_ATTRIBUTES lpThreadAttributes, BOOL bInheritHandles, DWORD dwCreationFlags, LPVOID lpEnvironment, LPCWSTR lpCurrentDirectory, LPSTARTUPINFOW lpStartupInfo, LPPROCESS_INFORMATION lpProcessInformation) { return CreateProcessInternalW( 0, lpApplicationName, lpCommandLine, lpProcessAttributes, lpThreadAttributes, bInheritHandles, dwCreationFlags, lpEnvironment, lpCurrentDirectory, lpStartupInfo, lpProcessInformation, 0); }
this function calls the CreateProcessInternalW function again. Let's continue to click to see that the function of creating thread is here. However, in order to save space, we give the local code:
BaseInitializeContext(&ThreadContext, v157, SectionInformation, UserStack.StackBase, 0); v161 = BaseFormatObjectAttributes(&ObjectAttributes, v141, 0); if ( v184 && v160 && v141 ) { DirectoryInfo.CurDirRef = v141->nLength; v200 = v141->lpSecurityDescriptor; v201 = v141->bInheritHandle; v200 = 0; v161 = BaseFormatObjectAttributes(&ObjectAttributes, &DirectoryInfo.CurDirRef, 0); } v20 = NtCreateThread( &ThreadHandle, 0x1F03FFu, v161, ProcessHandle, &ClientId, &ThreadContext, &UserStack, 1u);
Process end analysis
the end function has nothing to analyze, and the following IDA pseudo code is given:
NTSTATUS __stdcall NtTerminateProcess(HANDLE ProcessHandle, NTSTATUS ExitStatus) { // [COLLAPSED LOCAL DECLARATIONS. PRESS KEYPAD CTRL-"+" TO EXPAND] v2 = KeGetCurrentThread(); v3 = v2; v8 = v2->ApcState.Process; if ( ProcessHandle ) { v11 = 1; } else { ProcessHandle = -1; v11 = 0; } LOBYTE(AccessMode) = v2->PreviousMode; result = ObReferenceObjectByHandle(ProcessHandle, 1u, PsProcessType, AccessMode, &AccessMode, 0); v5 = AccessMode; v6 = AccessMode; if ( result >= 0 ) { ProcessHandlea = &AccessMode[146]; if ( (AccessMode[146].Count & 0x2000) != 0 ) PspCatchCriticalBreak("Terminating critical process 0x%p (%s)\n", AccessMode, &AccessMode[93]); RunRef = v5 + 32; if ( ExAcquireRundownProtection(v5 + 32) ) { if ( v11 ) _InterlockedOr(ProcessHandlea, 8u); ProcessHandleb = 290; v7 = PsGetNextProcessThread(v6, 0); if ( v7 ) { ProcessHandleb = 0; do { if ( v7 != v3 ) PspTerminateThreadByPointer(v7, ExitStatus); v7 = PsGetNextProcessThread(v6, v7); } while ( v7 ); } ExReleaseRundownProtection(RunRef); if ( v6 == v8 ) { if ( v11 ) { ObfDereferenceObject(v6); PspTerminateThreadByPointer(v3, ExitStatus); } } else if ( ExitStatus == DBG_TERMINATE_PROCESS ) { DbgkClearProcessDebugObject(v6, 0); } if ( ProcessHandleb == 290 || v6[1].ThreadListHead.Flink && v11 ) { ObClearProcessHandleTable(v6); ProcessHandleb = 0; } ObfDereferenceObject(v6); result = ProcessHandleb; } else { ObfDereferenceObject(v5); result = STATUS_PROCESS_IS_TERMINATING; } } return result; }
it can be said that to end a process is to kill all its threads, delete the relevant records of the process, and the process will be killed.
Process thread structure extension
when introducing the structure related to process thread, we mainly introduced the key members, but some members are still important when explaining information, or they are useless to make a record. Here is another supplement for understanding only:
KPROCESS
its structure is as follows:
kd> dt _KPROCESS ntdll!_KPROCESS +0x000 Header : _DISPATCHER_HEADER +0x010 ProfileListHead : _LIST_ENTRY +0x018 DirectoryTableBase : [2] Uint4B +0x020 LdtDescriptor : _KGDTENTRY +0x028 Int21Descriptor : _KIDTENTRY +0x030 IopmOffset : Uint2B +0x032 Iopl : UChar +0x033 Unused : UChar +0x034 ActiveProcessors : Uint4B +0x038 KernelTime : Uint4B +0x03c UserTime : Uint4B +0x040 ReadyListHead : _LIST_ENTRY +0x048 SwapListEntry : _SINGLE_LIST_ENTRY +0x04c VdmTrapcHandler : Ptr32 Void +0x050 ThreadListHead : _LIST_ENTRY +0x058 ProcessLock : Uint4B +0x05c Affinity : Uint4B +0x060 StackCount : Uint2B +0x062 BasePriority : Char +0x063 ThreadQuantum : Char +0x064 AutoAlignment : UChar +0x065 State : UChar +0x066 ThreadSeed : UChar +0x067 DisableBoost : UChar +0x068 PowerState : UChar +0x069 DisableQuantum : UChar +0x06a IdealNode : UChar +0x06b Flags : _KEXECUTE_OPTIONS +0x06b ExecuteOptions : UChar
ProfileListHead
performance analysis is related. Generally, the operating system will process it automatically, as shown in the figure below. It is useless:
ActiveProcessors
╭ indicates which core is currently active.
ReadyListHead
linked list of all processes in the ready state of the process.
SwapListEntry
the linked list of processes exchanged to the disk. If the processes are exchanged, they will be hung here.
ThreadListHead
linked list of all threads under the current process.
ProcessLock
process lock is used for synchronization to prevent simultaneous modification. It is used for the operating system.
ThreadQuantum
the default time fragment of the thread.
State
indicates the status of the process switching to disk and memory.
ThreadSeed
indicates the nucleophilic ideal of Affinity statement.
EPROCESS
its structure is as follows:
kd> dt _EPROCESS ntdll!_EPROCESS +0x000 Pcb : _KPROCESS +0x06c ProcessLock : _EX_PUSH_LOCK +0x070 CreateTime : _LARGE_INTEGER +0x078 ExitTime : _LARGE_INTEGER +0x080 RundownProtect : _EX_RUNDOWN_REF +0x084 UniqueProcessId : Ptr32 Void +0x088 ActiveProcessLinks : _LIST_ENTRY +0x090 QuotaUsage : [3] Uint4B +0x09c QuotaPeak : [3] Uint4B +0x0a8 CommitCharge : Uint4B +0x0ac PeakVirtualSize : Uint4B +0x0b0 VirtualSize : Uint4B +0x0b4 SessionProcessLinks : _LIST_ENTRY +0x0bc DebugPort : Ptr32 Void +0x0c0 ExceptionPort : Ptr32 Void +0x0c4 ObjectTable : Ptr32 _HANDLE_TABLE +0x0c8 Token : _EX_FAST_REF +0x0cc WorkingSetLock : _FAST_MUTEX +0x0ec WorkingSetPage : Uint4B +0x0f0 AddressCreationLock : _FAST_MUTEX +0x110 HyperSpaceLock : Uint4B +0x114 ForkInProgress : Ptr32 _ETHREAD +0x118 HardwareTrigger : Uint4B +0x11c VadRoot : Ptr32 Void +0x120 VadHint : Ptr32 Void +0x124 CloneRoot : Ptr32 Void +0x128 NumberOfPrivatePages : Uint4B +0x12c NumberOfLockedPages : Uint4B +0x130 Win32Process : Ptr32 Void +0x134 Job : Ptr32 _EJOB +0x138 SectionObject : Ptr32 Void +0x13c SectionBaseAddress : Ptr32 Void +0x140 QuotaBlock : Ptr32 _EPROCESS_QUOTA_BLOCK +0x144 WorkingSetWatch : Ptr32 _PAGEFAULT_HISTORY +0x148 Win32WindowStation : Ptr32 Void +0x14c InheritedFromUniqueProcessId : Ptr32 Void +0x150 LdtInformation : Ptr32 Void +0x154 VadFreeHint : Ptr32 Void +0x158 VdmObjects : Ptr32 Void +0x15c DeviceMap : Ptr32 Void +0x160 PhysicalVadList : _LIST_ENTRY +0x168 PageDirectoryPte : _HARDWARE_PTE_X86 +0x168 Filler : Uint8B +0x170 Session : Ptr32 Void +0x174 ImageFileName : [16] UChar +0x184 JobLinks : _LIST_ENTRY +0x18c LockedPagesList : Ptr32 Void +0x190 ThreadListHead : _LIST_ENTRY +0x198 SecurityPort : Ptr32 Void +0x19c PaeTop : Ptr32 Void +0x1a0 ActiveThreads : Uint4B +0x1a4 GrantedAccess : Uint4B +0x1a8 DefaultHardErrorProcessing : Uint4B +0x1ac LastThreadExitStatus : Int4B +0x1b0 Peb : Ptr32 _PEB +0x1b4 PrefetchTrace : _EX_FAST_REF +0x1b8 ReadOperationCount : _LARGE_INTEGER +0x1c0 WriteOperationCount : _LARGE_INTEGER +0x1c8 OtherOperationCount : _LARGE_INTEGER +0x1d0 ReadTransferCount : _LARGE_INTEGER +0x1d8 WriteTransferCount : _LARGE_INTEGER +0x1e0 OtherTransferCount : _LARGE_INTEGER +0x1e8 CommitChargeLimit : Uint4B +0x1ec CommitChargePeak : Uint4B +0x1f0 AweInfo : Ptr32 Void +0x1f4 SeAuditProcessCreationInfo : _SE_AUDIT_PROCESS_CREATION_INFO +0x1f8 Vm : _MMSUPPORT +0x238 LastFaultCount : Uint4B +0x23c ModifiedPageCount : Uint4B +0x240 NumberOfVads : Uint4B +0x244 JobStatus : Uint4B +0x248 Flags : Uint4B +0x248 CreateReported : Pos 0, 1 Bit +0x248 NoDebugInherit : Pos 1, 1 Bit +0x248 ProcessExiting : Pos 2, 1 Bit +0x248 ProcessDelete : Pos 3, 1 Bit +0x248 Wow64SplitPages : Pos 4, 1 Bit +0x248 VmDeleted : Pos 5, 1 Bit +0x248 OutswapEnabled : Pos 6, 1 Bit +0x248 Outswapped : Pos 7, 1 Bit +0x248 ForkFailed : Pos 8, 1 Bit +0x248 HasPhysicalVad : Pos 9, 1 Bit +0x248 AddressSpaceInitialized : Pos 10, 2 Bits +0x248 SetTimerResolution : Pos 12, 1 Bit +0x248 BreakOnTermination : Pos 13, 1 Bit +0x248 SessionCreationUnderway : Pos 14, 1 Bit +0x248 WriteWatch : Pos 15, 1 Bit +0x248 ProcessInSession : Pos 16, 1 Bit +0x248 OverrideAddressSpace : Pos 17, 1 Bit +0x248 HasAddressSpace : Pos 18, 1 Bit +0x248 LaunchPrefetched : Pos 19, 1 Bit +0x248 InjectInpageErrors : Pos 20, 1 Bit +0x248 VmTopDown : Pos 21, 1 Bit +0x248 Unused3 : Pos 22, 1 Bit +0x248 Unused4 : Pos 23, 1 Bit +0x248 VdmAllowed : Pos 24, 1 Bit +0x248 Unused : Pos 25, 5 Bits +0x248 Unused1 : Pos 30, 1 Bit +0x248 Unused2 : Pos 31, 1 Bit +0x24c ExitStatus : Int4B +0x250 NextPageColor : Uint2B +0x252 SubSystemMinorVersion : UChar +0x253 SubSystemMajorVersion : UChar +0x252 SubSystemVersion : Uint2B +0x254 PriorityClass : UChar +0x255 WorkingSetAcquiredUnsafe : UChar +0x258 Cookie : Uint4B
RundownProtect
process shutdown protection can prevent others from killing the process.
SessionProcessLinks
related to the session subsystem of the process.
Token
token of the process, security related.
InheritedFromUniqueProcessId
it indicates who created the pid of the process. This is more useful. You can find the parent process.
SeAuditProcessCreationInfo
the full path of the process can be obtained through this. Let's take an example:
kd> dt _EPROCESS 89a9b648 ntdll!_EPROCESS ...... +0x1f0 AweInfo : (null) +0x1f4 SeAuditProcessCreationInfo : _SE_AUDIT_PROCESS_CREATION_INFO +0x1f8 Vm : _MMSUPPORT ...... kd> dx -id 0,0,805539a0 -r1 (*((ntdll!_SE_AUDIT_PROCESS_CREATION_INFO *)0x89a9b83c)) (*((ntdll!_SE_AUDIT_PROCESS_CREATION_INFO *)0x89a9b83c)) [Type: _SE_AUDIT_PROCESS_CREATION_INFO] [+0x000] ImageFileName : 0x89cc8a18 [Type: _OBJECT_NAME_INFORMATION *] kd> dx -id 0,0,805539a0 -r1 ((ntdll!_OBJECT_NAME_INFORMATION *)0x89cc8a18) ((ntdll!_OBJECT_NAME_INFORMATION *)0x89cc8a18) : 0x89cc8a18 [Type: _OBJECT_NAME_INFORMATION *] [+0x000] Name : "\Device\HarddiskVolume1\Program Files\PalmInput\Extensions\Guard\2.6.0.49\PalmInputGuard.exe" [Type: _UNICODE_STRING]
Flags
it is useful to indicate the status of the process.
SubSystemMinorVersion / SubSystemMajorVersion
indicates the supported subsystem version, which is described in the information of the PE file.
ExitStatus
process exit status.
ReadOperationCount
number of times ReadFile was called.
WriteOperationCount
number of times WriteFile was called.
OtherOperationCount
number of calls to other API s related to IO read-write files.
KTHREAD
its structure is as follows:
kd> dt _KTHREAD ntdll!_KTHREAD +0x000 Header : _DISPATCHER_HEADER +0x010 MutantListHead : _LIST_ENTRY +0x018 InitialStack : Ptr32 Void +0x01c StackLimit : Ptr32 Void +0x020 Teb : Ptr32 Void +0x024 TlsArray : Ptr32 Void +0x028 KernelStack : Ptr32 Void +0x02c DebugActive : UChar +0x02d State : UChar +0x02e Alerted : [2] UChar +0x030 Iopl : UChar +0x031 NpxState : UChar +0x032 Saturation : Char +0x033 Priority : Char +0x034 ApcState : _KAPC_STATE +0x04c ContextSwitches : Uint4B +0x050 IdleSwapBlock : UChar +0x051 Spare0 : [3] UChar +0x054 WaitStatus : Int4B +0x058 WaitIrql : UChar +0x059 WaitMode : Char +0x05a WaitNext : UChar +0x05b WaitReason : UChar +0x05c WaitBlockList : Ptr32 _KWAIT_BLOCK +0x060 WaitListEntry : _LIST_ENTRY +0x060 SwapListEntry : _SINGLE_LIST_ENTRY +0x068 WaitTime : Uint4B +0x06c BasePriority : Char +0x06d DecrementCount : UChar +0x06e PriorityDecrement : Char +0x06f Quantum : Char +0x070 WaitBlock : [4] _KWAIT_BLOCK +0x0d0 LegoData : Ptr32 Void +0x0d4 KernelApcDisable : Uint4B +0x0d8 UserAffinity : Uint4B +0x0dc SystemAffinityActive : UChar +0x0dd PowerState : UChar +0x0de NpxIrql : UChar +0x0df InitialNode : UChar +0x0e0 ServiceTable : Ptr32 Void +0x0e4 Queue : Ptr32 _KQUEUE +0x0e8 ApcQueueLock : Uint4B +0x0f0 Timer : _KTIMER +0x118 QueueListEntry : _LIST_ENTRY +0x120 SoftAffinity : Uint4B +0x124 Affinity : Uint4B +0x128 Preempted : UChar +0x129 ProcessReadyQueue : UChar +0x12a KernelStackResident : UChar +0x12b NextProcessor : UChar +0x12c CallbackStack : Ptr32 Void +0x130 Win32Thread : Ptr32 Void +0x134 TrapFrame : Ptr32 _KTRAP_FRAME +0x138 ApcStatePointer : [2] Ptr32 _KAPC_STATE +0x140 PreviousMode : Char +0x141 EnableStackSwap : UChar +0x142 LargeStack : UChar +0x143 ResourceIndex : UChar +0x144 KernelTime : Uint4B +0x148 UserTime : Uint4B +0x14c SavedApcState : _KAPC_STATE +0x164 Alertable : UChar +0x165 ApcStateIndex : UChar +0x166 ApcQueueable : UChar +0x167 AutoAlignment : UChar +0x168 StackBase : Ptr32 Void +0x16c SuspendApc : _KAPC +0x19c SuspendSemaphore : _KSEMAPHORE +0x1b0 ThreadListEntry : _LIST_ENTRY +0x1b8 FreezeCount : Char +0x1b9 SuspendCount : Char +0x1ba IdealProcessor : UChar +0x1bb DisableBoost : UChar
MutantListHead
mutex linked list. The specific details are explained in the synchronization chapter.
Alertable
indicates whether the thread is awakened. The specific details are explained in APC.
Priority
thread priority.
EnableStackSwap
indicates whether the stack can be swapped as a file
ETHREAD
the structure is as follows:
kd> dt _ETHREAD ntdll!_ETHREAD +0x000 Tcb : _KTHREAD +0x1c0 CreateTime : _LARGE_INTEGER +0x1c0 NestedFaultCount : Pos 0, 2 Bits +0x1c0 ApcNeeded : Pos 2, 1 Bit +0x1c8 ExitTime : _LARGE_INTEGER +0x1c8 LpcReplyChain : _LIST_ENTRY +0x1c8 KeyedWaitChain : _LIST_ENTRY +0x1d0 ExitStatus : Int4B +0x1d0 OfsChain : Ptr32 Void +0x1d4 PostBlockList : _LIST_ENTRY +0x1dc TerminationPort : Ptr32 _TERMINATION_PORT +0x1dc ReaperLink : Ptr32 _ETHREAD +0x1dc KeyedWaitValue : Ptr32 Void +0x1e0 ActiveTimerListLock : Uint4B +0x1e4 ActiveTimerListHead : _LIST_ENTRY +0x1ec Cid : _CLIENT_ID +0x1f4 LpcReplySemaphore : _KSEMAPHORE +0x1f4 KeyedWaitSemaphore : _KSEMAPHORE +0x208 LpcReplyMessage : Ptr32 Void +0x208 LpcWaitingOnPort : Ptr32 Void +0x20c ImpersonationInfo : Ptr32 _PS_IMPERSONATION_INFORMATION +0x210 IrpList : _LIST_ENTRY +0x218 TopLevelIrp : Uint4B +0x21c DeviceToVerify : Ptr32 _DEVICE_OBJECT +0x220 ThreadsProcess : Ptr32 _EPROCESS +0x224 StartAddress : Ptr32 Void +0x228 Win32StartAddress : Ptr32 Void +0x228 LpcReceivedMessageId : Uint4B +0x22c ThreadListEntry : _LIST_ENTRY +0x234 RundownProtect : _EX_RUNDOWN_REF +0x238 ThreadLock : _EX_PUSH_LOCK +0x23c LpcReplyMessageId : Uint4B +0x240 ReadClusterSize : Uint4B +0x244 GrantedAccess : Uint4B +0x248 CrossThreadFlags : Uint4B +0x248 Terminated : Pos 0, 1 Bit +0x248 DeadThread : Pos 1, 1 Bit +0x248 HideFromDebugger : Pos 2, 1 Bit +0x248 ActiveImpersonationInfo : Pos 3, 1 Bit +0x248 SystemThread : Pos 4, 1 Bit +0x248 HardErrorsAreDisabled : Pos 5, 1 Bit +0x248 BreakOnTermination : Pos 6, 1 Bit +0x248 SkipCreationMsg : Pos 7, 1 Bit +0x248 SkipTerminationMsg : Pos 8, 1 Bit +0x24c SameThreadPassiveFlags : Uint4B +0x24c ActiveExWorker : Pos 0, 1 Bit +0x24c ExWorkerCanWaitUser : Pos 1, 1 Bit +0x24c MemoryMaker : Pos 2, 1 Bit +0x250 SameThreadApcFlags : Uint4B +0x250 LpcReceivedMsgIdValid : Pos 0, 1 Bit +0x250 LpcExitThreadCalled : Pos 1, 1 Bit +0x250 AddressSpaceOwner : Pos 2, 1 Bit +0x254 ForwardClusterOnly : UChar +0x255 DisablePageFaultClustering : UChar
CrossThreadFlags
indicates the status and identity of the thread. The status can be set to system thread, which can make ordinary permissions immortal.
Analysis of thread creation
when we analyzed the creation process, we used the NtCreateThread function to create the main thread. Let's see what it contains:
NTSTATUS __stdcall NtCreateThread(PHANDLE ThreadHandle, ACCESS_MASK DesiredAccess, POBJECT_ATTRIBUTES ObjectAttributes, HANDLE ProcessHandle, PCLIENT_ID ClientId, PCONTEXT ThreadContext, PINITIAL_TEB UserStack, BOOLEAN CreateSuspended) { struct _INITIAL_TEB *v8; // eax int *v9; // ebx int v11; // ecx int v12[6]; // [esp+Ch] [ebp-38h] BYREF _KTHREAD *v13; // [esp+24h] [ebp-20h] CPPEH_RECORD ms_exc; // [esp+2Ch] [ebp-18h] ms_exc.registration.TryLevel = 0; v13 = KeGetCurrentThread(); if ( v13->PreviousMode ) { if ( ThreadHandle >= MmUserProbeAddress ) *MmUserProbeAddress = 0; *ThreadHandle = *ThreadHandle; if ( ClientId ) { v12[5] = ClientId; if ( ClientId >= MmUserProbeAddress ) *MmUserProbeAddress = 0; if ( (ClientId & 3) != 0 ) ExRaiseDatatypeMisalignment(); LOBYTE(ClientId->UniqueProcess) = ClientId->UniqueProcess; LOBYTE(ClientId->UniqueThread) = ClientId->UniqueThread; } if ( !ThreadContext ) { ms_exc.registration.TryLevel = -1; return STATUS_INVALID_PARAMETER; } if ( (ThreadContext & 3) != 0 ) ExRaiseDatatypeMisalignment(); v8 = MmUserProbeAddress; if ( ThreadContext >= MmUserProbeAddress ) { *MmUserProbeAddress = 0; v8 = MmUserProbeAddress; } v9 = UserStack; if ( (UserStack & 3) != 0 ) ExRaiseDatatypeMisalignment(); if ( UserStack >= v8 ) v8->PreviousStackBase = 0; } else { v9 = UserStack; } v12[0] = *v9; v11 = v9[1]; v12[1] = v11; if ( !v12[0] && !v11 ) qmemcpy(v12, v9, 0x14u); ms_exc.registration.TryLevel = -1; return PspCreateThread( ThreadHandle, DesiredAccess, ObjectAttributes, ProcessHandle, 0, ClientId, ThreadContext, v12, CreateSuspended, 0, 0); }
browse through it and find that the thread creation is finally realized through the function PspCreateThread. Click to see:
NTSTATUS __stdcall PspCreateThread(PHANDLE ThreadHandle, ACCESS_MASK DesiredAccess, POBJECT_ATTRIBUTES ObjectAttributes, HANDLE ProcessHandle, PVOID a5, PCLIENT_ID ClientId, PCONTEXT ThreadContext, int a8, BOOLEAN CreateSuspended, int a10, int a11) { // [COLLAPSED LOCAL DECLARATIONS. PRESS KEYPAD CTRL-"+" TO EXPAND] CurrentThread = KeGetCurrentThread(); CurrentThread_1 = CurrentThread; if ( a10 ) AccessMode[0] = 0; else AccessMode[0] = CurrentThread->PreviousMode; v70 = 0; v12 = 0; v56 = 0; if ( ProcessHandle ) { result = ObReferenceObjectByHandle(ProcessHandle, 2u, PsProcessType, AccessMode[0], &Object, 0); v12 = Object; v56 = Object; } else if ( a10 ) { v12 = a5; ObfReferenceObject(a5); v56 = a5; result = 0; } else { result = STATUS_INVALID_HANDLE; } if ( result >= 0 ) { if ( AccessMode[0] && v12 == PsInitialSystemProcess ) { v14 = STATUS_INVALID_HANDLE; LABEL_15: ObfDereferenceObject(v12); return v14; } v15 = ObCreateObject(AccessMode[0], PsThreadType, ObjectAttributes, *AccessMode, 0, 600, 0, 0, &v59); if ( v15 < 0 ) { v14 = v15; goto LABEL_15; } thread = v59; memset(v59, 0, 0x258u); thread[1].WaitBlock[0].WaitListEntry.Blink = 0; thread[1].WaitListEntry.Flink = v12; *&thread[1].DebugActive = v12[33]; v60[0] = thread; v60[1] = 0; v17 = ExCreateHandle(PspCidTable, v60); *&thread[1].Iopl = v17; if ( !v17 ) { v18 = STATUS_INSUFFICIENT_RESOURCES; LABEL_38: ObfDereferenceObject(thread); return v18; } thread[1].WaitBlock[0].NextWaitBlock = MmReadClusterSize; KeInitializeSemaphore(&thread[1].ApcState, 0, 1); thread[1].Header.WaitListHead.Blink = &thread[1].Header.WaitListHead; thread[1].Header.WaitListHead.Flink = &thread[1].Header.WaitListHead; thread[1].WaitStatus = &thread[1].IdleSwapBlock; *&thread[1].IdleSwapBlock = thread + 528; thread[1].InitialStack = &thread[1].MutantListHead.Blink; thread[1].MutantListHead.Blink = &thread[1].MutantListHead.Blink; thread[1].WaitBlock[0].Thread = 0; KeInitializeSpinLock(&thread[1].Teb); thread[1].KernelStack = &thread[1].TlsArray; thread[1].TlsArray = &thread[1].TlsArray; v45 = v12 + 32; if ( !ExAcquireRundownProtection(v12 + 32) ) { v18 = STATUS_PROCESS_IS_TERMINATING; goto LABEL_38; } if ( ThreadContext ) { v18 = MmCreateTeb(v12, a8, &thread[1].DebugActive, &v70); if ( v18 < 0 ) { v19 = v45; LABEL_37: ExReleaseRundownProtection(v19); goto LABEL_38; } ms_exc.registration.TryLevel = 0; thread[1].WaitListEntry.Blink = ThreadContext->Eip; thread[1].WaitTime = ThreadContext->Eax; ms_exc.registration.TryLevel = -1; v20 = KeInitThread(thread, 0, PspUserThreadStartup, 0, thread[1].WaitListEntry.Blink, ThreadContext, v70, v12); } else { v70 = 0; _InterlockedOr(&thread[1].WaitBlock[1], 0x10u); thread[1].WaitListEntry.Blink = a10; v20 = KeInitThread(thread, 0, PspSystemThreadStartup, a10, a11, 0, 0, v12); } v18 = v20; if ( v20 < 0 ) { if ( v70 ) { MmDeleteTeb(v12, v70); thread->Teb = 0; } LABEL_36: v19 = v12 + 32; goto LABEL_37; } v21 = CurrentThread_1; --CurrentThread_1->KernelApcDisable; v53 = v12 + 27; _ECX = v12 + 27; _EDX = 2; __asm { cmpxchg [ecx], edx } if ( (v12[146].Count & 8) != 0 ) { v50 = v12 + 27; _ECX = v12 + 27; _EDX = 0; __asm { cmpxchg [ecx], edx } v26 = (*(v21 + 212))++ == -1; if ( v26 && *(v21 + 52) != v21 + 52 ) { *(v21 + 73) = 1; LOBYTE(_ECX) = 1; HalRequestSoftwareInterrupt(_ECX); } KeUninitThread(thread); if ( v70 ) MmDeleteTeb(v12, v70); v18 = STATUS_PROCESS_IS_TERMINATING; goto LABEL_36; } v27 = v12[104].Count; v12[104].Count = v27 + 1; v28 = v12[101].Count; *&thread[1].BasePriority = v12 + 100; thread[1].WaitBlock[0].WaitListEntry.Flink = v28; v28->Flink = &thread[1].BasePriority; v12[101].Count = &thread[1].BasePriority; KeStartThread(thread); v48 = v12 + 27; _ECX = v12 + 27; _EDX = 0; __asm { cmpxchg [ecx], edx } v31 = CurrentThread_1; v26 = CurrentThread_1->KernelApcDisable++ == -1; if ( v26 && *(v31 + 52) != v31 + 52 ) { CurrentThread_1->ApcState.KernelApcPending = 1; LOBYTE(_ECX) = 1; HalRequestSoftwareInterrupt(_ECX); } ExReleaseRundownProtection(v12 + 32); if ( !v27 ) { WmiTraceProcess(v12); if ( PspCreateProcessNotifyRoutineCount ) { v58 = &PspCreateProcessNotifyRoutine; v51 = 8; do { v32 = ExReferenceCallBackBlock(v58); v33 = v32; if ( v32 ) { v34 = ExGetCallBackBlockRoutine(v32); v34(v12[83].Count, v12[33].Count, 1); ExDereferenceCallBackBlock(v58, v33); } v58 += 4; --v51; } while ( v51 ); } } v35 = v12[77].Count; if ( v35 && *(v35 + 196) && (v12[145].Count & 5) == 0 ) { _InterlockedOr(&v12[145], 4u); --CurrentThread_1->KernelApcDisable; Resource = (v35 + 32); ExAcquireResourceSharedLite((v35 + 32), 1u); v36 = *(v35 + 196); if ( v36 ) IoSetIoCompletion(v36, *(v35 + 200), v12[33].Count, 0, 6, 0); ExReleaseResourceLite(Resource); v37 = CurrentThread_1; v26 = CurrentThread_1->KernelApcDisable++ == -1; if ( v26 && *(v37 + 52) != v37 + 52 ) { *(v37 + 73) = 1; LOBYTE(v37) = 1; HalRequestSoftwareInterrupt(v37); } } WmiTraceThread(thread, a8, 1); if ( PspCreateThreadNotifyRoutineCount ) { v57 = &PspCreateThreadNotifyRoutine; v49 = 8; do { v38 = ExReferenceCallBackBlock(v57); RunRef = v38; if ( v38 ) { v39 = ExGetCallBackBlockRoutine(v38); v39(*&thread[1].DebugActive, *&thread[1].Iopl, 1); ExDereferenceCallBackBlock(v57, RunRef); } v57 += 4; --v49; } while ( v49 ); } ObReferenceObjectEx(thread, 2); if ( ThreadContext ) { P = ExAllocatePoolWithTag(NonPagedPool, 0x30u, 'aCsP'); if ( !P ) { _InterlockedOr(&thread[1].WaitBlock[1], 2u); LABEL_64: v18 = STATUS_INSUFFICIENT_RESOURCES; LABEL_77: KeReadyThread(thread); ObDereferenceObjectEx(thread, 2); return v18; } KeInitializeApc(P, thread, 0, IopDeallocateApc, 0, dword_598B5C, 1, 0); if ( !KeInsertQueueApc(P, BaseAddress, 0, 0) ) { _InterlockedOr(&thread[1].WaitBlock[1], 2u); ExFreePoolWithTag(P, 0); goto LABEL_64; } } if ( CreateSuspended ) { ms_exc.registration.TryLevel = 1; KeSuspendThread(thread); ms_exc.registration.TryLevel = -1; if ( (thread[1].WaitBlock[1].WaitListEntry.Flink & 1) != 0 ) KeForceResumeThread(thread); } if ( ThreadContext ) v40 = CurrentThread_1->ApcState.Process; else v40 = v12; v63 = SeCreateAccessStateEx(0, v40, &PassedAccessState, v44, DesiredAccess, (PsThreadType + 26)); if ( v63 < 0 ) { _InterlockedOr(&thread[1].WaitBlock[1], 2u); if ( CreateSuspended ) KeResumeThread(thread); v18 = v63; goto LABEL_77; } v63 = ObInsertObject(thread, &PassedAccessState, DesiredAccess, 0, 0, &Handle); SeDeleteAccessState(&PassedAccessState); if ( v63 >= 0 ) { ms_exc.registration.TryLevel = 2; *ThreadHandle = Handle; if ( ClientId ) *ClientId = *&thread[1].DebugActive; ms_exc.registration.TryLevel = -1; } else { _InterlockedOr(&thread[1].WaitBlock[1], 2u); if ( CreateSuspended ) KeResumeThread(thread); } KeQuerySystemTime(&CurrentTime); v41 = CurrentTime.QuadPart >> 29; *&thread[1].Header.Type = 8 * CurrentTime.LowPart; thread[1].Header.SignalState = v41; if ( (thread[1].WaitBlock[1].WaitListEntry.Flink & 2) != 0 ) { *&thread[1].WaitBlock[0].WaitKey = 2032639; } else { v63 = ObGetObjectSecurity(thread, &SecurityDescriptor, MemoryAllocated); if ( v63 < 0 ) { _InterlockedOr(&thread[1].WaitBlock[1], 2u); if ( CreateSuspended ) KeResumeThread(thread); KeReadyThread(thread); ObfDereferenceObject(thread); ObCloseHandle(Handle, AccessMode[0]); goto LABEL_95; } SubjectSecurityContext.ProcessAuditId = v12; SubjectSecurityContext.PrimaryToken = PsReferencePrimaryToken(v12); SubjectSecurityContext.ClientToken = 0; v42 = &thread[1].WaitBlock[0].WaitKey; v64 = SeAccessCheck( SecurityDescriptor, &SubjectSecurityContext, 0, 0x2000000u, 0, 0, (PsThreadType + 26), AccessMode[0], &thread[1].WaitBlock[0].WaitKey, &AccessStatus); ObFastDereferenceObject(&v12[50], SubjectSecurityContext.PrimaryToken); ObReleaseObjectSecurity(SecurityDescriptor, MemoryAllocated[0]); if ( !v64 ) *v42 = 0; *v42 |= 0x61u; } KeReadyThread(thread); ObfDereferenceObject(thread); LABEL_95: result = v63; } return result; }
process and process threads create almost the same process routine. Initialize the thread structure, create TEB and insert the structure, but there is APC related knowledge in it. The specific details will not be repeated.
DPC
what is DPC? Its full English name is Deferred Procedure Call, that is, Deferred Procedure Call. It was originally designed as a part of the interrupt service program to solve the problem of too long interrupt service processing time. Because every time an interrupt is triggered, the interrupt will be closed and then the interrupt service routine will be executed. Because the shutdown is broken, the interrupt service routine must be short and concise and cannot consume too much time, otherwise the system will lose a large number of other interrupts. But some interrupts, in which the interrupt service routine has to do a lot of things, what should we do? Therefore, you can first perform the most urgent part of the work in the interrupt service routine, and then move the remaining relatively less important work into the DPC function for execution.
whenever an interrupt is triggered, the interrupt service routine can insert a DPC into the current CPU. After executing ISR and exiting ISR, the CPU will scan its DPC queue and execute each DPC in turn. After executing DPC, it will return to the interrupt of the current thread to continue execution.
since ISR is mentioned, what is it? Its full English name is interrupt service routes, that is, interrupt service processing. In Windows, if there are operations that are considered less important, they will be packaged into a KDPC structure, as shown below:
kd> dt _KDPC ntdll!_KDPC +0x000 Type : Int2B +0x002 Number : UChar +0x003 Importance : UChar +0x004 DpcListEntry : _LIST_ENTRY +0x00c DeferredRoutine : Ptr32 void +0x010 DeferredContext : Ptr32 Void +0x014 SystemArgument1 : Ptr32 Void +0x018 SystemArgument2 : Ptr32 Void +0x01c Lock : Ptr32 Uint4B
after packaging, it will be inserted into the DpcListHead member of KPCRB and wait for the trigger call time. The KPCRB structure is as follows, showing only DPC related members:
kd> dt _KPRCB ntdll!_KPRCB ...... +0x4b0 DpcTime : Uint4B +0x4b4 DebugDpcTime : Uint4B ...... +0x860 DpcListHead : _LIST_ENTRY +0x868 DpcStack : Ptr32 Void +0x86c DpcCount : Uint4B +0x870 DpcQueueDepth : Uint4B +0x874 DpcRoutineActive : Uint4B +0x878 DpcInterruptRequested : Uint4B +0x87c DpcLastCount : Uint4B +0x880 DpcRequestRate : Uint4B ...... +0x8a0 DpcLock : Uint4B ...... +0x8c0 CallDpc : _KDPC ......
for non clock interrupt interrupt, clock interrupt and active call API, related functions will poll the DPC linked list for callback. For the DPC created by the user or driver, you can also add a timer to trigger at the specified time, but it will not be linked to the KPCR, but will be linked to the clock task.
after introducing the above concepts, let's introduce the meaning of the relevant members of the structure:
KDPC
Type
indicates the type of the structure, and the value is 0x13.
Number
indicate which KPCR it belongs to.
Importance
priority: the value is 0-2. 0 is the lowest and 2 is the highest. The higher the priority, the higher the priority. The default value of initialization is 1.
DpcListEntry
the DPC linked list, like the process thread linked list, is hung on the waist.
DeferredRoutine
callback function address of DPC.
DeferredContext
callback function context, not required.
SystemArgument1 / SystemArgument2
parameter of callback function, not required.
Lock
lock of DPC structure.
DPC initialization
to learn DPC, first let's understand its initialization. We use IDA to locate the function:
; void __stdcall KeInitializeDpc(PRKDPC Dpc, PKDEFERRED_ROUTINE DeferredRoutine, PVOID DeferredContext) public _KeInitializeDpc@12 _KeInitializeDpc@12 proc near ; CODE XREF: IopInitializeIrpStackProfiler()+29↑p ; VdmpDelayInterrupt(x)+26B↓p ... Dpc = dword ptr 8 DeferredRoutine = dword ptr 0Ch DeferredContext = dword ptr 10h mov edi, edi push ebp mov ebp, esp mov eax, [ebp+Dpc] mov ecx, [ebp+DeferredRoutine] and dword ptr [eax+1Ch], 0 mov [eax+0Ch], ecx mov ecx, [ebp+DeferredContext] mov word ptr [eax], 13h mov byte ptr [eax+2], 0 mov byte ptr [eax+3], 1 mov [eax+10h], ecx pop ebp retn 0Ch _KeInitializeDpc@12 endp
this function is very simple. It is shown in pseudo code as follows:
void __stdcall KeInitializeDpc(PRKDPC Dpc, PKDEFERRED_ROUTINE DeferredRoutine, PVOID DeferredContext) { Dpc->Lock = 0; Dpc->DeferredRoutine = DeferredRoutine; Dpc->Type = 19; Dpc->Number = 0; Dpc->Importance = 1; Dpc->DeferredContext = DeferredContext; }
Insertion of DPC
then let's take a look at the DPC insertion process. Since IRQL authorization lifting operation is involved, let's take a look at the enumeration first:
in order to enhance readability, the following is the pseudo code of IDA I renamed:
BOOLEAN __stdcall KeInsertQueueDpc(PRKDPC Dpc, PVOID SystemArgument1, PVOID SystemArgument2) { _KPRCB *kprcb; // esi bool Importance; // zf _LIST_ENTRY *DpcListHead; // ecx _LIST_ENTRY *DpcListEntry; // eax _LIST_ENTRY *v9; // edx _LIST_ENTRY *v10; // edx KIRQL NewIrql; // [esp+Fh] [ebp-1h] NewIrql = KfRaiseIrql(0x1Fu); kprcb = MEMORY[0xFFDFF020]; _ECX = &Dpc->Lock; _EDX = MEMORY[0xFFDFF020] + 0x8A0; // DpcLock __asm { cmpxchg [ecx], edx } ++kprcb->DpcCount; ++kprcb->DpcQueueDepth; Importance = Dpc->Importance == 2; Dpc->SystemArgument1 = SystemArgument1; Dpc->SystemArgument2 = SystemArgument2; DpcListHead = &kprcb->DpcListHead; DpcListEntry = &Dpc->DpcListEntry; if ( Importance ) { v9 = DpcListHead->Flink; DpcListEntry->Flink = DpcListHead->Flink; Dpc->DpcListEntry.Blink = DpcListHead; v9->Blink = DpcListEntry; DpcListHead->Flink = DpcListEntry; } else { v10 = kprcb->DpcListHead.Blink; DpcListEntry->Flink = DpcListHead; Dpc->DpcListEntry.Blink = v10; v10->Flink = DpcListEntry; kprcb->DpcListHead.Blink = DpcListEntry; } if ( !kprcb->DpcRoutineActive && !kprcb->DpcInterruptRequested && (Dpc->Importance || kprcb->DpcQueueDepth >= kprcb->MaximumDpcQueueDepth || kprcb->DpcRequestRate < kprcb->MinimumDpcRate) ) { LOBYTE(DpcListHead) = 2; kprcb->DpcInterruptRequested = 1; HalRequestSoftwareInterrupt(DpcListHead); } KfLowerIrql(NewIrql); return 1; }
it can be seen that if the priority of DPC is high, it will be inserted in front of the DPC linked list, and if it is low, it will be inserted at the end. In the case of multi-core, the pseudo code will be different, as shown below:
BOOLEAN __stdcall KeInsertQueueDpc(PRKDPC Dpc, PVOID SystemArgument1, PVOID SystemArgument2) { unsigned __int8 Number; // al _KPRCB *kpcrb; // esi char v5; // bl bool v6; // zf _LIST_ENTRY *v7; // ecx _LIST_ENTRY *v8; // eax _LIST_ENTRY *v9; // edx _LIST_ENTRY *v10; // edx signed __int32 v12; // [esp+Ch] [ebp-8h] KIRQL NewIrql; // [esp+13h] [ebp-1h] NewIrql = KfRaiseIrql(0x1Fu); Number = Dpc->Number; if ( Number < 0x20u ) { v5 = Dpc; kpcrb = KeGetPcr()->Prcb; } else { kpcrb = KiProcessorBlock[Number]; v5 = Number - 32; } KiAcquireSpinLock(); v12 = _InterlockedCompareExchange(&Dpc->Lock, &kpcrb->DpcLock, 0); if ( !v12 ) { ++kpcrb->DpcCount; ++kpcrb->DpcQueueDepth; v6 = Dpc->Importance == 2; Dpc->SystemArgument1 = SystemArgument1; Dpc->SystemArgument2 = SystemArgument2; v7 = &kpcrb->DpcListHead; v8 = &Dpc->DpcListEntry; if ( v6 ) { v9 = v7->Flink; v8->Flink = v7->Flink; Dpc->DpcListEntry.Blink = v7; v9->Blink = v8; v7->Flink = v8; } else { v10 = kpcrb->DpcListHead.Blink; v8->Flink = v7; Dpc->DpcListEntry.Blink = v10; v10->Flink = v8; kpcrb->DpcListHead.Blink = v8; } if ( !kpcrb->DpcRoutineActive && !kpcrb->DpcInterruptRequested ) { if ( kpcrb == KeGetPcr()->Prcb ) { if ( Dpc->Importance || kpcrb->DpcQueueDepth >= kpcrb->MaximumDpcQueueDepth || kpcrb->DpcRequestRate < kpcrb->MinimumDpcRate ) { LOBYTE(v7) = 2; kpcrb->DpcInterruptRequested = 1; HalRequestSoftwareInterrupt(v7); //Process DPC } } else if ( Dpc->Importance == 2 || kpcrb->DpcQueueDepth >= kpcrb->MaximumDpcQueueDepth ) { kpcrb->DpcInterruptRequested = 1; KiIpiSend(1 << v5, 2u); } } } KiReleaseSpinLock(&kpcrb->DpcLock); KfLowerIrql(NewIrql); return v12 == 0; }
it is mentioned here that if you want to obtain the multi-core debugging environment and its kernel files, you must configure the virtual machine as multi-core and then reinstall the operating system. However, the symbols I gave are single core and multi-core, and I can't find them. I generated some structures of the pseudo code above through single core, and the functions are named by reference. If you have multi-core symbols, I hope you can provide them. Thank you for your support.
for the case of multi-core, the specific details will be explained in the synchronization section. This thing with the SpinLock function becomes a spin lock. It is found that there will be more judgment and a function KiIpiSend for this code and single core. This function is used to inform another KPCR to execute DPC, and you can understand it.
Removal of DPC
it's easy to remove the DPC code, and the pseudo code is as follows:
BOOLEAN __stdcall KeRemoveQueueDpc(PRKDPC Dpc) { unsigned int *Lock; // esi _disable(); // cli Lock = Dpc->Lock; if ( Lock ) { --*(Lock - 12); RemoveEntryList(&Dpc->DpcListEntry); Dpc->Lock = 0; } _enable(); // sti return Lock != 0; }
for multi-core, the code is as follows, which is not different from single core, but spin lock is added:
BOOLEAN __stdcall KeRemoveQueueDpc(PRKDPC Dpc) { unsigned int *Lock; // edi _disable(); Lock = Dpc->Lock; if ( Lock ) { KiAcquireSpinLock(Dpc->Lock); if ( Lock == Dpc->Lock ) { --*(Lock - 12); RemoveEntryList(&Dpc->DpcListEntry); Dpc->Lock = 0; } KiReleaseSpinLock(Lock); } _enable(); return Lock != 0; }
Execution of DPC
the DPC is executed through kireteredpclist, while the function is called through KiDispatchInterrupt and KiIdleLoop. Let's briefly understand it through pseudo code:
int __usercall KiRetireDpcList@<eax>(_KPCR *kpcr@<ebx>, _LIST_ENTRY *DPCListHead@<ebp>) { // [COLLAPSED LOCAL DECLARATIONS. PRESS KEYPAD CTRL-"+" TO EXPAND] v13 = 0; if ( PPerfGlobalGroupMask && (*(PPerfGlobalGroupMask + 4) & 0x80) != 0 ) v13 = 1; do { MEMORY[0xFFDFFBC4] = &v11; // There is an error here. The original compilation: // mov ds:0FFDFF994h, esp // It may be a plug-in Bug // Point to: dpcrutineactive do { dpc = DPCListHead->Flink; v3 = DPCListHead->Flink->Flink; DPCListHead->Flink = v3; v3->Blink = DPCListHead; dpc = (dpc - 4); // Hang it on your waist and point the pointer to your head DeferredRoutine = dpc->DeferredRoutine; SystemArgument1 = *&dpc->SystemArgument1; DeferredContext = dpc->DeferredContext; dpc_1 = dpc; dpc->Lock = 0; --kpcr->PrcbData.DpcQueueDepth; _enable(); if ( v13 ) // Log performance analysis, ignoring { v6 = WmiGetCpuClock(DeferredRoutine); DeferredRoutine = v7; v11 = v6; v12 = v7; } result = (DeferredRoutine)(dpc_1, DeferredContext, SystemArgument1, HIDWORD(SystemArgument1));// Execute DPC if ( v13 ) result = PerfInfoLogDpc(v12, v11, SHIDWORD(v11)); _disable(); } while ( !IsListEmpty(DPCListHead) ); kpcr->PrcbData.DpcRoutineActive = 0; kpcr->PrcbData.DpcInterruptRequested = 0; } while ( !IsListEmpty(DPCListHead) ); return result; }
Use of DPC
since we have learned DPC, we can't do without application. Let's do some experiments to experience DPC. Let's create a new driver project. I won't say how to configure it. The code is as follows:
#include <ntifs.h> #include <ntddk.h> KDPC dpc = { 0 }; VOID DPCRoutine(_In_ struct _KDPC* Dpc, _In_opt_ PVOID DeferredContext, _In_opt_ PVOID SystemArgument1, _In_opt_ PVOID SystemArgument2) { DbgPrint("DPC Running......\n"); } NTSTATUS UnloadDriver(PDRIVER_OBJECT DriverObject) { DbgPrint("Unloaded Successfully!"); } NTSTATUS DriverEntry(PDRIVER_OBJECT DriverObject, PUNICODE_STRING RegistryPath) { DbgPrint("Loaded Successfully!"); DriverObject->DriverUnload = UnloadDriver; KeInitializeDpc(&dpc, DPCRoutine, NULL); KeInsertQueueDpc(&dpc, NULL, NULL); return STATUS_SUCCESS; }
after compiling and loading, the following results will appear:
then continue to be more advanced. Doesn't the driver or user created DPC support timers? Let's do this:
#include <ntifs.h> #include <ntddk.h> KDPC dpc = { 0 }; KTIMER timer = { 0 }; LARGE_INTEGER duringtime = { 0 }; VOID DPCRoutine(_In_ struct _KDPC* Dpc, _In_opt_ PVOID DeferredContext, _In_opt_ PVOID SystemArgument1, _In_opt_ PVOID SystemArgument2) { DbgPrint("DPC Running......\n"); KeSetTimer(&timer, duringtime, &dpc); } VOID UnloadDriver(PDRIVER_OBJECT DriverObject) { KeCancelTimer(&timer); DbgPrint("Unloaded Successfully!"); } NTSTATUS DriverEntry(PDRIVER_OBJECT DriverObject, PUNICODE_STRING RegistryPath) { DbgPrint("Loaded Successfully!"); DriverObject->DriverUnload = UnloadDriver; KeInitializeTimer(&timer); KeInitializeDpc(&dpc, DPCRoutine, NULL); duringtime.QuadPart = -30 * 1000 * 1000; //The minus sign indicates the relative time. The interval here is 3 seconds KeSetTimer(&timer, duringtime, &dpc); return STATUS_SUCCESS; }
the effects achieved are as follows:
the DPC timer is implemented above.
Windows architecture
to learn the system kernel, you'd better understand the system architecture of the system. The schematic diagram is as follows, from the principle and implementation of Windows kernel by Pan Aimin:
if you are interested in understanding the details, please look for it by yourself. Note that the book is based on WRK. If the structure or code is related, please refer to the reverse result.