User-mode API vs system API
Two layers matter for this topic:
- User-mode API: the friendly libraries you
P/Invoke, such asuser32.dllandkernel32.dll. These contain compatibility shims, validation, and business logic. They are the usual target for hooks. - System API: the thin transition layer, such as
ntdll.dllandwin32u.dll. These hold the actual syscall stubs that load an ID intoeaxand trap into kernel mode. This is the canonical place to observe which syscall number will be invoked.
Hook taxonomy
Almost every hook you will encounter is one of a handful of patterns. Grouped by mechanism:
- Debug interrupts:
int3orint npatched over the prologue, caught by a vectored exception handler. - Debug page break:
PAGE_GUARD, NX violations, or explicit illegal instructions that provoke an exception the handler interprets. - Inline jump / call: a
jmporcallpatched over the first few bytes of the function, landing in the hook’s trampoline. - Inline warp: return-address rewriting, either on the stack or via
RIP, redirecting control flow without patching the function itself. - Export address / IAT: rewriting the address the loader resolved, so the caller ends up in a forwarder. Lump this and other address-manipulation tricks into a single “forwarding” category.
There are exotic variants (zeroing the IAT entry and catching the fault, forcing Win32 fallback through int 2e, or intercepting the syscall kernel routine itself), but the five above cover nearly everything you’ll see in real code.
The shape of a system-API stub
A syscall stub in win32u.dll or ntdll.dll is almost comically simple. For example, NtUserDispatchMessage on a modern x64 Windows looks approximately like:
mov r10, rcx ; 4c 8b d1
mov eax, 0x1036 ; b8 36 10 00 00 ; syscall ID
test byte ptr [0x7FFE0308], 1
jne legacy_int2e
syscall ; 0f 05
ret ; c3
The identifier (0x1036 here) is the kernel’s index for the target routine. The byte test is the shared-user-data flag that selects between the syscall fast path and the legacy int 2e path; on anything remotely modern, the syscall path is always taken.
Extracting the syscall ID
Hooks that live in user-mode libraries generally do not modify the system-API stub itself; they modify the user-mode function that ends up calling it. That means the stub is still your ground truth for the syscall ID.
In C#, Iced.Intel and Capstone are good options for walking the instruction stream; a minimal implementation just looks for the mov eax, imm32 in the prologue:
private static unsafe int GetIdentifier(string name)
{
ExportFunction function = GetFunctions().First(x => x.Name == name);
if (function.Address == (byte*)0)
throw new ArgumentException($"{name} is not an export function");
// Forwarded exports store the syscall id immediately after the 4-byte redirect.
return function.GetHookType() == HookType.FORWARDED
? *(int*)(function.Address + sizeof(int))
: (function.IsSharedExport ? 0 : 4072) + _functionCache.IndexOf(function);
}
Functions beginning with Zw are shared exports with a slightly different prologue from their Nt counterparts, which is why the cache index is adjusted conditionally.
Building the stub yourself
Once you have the ID, the call is just 14 bytes of shellcode: load rcx into r10 (Windows uses r10 as the syscall kernel-arg register), set the ID in eax, syscall, ret.
private static byte[] GenerateShellcode(string name)
{
byte[] shellcode = GC.AllocateArray<byte>(14, pinned: true);
int id = GetIdentifier(name);
// mov r10, rcx
shellcode[0] = 0x4c;
shellcode[1] = 0x8b;
shellcode[2] = 0xd1;
// mov eax, id
shellcode[3] = 0xb8;
shellcode[4] = (byte)(id >> 0);
shellcode[5] = (byte)(id >> 8);
shellcode[6] = (byte)(id >> 16);
shellcode[7] = (byte)(id >> 24);
// syscall
shellcode[8] = 0x0f;
shellcode[9] = 0x05;
// ret
shellcode[10] = 0xc3;
return shellcode;
}
Allocate it executable and hand it to the runtime as a delegate:
byte[] shellcode = GenerateShellcode(name);
nint addr = Marshal.UnsafeAddrOfPinnedArrayElement(shellcode, 0);
VirtualAlloc(addr, 14, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
Delegate @delegate = Marshal.GetDelegateForFunctionPointer(addr, delegateType);
For flexibility, System.Linq.Expressions.Compiler.DelegateHelpers.MakeNewCustomDelegate can build a delegate type dynamically with whatever signature you want, so the resulting callable is indistinguishable from a regular P/Invoke target at the call site.
What this is actually good for
- Rebuilding a clean path to the kernel when user-mode libraries are instrumented.
- Detecting and classifying hooks by comparing observed prologues to the expected stub shape.
- Writing tooling that needs to reason about how syscalls are dispatched (debuggers, tracers, compatibility shims).
What it’s not good for: evading security software on systems you don’t own. The techniques are documented in OS-internals literature; this is a walk-through of the mechanics, not a bypass guide.