
The Ransomware-as-a-Service (RaaS) group Lockbit is the main pillar of this Ransomware business model, largely due to its strong commitment to the development of its product, producing Ransomware with implementations that are up to date on the date of each release. The Lockbit4.0 (or Lockbit Green) version is no different, as it is a major update, especially in the Evasion and Obfuscation layer. In this research, I will analyze the main Obfuscation and Evasion capabilities implemented in Lockbit4.0, and some Intelligence insights will be provided after the analysis.
Below is the SHA256 hash of the Lockbit4.0 sample that I will analyze in this research.
"sha256":"21E51EE7BA87CD60F692628292E221C17286DF1C39E36410E7A0AE77DF0F6B4B"
Reverse Engineering of the Unpacking Process
The Lockbit4.0 unpacking process is quite complex, and I will try to describe my analysis based on its pseudocode. Below, we can see the beginning of the unpacking algorithm.

The code starts by reading a byte from the compressed stream, storing it in rdx. It then loads the value of arg5 and multiplies it by 2 (arg5 << 1). This forces a carry when the most significant bit (MSB) is cleared. The carry is saved in the c_1 flag and will be used later to determine whether the byte can be copied directly or needs to be processed.

The code checks whether arg5 (in temp0_1) is equal to its complement (-arg5 or temp1_1). This is because certain values in the compressed byte stream represent special markers that need to be processed differently. If the equality is true:
- A new byte is loaded into rbx.
- The arg2 pointer is adjusted to advance 4 bytes.
- An ADC (Add with Carry) operation is performed on rbx, modifying arg5.
- A new byte is loaded into rdx, continuing the data extraction.

If carry c_1 was previously activated, this means that the read byte (rdx) does not need any special transformation and can be copied directly to the output buffer.
- The pointer arg2 (read from the compressed stream) and arg1 (write position in the output buffer) are incremented.
- The stream continues with the next byte.
If c_1 is false, it means that the byte cannot be copied directly and must be processed in a secondary loop. In this secondary loop, the code will execute:
- An internal function (__return_addr_1) is called and returns temporary values.
- The arg5 register is ADCed and shifted to extract more information from the compressed bytes.
- The marker special condition is tested again to see if a new decode is needed.
- If the carry extraction indicates an invalid value, a new byte is loaded and tested again.

After decoding, the code combines the extracted values to determine whether a patch in memory is necessary.
- If the result of the combination is 0xffffffff, the algorithm interprets this as a signal to start the patching routine.
- Otherwise, the extracted values are written directly to the output buffer.


Now we come to the last data block of the unpacking code. If the code detects that patching is necessary, it performs a series of operations to directly modify specific regions of memory. So this last block of code will do:
- A loop walks through a section of memory, identifying and correcting relative addresses.
- Some instructions use _bswap to reverse the order of bytes.
- A set of subtractions adjusts obfuscated values to restore the correct bytes of the original code.
- Calls to VirtualProtect are made to change the memory permissions, ensuring that the modifications are applied.
- The transferred code is cleared and prepared for execution in a region now filled with unpacked code, in the UPX0 section. Specifically, the address will be offset 0x140013a9f (unpacked_code_UPX0). And this offset is the entry point for the unpacked Lockbit4.0.
This last loop in which VirtualProtect is called and the program flow is unconditionally changed to the unpacked code, is clearly observed in the graphical format of the Disassembly below.

And below you can see that the region at offset 0x140013a9f is in fact statically empty.

Now let’s analyze this algorithm dynamically, with the aim of extracting the unpacked code from Lockbit4.0. Below we can see the exact space still empty, before the unpacking process.

After the unpacking process is complete, the previously empty space is now filled with the unpacked code.

Once we reach this point, we will use the Scylla plugin to dump Lockbit4.0 unpacked.

When comparing the original packed sample from Lockbit4.0 with the dynamically extracted unpacked version, we can observe a big difference in DiE size and detection.

Below we can also see the difference in data organization in the packed version.

And below we can see the structural change of the unpacked version.

Despite the names of the sections being similar to the IOCs left by UPX, this is not a sample packed by UPX. And when we go to the offset where the unpacked code was written, we can see that it is filled with valid code, in this case the Lockbit4.0 Main function.

Analyzing Dynamic DLL and API Resolution
The great obfuscation feature implemented by Lockbit4.0 is the DLL and API resolution technique at runtime, divided into three functions. The image below illustrates the flow that is executed whenever Lockbit4.0 needs a certain API.

So, let’s look first at DLL resolution via Hashing. Lockbit4.0’s hashing algorithm is relatively easy, does not include any extra layers of obfuscation, and is intended to obfuscate DLLs that will be resolved at runtime. The algorithm traverses a data structure applying mathematical transformations and bitwise operations to generate an accumulative value (in the rcx_17 variable).

This hashing algorithm traverses a data structure applying mathematical transformations and bitwise operations to generate an accumulative value (rcx_17). Below is an objective summary of this algorithm:
- Initialization: Defines variables, specifically in rdx_8 = 0, rcx_17 = 0x14bf.
- The Main Loop: Reads values from memory indexed by r14_2 and stops when it finds a null value.
- Conditional Conversion: If the value is an uppercase letter (A-Z), converts it to lowercase (+0x20).
- Hash Calculation: The algorithm will XORs in rdx_8 ^ 0x14bf, to ensuring variation in values.
- Hashing or Checksum: Multiplies and combines values with XOR to create a cumulative identifier.
- Iteration: It will increment rdx_8 and r14_2, advancing to the next data block.
Below we can see the Python algorithm that I developed, which is already available in HashDB for automatic resolution, through the plugin available for Ghidra, IDA and Binary Ninja.
def lockbit4_hashing(hashing):
MASK_32BIT = 0xffffffff
hash_value = 0x14bf
char_index = 0
for char in hashing:
char_code = ord(char)
if 0x41 <= char_code <= 0x5A:
normalized_char = (char_code + 0x20) & MASK_32BIT
else:
normalized_char = char_code
if char_index == 0:
index_modifier = 0
else:
index_modifier = (char_index ^ 0x14bf) & MASK_32BIT
hash_value = (index_modifier * (((char_index + 0x14bf) * normalized_char + (hash_value ^ normalized_char)) & MASK_32BIT) + normalized_char) & MASK_32BIT
char_index += 1
return hash_value
In the following sequence of images, we can observe the use of HashDB for resolving DLL/API Hashing in Lockbit4.0.



A good example of the Hashing resolution process flow is the code below from Lockbit4.0, where we first see the resolution of the ntdll.dll Hash and the collection of its offset, followed by the resolution of the EtwEventWrite API Hash, storing them in a Lockbit4.0 custom structure. This piece of code is the beginning of the execution of the ETW Patching technique, where Lockbit4.0 will collect the address of the EtwEventWrite API, to overwrite the initial API code for the ret opcode (0xc3), thus applying the patch.

As we can see above, after the resolution, we can see that a function is executed that calculates a fake API address through the lockbit4_calc_fake_api_addr function, and then the real address is calculated and stored once again in the Lockbit4.0 custom struct. Below, we can see that the calculation for the correct resolution of the API address is a simple XOR operation, with values present within the Lockbit4.0 custom struct.

Continuing the analysis of the implementation of the EDR Evasion technique via ETW Patching, and using it as an example, to demonstrate the repeatability of the DLL/API resolution technique dynamically, below we can observe the execution of zwWriteVirtualMemory, overwriting the EtwEventWrite API with the opcode ret (0xc3).

As we saw with the implementation of ETW Patching, all other capabilities depend on this same DLL/API resolution technique via Hashing. Capabilities such as:
- Disabling DLL Notification via the LdrUnRegisterDllNotification API.
- Deleting Volume Shadows via the IVssBackupComponents interface with the DeleteSnapshots API.
- Disabling the Volume Shadows Management Service via the OpenSCManager, OpenService and ChangeServiceConfig APIs.
- Enumerating Networks via APIs such as GetIpNetTable, inet_ntoa, gethostbyaddr and NetShareEnum.
- Log deletion through APIs, EvtOpenSession, EvtOpenChannelEnum, EvtNextChannelPath and EvtClearLog.
- And so on.
Analysis of Cryptographic Algorithms for Obfuscation Implemented in Lockbit 4.0
Unlike version 3.0, Lockbit 4.0 implements two algorithms to decrypt Strings and the README that will be created throughout the system. The algorithm to decrypt strings is very simple, being just a logical operation with XOR, while the algorithm used to decrypt the README is the well-known RC4.
Below, we can see an example of a moment when Lockbit4.0 implements the algorithm to decrypt multiple strings, necessary for later actions.

Below you can see the algorithm itself, which involves logical operations with an XOR that starts with the key 0x3a, and is changed by the counter in each round of the loop, making each byte have a different XOR key.

Below is my implementation of this algorithm in Python, followed by the output of its execution.
def lb4_str_decrypt(data: bytes) -> str:
if len(data) % 2 != 0:
raise ValueError("[-] Error [-]")
decrypted_chars = []
key = 0x3a
for i in range(0, len(data), 2):
encrypted_word = int.from_bytes(data[i:i+2], byteorder='little')
decrypted_word = encrypted_word ^ key
if decrypted_word == 0:
break
decrypted_chars.append(chr(decrypted_word))
return ''.join(decrypted_chars)
if __name__ == "__main__":
encrypted_data = b'\x6b\x00\x00\x00\x66\x00\x3a\x00'
result = lb4_str_decrypt(encrypted_data)
print("Decrypted String:", result)

In addition to the XOR algorithm above used for string decryption, Lockbit4.0 also implements the well-known RC4 algorithm, with the aim of decrypting the README that will be written throughout the system during the execution of the Ransomware. Below we can see the in-line implementation of the RC4 Algorithm present in Lockbit4.0, within the Main function itself.

Without any extra obfuscation layers, we are able to identify the RC4 key and the encrypted README.


Since it is a well-known algorithm, and widely used by Malware, it is easy to implement this algorithm in Python. Below is my implementation, followed by the output of its execution (I removed the values of the RC4 key and the large block of data from the encrypted README, to keep the visual appearance cleaner).
def rc4(key: bytes, data: bytes) -> bytes:
# KSA Phase
S = list(range(256))
j = 0
key_length = len(key)
# PRGA Phase
for i in range(256):
j = (j + S[i] + key[i % key_length]) % 256
S[i], S[j] = S[j], S[i]
# Decryption Phase
i = 0
j = 0
result = bytearray()
for byte in data:
i = (i + 1) % 256
j = (j + S[i]) % 256
S[i], S[j] = S[j], S[i]
K = S[(S[i] + S[j]) % 256]
result.append(byte ^ K)
return bytes(result)
if __name__ == "__main__":
rc4_key = "RC4_KEY"
encrypted_lb4_readme = (
"ENCRYPTED_README_DATA"
)
rc4_key_bytes = bytes.fromhex(rc4_key)
encrypted_readme_bytes = bytes.fromhex(encrypted_lb4_readme)
decrypted_bytes = rc4(rc4_key_bytes, encrypted_readme_bytes)
try:
decrypted_lb4_readme = decrypted_bytes.decode("utf-8")
except UnicodeDecodeError:
decrypted_lb4_readme = decrypted_bytes.decode("latin1", errors="replace")
print("\nLockbit4.0 Decrypted Readme:")
print(decrypted_lb4_readme)

Detection Engineering - Yara Rules
Below, contains the YARA rules I produced during the analysis of Lockbit4.0, focused on detecting code patterns from the packed sample, and the unpacked sample.
rule lb4_packer_was_detected
{
meta:
author = "0x0d4y"
description = "Detect the packer used by Lockbit4.0"
date = "2024-02-16"
score = 100
yarahub_reference_md5 = "15796971D60F9D71AD162060F0F76A02"
yarahub_uuid = "f6f57eca-314b-4657-906e-495ea9b92def"
yarahub_license = "CC BY 4.0"
yarahub_rule_matching_tlp = "TLP:WHITE"
yarahub_rule_sharing_tlp = "TLP:WHITE"
malpedia_family = "win.lockbit"
strings:
$unpacking_loop_64b = { 8b 1e 48 83 ee fc 11 db 8a 16 72 e5 8d 41 01 41 ff d3 11 c0 01 db 75 0a }
$jump_to_unpacked_code_64b = { 48 8b 2d 16 0f ?? ?? 48 8d be 00 f0 ?? ?? bb 00 ?? ?? ?? 50 49 89 e1 41 b8 04 ?? ?? ?? 53 5a 90 57 59 90 48 83 ec ?? ff d5 48 8d 87 ?? ?? ?? ?? 80 20 ?? 80 60 ?? ?? 4c 8d 4c 24 ?? 4d 8b 01 53 90 5a 90 57 59 ff d5 48 83 c4 ?? 5d 5f 5e 5b 48 8d 44 24 ?? 6a ?? 48 39 c4 75 f9 48 83 ec ?? e9 ?? ?? ?? ?? }
$unpacking_loop_32b = { 8A 06 46 88 07 47 01 DB 75 ?? 8B 1E 83 EE ?? 11 DB 72 ?? 9C 29 C0 40 9D 01 DB 75 ?? 8B 1E 83 EE ?? 11 DB 11 C0 01 DB 73 ?? 75 ?? 8B 1E 83 EE ?? 11 DB 73 ?? }
$jump_to_unpacked_code_32b = { 8b ae ?? ?? ?? ?? 8d be 00 f0 ?? ?? bb 00 ?? ?? ?? 50 54 6a 04 53 57 ff d5 8d 87 ?? ?? ?? ?? 80 20 ?? 80 60 ?? ?? 58 50 54 50 53 57 ff d5 58 8d 9e 00 f0 ?? ?? 8d bb ?? ?? ?? ?? 57 31 c0 aa 59 49 50 6a 01 53 ff d1 61 8d 44 24 ?? 6a ?? 39 c4 75 fa 83 ec ?? e9 ?? ?? ?? ??}
condition:
uint16(0) == 0x5a4d and
1 of ($jump_to_unpacked_code_*) and
1 of ($unpacking_loop_*)
}
rule lb4_rc4_alg
{
meta:
author = "0x0d4y"
description = "Detect the implementation of RC4 Algorithm by Lockbit4.0"
date = "2024-02-13"
score = 100
yarahub_reference_md5 = "062311F136D83F64497FD81297360CD4"
yarahub_uuid = "4de48ced-b9fa-4286-aac4-c263ad20d67d"
yarahub_license = "CC BY 4.0"
yarahub_rule_matching_tlp = "TLP:WHITE"
yarahub_rule_sharing_tlp = "TLP:WHITE"
malpedia_family = "win.lockbit"
strings:
$rc4_alg = { 48 3d 00 01 00 00 74 0c 88 84 04 ?? ?? ?? ?? 48 ff c0 eb ec 29 c9 41 b8 ?? ?? ?? ?? 4c 8d 0d 15 7b 00 00 45 31 d2 48 81 f9 00 01 00 00 74 34 44 8a 9c 0c ?? ?? ?? ?? 45 00 da 89 c8 99 41 f7 f8 46 02 14 0a 41 0f b6 c2 8a 94 04 ?? ?? ?? ?? 88 94 0c ?? ?? ?? ?? 44 88 9c 04 ?? ?? ?? ?? 48 ff c1 eb c3 29 c0 48 8b 0d 14 9e 00 00 31 d2 45 29 c0 48 3d ?? ?? ?? ?? 74 4b 41 ff c0 45 0f b6 c0 46 8a 8c 04 ?? ?? ?? ?? 44 00 ca 44 0f b6 d2 46 8a 9c 14 ?? ?? ?? ?? 46 88 9c 04 ?? ?? ?? ?? 46 88 8c 14 ?? ?? ?? ?? 46 02 8c 04 ?? ?? ?? ?? 45 0f b6 c9 46 8a 8c 0c ?? ?? ?? ?? 44 30 0c 01 48 ff c0 eb ad }
condition:
uint16(0) == 0x5a4d and
$rc4_alg
}
rule lb4_hashing_alg
{
meta:
author = "0x0d4y"
description = "This rule detects the custom hashing algorithm of Lockbit4.0 unpacked"
date = "2024-02-16"
score = 100
yarahub_reference_md5 = "062311F136D83F64497FD81297360CD4"
yarahub_uuid = "d1a6d555-626d-4625-9da6-e4478cb7a142"
yarahub_license = "CC BY 4.0"
yarahub_rule_matching_tlp = "TLP:WHITE"
yarahub_rule_sharing_tlp = "TLP:WHITE"
malpedia_family = "win.lockbit"
strings:
$hashing_alg = { 41 89 d0 46 0f be 04 00 45 09 c0 74 ?? 45 8d 48 ?? 45 8d 50 ?? 41 80 f9 ?? 45 0f 43 d0 44 31 d1 44 8d 04 3a 45 0f af c2 41 01 c8 89 d1 31 f9 09 d2 0f 44 ca 41 0f af c8 44 01 d1 ff c2 eb ?? 49 ff c6 }
condition:
uint16(0) == 0x5a4d and
$hashing_alg
}
Detection Engineering - Yara Hunts
With the YARA rules produced, I carried out a Yara Hunt on UnpacMe and below is the link shared with the matches produced by the Hunt with the YARA rules above.
Conclusion
Throughout the analysis, Lockbit4.0 presents us with a version that is much more concerned with implementing Obfuscation techniques, such as the DLL/API Hashing technique and the DLL/API address resolution technique divided into phases, with the clear purpose of obfuscating its intentions and slowing down the analysis. And we can also observe its concern with implementing Endpoint Protection Software Evasion techniques, through techniques such as ETW Patching and Disabling DLL Loading Notifications. In addition, it is also possible to observe the introduction of the network enumeration technique in an autonomous manner, through the collection of IP addresses from the ARP Table and the Routing Table, through the IPs mentioned in the research. There is no secret in the implementation of this technique, since it is entirely done through the use of Windows APIs, with the only layer of complexity being the implementation of the DLL/API resolution technique dynamically.
Therefore, unlike the previous version, this new version of Lockbit ransomware is focused on staying under the radar.