Dissecting LockBit v3 ransomware
Introduction
In our last article, we recommended analyzing ransomware binaries as part of an effective ransomware response strategy:
“Analyzing binaries is hard. Analyzing obfuscated ransomware is even harder.
[...] However, it is worth investing in analyzing and understanding ransomware. Crypto breaking bugs may be rare, but they are not impossible to find. In addition, ransomware authors may not fully understand how to use crypto correctly. The only way to determine if it is possible to recover the data, if any, is the long and detailed ransomware analysis by an expert team.
[...] In addition, a successful analysis can help reassure you that there are no potential bugs in the encryption and decryption process. It also helps the technical team understand and potentially improve the recovery process. This is an investment that should be considered early on in an incident.”
In this article, we show some examples of crucial intelligence you can gain from a meticulous and accurate ransomware analysis. The target of this analysis is a variant of the LockBit v3 ransomware that we encountered in a recent engagement. This variant is also known as LockBit Black due to some code similarity with the BlackMatter family. These samples are built from the leaked LockBit v3 builder available on GitHub.
Calif discovered two issues in this version of the ransomware:
a crypto bug that may allow for the decryption of a portion of the data without the private key, i.e., without paying the ransom.
a design flaw that may cause data corruption and permanent data loss.
We decided to publish this analysis for the following reasons:
The crypto bug is already known to the malware author. We have observed newer variants where we can no longer take advantage of this bug.
We want to share our analysis and research to help other affected organizations prepare and respond to the same ransomware family, especially regarding the data corruption flaw.
The LockBit v3 family contains interesting anti-analysis techniques and clever use of standard cryptographic algorithms that are not well documented. These technical details would be valuable for malware researchers and threat hunters.
We also publish an open-source decryptor for this variant. You can download the tool from GitHub.
Calif would like to extend a special thank you to Chuong Dong – a malware expert who has previous experience with this ransomware family. During the initial analysis, we requested Chuong’s assistance to swiftly comprehend the file encryption scheme. His help proved highly valuable as we managed to quickly reimplement the decryptor.
Note that the screenshots and code snippets within this article assume that the encryptor is loaded at address 0xFA0000 instead of the default ImageBase of 0x400000. The decryptor is loaded at the ImageBase of 0x400000. In addition, the ransomware has many anti-debugging and obfuscation mechanisms. To bypass these protections and reproduce this analysis, please refer to Appendix A: Reverse engineering detail.
Table of contents
Encryption and decryption logic
Appendix A: Reverse engineering detail
Appendix B: Open-source decryption tool
Appendix C: Binary Information and Indicators of Compromise (IOCs)
Appendix D: IDC script to rename functions
Appendix E: Chunk counts and skip bytes
Encryption and decryption logic
The sample encrypts files using a combination of symmetric and asymmetric cryptography, as follows:
Generate a 64-byte random key for each targeted file. We will refer to it as the file_encryption_key. We identify the encryption algorithm as a variant of Salsa20. Normally, Salsa20 uses a 32-byte key, but this variant uses 64-byte. Please refer to the Modified Salsa20 section for more details. Unless specified otherwise, all references to Salsa20 in this document refer to this modified version.
Generate another 64-byte random Salsa20 key to encrypt the file_encryption_key. We will refer to this second key as the key_encryption_key. As an optimization to reduce the number of slow RSA encryption operations, the sample reuses this key for 1,000 files before generating a new one. This key reuse leads to a vulnerability described in the Keystream reuse vulnerability section.
Encrypt the key_encryption_keys using RSA with no padding, using a 1024-bit public key embedded within. We describe this algorithm in the RSA with no padding section. Note that since 2015 NIST has recommended against using 1024-bit RSA keys.
Encrypted file structure
The sample processes targeted files the same way during encryption and decryption. It divides each file into chunks of 0x20000 bytes. The sample does not pad the file if the file size or the size of the last chunk is less than 0x20000 bytes.
Consecutive chunks form a group. There are three group types: before, skip, and after group. There is exactly one “before group” at the beginning of the file. The skip group and the after group follow the before group and repeat alternatively throughout the rest of the file.
The sample encrypts chunks of the before group and after groups using Salsa20. It leaves chunks in the skip group unencrypted. It determines the number of chunks in each group based on the file size. Please refer to Appendix E for more details.
An encrypted file ends with a footer containing information about the file such as the file’s original name, number of chunks in each group, etc, including the file_encryption_key to decrypt the file data. The sample encrypts this footer, and appends it to the file after the encryption finishes. For a detailed description of the footer structure, refer to the next section.
The overall structure of an encrypted file can be visualized as follows:
Footer structure
We reconstruct the overall structure of the footer in the C snippet below:
struct file_encryption_info
{
char filename[file_encryption_info.filename_size]; // apLib compressed
uint16_t filename_size;
LARGE_INTEGER skipped_bytes;
int before_chunk_count;
int after_chunk_count;
uint8_t file_encryption_key[0x40];
};
struct key_encryption_info
{
uint16_t file_encryption_info_length; // necessary because filename is dynamically sized
int checksum;
union
{
struct
{
uint8_t key_encryption_key[0x40];
uint8_t checksum[0x40];
} decrypted;
uint8_t encrypted_key_encryption_key[0x80]; // RSA encrypted
} key_blob;
};
struct footer
{
struct file_encryption_info file_encryption_info; // Salsa20 encrypted
struct key_encryption_info key_encryption_info;
};
The file_encryption_info contains a randomly generated key to decrypt the file content. The file_encryption_info is encrypted using Salsa20. The key to decrypt the file_encryption_info is stored in the encrypted_key_encryption_key field of the key_encryption_info structure. This field, in turn, is encrypted using the RSA public key embedded in the ransomware.
The decryptor contains an embedded private key, and works as follows:
Read the key_encryption_info structure at offset 0x86 bytes from the end of the file.
Hash the encrypted_key_encryption_key field and verify it against the checksum field as seen here.
Decrypt the encrypted_key_encryption_key using the embedded private RSA key then validate the key_encryption_key with the decrypted.checksum field.
Calculate the start of the file_encryption_info structure using the file_encryption_info_length field.
Use the key_encryption_key to decrypt the file_encryption_info structure using the modified Salsa20 algorithm. This structure contains the Salsa20 file_encryption_key that can be used to decrypt the chunks in the before group and after group.
Modified Salsa20
The sample encrypts the file_encryption_info structure and the chunks using Salsa20 at address 0x00FA20AC.
Salsa20 has a 64-byte state that is used to generate a key stream to encrypt the plaintext one 64-byte block at a time. In the vanilla Salsa20 standard, the initial 64-byte state consists of a 32-byte key, an 8-byte block counter, an 8-byte nonce, and a 16-byte constant that spell “expand 32-byte k” in ASCII.
However, in this variant, the entire initial state is filled with random values. The aforementioned file_encryption_key and key_encryption_key are the initial states of the file encryption and key encryption processes respectively.
This finding shows that LockBit v3 is indeed a successor of BlackMatter, which in turn came from the Darkside ransomware family. Chuong’s analysis of Darkside shows that it also fills the Salsa20’s initial state, which Chuong called the matrix, with random values.
RSA with no padding
This sample encrypts key_encryption_info.key_encryption_key and key_encryption_info.checksum, using a custom implementation of the RSA algorithm at address 0x00FA17B4.
Recall that an RSA public key consists of two components:
The modulus N.
The public exponent e.
To encrypt a message m using RSA with no padding, you compute m^e (mod N). This encryption mode, which is known as textbook RSA, has many potential footguns. For example, it’s possible to recover small messages. Therefore, m is usually padded with PKCS v1.5 or OAEP padding schemes.
However, the sample uses no padding. We can’t find any obvious issues, because the sample only encrypts messages that have the same size as the modulus. In particular, it uses a 1024-bit key to encrypt key_encryption_info.key_encryption_key and key_encryption_info.checksum, which in total are also 1024 bits long.
File encryption
Before encrypting any files, the sample parses its embedded configuration at address 0x00FC600C. This data are encrypted by the function at 0x00FA6F48 and contain information such as configuration flags, file hashes to avoid, ransom note, and the RSA public key used to encrypt the randomly generated key_encryption_info.key_encryption_key.
After decrypting its configurations, the sample parses its command line arguments and enumerates target paths to encrypt files. The sample operates slightly differently depending on the command line argument. However, the file encryption logic is similar across different execution flows. The sample creates one thread for traversing and queueing files to be encrypted and multiple threads to actually encrypt the files. The threads communicate asynchronously with each other using an IO completion port.
At a high-level, the encryption threads work as follows:
The file traversal and queueing logic starts at 0x00FAF308.
It drops a ransom note in the current directory.
For each file in the current target directory, it verifies the filename against the lists of hashes to avoid. If the current filename doesn’t belong to any of the lists, it renames the current file and adds a unique extension. In our variant, the extension is .IzYqBW5pa.
It increases various counters, including a counter for the number of files using the current key_encryption_key. This key is randomly generated and reused once every 1,000 files. Once this counter reaches 1,000, the sample resets it back to 0 and generates a new key_encryption_key. This design introduces a bug that allows for the decryption without paying the ransom. This bug is described in detail in the Key stream reuse vulnerability section.
It fills out and sets up the key_encryption_info structure for the current file. The logic to set the before_chunk_count, skipped_bytes, and after_chunk_count is at address 0x00FAE8AC. These values are determined based on the current file size. Refer to Appendix E for the exact values of each field based on the current file size.
The file encryption thread logic starts at address 0x00FADE78. This function simply determines if it needs to encrypt the current chunk depending on the file_encryption_info structure. It uses a randomly generated key stored at file_encryption_info.file_encryption_key to encrypt each chunk using Salsa20. Finally, when the entire file is processed, it writes the footer structure to the end of the file.
File decryption
The decryptor binary LB3Decryptor.exe is not obfuscated and can be quickly analyzed statically.
Similar to the encryptor, the decryptor parses its command line arguments and enumerates paths to decrypt files. The sample also creates multiple threads for decrypting and one for traversing and queueing files. These threads communicate asynchronously with each other using an IO completion port.
At a high-level, the decryption threads work as follows:
The file traversal and queueing logic starts at 0x00403CEC. For each file, it decrypts the key_encryption_info (see Footer structure) at address 0x00403960. Then, it obtains the file_encryption_key and the chunk counts before queueing the file.
The file decryption thread logic starts at 0x004030DC. It decrypts the chunks selected by the grouping algorithm using the file_encryption_key. Finally, when the entire file is processed, it removes the encrypted footer structure at the end of the file.
Flaws
Keystream reuse vulnerability
This version of the LockBit v3 ransomware has a keystream reuse vulnerability.
Instead of directly encrypting the file_encryption_info structure with RSA, the sample aims to reduce the number of slow RSA operations by adding another layer of Salsa20 encryption. This is where it makes a mistake that may allow the recovery of a portion of the data.
The sample generates a random Salsa20 key_encryption_key to encrypt the file_encryption_info structure once every 1,000 files as seen below:
Therefore, the Salsa20 algorithm would generate the same key stream for 1,000 files from the same key. Within these 1,000 files, if there is a file with a sufficiently long compressed filename, we can recover enough of the keystream to decrypt the file_encryption_info structure of other files with a much shorter compressed filename. This file_encryption_info structure contains the file_encryption_key to decrypt the file content. In other words, if we happen to have a file with a sufficiently long compressed filename, chances are we can recover the content of other files with shorter compressed filenames without the private key from the threat actor, i.e., without paying the ransom.
For example, we created two short text files for our test case:
a.txt, whose compressed filename is: 61 e0 2e e0 74 e0 78 db 09 02 00 00
aABCDEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwxyz{}~123456789.txt, whose compressed filename is shown below:
The content of the encrypted file with the longer file name is shown here:
Because we know the current filename, we can compute the plaintext compressed filename. XOR-ing the encrypted compressed filename with the plaintext compressed filename gives us the following keystream:
The content of the encrypted a.txt is shown below with similar color-coded fields:
XOR-ing the entire file_encryption_info block, starting at offset 0x0e to offset 0x6c with the keystream above would give us the following bytes:
The recovered file_encryption_info fields are:
Compressed filename: 61 e0 2e e0 74 e0 78 db 09 02 00 00 (compressed a.txt)
filename_size: 0c 00 (0x0c)
skipped_bytes: 00 00 52 00 00 00 00 00 (0x520000 in little-endian)
before_chunk_count: 03 00 00 00 (0x03 in little-endian)
after_chunk_count: 03 00 00 00 (0x03 in little-endian)
file_encryption_key:
The recovered file_encryption_info structure allows us to decrypt the entire file following the decryption scheme described above.
Data corruption
This version of the LockBit v3 ransomware has a design flaw that can cause permanent data loss. LockBit v3 has a mutex checking mechanism to ensure only one instance of itself is running on the infected system:
However, this feature can be configured at build time and is disabled in our sample. The flag at byte_FC5129 is part of the sample’s encrypted settings configured by the TA and set by the builder. When this feature is disabled, multiple instances of the ransomware can run on the infected system at one time.
The sample needs to process each file with exclusive access to the encryption logic. To do that, it attempts to terminate other processes that prevent exclusive access to the file. The sample uses the restart manager family of APIs (RmStartSession(), RmRegisterResource(), RmGetList()) to get a list of processes with open handles to the file being encrypted. It then terminates all of those processes.
This design can cause permanent data corruption because of the following reasons:
With multiple instances of the same ransomware running on the same system, one instance can attempt to terminate the other instance that is encrypting the same file. In this case, the randomly generated file_encryption_key from the 1st instance can not be recovered. The file is permanently corrupted. We can detect the corruption by observing files with multiple extra extensions, signaling that the files were encrypted multiple times. Each instance of the ransomware can have multiple encryption threads running parallel, each of which encrypts one file at a time. Since the number of concurrent threads is quite low, the number of files being affected in this case can potentially be low.
The sample may attempt to terminate another process that is currently writing and modifying the current file. This may cause data corruption depending on how the affected process is designed. We can not easily detect this case. However, the number of files being affected can be very high depending on the services running on the infected system and their utilization. Calif has observed files that are properly decrypted but are corrupted and not recognized by their associated applications.
Conclusion
Analyzing the ransomware could provide critical intelligence when evaluating response strategies to ransomware attacks. In this case, Calif observed flaws in the ransomware design that allowed affected organizations to reconsider the true value of the ransom demand. We hope our analysis helps demystify the inner workings of one ransomware variant. We also hope to encourage more sharing of technical analysis, curated intelligence, and valuable lessons across organizations. Security demands collaboration as no organizations operate in a vacuum. The more secure our peers, the safer we are against cyber criminals.
Appendix A: Reverse engineering detail
Anti-debugging
Typically, malware does not want to be analyzed. With a debugger, we can easily control the malware’s execution, dump data, or force the malware to execute a specific code path. Therefore, malware usually contains multiple anti-debugging checks. We found multiple said checks in this sample.
The first check occurs at 0x00FA63C5 (offset 0x57C5 into the file) as seen below:
After manually resolving some Windows Application Programming Interfaces (APIs), the sample calls the RtlCreateHeap() function to create a new heap. The result is a HANDLE to a window HEAP structure provided by the operating system for the current process. This HEAP structure is undocumented by Microsoft. To better understand this structure, refer to other online resources regarding the Windows HEAP. Significant to anti-debugging mechanisms, the HEAP structure contains two flags: Flag and ForceFlag. These values change depending on whether the current process is running under a debugger.
In the screenshot above, the sample checks the Flag field, which is at offset 0x40 byte into the undocumented HEAP structure. The value of this Flag field is 0x40041062 as shown in the screenshot below:
The sample gets the most significant 4 bits of the flag by rotating the Flag field 28 (0x1c) bits to the right, and tests the result against 0x04. This effectively tests the most significant byte of the Flag field against 0x40000000 (HEAP_VALIDATE_PARAMETER_ENABLED) which is set if the current process is running under a debugger.
If the sample detects a debugger, it modifies the HANDLE to the current process’s heap using the rol operation. This causes the process to crash if it ever tries to allocate any memory using the modified heap HANDLE in the future.
A similar check of the heap’s ForceFlag field is shown below:
In the screenshot above, the sample finds the heap using the current process’s Process Environment Block (PEB). Then, it tests the ForceFlag field, which is at offset 0x44, against 0x40000000 to detect a debugger.
These checks are scattered around the sample’s logic near any heap operation. The easiest way to bypass these anti-debugging checks is to modify the process heap structures directly and reset both the Flag and ForceFlag fields’ most significant byte to 0x00.
This sample also contains the following additional anti-debugging features:
Checking beyond the bound of the allocated heap memory against magic constants like 0xABABABAB. These magic constants come from a Windows feature that adds additional guardrails to heap memory to quickly detect memory corruption bugs. This feature is only enabled if the current process is running under a debugger. This check can also be bypassed by modifying the Flag and ForceFlag fields of the heap.
Calling NtProtectVirtualMemory() and RtlEncryptMemory() to encrypt the DbgUiRemoteBreakin() function. This causes the current process to crash if there is any attempt to attach a debugger afterwards. This does not have any effect if we start executing the sample using the debugger.
Calling NtSetInformationThread() with ThreadHideFromDebugger (0x11) for the current thread. This call only happens a few times at the beginning of the execution flow. A quick way to bypass this is patching the function to simply return NT_SUCCESS (0x00).
Obfuscation
Manual API resolution
To avoid leaking capabilities and being tracked using the import hash, this sample manually resolves Windows APIs using the PEB.
The PEB contains all the properties of its associated process, including a list of loaded DLLs. The sample can walk this list of DLLs and their export tables to manually find addresses of the necessary Windows APIs. To further avoid leaking strings, the sample manually resolves Windows APIs using a hashing algorithm shown below:
The sample applies the hashing algorithm above on the DLLs and their export names to find a match instead of comparing strings normally. However, in addition to the “Addition-Rotate Right 13” operation, the sample also XORs the result with the 0x10035FFF constant. This results in a set of API hashes that are different from other malware families using a similar technique.
Trampoline code
The sample doesn’t use the resolved APIs directly. Instead, for each API, it allocates a small memory chunk and builds a small piece of trampoline code which calculates and jumps to the target API.
For example, instead of executing a standard indirect call to NtOpenProcess() as an import, the sample calls to a pointer at address 0x00fc5474, which points to a function at address 0x00330bc8 on the heap. This function is shown below:
After the rol operation, eax becomes 0x7743FC50, which is the address of ntdll!NtOpenProcess(). This makes static analysis significantly more tedious. We would have a hard time tracking all the calls to the trampoline code.
Because we can bypass the anti-debugging checks, we can use a debugger to help us automate the renaming of the trampoline calls. The logic to resolve APIs and setup the trampoline code is at 0x00FA5dA0. Using the IDA Free debugger, we can set a breakpoint at 0x00FA5DDB, which is the instruction right after the call to manually resolve Windows APIs. Then, we can edit the breakpoint to execute the following one-liner:
fprintf(fopen("out.txt", "a+"),
"%s\n",
sprintf("%a -- %s",
GetRegValue("edi"),
get_name(GetRegValue("eax"))
)
)
This small snippet tells the IDA Free debugger to log the following items to the file “out.txt” in the current working directory:
The current value of the edi register. This is the address of the trampoline code. In our example, this would be 0x00FA5dA0.
The name of the value in the eax register. The eax register holds the address of the resolved API. In our example, eax would be 0x7743FC50. Within the current process context, this is the address of ntdll!NtOpenProcess().
Once the breakpoint is ready, we can let the sample execute through all the API resolution logic. At the end, we should see the out.txt file that looks similar to this:
After the sample finishes resolving all the APIs, we can dump the current process including all of its allocated memory for further analysis. Then, we can write a small IDA script to parse out.txt and rename all the trampoline calls to the appropriate APIs. This will help speed up our static analysis significantly. An example of such a script is available in Appendix D.
Appendix B: Open-source decryption tool
The leaked LockBit v3 builder generated the encryptor and decryptor for Windows. Although the decryptor can run on Linux using Wine, Calif decided to re-implement the decryption logic in C for the following reasons:
We want to run the decryptor natively on VMWARE ESXi.
We want to confirm our understanding of the encryption scheme.
We want to avoid executing the malware author’s decryptor which may contain other data corruption bugs.
Other affected organizations may also find our decryptor useful.
This section describes how we build a decryption tool for Linux. The tool is open-source and can be downloaded from GitHub.
Extracting the decryption function
Calif identified the two crypto functions to be Salsa20 (0x00FA20AC) and RSA with no padding (0x00FA17B4). Initially, instead of fully reverse-engineering these functions, we take the code directly from the binary and run it as shellcode inside a C wrapper. Calif’s decisions were based on the following reasons:
It would take us too long to fully analyze and confirm the algorithms.
The sample uses a custom implementation of the two algorithms. Therefore, re-implementation or using a standard library may introduce discrepancies and bugs.
We extract the following items directly from the ransomware into shellcode that we can call using our wrapper:
The Salsa20 encryption function and related functions
The Raw RSA function and related functions
The checksum calculation algorithm
The APLib compression function and related functions
When preparing these functions, we also fix any absolute address references so we can call them correctly in our wrapper without causing a crash.
Implementation
The sample is compiled for a 32-bit Windows environment. To get the shellcode to run correctly, we also need to compile our code for this environment. By default, GCC would default to the cdecl calling convention, but Windows uses stdcall. We fix that by adding the attribute (__attribute__((stdcall)).
The data section of an executable is marked as non-executable, therefore we can not execute the shellcode directly from there. Instead, we allocate new memory pages with executable permission and copy the shellcode over.
Appendix C: Binary Information and Indicators of Compromise (IOCs)
The following IOCs come from our specific build of this variant of LockBit v3. Here are the components that may be different across different builds:
The unique ID for this build: IzYqBW5pa.
File hashes other than the hash of the icon and desktop background.
Binary information
Filename: LB3.exe
File type: Windows Portable Executable (PE) x86
File size: 156,160
SHA256 hash: f34dd8449b9b03fedde335f8be51bdc7f96cda29a2dde176c3db667ba0713c6f
Filename: LB3Decryptor.exe
File type: Windows Portable Executable (PE) x86
File size: 33,280
SHA256 hash: 8f0a2d5b47441fbcf1882aa41cae22fd0db057ccc38abad87ccc28813df3a83c
Indicators of Compromise
Host-based indicators (HBIs)
Volatile:
When configured, the sample creates the following mutex: Global\a91a66d6abc26041b701bf8da3de4d0f where a91a66d6abc26041b701bf8da3de4d0f is calculated from the embedded RSA private key
Files
Filename: C:\ProgramData\IzYqBW5pa.ico where IzYqBW5pa is the unique ID for this specific variant.
File type: ICO
File size: 15,086
SHA256 hash: 95e059ef72686460884b9aea5c292c22917f75d56fe737d43be440f82034f438
Filename: C:\ProgramData\IzYqBW5pa.bmp.
File type: BMP
File size: 86,708
SHA256 hash: ef66e202c7a1f2a9bc27ae2f5abe3fd6e9e6f1bdd9d178ab510d1c02a1db9e4f
Filename: IzYqBW5pa.README.txt.
File type: TXT
File size: 6,197
SHA256 hash: af23f7d2cf9a263802a25246e2d45eaf4a4f8370e1b6115e79b9e1e13bf20bfe
Registry:
Path: HKEY_CLASSES_ROOT\.IzYqBW5pa\DefaultIcon
Value: C:\ProgramData\IzYqBW5pa.ico
Network-based indicators (NBIs):
When configured, the sample communicates with the configured C2 server using HTTP Protocol POST method. This specific variant is not configured with a C2 server.
When communicating with the C2 server, the sample uses the following User-Agent string: Chrome/91.0.4472.77.
Communication with the C2 server is encrypted using the AES algorithm. This specific variant is not configured to communicate with the C2 server. Therefore, it also does not contain the AES key.
Appendix D: IDC script to rename functions
#include <idc.idc>
static process(line) {
// example line: .data:00FC5410 -- ntdll_RtlCreateHeap
auto idx = strstr(line, " -- ");
// saddr: .data:00FC5410
auto saddr = substr(line, 0, idx);
// name: ntdll_RtlCreateHeap
auto name = substr(line, idx + 1, -1);
// old saddr: .data:00FC5410
// new saddr: 00FC5410 as a string
// addr : 0x00FC5410
auto _idx = strstr(saddr, ":");
saddr = substr(saddr, _idx + 1, -1);
auto addr = xtol(saddr);
// old name: ntdll_RtlCreateHeap
// new name: RtlCreateHeap
_idx = strstr(name, "_");
name = substr(name, _idx + 1, -1);
auto len = strlen(name);
// NULL terminate the last byte
name[len-1] = '\0';
Message("Addr: 0x%x, name: %s\n", addr, name);
set_name(addr, name, SN_NOCHECK|SN_FORCE);
}
static load_file() {
auto fd = fopen("out.txt", "r");
auto line = readstr(fd);
while (value_is_string(line)) {
process(line);
line = readstr(fd);
}
fclose(fd);
return 0;
}
static main() {
load_file();
}
Appendix E: Chunk counts and skip bytes
Grouping algorithm
Each chunk of the file belongs to one of the three groups (before group, skip group, after group). But for the sake of simplicity, let’s only consider the state of each chunk: encrypted (before group or after group), or unencrypted (skip group).
To determine if a chunk needs encrypting or decrypting, we can use the following algorithm:
chunk_state = ''
# first 'before_chunk_count' chunks belong to before group and are encrypted
crypt_chunk_count = before_chunk_count
skip_chunk_count = (skipped_bytes / 0x20000) -1
skip_count = skip_chunk_count
for chunk in chunks:
if (crypt_chunk_count):
chunk_state = "en(de)crypt"
crypt_chunk_count = crypt_chunk_count - 1
else:
chunk_state = "skip" # belongs to a skip group
skip_count = skip_count - 1
if (skip_count == 0):
crypt_chunk_count = after_chunk_count
The decrypted file_encryption_info contains the value for before_chunk_count, after_chunk_count and skipped_bytes. To see how and where they are generated, refer to the File encryption section. The sample determines the chunk count based on the file size as described in the following table: