A while back, as I was digging around inside a VMKernel core dump for an unnecessary amount of time, I started to wonder if I could write a program that would print an arbitrary string as its backtrace. You know, instead of outputting it onto the screen.
All that you would have to do is read a message from argv and convert each character into a call to a function of the same name. In other words, supplying the string hello world would result in calling the functions h(), e(), l(), l(), o(), and so on. Turns out, writing such a program is possible. Its backtrace looks something like this:

The Program
GDB prints the innermost stack frame at the top of the backtrace, continuing until the outermost frame is reached. Our program then has to call the functions making up the letters in reverse order. This was achieved by way of a linked list. Once all characters had been added to the list, the first node’s function pointer is called, which calls the second node’s function pointer, and so on. After much tinkering, I managed to get it all to work. The end result:
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#define ASCII_LOWER_LBOUND 97
#define ASCII_LOWER_UBOUND 122
#define ASCII_UPPER_LBOUND 65
#define ASCII_UPPER_UBOUND 90
#define ASCII_INTERVAL (97 - 65)
typedef enum RetVal {
ERR_OK = 0,
ERR_TOO_FEW_ARGS,
ERR_NON_ALPHABET
} RetVal;
#define BAIL(__bail_retval) { retval = __bail_retval; goto finish; }
typedef struct LetterNode {
void (*func)(struct LetterNode *node);
struct LetterNode *next;
} LetterNode;
#define DEF_LETTER_FUNC(__def_letter_func_letter) \
void __def_letter_func_letter (LetterNode *node) \
{ \
if (node->next != NULL) node->next->func(node->next); \
int *p = 0; \
*p = 1; \
}
DEF_LETTER_FUNC(A);
DEF_LETTER_FUNC(B);
DEF_LETTER_FUNC(C);
DEF_LETTER_FUNC(D);
DEF_LETTER_FUNC(E);
DEF_LETTER_FUNC(F);
DEF_LETTER_FUNC(G);
DEF_LETTER_FUNC(H);
DEF_LETTER_FUNC(I);
DEF_LETTER_FUNC(J);
DEF_LETTER_FUNC(K);
DEF_LETTER_FUNC(L);
DEF_LETTER_FUNC(M);
DEF_LETTER_FUNC(N);
DEF_LETTER_FUNC(O);
DEF_LETTER_FUNC(P);
DEF_LETTER_FUNC(Q);
DEF_LETTER_FUNC(R);
DEF_LETTER_FUNC(S);
DEF_LETTER_FUNC(T);
DEF_LETTER_FUNC(U);
DEF_LETTER_FUNC(V);
DEF_LETTER_FUNC(W);
DEF_LETTER_FUNC(X);
DEF_LETTER_FUNC(Y);
DEF_LETTER_FUNC(Z);
DEF_LETTER_FUNC(_);
#define LETTER_FUNC_PTR(__letter_func_ptr_ch) (void (*)(LetterNode*)) \
(&A + ((&B - &A) * (__letter_func_ptr_ch - 'A')))
#define SPACE_FUNC_PTR() (void (*)(LetterNode*))(&_)
void freeLetterNodeList (LetterNode *head)
{
LetterNode *curr = head;
LetterNode *next;
do {
next = curr->next;
free(curr);
curr = next;
} while (next != NULL);
}
int main (int argc, char **argv)
{
LetterNode *head = NULL;
LetterNode *curr = NULL;
int msg_len = 0;
char ch;
RetVal retval = ERR_OK;
if (argc < 2)
BAIL(ERR_TOO_FEW_ARGS);
for (int i = argc - 1; i >= 1; --i) {
msg_len = strlen(argv[i]);
for (int j = msg_len - 1; j >= 0; --j) {
ch = argv[i][j];
if (ch >= ASCII_LOWER_LBOUND && ch <= ASCII_LOWER_UBOUND) {
ch -= ASCII_INTERVAL;
}
else if ((ch >= ASCII_LOWER_LBOUND
&& ch <= ASCII_LOWER_UBOUND) == 0
&& (ch >= ASCII_UPPER_LBOUND
&& ch <= ASCII_UPPER_UBOUND) == 0) {
BAIL(ERR_NON_ALPHABET);
}
if (curr == NULL) {
curr = malloc(sizeof(LetterNode));
curr->next = NULL;
if (head == NULL)
head = curr;
}
else {
curr->next = malloc(sizeof(LetterNode));
curr = curr->next;
}
curr->func = LETTER_FUNC_PTR(ch);
}
if (i > 1) {
curr->next = malloc(sizeof(LetterNode));
curr = curr->next;
curr->func = SPACE_FUNC_PTR();
}
}
head->func(head);
finish:
if (head != NULL)
freeLetterNodeList(head);
switch (retval) {
case ERR_TOO_FEW_ARGS:
printf("Too few arguments supplied."
"\nUsage: ./printer The message to print\n");
break;
case ERR_NON_ALPHABET:
printf("Non-alphabet character encountered."
" Currently only a-z and A-Z are supported.\n");
break;
}
return retval;
}
We will proceed to compile the program using GCC, by excuting the command gcc -O0 ./btprinter.c -o ./btprinter. To run the program, we will excute ./btprinter Hello World. The result is a core dump:
thebel:/home/thebel/programming/blog/misc$ ./btprinter Hello World
Segmentation fault (core dumped)
Next, we will open the core file using GDB (in my case gdb ./btprinter -c core.3796) and take a look at the backtrace:
(gdb) bt
#0 0x0000560b0b99738c in H ()
#1 0x0000560b0b9972ae in E ()
#2 0x0000560b0b997498 in L ()
#3 0x0000560b0b997498 in L ()
#4 0x0000560b0b99756a in O ()
#5 0x0000560b0b9978b2 in _ ()
#6 0x0000560b0b99779a in W ()
#7 0x0000560b0b99756a in O ()
#8 0x0000560b0b99763c in R ()
#9 0x0000560b0b997498 in L ()
#10 0x0000560b0b997268 in D ()
#11 0x0000560b0b997aed in main ()
(gdb)
The backtrace shows us the message we had entered. To keep the program short, it only prints capital letters and the underscore (which it substitutes for spaces).
How It Works
Let’s have a look at how the program works, step by step. First, we have defined a number of constants:
#define ASCII_LOWER_LBOUND 97
#define ASCII_LOWER_UBOUND 122
#define ASCII_UPPER_LBOUND 65
#define ASCII_UPPER_UBOUND 90
#define ASCII_INTERVAL (97 - 65)
These are the boundaries of the ASCII ranges a-z and A-Z (see https://www.asciitable.com/). We will use these boundaries to identify whether we have a function with the right name available. Remember that we need to call functions whose names are the same as the letters in the phrase we want to print.
The next few lines define an enum that holds the possible error states of the main() function. This is followed by a pre-processor macro that jumps to the finish label in main(). The reason for doing it this way is so the program can break out of nested loops and exit gracefully. The values making up enum RetVal will be explained when we look at main().
typedef enum RetVal {
ERR_OK = 0,
ERR_TOO_FEW_ARGS,
ERR_NON_ALPHABET
} RetVal;
#define BAIL(__bail_retval) { retval = __bail_retval; goto finish; }
To store the letters making up the message that will be printed in the backtrace, I decided to use a linked list. It consists of a function pointer and a pointer to the next node in the list. The function pointer for each node will be the function corresponding to the letter that will be printed at that point when the list is traversed.
For more on function pointers, check out https://www.geeksforgeeks.org/function-pointer-in-c/.
typedef struct LetterNode {
void (*func)(struct LetterNode *node);
struct LetterNode *next;
} LetterNode;
The next bit of code is another macro. It is used to make it easier to define all functions corresponding to printable characters. Essentially, the macro defines a template for our character functions.
#define DEF_LETTER_FUNC(__def_letter_func_letter) \
void __def_letter_func_letter (LetterNode *node) \
{ \
if (node->next != NULL) node->next->func(node->next); \
int *p = 0; \
*p = 1; \
}
The pre-processor will replace __def_letter_func_letter with the letter supplied as this macro’s argument. Say, we want to define a function for the character A. To do so, we simply “call” DEF_LETTER_FUNC(A);. The same goes for the remaining characters. The function generated by the macro takes a list node of type LetterNode *node. As we saw in the previous code listing, such a list node represents one character in the string we will print.
Inside the body of the function generated by the macro is an if condition and two statements that crashes the program. The former is used to determine whether the program should (a) call the next function in the linked list or (b) crash the program and generate a core dump (if enabled in the OS).
The next bit of code is simply the list of “calls” to the macro, one for each printable character (edited for brevity):
DEF_LETTER_FUNC(A);
DEF_LETTER_FUNC(B);
DEF_LETTER_FUNC(C);
// ...
DEF_LETTER_FUNC(Z);
DEF_LETTER_FUNC(_);
The above bit of code is followed by perhaps the most confusing lines. Admittedly, I should have figured out a clearer way of achieving what I wanted. But I didn’t. So we’re just going to have to suffer through this monstrosity. The lines below represent two macros, the first for computing the address of the function representing a letter A-Z, while the second “returns” the address of the underscore function _() that stands in for spaces in the printed string:
#define LETTER_FUNC_PTR(__letter_func_ptr_ch) (void (*)(LetterNode*)) \
(&A + ((&B - &A) * (__letter_func_ptr_ch - 'A')))
#define SPACE_FUNC_PTR() (void (*)(LetterNode*))(&_)
How do these macros work? The compiled code of a program is placed in memory when the program is run. This includes all of the functions in the program. Every location in memory possesses an address. From that follows that each function can be called by knowing its address in memory. Because functions A() through Z() are (a) defined one after the other and (b) the length of their code is equal, we can compute the address of a letter easily:
address_of_target_func = address_of_func_A + (size_of_func_A * offset_of_letter_from_A)
The offset_of_letter_from_A is the difference between the ASCII code for your target letter and the ASCII code for the first letter, A. Let’s look at how the code reflects the above formula:
#define LETTER_FUNC_PTR(__letter_func_ptr_ch) (void (*)(LetterNode*)) \
(&A + ((&B - &A) * (__letter_func_ptr_ch - 'A')))
// ^ ^-------^ ^--------------------------^
// | | |
// | | offset_of_letter_from_A
// | |
// | size_of_func_A
// |
// address_of_func_A
This is the calculation that the macro LETTER_FUNC_PTR() performs. By comparison, SPACE_FUNC_PTR() is trivial.
So with all of that in our back pocket, we can turn to our first bona fide function definition. As it happens, this function is completely superfluous given how the program works. How come? The nodes of the linked list containing our string are allocated on the heap using malloc(). This means that the programmer is responsible for cleaning up any uneeded allocations using free(). If we don’t do this, we’ve got a memory leak. To do the cleanup, I wrote the function freeLetterNodeList(), which gets rid of the entire linked list of characters. However, the only time it’s called is right before the program terminates. When a program terminates, the OS de-allocates all of the memory belonging to the program anyway, so the function is unnecessary. Also, I never bothered to call freeLetterNodeList() when the program crashes (to produce the core dump). So not only is the function unnecessary, I’ve also implemented it inconsistently. Shame on me.
void freeLetterNodeList (LetterNode *head)
{
LetterNode *curr = head;
LetterNode *next;
do {
next = curr->next;
free(curr);
curr = next;
} while (next != NULL);
}
Finally, the main() function. This time, instead of dissecting it, I think it’s clearer to cut out everything but the essentials and throw in a few comments to explain how it works:
int main (int argc, char **argv)
{
// ...
// Iterate through words passed to program.
for (int i = argc - 1; i >= 1; --i) {
msg_len = strlen(argv[i]);
// Iterate through letters in current word.
for (int j = msg_len - 1; j >= 0; --j) {
ch = argv[i][j];
// Figure out whether `ch` is a letter or space.
// ...
// Allocate list node.
if (curr == NULL) {
curr = malloc(sizeof(LetterNode));
curr->next = NULL;
if (head == NULL)
head = curr;
}
else {
curr->next = malloc(sizeof(LetterNode));
curr = curr->next;
}
// Set letter function.
curr->func = LETTER_FUNC_PTR(ch);
}
// Set space function.
if (i > 1) {
curr->next = malloc(sizeof(LetterNode));
curr = curr->next;
curr->func = SPACE_FUNC_PTR();
}
}
// Call the first list node's function, producing the backtrace.
head->func(head);
finish:
// Handle errors.
// ...
}
And that is the metamorphosis from a nonsensical idea to a frivolous programming exercise. I hope you enjoyed reading this as much as I enjoyed writing it.