Personal blog of Gunnari Auvinen. I enjoy writing about software engineering topics.

Picture of the author, Gunnari

Print emojis to the terminal in C

April 19, 2021

Recently I've started to work on building my own shell in C, which is one of the projects for the CSI class I'm taking. The first thing I had to decide is what would be the appearance of my shell. You may be asking yourself, "What would I want as part of my shell's appearance?" The answer should be obvious! Emojis are clearly the only correct answer 🚀

For now, I'm going to call my shell unishell. I bet you've guessed which emoji will be part of prompt. If you guessed 🦄, you'd be correct! After I made this choice I realized that I wasn't entirely sure how I'd print out the emoji, as those clearly weren't covered in K&R.

Looking online I was able to find the Unicode sequence for the unicorn, which is U+1F984. I knew that I'd need an escape sequence to print it in C, but I wasn't sure exactly what it was. After a bit of searching I found out that there are two escape sequences, which are \u and \U.

The \u is used for unicode characters that can be encoded in four hex characters. On the other hand, \U is used for unicode characters that are five or more hex characters long. Initially I didn't realize there was a \U escape sequence, so I tried using u1F984, which didn't turn out exactly as planned 😅 The code for the first attempt is below.

// unishell.c
#include <stdio.h>

int main() {
    printf("\u1f984");
    return 0;
}

When I ran the emoji1 executable file, I got this output 😬

Terminal output with incorrect escape sequence

At first I was scratching my head for a moment, but then I realized it made sense. The unicorn emoji is five hex characters long and the translation on the screen is the first four characters translated to one character, with the leftover 4 getting printed as a 4. It turns out that U+1F98 is the Greek capital letter Eta.

Following that incorrect solution, I tried to change the \u1F984 to a \U1F984, but CLion showed that was an error. Unfortunately it didn't say what the error was 🤦🏼‍♂️ It turns out that you must pad the front of the Unicode characters with 0s until there are eight characters plus the escape sequence, which is \U0001F984. With this update, I got the output below 🎉

Terminal output with correct escape sequence

Here's the final code for reference.

#include <stdio.h>

int main() {
    printf("\U0001f984");
    return 0;
}

With this solved, I can get onto the rest of the unishell implementation!