banner

For a list of BASHing data 2 blog posts see the index page.    RSS


How to hide a number in plain sight

There are many invisible characters in Unicode; there's even a website devoted to them. Some of the invisibles don't take up any space when printed, like the "zero width non-joiner", U+200C. Below I've printed this character three times between an "a" and a "b", then looked at the result in my terminal emulator and in Mousepad text editor:

printf "a\u200c\u200c\u200cb\n" > test1

test1

The invisibility and zero-width-ness of this character (and others like it) lend themselves to a simple kind of secret messaging. Here I'll hide the 4-digit PIN number "3054" inside a sentence, with a "zero width non-joiner" repeated 4, 1, 6 and 5 times:

printf "No problem, André, lo\u200c\u200c\u200c\u200coking forw\u200ca\u200c\u200c\u200c\u200c\u200c\u200crd to hearin\u200c\u200c\u200c\u200c\u200cg from you.\n" > test2

test2

To recover the number I can use AWK:

awk -v FPAT="[\xe2\x80\x8c]+" '{for (i=1;i<=NF;i++) printf ("%s",length($i)-1); print ""}' test2

decodingtest2

in this command, AWK processes "test2" with fields defined by a pattern (-v FPAT), which here is "one or more zero width non-joiner characters", and with the character represented by its 3-byte hexadecimal code (-v FPAT="[\xe2\x80\x8c]+"). AWK does a loop through such fields (for (i=1;i<=NF;i++)) and for each such field it prints one less than the length of the field in characters, without a trailing newline (printf ("%s",length($i)-1)). A final command (print "") adds a newline to finish.

Un-hiding the number is easy enough, but hiding it programmatically is a little harder. One way to do it is with this function:

hide-num() { paste -d"\0" <(printf "$2" | fold -w3) <(printf "$1" | fold -w1 | awk -v ghost="$(printf "\u200c")" '{print ""; $1=($1+1); for (i=1;i<=$1;i++) printf ghost; print ""}') | paste -s -d"\0"; }

The "hide-num" function takes two arguments: the number to be hidden and a string in which to hide it. The function starts with a paste command that joins two command outputs side-by-side with no separator: paste -d"\0". The first output to be pasted is the string split into a list of 3-character strings with fold -w3. If the string is "Here is a sentence long enough to hide the number.", the output looks like this:

hide-num1

The second output to be pasted to the first one starts by printing the number to be hidden. This is piped to fold -w1 to turn the number into a list of one-character digits. This list is sent to an AWK command.

The AWK command begins by defining the variable "ghost" as the output of printf "\u200c", because AWK (GNU AWK) doesn't yet process "\uXXXX" codes as characters very easily. In the action part of the command, AWK first prints a blank line and a newline (print ""), then adds 1 to each of the single digits in the folded list ($1=($1+1). Next it loops through the numbers from 1 to the list-number-plus-1, and each time it does this it printfs "ghost" (\u200c), but without adding a newline (for (i=1;i<=$1;i++) printf ghost). Finally, AWK prints another blank line and newline. This command is carried out for each of the listed digits. Here's how that works with "ghost" set to the letter X and with the seed number 3054:

hide-num2

Back to the function definition. This second output is pasted to the first, adding strings of zero width non-joiner characters at several places in the list of 3-character bits from the sentence. Here's the result with "X":

hide-num3

The final step in the function is to convert the list into a single long string with paste -s -d"\0":

hide-num4

Below I feed the "hide-num" function with the number 50679 to be hidden in "Here is a sentence long enough to hide the number.", generating the file "test3". "test3" is then checked with cat for the invisibility of its added characters, and finally I use the recovery command to reveal the secret number. I've saved that recovery command in the function "show-num":

show-num() { awk -v FPAT="[\xe2\x80\x8c]+" '{for (i=1;i<=NF;i++) printf ("%s",length($i)-1); print ""}' "$1"; }

showhide

As a cryptographic trick it's not very sophisticated, but it works well in plain-text programs. I emailed "test3" to myself both in the message text and as an attachment, and in both cases I could recover 50679 with "show-num".

See this 2021 blog post for a way to watermark a text file with an invisible character.


Next post:
2025-06-20   Beware these characters in a terminal


Last update: 2025-06-13
The blog posts on this website are licensed under a
Creative Commons Attribution-NonCommercial 4.0 International License