banner

For a list of BASHing data 2 blog posts see the index page.    RSS


My shell and my browser don't understand each other

A small annoyance in writing this blog is that some characters in commands need to be replaced with other characters when representing them in HTML.

For example, I often have "&&" in a command but "&amp;&amp;" in a webpage showing that command, while "<" in a command becomes "&lt;" in HTML.

I also don't trust browsers to correctly display non-ASCII characters like "ä", even though I have <meta charset="utf-8"> in the webpage header. For that reason I write non-ASCII characters in webpage code as their HTML named-entity equivalent, in this case "&auml;".

See this post where I talk about ìèñëèâñüêå, and have a look at the page source code.

The annoyance goes both ways, too. If I take webpage text (from the webpage code) and paste it into my terminal, the shell complains:

ampersands

So both my shell and my browser understand those exceptional characters, but their spelling is a bit different. That raises an interesting question: is there a shared spelling? In other words, is there a single way to represent those characters that both shell and browser understand?

Not that I know of. To have certain characters display properly on a webpage I have to use either a named or numerical HTML entity. The various direct ways to represent characters in the shell, like

characters

just don't work in HTML, although the reformatted &#38; (decimal ampersand) and &#x26; (hexadecimal ampersand) will.

I've seen custom scripts for converting to and from HTML entities, and for the few regularly annoying characters I could build a lookup table for conversions with AWK. What I did instead was write my own simple script based on the GNU recode utility:

#!/bin/bash
xclip -o | recode utf-8..html | xclip -selection clipboard
exit 0

This is a BASH-to-HTML script. I highlight some text in my terminal, launch the script with a keyboard shortcut, then Ctrl + v paste into an HTML document. recode converts the usual suspects (<, &, >) into HTML named entities. The script also works (obviously) with text highlighted in any application and pasted into the same or any other application. Example in Mousepad text editor:

recode1

It's not a perfect solution, because I like to keep quotes (") as quotes and recode returns their HTML version:

recode2

Yes, I do know about the HTML <pre> tag, but it only displays text in a fixed-width font.


Next post:
2025-10-03   Too many keyboard shortcuts to remember easily?


Last update: 2025-09-26
The blog posts on this website are licensed under a
Creative Commons Attribution-NonCommercial 4.0 International License