By

Minimal English

It’s a long journey of writing from Egyptian hieroglyphics to Proto-Sinaitic to Phoenician to Greek to the Latin alphabet we use for English. But what if things had gone differently? For instance, what if unique signs were not developed or utilized for vowels? Or what if the Phoenicians only shared some of their letters? What is some technology was optimized for representing as few characters as possible instead of the unlimited transmission bandwidths and Unicode we are accustomed to now?

English already has more sounds than letters, so makes use of digraphs and other letter combinations to represent its sounds. What’s the minimum number of characters we would need to represent the sounds of (American) English?

It turns out, using only ten letters, you can represent all the sounds of (some) American English:
H K L N P R S T W Y

These each roughly represent a place of articulation (e.g. T is dental). Two characters can also act as modifiers, H as we use in English currently to show frication, and N as is done in Greek to represent voicing.

For example, T is /t/ and NT is /d/. For fricatives, TH is /ΞΈ/ and you guessed it, NTH is /Γ°/. Vowels, being a later development by an uncreative inventor, are represented by double letters. I’ve chosen to represent American English vowels more or less.

The problem with the system is a massive amount of ambiguity if you smash all the letters together to form a word. Perhaps the “ancient” inscribers of Minimal English didn’t mind. But modern readers and writers decided a visual affordance was needed.

The system I’ve developed strives for regularity and some (imperfect) phonetic or featural connection with the sequence of characters to the sound it represents.

PTSKNYLHRW
_/p//t//s//k//n//j//l//h//r//w/
N_/b//d//z//g//Ε‹//m/
_H/f//ΞΈ//Κƒ//Ε‹/
N_H/v//Γ°//Κ’/
_ _/Ιͺ//Ι›//o//ʊ//Ι™/ /ʌ//i//Γ¦//Ι‘/ /Ι”//ɚ//u/

To encode this sentence: “Hello this is the newest version of the minimal alphabet” we would write:

H.TT.L.SS.W NTH.PP.S PP.NS NTH.NN N.WW.PP.S.T NPH.RR.NSH.PP.N NN.NPH NTH.NN NW.PP.N.PP.NW.NN.L LL.L.PH.NN.NP.TT.T

With a computational data unit of 4 bits, we could encode 24 = 16 characters. This is plenty for the 10 alphabetic characters, one phoneme separator (I’ve used a period), one work separator (I’ve used a space), and four other characters to spare, perhaps including other types of punctuation.

Of course, the characters needed to represent words ends up much larger. Take the word “judgmental”. Already a long word but in MinEng it would be spelled: NT.NSH.NN.NT.NSH.NW.TT.N.T.NN.L – a mere 31 characters. /dΝ‘Κ’/ itself requires 7 characters as opposed to one ⟨j⟩ or two ⟨dg⟩. Have we actually saved any bandwidth?

Basic Latin characters in Unicode only require one byte, so 8 bits. “Judgmental” takes up 80 bits. In MinEng it would be 124 bits if I have my math right. But the difference is, MinEng is encoding phonemic English with more clarity than our current Latin system. So it’s really comparing LL.P.NN.L.NS and HH.R.NN.N.NT.NSH.PP.NS.

Leave a comment