IRM (Improved Readability Minglish) is a system for transcribing
English. It is mostly, but not entirely, phonemic, and has been
designed to look not too alien to readers of standard English.
I regard IRM as an essentially completed project; this document
describes version 1.10, current
as of 7/17/2004.
This document is in three parts. The first is an informal
overview of IRM, the second is an algorithmic description of how to
transform a text into IRM, and the third is several samples of
transcribed text. There are two appendices, the first concerning
the handling of dialectal and foreign sounds not covered by the main
document, and a second discussing IRM's history.
Most of the consonants of English are represented exactly
as you would expect. Here is a table of the IRM consonant
spellings that might need explanation:
Spelling |
IRM example |
Traditional spelling |
ce (1) |
dance |
dance |
ch |
chin |
chin |
ck (2) |
buckl |
buckle |
dh |
dhen |
then |
dj (2) |
pidjn |
pigeon |
ng |
sing |
sing |
ngh |
finghr |
finger |
nk |
bank |
bank |
nqu |
tranqul |
tranquil |
nx |
jinx |
jinx |
nxh |
anxhus |
anxious |
qu |
quit |
quit [same as kw] |
sh |
ship |
ship |
thik |
thik |
thick |
x |
box |
box [same as ks] |
xh |
axhn |
action [same as ksh] |
xz |
exzact |
exact [same as gz] |
zh |
baezh |
beige |
Notes:
Here is a table of the vowels, diphthongs and other combinations of
IRM. For reasons to be explained shortly, no spelling is shown
for the schwa sound.
Spelling |
IRM example |
Traditional spelling |
a |
ask |
ask |
aa (1) |
baa |
baa |
ae |
saem |
same |
ah (2) |
spah |
spa |
air |
fair |
fair |
ar |
hard |
hard |
aw |
saw |
saw |
e |
end |
end |
e (3) |
be |
be |
ee |
seed |
seed |
eh (1) |
yeh |
yeah |
er |
teraen |
terrain |
i |
if |
if |
i (1) |
hi |
high |
ie |
wied |
wide |
ir |
pir |
pier |
o |
pot |
pot |
o (1) |
go |
go |
oe |
doez |
doze |
ol |
told |
told |
oo |
look |
look |
ool |
wool |
wool |
oor |
poor |
poor |
or |
bore |
bore |
ow |
town |
town |
oy |
boy |
boy |
u |
sun |
sun |
u (1) |
blu |
blue |
ue |
fluet |
flute |
uh (1) |
duh |
duh |
ul |
kullr |
color |
ur |
fur |
fur |
y (4) |
daely |
daily |
Notes:
In order to make things more readable for readers familiar with English spelling, IRM sometimes modifies the spellings that would be obtained from just substituting the letters and letter combinations above for their sounds. Here is an informal and incomplete description of the modifications:
The following is an algorithmic description of how to spell a word in
IRM, based on its pronunciation (as encoded in strict MCM) and its
English spelling. I realize that the level of formality and
detail here may make this somewhat hard to read, but this is the best
way I know to be precise. Besides, in my work persona I'm a
software guy: I tend to think in algorithms.
IRM is produced from strict MCM by a series of transformations.
Some transformations replace schwa (3)
characters of the MCM with corresponding characters from the
traditional English spelling, or make use of the
traditional spelling of common affixes. (See my MCM Reference if you need information on MCM.)
For regularly inflected words, the transformations are first applied to
the root word, and then the inflections are handled as described in Step 9. This applies to plural-like words
(<jeans>, <scissors>, but not <rabies>), and to -ed
words derived from nouns (e.g., <muscled>, <freckled>,
<spirited>, but not <crooked>). Also, sciences like
<physics> and <economics> are treated as plurals.
Similarly, for words including common affixes, the transformations are
first applied to the root word, and then the affixes are applied as
described in Step 10. This may produce a
different result than if the word were processed as a whole.
In compound words, the parts are generally spelled separately and the
results combined. A hyphen can be used to separate the parts if
this combined spelling is inconsistent with the actual pronunciation.
If it happens that a sequence occurs which appears to contain a
digraph, but where the letters of the digraph should be pronounced
separately, a hyphen should be inserted, as in ex-hael or un-graetful.
Homonyms of very frequent words generally are spelled uniquely, to
avoid confusion. These words are listed at the end of this
section.
MCM encodes capitalization by an initial period. The process described below ignores this period. When the process is complete, if an initial period is present, it is removed and the IRM word is capitalized.
Step 1: Make the following
replacements in the MCM transcription:
A |
=> |
o |
Ar |
=> |
ar |
ar |
=> |
aar (1) |
C |
=> |
ch |
D |
=> |
dh (2) |
E |
=> |
ae |
er |
=> |
air |
G |
=> |
ng |
hw |
=> |
wh (1) |
I |
=> |
ee |
J |
=> |
zh |
L |
=> |
3l |
M |
=> |
3m |
N |
=> |
3n |
O |
=> |
aw |
Or |
=> |
or |
o |
=> |
oe |
Q |
=> |
oy |
R |
=> |
er |
T |
=> |
th |
U |
=> |
ue |
V |
=> |
oo |
W |
=> |
ow |
X |
=> |
sh |
Y |
=> |
ie |
3r |
=> |
er |
& |
=> |
ur |
The transformations of L, M and N are simply for the purpose of making
later steps easier to describe.
Notes:
Step 2: Make the following replacements at the ends of words:
a |
=> |
aa |
e |
=> |
eh |
ee |
=> | e (1) |
ee |
=> |
y (1) |
ie |
=> |
i |
o |
=> |
ah |
oe |
=> |
o |
u |
=> |
uh |
ue |
=> |
u |
Notes:
In the case of a few homonyms of very common words, these
replacements do not occur.
The changes of o to ah and a to aa also occurs preceding the
letters h and w.
When a word with an -e, -i, -o
or -u ending is a component
of a compound other than the last, the ending e is restored (or replaced by an
apostrophe if the following word starts with a vowel).
Step 3: The following sequences are respelled:
gz |
=> |
xz * |
ks |
=> |
x |
ksh |
=> |
xh * |
kw |
=> |
qu |
ngg |
=> |
ngh * |
ngk |
=> |
nk |
nkw |
=> |
nqu |
nks |
=> |
nx |
nksh |
=> |
nxh * |
The starred transformations may be optionally skipped to avoid the
unfamiliar sequences. The -ks/-nks/-gz
transformations do not occur when the last letter is a plural
marker. Also, the -ks
transformation is not performed for an "-ics" word formed by adding an
"-s" to an adjective ending in "-ic", as in sivviks and fizziks. Also note that the x transformations are not performed
when the component sounds are contained in more than one part of a
compound word.
Step 4: If a stressed single letter vowel is followed by a single
consonant and a vowel (or the end of a word), the consonant is doubled,
unless it is an q, r, h, j or
k. A j is replaced by dj instead of doubling it;
similarly, k is doubled as ck. This step is not
performed in a single syllable word, unless the consonant is s.
Step 5: If a long vowel (ae, ee, ie, oe or ue) is followed by another vowel
(including 3), the second e of the first vowel is replaced by
an apostrophe. However, if the long vowel is followed by a schwa
and a liquid, instead the schwa is dropped if it is not implied by the
English spelling, as in fier
and stael.
Step 6: The combinations 3l, 3m, or 3n may be replaced by l, m, and n respectively under the following
circumstances. In the same circumstances, the sequence er (resulting from the Step 1 transformations) can be replaced by r.
Step 7: Any remaining 3 or other unstressed short vowel
is replaced by the corresponding vowel from the standard English
spelling. If the corresponding vowel is a digraph or trigraph,
the vowel most closely matching the pronunciation is used. If the
matching vowel is pronounced as oo,
then u is used as the
vowel. Exceptions: 3r/R
is always replaced by er.
(This takes place in Step 1.) A 3 at the end of a word is always
spelled a. If the
substitution of a for 3 would produce the sequence aw, then e is substituted instead. If
the substitution of o for 3 would produce either ow or oy, then u is substituted instead. If
the substitution of a for 3 would produce the sequence ah, then o is substituted instead. In
the words <today>, <tonight>, <tomorrow> and
<together>, the
first vowel is rendered as u,
to correspond to the stressed pronunciation.
Step 8: An ending s
which is not a plural marker is replaced as follows:
Note that an ending s
is not altered in the sequence ss,
or following any non-liquid consonant. For instance, the word
traditionally spelled <collapse> is kolaps in IRM. Also note that
the rules for pluralization (see Step 9)
make it unnecessary to change an -s
after a schwa with a spelling other than a. The word prommis cannot be the plural of a
word spelled prommi (perhaps
spelled <promeye> in traditional spelling), as its plural would
be prommies.
Step 9: When the source word is an
inflection of some other word, the inflection is added without
modifying the root spelling any more than absolutely necessary. A
word ending in a single e, i, o
or u (not part of a digraph)
is inflected as though the implied e
was present. An -ing
inflection is added without modifying any preceding vowel or syllabic
liquid, but possibly applying Step 5.
(However, when -ing is added
to a word ending in ce, the e is dropped.) However, the
final consonant of any one-syllable word with a single vowel letter is
doubled. An -er or -est suffix drops the e if following a vowel, unless an e was dropped from ee, ie, oe or ue at the end of a word, in which
case an apostrophe is used. (Note that the comparative -er prefix is not shortened to a
single r, except when
preceded by an apostrophe.) The plural suffix is always written
as s (or es) at the end of a word, whether
pronounced as "s" or "z". When pluralizing a word ending in y, the y is not changed (e.g., bownderys). Similarly, a past
tense d is always written as d, even if it is pronounced as "t",
and an ending y is not
modified (e.g., studdyd).
Step 10: Certain very common affixes
have standard spellings:
Traditional |
IRM |
-able/-ible |
-abl * (1) |
-age |
-aj * |
-al |
-(a)l * |
-ally |
-(a)ly * |
-ance/-ence |
-(e)nce |
-ancy/-ency |
-(e)ncy |
-ant/-ent |
-(e)nt |
anti- |
anty- * |
-ate |
-at * |
co- |
coe- * |
counter- |
cownter- (3) |
-ction |
-xhn |
ee- |
ee- * |
-er/-or |
-(e)r * (1) |
-ess |
-ess * |
ex- |
ex(z)- |
-ful |
-ful |
-ious |
-e'us/-yus/-us * (2) |
-ish |
-ish * |
-ity |
-ity * |
-ive |
-iv * |
-ization |
-izaeshn * |
-ize |
-iez * |
-less |
-les |
-like |
-liek |
-ly |
-ly (1) |
-man/-men |
-man/-men (4) |
-ment |
-mnt |
mis- |
mis- |
multi- |
multi- |
-ness |
-nes (1) |
-ory |
-ory/-ery * (2) |
-ous |
-us * |
out- |
owt- |
over- |
oevr- (3) |
pre- |
pree- * |
pro- |
proe- * |
re- |
ree- * |
semi- |
semmy- * |
-some |
-sm |
sub- |
sub- |
super- |
suepr- (3) |
-tion/-ssion |
-shn |
un- |
un- |
under- |
undr- (3) |
-y |
-y * |
Symbols in parentheses in the table above indicate letters which are optionally present, based on the normal IRM spelling rules. Affixes shown with an asterisk call for the application of Step 5 after combination with the affix. When step 5 is applied to a word in which the "ee" sound is spelled as y, the result is spelled y' rather than e', as in sairemoeny'l.
Other notes:
Also note that when a suffix starting with -e or -i is appended to a word ending in ce, the extraneous e is dropped.
These spellings may also be used in words that closely resemble words
with these affixes, even if the root form does not exist, e.g., detrimnt, vi'lnce, loensm. If
the application of this rule would create a double letter, a hyphen is
inserted, e.g., un-no'ing.
However, the ree- spelling is
not used (unless accurate) for words which do not have a "do again"
meaning, or a relationship to such a word. Similarly, the pree- and proe- spellings should not be used
for words words where the suffix does not have the standard meaning,
such as prefer or propoez.
The following common words (and their homonyms) are rendered as shown here, rather than by application of the rules above.
Traditional form |
IRM |
Homonyms |
IRM |
a |
a |
||
all |
awl |
awl |
awll |
am |
am |
||
an |
an |
Ann |
Ann |
and |
and |
||
are |
ar |
||
be |
be |
bee |
bee |
been |
bin |
bin |
binn |
but |
but |
butt |
butt |
by |
bi |
buy |
bie |
can |
kan |
can (tin) |
kann |
do |
du |
dew, due |
due |
done |
dun |
dun |
dunn |
down |
down |
down (bird) |
doun |
for |
for |
four |
foer |
have |
hav |
halve |
havv |
he |
he |
||
hers |
hurz |
||
here |
hir |
hear |
heer |
her |
hur |
||
him |
him |
hymn |
himm |
his |
hiz |
||
I |
I |
aye, eye |
ie |
in |
in |
inn |
inn |
into |
intu |
||
is |
iz |
||
its |
its |
||
it's |
it'z |
||
just |
just |
just (fair) |
jusst |
like |
liek |
like (enjoy) |
liik |
may |
mae |
May |
Mai |
me |
me |
||
might |
miet |
might (force) |
miit |
mine |
mien |
mine (dig) |
miin |
must |
must |
must (mold) |
musst |
my |
mi |
||
new |
nu |
gnu, knew |
nue |
no |
no |
know |
noe |
none |
nun |
nun |
nunn |
not |
not |
knot |
nott |
of |
ov |
||
off |
auf |
||
on |
on |
||
one |
wun |
won |
wunn |
or |
or |
oar, ore |
oer |
our |
owr |
hour |
ower |
ours |
owrz |
hours |
owers |
see |
se |
sea |
see |
she |
she |
||
so |
so |
sew, sow |
soe |
some |
sum |
sum |
summ |
that |
dhat |
||
the |
dhe |
thee |
dhee |
them |
dhem |
||
their |
dhaer |
||
theirs |
dhaerz |
||
there |
dhair |
||
there's |
dhair'z |
||
they |
dhae |
||
they're |
dhae'r |
||
this |
dhis |
||
though |
dho |
||
through |
thru |
threw |
thrue |
to |
tu |
too, two |
tue |
us |
us |
||
very |
vairy |
vary |
vaery |
was |
wuz |
||
we |
we |
wee |
wee |
were |
wur |
||
what |
whut |
||
when |
when |
||
where |
whair |
ware, wear |
wair |
whether |
whedhr |
weather |
wedhr |
which |
which |
witch |
wich |
while |
whiel |
||
whither |
whidhr |
wither |
widhr |
who |
hu |
||
whom |
huem |
||
whose |
huez |
||
why |
whi |
||
will |
wil |
will (wish) |
will |
with |
widh |
||
would |
wood |
wood |
woodd |
you |
yu |
yew |
yue |
your |
yur |
||
you're |
yu'r |
||
yours |
yurz |
The wh spellings for words
above are used even if the wh
is not used for other words.
Here are two samples of IRM, the first the traditional first paragraph
of H.G. Wells' "The Star" (see here),
and the second a transcription of a pop song by Dire Straits.
(Truth to tell, song lyrics are my usual transcription material.)
It wuz
on dhe furst dae uv dhe nu yir dhat dhe anowncemnt wuz maed, awlmoest
siemultaene'usly frum thre obzurvatorys, dhat dhe moeshn uv dhe plannet
Neptuen, dhe owtrmoest uv awl dhe plannets dhat weel abowt dhe sun, had
bekumm vairy irattik. A retardaeshn in its velossity had bin
suspekted in Desembr. Dhen a faent, remoet spek uv liet wuz
diskuvvrd in dhe reejn uv dhe perturbd plannet. At furst dhis did
not kawz enny vairy graet exietmnt. Si'ntiffik peepl, howevvr,
fownd
dhe intellijnce remarkabl enuff, eevn befor it bekaym noen dhat dhe nu
boddy wuz rappidly gro'ing larjer and brieter, and dhat its moeshn wuz
quiet differnt frum dhe ordrly progres uv dhe plannets.
Because this example is a pop song, I've left out most of
the punctuation, in the style of rock lyrics everywhere.
Worning liets ar
flashing down at Quollity Kntroel
Sumboddy thrue a spannr and
dhae thrue him in dhe hoel
Dhair'z ruemrs in dhe loeding
bae and anghr in dhe town
Sumboddy blu a wissl and dhe
wawls kaem down
Dhair'z a meeting in dhe
bordruem, dhae'r tri'ing tu traece dhe smel
Dhair'z leeking in dhe
woshruem, dhair'z a sneek in Pursonell
Sumwair in dhe koridr, sumwun
wuz hurd tu sneez
"Goodnes me, kood dhis be
Industre'l Dizeez?"
Dhe kairtaekr wuz kruesified
for sleeping at hiz poest
Refyuezing tu bi passified,
it'z him dhae blaem dhe moest
Dhe wochdawg got raebeez, dhe
forman'z got dhe flees
Evrywun'z knsurnd abowt
Industre'l Dizeez
Dhair'z pannik on dhe
swichbord, tungs ar tied in notts
Sum kum owt in simpathy, sum
kum owt in spots
Sum blaem dhe mannajmnt, sum
dhe employees
And evryboddy noes it'z dhe
Industre'l Dizeez
Dhe wurk force iz disgusted,
downs tuels, wawks
Innosnce iz injrd, expire'nce
just tawks
Evrywun seeks dammajes,
evrywun agrys
"Dheez ar klassik simptms uv
a monnetairy squeez"
On ITV and BBC dhae tawk
abowt dhe kurce
Filossofy iz yuesles,
the'ollojy iz wurce
Histery boyls oevr, dhair'z
an ekkonommiks freez
Soese'ollojists invent wurds
dhat meen Industre'l Dizeez
Doktr Parkinsn deklaird "I'm
not serpriezd tu se yu hir
Yu'v got smoekr's kawf frum
smoeking, bru'r's druep frum drinking bir
I doen't noe how yu kaem tu
get dhe Betty Daevis nees
But wurst uv awl, yung man,
yu'v got Industre'l Dizeez"
He roet me a preskripshn, he
sed "Yu ar depressd
I'm glad yu kaem tu se me, tu
get dhis awf yur chest
Kum bak and se me laeter,
next paeshnt pleez
Send in anudhr viktm uv
Industre'l Dizeez"
I go down tu Speekr's Kornr,
I'm thundrstruk
Dhae got fre speech,
toorists, poleece in truks
Tue men sae dhae'r Jeezus,
wun uv dhem must be rawng
Dhae got a proetest singr,
he'z singing a proetest sawng
He sez "Dhae wont tu hav a
wor so dhae kan keep us on owr nees
Dhae wont tu hav a wor so
dhae kan keep dhaer fakterys
Dhae wont tu hav a wor tu
stop us bi'ing Jappaneez
Dhae wont tu hav a wor tu
stop Industre'l Dizeez
Dhae'r poynting owt dhe
ennemy tu keep yu def and bliend
Dhae wawnt tu sap yur
ennerjy, inkarseraet yur miend
Giv yu Ruel Britanya, gassy
bir, paej thre
Tue weeks in Espanya and
Sundae strip teez"
Meenwhiel, dhe furst Jeezus
sez, "I'l kyoor it suen
Abollish Mundae morning and
Friedae aftrnuen"
Dhe udhr wun'z owt on hunghr
striek, he'z di'ing bi degrees
How kum Jeezus gets
Industre'l Dizeez?
My dialect of English does not distinguish <marry> from
<merry>, nor <whale> from <wail>. The sequence aar may be used to represent the
"arr" sound of <marry>, and wh
may be used to represent the initial consonant of <whale>, if
desired.
A few English words are pronounced with non-English sounds. The
most common of these sounds may be represented as follows:
Spelling |
IRM example |
Traditional spelling |
Origin |
kh |
lokh |
loch |
German/ Scottish |
~ |
kontreta~h |
contretemps |
French |
" |
Go"ta Klair da Lu"n |
Goethe Clair de Lune |
French/ German |
(The tilde and umlauts are best used as diacritical marks: kontretãh, Göta, Klair da
Lün. However, they may also be used as separate marks
within a word, as shown above, if more convenient.)
Version 1.10 of IRM introduced the following changes from
version 1.00:
Version 1.00 of IRM introduced the following changes from
version 0.90:
I consider IRM to be a failure. The use of doubled
consonants to show stress obscures word relationships in words like bottany and botannical or anonnimus and annonimmity. It works well
enough for short words, but is clumsy at best for the enlarged
vocabulary of technical or scientific writing. Further
development of IRM has ceased, unless I change my mind on this point,
or discover some technical means of lessening its severity.
Nevertheless, one could contemplate minor improvements to
IRM. The remainder of this section repeats the discussion of such
improvements from the previous version of this document, with one or
two
other items added before abandonment of the project.
IRM is incompatible with British English (RP), primarily
due to its failure to distinguish the broad "a" sound from the short
"o". This issue could probably be remedied by spelling the broad
"a" as ah. Possibly
other adjustments would be required. Probably no significant
change in the use of r would
be needed; the traditional handling of this issue seems to work well
enough.
It occurs to me that using -sc in place of -ce has some
advantages, notably the ability to avoid the mutation of -ce to
-ci. Examples: horsc,
mowsc, spiesc, wunsc, juescy, reduescing, replaescabl, disgraescfl.
-sc is of course not pronounced as /s/ in TS, but the suggestion of
-sce is strong enough that I think this change would work.
The decision to contract m3n
to mn at the end of words is
very new. I may regret it. It does make the handling of the
"-ment" suffix more consistent, however.
I'm not completely happy with the diacritic solution to foreign
vowels. But the diacritics are probably better than introducing
new unnatural digraphs that would be rarely used. I considered kontretonh, Geota and Kleir de Leun, but these spellings
don't appeal.
Using -s to mark all plurals
complicates things enormously. It forces me to use -ce (or some other convention) to
mark words ending in -s which
are not plural, and to add extra e's
when pluralizing words ending in a single vowel. I may yet change
my mind on this, even though using -z
for plurals will surely make most texts significantly stranger
looking. (I must admit that -ce
is itself pretty strange upon occasion.)
I'm considering spelling the "oo" vowel with a w. While this increases the
oddness factor, it allows me to treat this vowel like all the other
short vowels, and lets me spell words with an unstressed "oo" more
accurately, e.g., porkywpien
instead of porkyupien.
On writing it down, this really doesn't seem all that much of an
improvement.
It's unnecessary to double a consonant to show stress in a two syllable
word where the other syllable is spelled as a solitary l, m, n or r. That is, ladr would do as well as laddr. But keeping the
doubling makes texts more consistent, and is worthwhile for that
reason. So I probably won't change my mind about this again.
The fact that r's are not
doubled is an anomaly. Even though it is not necessary to do
this, I may want to add it, just to make the system more
consistent. There is also the issue that an unstressed 3r is always spelled er, rather than using the vowel of
the traditional spelling, as with all other sounds. This gets in
the way of easy recognition of some related words, like admier and admerabl. But so far, I
haven't thought of a way to fix this.
Perhaps I should get rid of the ngh
trigraph, and just use ngg
instead. This system already has plenty of double letters in it,
after all. And when the sequence ngh is used in TS, the sound is
virtually always a soft ng, as in <dinghy> and <gingham>.
Not contracting the comparative suffix to -r (so that I write kolder, not koldr) was a very close
decision. I find that when I write in IRM I naturally leave the e out. I may well change my
mind (again) on this one.
To comment on this page,
e-mail Alan at wyrdplay.org
Go to wyrdplay.org home
page
Go to wyrdplay.org spelling
system roster