ABCD Notation

Alan Beale
August 15, 2019

What is ABCD?

This page is about ABCD, which stands for Alan's Basic Codes with Diacritics.  I call ABCD a "notation" - it is easier to explain what it is not than to explain what it is, and why you might be interested in it.  ABCD is not a spelling system: it is too complex and idiosyncratic for that.  Neither is it a dictionary key: it is neither as accurate or as regular as a dictionary key.  Essentially, ABCD is a notation which elucidates the relationship between a word's usual spelling and its pronunciation.  It is suitable for use both with words that conform to common English spelling patterns, words like nasty, nice, terrible and benevolent, as well as with horridly exceptional words like women, colonel, boatswain and connoisseur.

ABCD is loosely based on my spelling system DRE.  It makes use of an extensive number of diacritics, organized much like the DRE set of diacritics.  ABCD uses both lower- and upper-case characters, but prefers to use lower-case for the most common and familiar patterns, and upper-case for less familiar ones.  Further, ABCD's lower-case characters always match the corresponding traditional spellings (possibly with the addition of a diacritic), while upper-case characters may occasionally differ from them.  (For instance, the Z in ABCD represents an s which is pronounced as z.)  Each ABCD letter or digraph represents both a sound and an English spelling.  For instance, the digraphs sh, ti and SH all represent the same sound, but spelled as sh (as in shoe), ti (as in nation) and ch (as in machine) respectively. In addition to lower- and upper-case alphabetics, ABCD uses a few punctuation characters, mostly to note flaws in a word's usual spelling, and also to separate word constituents. (The @ character is an anomaly - it is treated as a special form of the letter a rather than as punctuation.)

Like DRE, ABCD is ambiguous about certain aspects of pronunciation, though less so than DRE.  An ABCD spelling does not usually indicate stress, and also does not distinguish between the schwa and the regular short vowel sounds.  However, if you ignore these two areas, ABCD is quite precise.  In fact, one characteristic of ABCD is that the ABCD spelling of a word is sufficient to represent both its traditional spelling (ignoring typographical issues like capitalization and hyphenation) and its pronunciation, subject to the two ambiguities of stress and schwa.  (I have defined a less ambiguous form of ABCD, briefly discussed in an appendix, but the ambiguous version is easier to read, and I think more useful.)  Furthermore, the ABCD representation of pronunciation and spelling is almost entirely context-free, which makes it easy to process mechanically by a computer program.  The context-dependent elements of ABCD are enumerated in this appendix.

Here are a few simple examples of ABCD in action, to give you a better idea of how it works.  The list below is in the format "TS: ABCD".  (Throughout this page, I use the convention of displaying traditionally spelled words (TS) in italics, and ABCD spellings in boldface.  Occasionally, my CAAPR notation is also used - this is also shown in bold, and enclosed in curly braces to identify it as CAAPR.)

abundant: abundant
alienate: álîenáte
charisma: KHariZma
handsome: han(d)som(e)
awareness: aWâre+nêss
accordion: a^ccòrdîon
demoralization: dem~Öral~Ízátion
laugh: l[au:~À][gh:f]

abundant is a word spelled entirely according to English patterns, and requiring no markings for vowel sounds.  alienate also conforms to patterns, but requires some vowels be marked with diacritics to prevent misinterpretation. Note that no special marking is required for a final silent e following a long vowel. The word charisma also conforms to high-frequency patterns, but both the ch and the s need to be altered to avoid misunderstanding.  (The spelling KH is used rather than CH, because CH is equally plausible as a spelling for the ch of machine.)  handsome has two silent letters, and, in contrast to alienate, the e is marked as silent since the previous vowel is not long.  Finally, the word awareness shows some ABCD techniques for resolving some of the subtler ambiguities of regular spelling.  The W in awareness is capitalized to show that aw is not to be interpreted as a single vowel sound (as in law), while the + sign after aWâre shows that the first e is not pronounced as a short e or a schwa, but instead is silent, because it ends the root word aware.

Unlike the words above it, the word accordion does not conform to basic English patterns, because the double c follows a vowel representing a schwa.  The ^ flags this situation.  The word demoralization displays a different difficulty - the British and American pronunciations differ.  The ~ flags a code which is interpreted differently for the two varieties of English.  And finally, the word laugh is completely defiant of standard English patterns, and so the ABCD representation simply shows how the letter combinations map to sounds.

An ABCD dictionary is available for download here.  It contains 27,000 English words, spelled in TS and ABCD.  For most words, the spelling is the same for both American and British English; where they differ, the dictionary provides both of them, with the American spelling first.  The whole point of ABCD is really the dictionary.  It can be used as an educational tool, for increasing one's understanding of the patterns of English spelling, and the ways in which they break down.  I also believe that it may be useful as the starting point for developing spelling systems which are very similar to existing spelling, by allowing easy identification of those words that fail to abide by whatever rules the designer feels are most important.  One reason I developed ABCD was to help me develop a version of my system DRE which did not require the use of diacritics.  I have not, at this time, actually succeeded in doing this, but there is no doubt that ABCD has made the process easier, and I consider it possible that the process might someday actually produce something satisfactory.

The ABCD dictionary is ultimately derived from the CAAPR dictionary; the pronunciations it uses are based on consensus of 2 American dictionaries, 2 British dictionaries and the Longman Pronunciation dictionary, which covers both varieties.  See the CAAPR page for more information on this subject.  The dictionary download above includes copies of both this page and the CAAPR definition for easy reference.

This version of ABCD and its dictionary differ from previous versions due to removal of the symbol L (which was completely synonymous with (l)), the use of the symbol É (which was originally part of ABCD, but was then unwisely removed), and removal of the exception for the S symbol in the sequence ôuS.

Note: The dictionary was updated in 2019 by the addition of a significant number of additional words, most of them frequently used capitalized words, as well as the correction of a few errors. The ABCD notation itself was not changed, although some of the tables in this document were clarified and/or corrected.

The diacritics of ABCD

Before attempting to describe the ABCD notation in full detail, it will be useful to describe the way it organizes its diacritical markings, which is based on the conventions of my spelling system DRE.  The organization is strictly applied for letters in lower case; some flexibility is allowed for upper-case letters to avoid running out.

  1. Vowels without diacritics represent either the regular short sound of the vowel (as in shack, check, chick, shock and chuck), or the schwa.  The digraph oo represents the vowel of shook.  An unmarked y is a rather special case, and may have either the vowel sound of misty, or the consonantal sound of yell.  When followed by an r, some vowels may also be pronounced with a stressed er sound, as in fern, bird and burn.

  2. Letters with an acute accent represent the normal long sound of the five English vowels, as in máte, méte, míte, móte and múte. The digraph represents the vowel of moót.  An acute-accented ý, as in flý, represents the same sound as í.

  3. Letters with a grave accent represent an alternate sound of the marked letter.  These sounds are all long in length, and almost always spoken distinctly.  These sounds are especially common in words of European origin.  Mnemonic words are dràma, sÈànce, marÌne, bòre (as well as dòg, in American English) and crùde.

  4. Letters with a circumflex represent an alternate sound of the marked letter.  These sounds are shorter than the sounds associated with the grave accent.  They may be reduced to a schwa, and may also have a slightly different meaning preceding an r than in other positions.  The sounds of the circumflexed vowels when no r follows are those of vidÊo, audîó, Ôther and pÛsh.  (DRE also spells âny and prêtty, but these particular forms are not used in ABCD.)  Before the letter r, the circumflexed vowels represent the sounds of câre, hÊre and wÔrd.  They are also used in the standard suffixes -âlly, -lêss, -nêss and -fûlly, to indicate an indistinct sound despite the following double letter.  Note that ABCD, unlike DRE, only uses a lower-case ê for an unstressed sound: ênáble is spelled with one in ABCD, but prÏtty is not.

  5. Letters with a dieresis represent the same sound as the unmarked letter, and are often used where the unmarked letter would have a different interpretation.  Examples are päradox, wickËd, fúËl and sörry.  The ü with a dieresis has a special meaning.  It represents either the unstressed sound of Û or the schwa, preceded by a y, as in regülar or mercüry.  In ABCD, ë and ï are also used to indicate the normal short sound of the vowel before an r, as in chërish and spïrit.  A ÿ with a dieresis may be used in ABCD to indicate a y which is always pronounced as a vowel, as in lobbŸist, where, because the y is followed by a vowel, one might otherwise assume the consonantal sound is intended.

DRE and ABCD both utilize a number of digraphs in which one of the vowels is marked with a diacritic.  In ABCD, except for a few exceptions (, éu, éw and combinations like íË containing an Ë), the rule for interpretation of such combinations is simple - the sound is that of the marked letter, and the unmarked letter is ignored.  Example words include hEÂd, thÈY, dÍE, dÔUble, nervôuS and cúe.  A certain number of unmarked digraphs are used also, and they generally have the meaning you would expect. These are ai, au, aw, ay, ea, ee, eu, ew, oa, oe, oi, oo, ou, ow and oy.  Note that éu and éw are exceptions to the rule for interpreting digraphs above. eu and ew are pronounced like ùe (sleutþ, brew), and éu and éw like úe (as in éuró and féw). These two combinations break the rules because of the lack of an accented w in many fonts and on most keyboards.

Other ABCD Conventions

One of the all-too-common features of English spelling is the use of silent letters.  ABCD encloses silent letters in parentheses, as in (k)nífe, í(s)land and ballÈ(t).  There are a few letters and combinations, notably e and gh, whose treatment is more complicated when silent.  See their descriptions below for more information.

Another confusing aspect of English spelling is the use of double letters. A useful rule of thumb is that a double consonant implies that the preceding vowel is short and stressed; for example, compare filling and filing, or matter and material. Unfortunately, there are a great many exceptions to these so-called rules. ABCD uses the ^ character preceding a double letter to flag a vowel which is either unstressed or long, as in a^dditional or gró^ss. Note that ck, cq and dj are treated as double letters for this purpose.

One might well ask of ABCD: is it oriented towards American or British English?  The answer is that it is equally oriented towards both.  It may be used to spell words from either regional variety.  In most cases, the spelling is independent of the variety.  This may happen in any of three ways.  Many words are pronounced the same in both varieties, such as catcloudy and demonstration.  Other words are pronounced differently, but with pronunciations that are related to each other according to well-defined rules, allowing a single spelling to be used for both.  Examples of such words are pot, stairs and curious.  A third case is that of words which have related pronunciations in American and British English, but where the relationship is not reliable for similar words.  For instance, the American pronunciation of sample would be written sample in ABCD and the British pronunciation as sàmple, but the similar word ample would be spelled ample for both varieties.  ABCD uses the character ~ to indicate a pronunciation which commonly differs between American and British English.  For instance, sample is spelled s~Àmple in ABCD.  Words such as clerk and neither with unusual differences between American and British pronunciation must have two ABCD spellings, one for each variety.

You may also be wondering what the distinction is between the upper- and lower-case ABCD symbols.  Before a lower-case symbol could be used, there were two prerequisites.  The first was simply that the base character for a lower-case symbol had to be the character used in regular spelling. ABCD uses the symbol Ù for the letter o when pronounced as long oo, as in move.  A lower-case symbol could not be used unless I were willing to use a form of the letter o for it.  The other requirement was that I would use a lower-case symbol only when it was pretty clear how you would spell the sound in a rational spelling system.  For instance, spelling the vowel of plain as ai is very reasonable, and so lower-case could be used.  But spelling the second vowel of machine with the letter i is at least dubious, and so the word is denoted maSHÌne rather than maSHìne. The capital letter emphasizes that there's "something funny" going on here.

Deriving ABCD Spellings

I think the best approach to describing the details of ABCD is a semi-formal one.  So let me start off with a description of how the ABCD spelling of a word is determined.  The process starts off with a decomposition of a word into pairs.  The first element of each pair is one or more letters from the spelling, and the second is from the CAAPR representation of the pronunciation (see Endnote 1). (CAAPR is described here.  Note that the remainder of this page assumes familiarity with CAAPR - so if it is new to you, you may want to keep the CAAPR writeup open for reference.)  

As an example, the word charisma is originally decomposed as:

   [ch:k][ar:ør][i:i][s:z][m:m][a:ø]

The process of deriving the ABCD spelling then proceeds in three steps:

  1. High frequency pairs are replaced by ABCD symbols or symbol combinations.  (It seems remarkable that there are few enough of these pairs that one can find readable representations of all of them.)

  2. Certain symbols may be modified or added based on special circumstances of individual words.  This is done either to avoid ambiguity (e.g., to distinguish the th of worth from that of porthole) or to note unexpected violations of English patterns (like the double t in attend or the s at the end of the non-plural atlas).

  3. Any remaining pairs have the second element modified to contain an ABCD code rather than a CAAPR code, except that the CAAPR symbols {ø} and {&}, which do not have an unambiguous ABCD representation, are retained.

Step 1, and aspects of step 2, can be summarized easily by simply listing the pairs to which they apply, and how they are represented (which I will do below).  But some additional notations are more conveniently described here:

  1. In a number of cases, pairs at the end of a word are handled differently from the same pair within a word.  This is especially true for the silent e, and the letter s when used to indicate a plural or possessive.  Because of English's fondness for compound and derived words, these letters can sometimes occur within a word with the end-of-word interpretation. In ABCD, a plus sign is used to indicate the end of a word within a word.  Examples are scâre+crÓW and státe+ment.  The plus sign is also used to separate double letters when both are sounded, as in un+nótiCed or mis+státe.

  2. A lone ^ is used before a double consonant following a schwa or an unstressed short i, in violation of normal English patterns.  Examples are a^ccommodáte, co^rrect and cÔmpa^ss.  The combinations ck, cqu, dG, dj and tch are treated as double letters here.
  3. Silent letters are enclosed in parentheses, as noted above.  (Other notations are sometimes used for silent e and gh, as described below.)

  4. The symbol ~ always indicates that what follows is pronounced differently in British and American English.  Individual letter combinations beginning with ~ are discussed below, together with the notations with no dependence on English variety.

One property of ABCD is that it is very easily parsed by software - while some letter combinations, such as ch, have meanings distinct from those of their components, there is never (so far as I can determine) any ambiguity in how a word is divided into meaningful units.  I note that this property is preserved even if all the ~'s are removed. Which is to say, the ~'s are there to assist the human reader, but are unnecessary for accurate algorithmic decomposition.

The ABCD Alphabet

Having said all that, I am now ready to run down the alphabet, and produce a complete list of the ABCD phonograms.  Though the list is quite long and detailed, it is highly structured and organized, notably by the diacritical conventions given above, and for that reason is not hard to grasp and master.  For symbols beginning with a ~, the Denotes column of the tables gives both the American and the British meaning for the symbol: in a/à, the a is the American form, and the à the British form.

a

Symbol Denotes Example ABCD
Example
a [a:a] or
[a:ø]
cat
about
cat
about
á [a:E] late láte
à [a:A] father fàther
â -âlly locally lócâlly
@ [a:i] or
[a:ê]
message mess@G(e)
ai [ai:E] rain rain
air [air:ër] fair fair
ar [ar:ør] awkward awkward
âr [ar:ër] care câre
är [ar:ar] paradox päradox
ärr [arr:ar] arrow ärrÓW
au [au:Ø] pause pauZe
aw [aw:Ø] claw claw
ay [ay:E] play play
Å [a:Ø] water wÅter
[ae:I] algae alGAÉ
a/à bath b~Àtþ
~Âr âr/[ar:ør] secretary secrêt~Âry

See below for [a:o], as in watch (ABCD wOtch).

b -

Symbol Denotes Example ABCD
Example
b [b:b] big big
bb [bb:b] rubble rubble

c -


Symbol Denotes Example ABCD
Example
c (Note 1) [c:k] or
[c:s]
cat
city
cat
city
cc (Note 1) [cc:k] accord accòrd
ck [ck:k] luck luck
cqu [cqu:kw] acquit a^cquit
cQ [cqu:k] lacquer lacQer
ch [ch:C] chill chill
ci (Note 2) [ci:X] vicious viciôuS
Ce, C(e)
(Note 3)
[ce:s] advance
furnace
advanCe
furn@C(e)

Notes:

  1. c denotes [c:s] if followed, in the traditional spelling, by e, i or y, and otherwise [c:k].  The few words which do not conform to this pattern must be spelled in ABCD with an explicit [c:k] or [c:s], as in [c:k]eltic or fa[c:s]àd(e).  cc denotes [cc:k] unless followed by e, i or y. When it is followed by e, i or y, the pronunciation is ks - this is regarded as 2 c's in succesion, rather than a single occurrence of cc.

  2. ci denotes [ci:X] only when followed by a vowel.  Otherwise, the c and the i are distinct symbols.

  3. Ce and C(e) represent [c:s] followed by a silent e, in situations where the silent e is not a magic e, as in advanCe and furnaC(e).  In the case of furnace, the e is misleading about the preceding vowel, and so is parenthesized.  In the case of advance, the previous vowel is too distant in the word to be affected by the e, which serves the useful purpose of defining the pronunciation of the preceding c.

See below for [ch:k], as in chrome (ABCD KHróme), and for [ch:X], as in machine (ABCD maSHÌne).  Also see n below for information on the combinations ñc and ñKH as in uncle and anchor.

d -

Symbol Denotes Example ABCD
Example
d [d:d] or
[d:þ]
dog
wanted
d~Ög
wOntêd
dd [dd:d] add add
dG (see G) [dg:j] judge judG(e)
dj [dj:j] adjust a^djust
dJ (see J) [d:j] procedure procédJur(e)
ed (Note 1) [ed:þ] missed missed

Notes:
  1. At the end of a word, ed represents [ed:þ], that is, a past tense in which the e is silent, and in which the d is pronounced either as t or d, depending on the previous letter.  There are some exceptional words ending with -ed in which the e is surprisingly not silent, such as beloved and wicked - these words are spelled with Ëd in ABCD to prevent ambiguity.

    Note that words like hunted and raided are regular, represented by [e:i][d:þ], and unambiguously spelled with -êd in ABCD.  Also note that the Ëd spelling in unnecessary in one-syllable words, and so bed is bed and not bËd in ABCD. Words compounded from a one-syllable word ending with ed will use Ëd, as in sickbËd, unless the one syllable word is separated from the rest by a +, as in fòrce+fed.

e -

Symbol Denotes Example ABCD
Example
e [e:e] or
[e:ø]
ten
rivet
ten
rivet
e (Note 1) [e:-] late láte
é [e:I] medium médîum
ê (Note 2) [e:i] or
[e:ê] or
-lêss, -nêss
enable
erupt
lifeless
fitness
ênáble
êrupt
lífe+lêss
fitnêss
ea [ea:I] feast feast
ear [ear:ïr] fear fear
ed (see d) [ed:þ] missed missed
ee [ee:I] feet feet
eer [eer:ïr] beer beer
er [er:ør] or
[er:&r]
river
revert
river
rêvert
ër [er:er] cherish chërish
ërr [err:er] terrible tërrible
es (see s) [es:$] miles míles
eu [eu:U] sleuth sleutþ
eur
(Endnote 2)
[eur:Ür] pleurisy pleurisy
éu [eu:yU] feud féud
éur
(Endnote 2)
[eur:yÜr] Europe éurop(e)
ew [ew:U] drew drew
éw [ew:yU] few féw
É (Note 3) [e:I] me
crises
museum

crísÉs
múZÉum
È [e:E] ballet
cafe
ballÈ(t)
cafÈ
Èe [ee:E] matinee
matinÈe
Ê [e:ý] apostrophe
video
apostroPHÊ
vidÊó
Ë (Note 4) [e:e] or
[e:ê] or
[e:ø]
duet
wicked
duel
d~ÚËt
wickËd
d~ÚËl
EÀr [ear:àr] heart hEÀrt
[ea:e] head
measure
hEÂd
mEÂZJur(e)
EÂr [ear:ër] bear bEÂr
[ea:ë] yeah yEÄ(h)
ËA
(Note 5)
[ea:ï] idea (Brit) ídËA
ÉI [ei:I] seize sÉIze
ÉIr [eir:ïr] weird wÉIrd
ÈI [ei:E] reign rÈI(g)n
ÈIr [eir:ër] their thÈIr
ER [ear:&r] earth ERtþ
Êr [er:ïr] here hÊre
Ër (Note 6) [er:ør] supplier su^pplíËr
ÈY [ey:E] survey survÈY
ÊY [ey:ý] money mÔnÊY
~Er
(Note 7)
ër/[er:ør] cemetery cemet~Ery
~ÉU
(Endnote 2)
eu/éu neutral n~ÉUtral
~ÉUr eur/éur neurotic n~ÉUrotic
~ÉW ew/éw news n~ÉWZ

Notes:

  1. The handling of silent e in ABCD is complicated.  There are two functions that silent e commonly performs.  It indicates that the previous vowel sound is long, in which case the e is commonly called magic.  Alternately, in many words, such as mice, savage and tense, it changes the sound of the previous consonant.  (Note that without the final e, tens would be a plural, and the s would be pronounced as z.)  When both functions are taken into account, we can classify words ending with a silent e into 4 categories.  We say a final e is magic if the previous vowel (separated from the e by a single consonant sound) is long.  (If the consonant is an r, the sounds of â, Ê and ò are also treated as long.)  We say a final e is misleading if there is a vowel preceding it which ought to be long, but is not.  In vice, the e is magic, but in service, it is misleading.  In words in which a final e is not magic, we call it useful if it is preceded by c, g or s, and otherwise useless.  An e can be both useful and misleading, as in garbage, or both useless and misleading, as in festive

    When a silent e occurs at the end of a word, it is enclosed in parentheses if it is misleading or if it is useless.  Also, when a useful (but not magic) e follows the letter c or s, ABCD capitalizes the consonant to show what the e is accomplishing.   Some example words are míne, pláce, festiv(e), sav@G(e) and tenSe.  When a magic e occurs within a word and is not parenthesized, it is followed by a +, usually indicating the end of an internal word, as in bâre+lylífe+boat, or minCe+meat.

  2. ê is used only when [e:i] is unstressed.  Ï is used instead when stressed, as in Ïñglish.

  3. É is used only when [e:I] appears where a silent e might be expected, at the end of a word () or before s (parentþesÉs).  Note that É is used even in words with no other vowels, such as be, even though it would be impossible for the e to be silent. É is also used in words like museum, where use of the usual é would seem to be part of the éu digraph.

  4. Ë is used for the regular sound of e when a bare e would be misinterpreted, such as wicked, which looks like a past tense, and duet, where d~Úet would appear to be a one-syllable word whose vowel is ~Úe.

  5. The sound of ËA is an RP diphthong represented in SAMPA as /I@/, which usually occurs before r in words like pier.

  6. Ër is used like Ë, to prevent ambiguity, as in flýËr, where a bare e would be treated as part of the composite vowel symbol ýe.

  7. Note that the distinction between ~Âr and ~Er is only orthographic - both are pronounced the same in either variety of English.

See below for [le:øL], as in double (ABCD dÔUble).


f -

Symbol Denotes Example ABCD
Example
f [f:f] free free
ff [ff:f] stuff stuff

g -

Symbol Denotes Example ABCD
Example
g [g:g] good good
gg [gg:g] egg egg
G (Notes 1, 2) [g:j] germ Germ
GG [gg:j] veggie veGGÎE
GH [gh:-] high
taught
híGH
tauGHt
GJ [g:J] mirage
genre
miràGJ(e)
GJ[e:o]nrË

Notes:

  1. Note that the spelling G is used even if the letter following g is unusual, as in margarine (American ABCD màrGarin(e)).

  2. The combination dG, as in edge (ABCD edG(e)), is treated as a double letter.

h -

Symbol Denotes Example ABCD
Example
h [h:h] hot hot
H (Note 1) [h:h] mishap misHap

Notes:

  1. Because the letter h is used in a number of digraphs, it is frequently ambiguous when it follows a consonant, as in the words porthole, mishap and rawhide.  ABCD uses a capital H for [h:h] if confusion might be possible, as in pòrtHólemisHap and rawHíde.

i -

Symbol Denotes Example ABCD
Example
i [i:i] or
[i:ê] or
[i:ø]
pig
acid
devil
pig
acid
devil
í [i:Y] item ítem
î (Endnote 3) [i:ý] or
[i:ÿ]
radio rádîó
ir [ir:ør] or
[ir:êr] or
[ir:&r]
admiral
direct
bird
admiral
direct
bird
ïr [ir:ir] miracle mïr@cle
ïrr [irr:ir] mirror mïrror
Ì [i:I] marine marÌne
Ï (Note 1) [e:i] pretty prÏtty
[ie:I] brief brIÉf
IÉr [ier:ïr] pier pIÉr
ÍE [ie:Y] pie pÍE
ÎE [ie:ý] cookie cookÎE
(Note 2) i/í missile
civilization
miss~Íle
civil~Ízátion

Notes:

  1. Ï is used for [e:i] only when stressed; when unstressed, ê is used.

  2. Note that the ending e in miss~Íle is not parenthesized - it is misleading in American English, but magic in British English.

The letter i also occurs in the combinations ci, si, sci, ssiti, and Zi, where it has no sound of its own, but modifies the sound of the preceding consonant.  

See below for [i:y], as in billion (ABCD billYon).

j -

Symbol Denotes Example ABCD
Example
j [j:j] jam jam
jj [jj:j] hajj hajj
J (Note 1) see note capture captJur(e)

Notes:

  1. The capital J is inserted as a sign of palatalization in the combinations dJ (in procedure), sJ (in insure), ssJ (in pressure), tJ (in capture and question), and ZJ (in measure).  More precisely, it is used in representing the pairs [d:j] (dJ), [s:X] (sJ), [ss:X] (ssJ), [t:C] and [ti:C] (tJ) and [s:J] (ZJ).  The symbol J also appears in the combination GJ, described under g.

    (Note that there is no ambiguity between the t and ti spellings corresponding to tJ - an i was present in the original spelling exactly if the letter after the J is not a u.)

k -

Symbol Denotes Example ABCD
Example
k [k:k] skin skin
KH [ch:k] school sKHoól

The combination ck is treated as a double k - see c above.

See n below for information on the combination ñk.

l -

Symbol Denotes Example ABCD
Example
l [l:L] leg leg
ll [ll:L] pill pill
le (Note 1) [le:øL] purple purple

Notes:

  1. le represents the normal sound of l followed by the normal sound of e when not at the end of a word, before a past tense or plural marker (d or s) or followed by +, as in sled. I now regret this context dependency, as slËd would be considerably easier for a program to process correctly, especially due to the simultaneous -ed ending.

The previous version of ABCD used the symbol L as a synonym for (l) after some form of the letter a. This didn't have any particular advantages, and sometimes complicated programming using ABCD as input.

m -

Symbol Denotes Exmaple ABCD
Example
m [m:m] mud mud
m [m:øm] spasm spaZm
mm [mm:m] hammer hammer

n -

Symbol Denotes Example ABCD
Example
n [n:n] nice níce
n [n:øn] didn't didnt
nn [nn:n] sunny sunny
ng [ng:G] song s~Öng
ñ (Note 1) [n:G] finger
sink
fiñger
siñk
N (Note 2) [n:n] ungrateful uNgráte+ful

Notes:

  1. ñ can be used before any of the various symbols representing or starting with the k sound, as in uñcle, añKHor, bañquet, coñQer and jiñx.

  2. N represents [n:n] when the regular n sound is followed by g, as in ungratefulN is not needed preceding k sounds - unclean is simply spelled unclean in ABCD.

o -

Symbol Denotes Example ABCD
Example
o [o:o] or
[o:ø]
pot
lemon
pot
lemon
ó [o:O] zero zéró
ò [o:Ø] coral
sloth (Amer)
còral
slòtþ
oa [oa:O] boat boat
oar [oar:Ør] boar boar
oe [oe:O] toe toe
oer [oer:Ør] Boer boer
oi [oi:Q] boil boil
oo [oo:V] book book
[oo:U] boot boót
oór
(Endnote 2)
[oor:Ür] poor poór
or [or:ør] motor
decorate
mótor
decoráte
ör [or:or] laboratory
(Brit)
laböratory
örr [orr:or] sorry sörry
ou [ou:W] house house
ôu -ôuS vicious viciôuS
ow [ow:W] allow a^llow
oy [oy:Q] boy boy
O [a:o] squash squOsh
Ô [o:u] mother mÔther
OR (Note 1) [our:ør] favour fávOR
Ôr [or:&r] word wÔrd
[ou:U] soup sOÙp
OÙr
(Endnote 2)
[our:Ür] tour tOÙr
ÒUr [our:Ør] court cÒUrt
ÔU [ou:u] trouble trÔUble
ÓW [ow:O] blow blÓW
ò/ö cross
forest
cr~Öss
f~Örêst
~Òr òr/[or:ør] category catêg~Òry

Notes:

  1. I chose to use OR rather than ôur here because almost all -our words have an American equivalent spelled with -or.
p -

Symbol Denotes Example ABCD
Example
p [p:p] pink piñk
pp [p:pp] happy happy
PH [ph:f] photo PHótó

q -

Symbol Denotes Example ABCD
Example
qu [qu:kw] queen queen
Q [qu:k] unique únÌQe

See n above for the combinations ñqu and ñQ, as in bañquêt and coñQer.

r -

Symbol Denotes Example ABCD
Example
r (Note 1) [r:r] red red
r (Note 2) [r:-] arrive a^rríve

Notes:

  1. The letter r indicates [r:r] after a consonant or at the start of a word. When r follows a vowel, it generally forms a digraph or trigraph with that vowel.  The possibilities are described with the individual vowels.
  2. Often, the meaning of ?rr is the same as that of ?r, for ? a vowel. Rather than give extra rules for all the conditions under which this may occur, it is simpler to just regard the second r as silent. Of course, if there is an explicit rule for ?rr, that rule has precedence.

s -

Symbol Denotes Example ABCD
Example
s (Note 1) [s:s] or
[s:$]
sad
cries
sad
crÍEs
ss [ss:s] guess g(u)ess
sc, sC (Note 2) [sc:s] scent
acquiesce
scent
acquîesC(e)
sci (Note 3) [sci:X] luscious lusciôuS
sh [sh:X] ship ship
si (Notes 3, 4) [si:X] mansion mansion
sJ (see J) [s:X] insure insJùre
ssi [ssi:X] mission mission
ssJ (see J) [ss:X] pressure pressJur(e)
S (Note 5) [s:s] atlas
cactus
tense
atlaS
cactuS
tenSe
SH [ch:X] machine maSHÌne


Notes:

  1. At the end of a word (or before a +) s is assumed to indicate a plural, in which case, depending on the preceding sound, it may be pronounced as z.  The plural s often follows a silent e - however, in contrast to the past tense, where the d is always preceded by e, a silent e in the plural generally implies its presence in the singular as well.

  2. sc denotes [sc:s] preceding e, i or y.  In any other position, it is simply the juxtaposition of the regular s and c (pronounced as k) symbols.  The C may be capitalized to indicate a following non-magic e.

  3. si, sci, ssi and ti have the sound of {X} only when followed by a vowel.  Otherwise, the i is a separate symbol.

  4. When si or ti follows n, there are two common pronunciations: nch and nsh.  The CAAPR dictionary, from which the ABCD dictionary is derived, uses nsh as the recognized pronunciation, which is more in line with the pronunciation of si and ti in other positions.

  5. S represents [s:s] at the end of a word, where it might be mistaken for a plural.  S is also used before a silent e, where the e prevents the word from being interpreted as a plural.  See e note 1 above for more details.

See z below for [s:z] (except in plurals) as in hose (ABCD hóZe).

t -

Symbol Denotes Example ABCD
Example
t [t:t] top top
tt [tt:t] kitten kitten
tch  [tch:C] catch catch
th [th:D] that
leather
that
lEÂther
[th:T] think
truth
tþiñk
trùtþ
ti (see s
Notes 3, 4)
[ti:X] vocation vócátion
tJ (see J) [t:C] or
[ti:C]
capture
question
captJur(e)
questJon

u -

Symbol Denotes Example ABCD
Example
u [u:u] or
[u:ø]
sun
circus
sun
circus
ú (Note 1) [u:yU] or
[u:yV]
puny
annual
púny
annúal
ù (Note 1) [u:U] or
[u:V]
lunar
gradual
lùnar
gradJùal
û -fûlly awfully awfûlly
ü [u:yV] or
[u:yû] or
[y:yø]
refugee
regular
volume (Amer)
refügee
regülar
volüm(e)
úe [ue:yU] cue cúe
úer
(Endnote 2)
[uer:yÜr] puerile (Brit) púer~Íle
ùe [ue:U] true trùe
ur [ur:ør] or
[ur:&r]
Arthur

burn
àrtþur

burn
urr (Note 2) [urr:ür] hurry hurry
úr
(Endnote 2)
[ur:yÜr] purity púrity
ùr
(Endnote 2)
[ur:Ür] or
[ur:Vr]
plural
brochure (Amer)
plùral
bróSHùr(e)
ür [ur:yûr] or
[ur:yør]
accurate
mercury
accür@t(e)
mercüry
Ù [o:U] move mÙve
Û [u:V] or
[u:û]
push
prejudice
pÛsh
prejÛdiC(e)
ù/ú student st~Údent
~Úe ùe/úe Tuesday t~ÚeZd[ay:y]
~Úr
(Endnote 2)
ùr/úr duration
manure
d~Úrátion
man~Úre
Û/yÛ insulation ins~Ülátion

Notes:

  1. The symbols ú and ù ordinarily represent the long vowel /u:/, but they represent /u/ (which is rendered in CAAPR as {V}) before a vowel.

  2. urr is the only instance of an ABCD notation without a ~ which is interpreted differently for American and British English, but this seems reasonable, since TS exhibits this variance itself.

v -

Symbol Denotes Example ABCD
Example
v [v:v] very vëry
vv [vv:v] savvy savvy

w -

Symbol Denotes Example ABCD
Example
w [w:w] way way
wh [wh:µ] which which
W (Note 1) [w:w] away aWay
Wh (Note 1) [wh:µ] awhile aWhíle

Notes:

  1. When the consonant w follows an a, e or o, confusion with a vowel digraph is possible, in which case the w is spelled with a capital letter.  This results in spellings like aWay, bêWâre and mícróWáve.  This is also possible with the wh digraph, as in aWhíle and nóWh[er:air]e.

x -

Symbol Denotes Example ABCD
Example
x [x:ks] fix fix
xc (Note 1) [xc:ks] except êxcept
X [x:gz] exist êXist

Notes:

  1. xc stands for [xc:ks] only preceding e, i or y.  Otherwise, it is simply an x followed by a c, as in excavate.

See n above for information on the combination ñx, as in jiñx.

y -

Symbol Denotes Example ABCD
Example
y (Note 1) [y:y] or
[y:ÿ]
yes
Tokyo
yeS
tókyó
y (Note 1) [y:ý] happy
everything
happy
ev(e)rytþing
ý [y:Y] fly
qualify
flý
quOlifý
ýe [ye:Y] dye dýe
ÿ [y:i] myth mÿtþ
Y [i:y] million millYon
Ÿ (Note 1) [y:ý] lobbyist lobbŸist

Notes:

  1. The ABCD symbol y may indicate either a consonant or vowel sound. As a consonant, it denotes [y:y].  As a vowel, it denotes [y:ý]. The vowel sound occurs at the end of a word or before a consonant, and the consonantal sound occurs at the beginning of a word. Before a vowel, either sound may occur.  Usually, when y is found after a consonant and before a vowel, the corresponding pair is [y:ÿ], indicating that both the consonant and the vowel pronunciation are possible.  In this position, a consonantal pronunciation is assumed - if only a vowel pronunciation is used, then the spelling should be Ÿ.  See Endnote 3 for further discussion of the ambiguous letter y and its sounds.

A previous version of ABCD used Ý rather than ý for long i at the end of a multi-syllable word like replyThis distinction has been dropped, as it did not seem particularly valuable.

z -

Symbol Denotes Example ABCD
Example
z [z:z] zoo zoó
zz [zz:z] buzz buzz
Z [s:z] hose hóZe
Zi (Note 1) [si:J] vision viZion
ZJ (see J) [s:J] measure mEÂZJur(e)

Notes:

  1. Zi denotes [si:J] only when followed by a vowel.  Otherwise, the Z and the i are distinct symbols.

Unusual sounds -

As noted, the ABCD spelling notation provides unique codes for high-frequency spelling patterns.  Of course, as we all know, English is afflicted with a sizable number of words that break these patterns.  ABCD handles these words by means of bracketed symbol pairs, for instance, [eau:éw] in beautiful.  The eau is the letter sequence in the usual spelling, and the éw defines the sound (but not the spelling).  Obviously, this representation is not unique: [eau:ú] or [eau:yoó] could have been written instead.

Almost all sounds of English have at least one high-frequency spelling, and so there is at least one ABCD spelling that can be used in such pairs for those sounds.  But a few sounds, mostly from words of foreign origin, are so low-frequency that there is no standard ABCD notation for them.  An example is the final sound of the word loch, when pronounced in the authentic Scottish way.  ABCD therefore must assign representations to these sounds, so that these words can be rendered sensibly.  For instance, the /x/ sound of loch is given the ABCD spelling of QH, and so the word is written lo[ch:QH] in ABCD.

This table catalogs the representations of unusual sounds (and one uncommon American/British difference):

Symbol Denotes
(SAMPA)
Example ABCD
Example
ã /A~/ melange mÈl[an:ã]GJ(e)
õ /O~/ concierge c[on:õ]cî[er:air]GJe
QH /x/ loch lo[ch:QH]
UH /V~/ uh-huh UHhUH
&
(Note 1)
/3/ masseuse
(Brit)
mass[eu:&]Z(e)
~OOr
(Note 2)
oòr/oor courier c[our:~OOr]îer

Notes:

  1. The CAAPR {&} symbol is normally used before the letter r, as in SH[au:ó]ff[eur:&r], to indicate the vowel sound of fur.  There are a few borrowed French words such as masseuse which, in British English, are pronounced using this vowel without an r.  The British pronunciation of masseuse is represented as mass[eu:&]Z(e) in ABCD.
  2. The ABCD spelling ~OOr corresponds to the CAAPR spelling {Vr}, used for words such as courier and hooray.  In American English, {Vr} is regarded as synonymous with {Ür}, spelled in ABCD as oór.  Whereas in British English, {Vr} and {Ür} are different sounds, and {Vr} is symbolized in ABCD as (unaccented) oor.  See Endnote 2 for more detail.

Endnotes

I. CAAPR as used in ABCD

Completely pure CAAPR is not used here.  Certain simplifications have been introduced to remove distinctions not relevant to this project.  In particular,

  1. The indistinct i, CAAPR {ê}, is treated as identical to the short i ({i}).

  2. The CAAPR symbol {°} is treated the same as {ø}, and the symbols {î}, {3}, {¹} and {³} are treated as synonymous with {ê}, and therefore with {i}.

  3. The symbol {ß} is treated as identical to {r}, and {R} as identical to {ør}.

  4. The {*} symbol is removed.


Also, some aspects of ABCD depend on stress.  Sometimes, when stress differs between British and American English, it will happen that the ABCD spelling is based on a compromise between the two.  A good example is the word electronic.  The American CAAPR for this word is {iLe·ktro'nik}, while the British CAAPR is {i·Lektro'nik}.  The conversion to ABCD is done on the composite form {i·Le·ktro'nik}, leading to the ABCD spelling Ïlectronic, which does not accurately reflect the American pronunciation.  I have edited the ABCD dictionary to correct this particular instance, but it is likely that other examples of the same problem still exist.

II. R spellings, especially with u

ABCD utilizes a number of spellings that imply the equivalence of a short sound followed by an r to a related long sound followed by r.  Examples are the spellings air, eer and oar, which logically ought to be pronounced as ár, ér and ór, but are actually pronounced as âr, Êr and òr respectively.  This implied equivalence is also reflected in the common use of the magic e in words like care, sphere and sore.

The most difficult case has to do with the vowels represented in CAAPR as {Vr} and {Ür}.  In American English, both {Vr} and {Ür} symbolize the same sound, represented in SAMPA as /Ur/, while for British English {Ür} represents the diphthong /U@(r)/.  I note that {Ür} is quite common in RP, while {Vr} occurs in only a few words, notably guru and courier. It turns out to be extraordinarily convenient to represent {yÜr}/{Ür} by the long vowel symbols úr and ùr, as in cúre and plùral.  Furthermore, though American and British dictionaries quite consistently show this sound as {Vr}, most of the participants in the Saundspel group feel that {Ur} (Sampa /u:r/) is more accurate.  For these reasons, {Ür} is consistently shown with a long vowel.  For instance, poór is used rather than poor.  However, when the sound is understood as {Vr} in British English, it is represented as a short sound there.  The word guru is spelled g[ur:~OOr]ù in ABCD, representing gùrù in American English, but gÛrù in British English.

III. The ambiguity of y

CAAPR utilizes the symbol {y} for the consonant sound of the letter y (as in young), and {ý} for the vowel sound (as in happy).  But there is a third possibility, a quite common one, represented by {ÿ}.  {ÿ} represents a sound that can be either {y} or {ý}, varying by speaker.  Most words like champion and warrior, in which i is followed by an unstressed vowel, are of this sort.  Some words in which y is followed by a vowel, such as Tokyo and Libyan, are also of this sort.  The ABCD approach for dealing with words containing this ambiguity is to spell them with the existing letter.  Thus, champion is spelled champîon, implying a vowel sound, even though the consonant sound is no doubt more common, and similarly, the spelling libyan is used, implying a consonant sound for the y, even though the word is probably more commonly pronounced with a vowel there.  The symbols Y and Ÿ can be used for words like spanYard and lobbŸist, where the pronunciation is unequivocally different from what one might expect.

Appendix - Context-Dependent Elements of ABCD

ABCD represents pronunciation and traditional spelling in an almost context-free way, which is to say that the interpretation of its symbols usually does not depend on their context.  For instance, the sequence SH always represents the sound of {X} and the spelling ch, regardless of where it occurs in a word, or what other symbols are adjacent.  For a computer program to understand ABCD, it is mostly necessary simply to divide the text into symbols.  Some letters are used in more than one symbol (for instance, the letter H occurs in the symbols H, GH, KH, PH, QH, SH and UH), but the rule is that each letter is contained in the longest possible symbol, so that SH will always represent SH, and never S followed by H.

There are, however, a small number of symbols whose interpretation is dependent on context.  These context dependencies are found in regular English spelling, and the familiarity benefits of adopting them in ABCD more than offset the additional complexity of context dependence.  The context-dependent elements of ABCD are of two sorts, positional and general.  The positional elements are as follows:

The other context-dependent elements of ABCD may occur anywhere within a word, as follows:

Appendix - An Unambiguous ABCD

As I mentioned earlier, ABCD is an ambiguous system.  The five unmarked vowel letters, as well as ü and Û, may denote either the schwa or a short vowel.  This ambiguity can be remedied without losing the readability of ABCD.  I'm not sure this is a change for the good, as it requires many more diacritics, while the benefits are small unless one considers this distinction important even in an orthography intended to be very similar to TS.  Nevertheless, here's how it is done.

The short vowel sounds of a, e, i and o are denoted by the vowel with a dieresis, in the way in which the dieresis is already used preceding r.  This gives rise to very precise spellings like ämbidëxtrôuS, hïppopötamuS and sêlëctïvity.  The sounds of u require a more serious reorganization, due to the use of ü for both the {} and {yV} sounds.  The table below shows how it could be done.

Sound Ambiguous
ABCD
Unambiguous
ABCD
Ambiguous
Example
Unambiguous
Example
{ø} u u campus cämpus
{u} u ü cut cüt
{V} Û Ü pÛsh pÜsh
{yø} ü û accür@t(e) äccûr@t(e)
{yV} ü Û refüGee rëfÛGee
{U}/{yU} d~Úty d|Ùty
{V}/{yV} d~Ürátion d|Ürátion
{ø}/{V} Û ~U instrÛment ïnstr~Ument
{yø}/{yV} ü µ monüment mönµment

One other ambiguity that must be resolved is between the unstressed {ør} and the stressed {&r}, which can both be spelled by er, ir or ur.  An obvious fix here is to use eR, iR and uR for the stressed sound, leading to spellings such as fiRstêmeRGency and muRder.  (And also, Ôr should be changed to ÔR, for consistency, as in wÔRtþ.)

In some ways, the unambiguous system is a better arrangement, since ü is compatible with the other uses of dieresis, and the resemblance of the symbol | to the letter I may be mnemonic. Nevertheless, I think the number of diacritics required in the unambiguous system makes it inferior to the slightly simpler ambiguous one.  Certainly, the ambiguity of ABCD is not an issue for my planned uses of it.

The same process that generates the ambiguous ABCD dictionary could equally well generate an unambiguous version.  I am not at this time offering it for download, but if you have some use for it, please contact me (Alan at wyrdplay.org), and I'll be happy to provide a copy.


To comment on this page, e-mail Alan at wyrdplay.org

Go to wyrdplay.org home page
Go to wyrdplay.org spelling system roster