ABCD Notation

Alan Beale
July 30, 2011

What is ABCD?

This page is about ABCD, which stands for Alan's Basic Codes with Diacritics.  I call ABCD a "notation" - it is easier to explain what it is not than to explain what it is, and why you might be interested in it.  ABCD is not a spelling system: it is too complex and idiosyncratic for that.  Neither is it a dictionary key: it is neither as accurate or as regular as a dictionary key.  Essentially, ABCD is a notation which elucidates the relationship between a word's usual spelling and its pronunciation.  It is suitable for use both with words that conform to common English spelling patterns, words like nasty, nice, terrible and benevolent, as well as with horridly exceptional words like women, colonel, boatswain and connoisseur.

ABCD is loosely based on my spelling system DRE.  It makes use of an extensive number of diacritics, organized much like the DRE set of diacritics.  ABCD uses both lower- and upper-case characters, but prefers to use lower-case for the most common and familiar patterns, and upper-case for less familiar ones.  Further, ABCD's lower-case characters always match the corresponding traditional spellings (possibly with the addition of a diacritic), while upper-case characters may occasionally differ from them.  (For instance, the Z in ABCD represents an s which is pronounced as z.)  Each ABCD letter or digraph represents both a sound and an English spelling.  For instance, the digraphs sh, ti and SH all represent the same sound, but spelled as sh (as in shoe), ti (as in nation) and ch (as in machine) respectively. In addition to lower- and upper-case alphabetics, ABCD uses a few punctuation characters, mostly to note flaws in a word's usual spelling, and also to separate word constituents. (The @ character is an anomaly - it is treated as a special form of the letter a rather than as punctuation.)

Like DRE, ABCD is ambiguous about certain aspects of pronunciation, though less so than DRE.  An ABCD spelling does not usually indicate stress, and also does not distinguish between the schwa and the regular short vowel sounds.  However, if you ignore these two areas, ABCD is quite precise.  In fact, one characteristic of ABCD is that the ABCD spelling of a word is sufficient to represent both its traditional spelling (ignoring typographical issues like capitalization and hyphenation) and its pronunciation, subject to the two ambiguities of stress and schwa.  (I have defined a less ambiguous form of ABCD, briefly discussed in an appendix, but the ambiguous version is easier to read, and I think more useful.)  Furthermore, the ABCD representation of pronunciation and spelling is almost entirely context-free, which makes it easy to process mechanically by a computer program.  The context dependent elements of ABCD are enumerated in this appendix.

Here are a few simple examples of ABCD in action, to give you a better idea of how it works.  The list below is in the format "TS: ABCD".  (Throughout this page, I use the convention of displaying traditionally spelled words (TS) in italics, and ABCD spellings in boldface.  Occasionally, my CAAPR notation is also used - this is also shown in bold, and enclosed in curly braces to identify it as CAAPR.)

abundant: abundant
alienate: lente
charisma: KHariZma
handsome: han(d)som(e)
awareness: aWre+nss
accordion: a^ccrdon
demoralization: dem~ral~ztion
laugh: l[au:~][gh:f]

abundant is a word spelled entirely according to English patterns, and requiring no markings for vowel sounds.  alienate also conforms to patterns, but requires some vowels be marked with diacritics to prevent misinterpretation. Note that no special marking is required for a final silent e following a long vowel. The word charisma also conforms to high-frequency patterns, but both the ch and the s need to be altered to avoid misunderstanding.  (The spelling KH is used rather than CH, because CH is equally plausible as a spelling for the ch of machine.)  handsome has two silent letters, and, in contrast to alienate, the e is marked as silent since the previous vowel is not long.  Finally, the word awareness shows some ABCD techniques for resolving some of the subtler ambiguities of regular spelling.  The W in awareness is capitalized to show that aw is not to be interpreted as a single vowel sound (as in law), while the + sign after aWre shows that the first e is not pronounced as a short e or a schwa, but instead is silent, because it ends the root word aware.

Unlike the words above it, the word accordion does not conform to basic English patterns, because the double c follows a vowel representing a schwa.  The ^ flags this situation.  The word demoralization displays a different difficulty - the British and American pronunciations differ.  The ~ flags a code which is interpreted differently for the two varieties of English.  And finally, the word laugh is completely defiant of standard English patterns, and so the ABCD representation simply shows how the letter combinations map to sounds.

An ABCD dictionary is available for download here.  It contains 27,000 English words, spelled in TS and ABCD.  For most words, the spelling is the same for both American and British English; where they differ, the dictionary provides both of them, with the American spelling first.  The whole point of ABCD is really the dictionary.  It can be used as an educational tool, for increasing one's understanding of the patterns of English spelling, and the ways in which they break down.  I also believe that it may be useful as the starting point for developing spelling systems which are very similar to existing spelling, by allowing easy identification of those words that fail to abide by whatever rules the designer feels are most important.  One reason I developed ABCD was to help me develop a version of my system DRE which did not require the use of diacritics.  I have not, at this time, actually succeeded in doing this, but there is no doubt that ABCD has made the process easier, and I consider it possible that the process might someday actually produce something satisfactory.

The ABCD dictionary is ultimately derived from the CAAPR dictionary; the pronunciations it uses are based on consensus of 2 American dictionaries, 2 British dictionaries and the Longman Pronunciation dictionary, which covers both varieties.  See the CAAPR page for more information on this subject.  The dictionary download above includes copies of both this page and the CAAPR definition for easy reference.

This version of ABCD and its dictionary differ from previous versions in the use of the symbol , (which was originally part of ABCD, but was then unwisely removed), and by removal of the exception for the S symbol in the sequence uS.

The diacritics of ABCD

Before attempting to describe the ABCD notation in full detail, it will be useful to describe the way it organizes its diacritical markings, which is based on the conventions of my spelling system DRE.  The organization is strictly applied for letters in lower case; some flexibility is allowed for upper-case letters to avoid running out.

  1. Vowels without diacritics represent either the regular short sound of the vowel (as in shack, check, chick, shock and chuck), or the schwa.  The digraph oo represents the vowel of shook.  An unmarked y is a rather special case, and may have either the vowel sound of misty, or the consonantal sound of yell.  When followed by an r, some vowels may also be pronounced with a stressed er sound, as in fern, bird and burn.

  2. Letters with an acute accent represent the normal long sound of the five English vowels, as in mte, mte, mte, mte and mte. The digraph o represents the vowel of mot.  An acute-accented , as in fl, represents the same sound as .

  3. Letters with a grave accent represent an alternate sound of the marked letter.  These sounds are all long in length, and almost always spoken distinctly.  These sounds are especially common in words of European origin.  Mnemonic words are drma, snce, marne, bre (as well as dg, in American English) and crde.

  4. Letters with a circumflex represent an alternate sound of the marked letter.  These sounds are shorter than the sounds associated with the grave accent.  They may be reduced to a schwa, and may also have a slightly different meaning preceding an r than in other positions.  The sounds of the circumflexed vowels when no r follows are those of vido, aud, ther and psh.  (DRE also spells ny and prtty, but these particular forms are not used in ABCD.)  Before the letter r, the circumflexed vowels represent the sounds of cre, hre and wrd.  They are also used in the standard suffixes -lly, -lss, -nss and -flly, to indicate an indistinct sound despite the following double letter.  Note that ABCD, unlike DRE, only uses a lower-case for an unstressed sound: nble is spelled with one in ABCD, but prtty is not.

  5. Letters with a dieresis represent the same sound as the unmarked letter, and are often used where the unmarked letter would have a different interpretation.  Examples are pradox, wickd, fl and srry.  The with a dieresis has a special meaning.  It represents either the unstressed sound of  or the schwa, preceded by a y, as in reglar or mercry.  In ABCD, and are also used to indicate the normal short sound of the vowel before an r, as in chrish and sprit.  A  with a dieresis may be used in ABCD to indicate a y which is always pronounced as a vowel, as in lobbŸist, where, because the y is followed by a vowel, one might otherwise assume the consonantal sound is intended.

DRE and ABCD both utilize a number of digraphs in which one of the vowels is marked with a diacritic.  In ABCD, except for a few exceptions (o, u, w and combinations like containing an ), the rule for interpretation of such combinations is simple - the sound is that of the marked letter, and the unmarked letter is ignored.  Example words include hEd, thY, dE, dUble, nervuS and ce.  A certain number of unmarked digraphs are used also, and they generally have the meaning you would expect. These are ai, au, aw, ay, ea, ee, eu, ew, oa, oe, oi, oo, ou, ow and oy.  Note that u and w are exceptions to the rule for interpreting digraphs above. eu and ew are pronounced like e (sleut, brew), and u and w like e (as in ur and fw). These two combinations break the rules because of the lack of an accented w in many fonts and on most keyboards.

Other ABCD Conventions

One of the all-too-common features of English spelling is the use of silent letters.  ABCD encloses silent letters in parentheses, as in (k)nfe, (s)land and ball(t).  There are a few letters and combinations, notably e, gh and l, whose treatment is more complicated when silent.  See their descriptions below for more information.

Another confusing aspect of English spelling is the use of double letters. A useful rule of thumb is that a double consonant implies that the preceding vowel is short and stressed; for example, compare filling and filing, or matter and material. Unfortunately, there are a great many exceptions to this so-called rules. ABCD uses the ^ character preceding a double letter to flag a vowel which is either unstressed or long, as in a^dditional or gr^ss. Note that ck, cq and dj are treated as double letters for this purpose.

One might well ask of ABCD: is it oriented towards American or British English?  The answer is that it is equally oriented towards both.  It may be used to spell words from either regional variety.  In most cases, the spelling is independent of the variety.  This may happen in any of three ways.  Many words are pronounced the same in both varieties, such as catcloudy and demonstration.  Other words are pronounced differently, but with pronunciations that are related to each other according to well-defined rules, allowing a single spelling to be used for both.  Examples of such words are pot, stairs and curious.  A third case is that of words which have related pronunciations in American and British English, but where the relationship is not reliable for similar words.  For instance, the American pronunciation of sample would be written sample in ABCD and the British pronunciation as smple, but the similar word ample would be spelled ample for both varieties.  ABCD uses the character ~ to indicate a pronunciation which commonly differs between American and British English.  For instance, sample is spelled s~mple in ABCD.  Words such as clerk and neither with unusual differences between American and British pronunciation must have two ABCD spellings, one for each variety.

You may also be wondering what the distinction is between the upper- and lower-case ABCD symbols.  Before a lower-case symbol could be used, there were two prerequisites.  The first was simply that the base character for a lower-case symbol had to be the character used in regular spelling. ABCD uses the symbol for the letter o when pronounced as long oo, as in move.  A lower-case symbol could not be used unless I were willing to use a form of the letter o for it.  The other requirement was that I would use a lower-case symbol only when it was pretty clear how you would spell the sound in a rational spelling system.  For instance, spelling the vowel of plain as ai is very reasonable, and so lower-case could be used.  But spelling the second vowel of machine with the letter i is at least dubious, and so the word is denoted maSHne rather than maSHne. The capital letter emphasizes that there's "something funny" going on here.

Deriving ABCD Spellings

I think the best approach to describing the details of ABCD is a semi-formal one.  So let me start off with a description of how the ABCD spelling of a word is determined.  The process starts off with a decomposition of a word into pairs.  The first element of each pair is one or more letters from the spelling, and the second is from the CAAPR representation of the pronunciation (see Endnote 1). (CAAPR is described here.  Note that the remainder of this page assumes familiarity with CAAPR - so if it is new to you, you may want to keep the CAAPR writeup open for reference.)  

As an example, the word charisma is originally decomposed as:

   [ch:k][ar:r][i:i][s:z][m:m][a:]

The process of deriving the ABCD spelling then proceeds in three steps:

  1. High frequency pairs are replaced by ABCD symbols or symbol combinations.  (It seems remarkable that there are few enough of these pairs that one can find readable representations of all of them.)

  2. Certain symbols may be modified or added based on special circumstances of individual words.  This is done either to avoid ambiguity (e.g., to distinguish the th of worth from that of porthole) or to note unexpected violations of English patterns (like the double t in attend or the s at the end of the non-plural atlas).

  3. Any remaining pairs have the second element modified to contain an ABCD code rather than a CAAPR code, except that the CAAPR symbols {} and {&}, which do not have an unambiguous ABCD representation, are retained.

Step 1, and aspects of step 2, can be summarized easily by simply listing the pairs to which they apply, and how they are represented (which I will do below).  But some additional notations are more conveniently described here:

  1. In a number of cases, pairs at the end of a word are handled differently from the same pair within a word.  This is especially true for the silent e, and the letter s when used to indicate a plural or possessive.  Because of English's fondness for compound and derived words, these letters can sometimes occur within a word with the end-of-word interpretation. In ABCD, a plus sign is used to indicate the end of a word within a word.  Examples are scre+crW and stte+ment.  The plus sign is also used to separate double letters when both are sounded, as in un+ntiCed or mis+stte.

  2. A lone ^ is used before a double consonant following a schwa or an unstressed short i, in violation of normal English patterns.  Examples are a^ccommodte, co^rrect and cmpa^ss.  The combinations ck, cqu, dG, dj and tch are treated as double letters here.
  3. Silent letters are enclosed in parentheses, as noted above.  (Other notations are sometimes used for silent e, l and gh, as described below.)

  4. The symbol ~ always indicates that what follows is pronounced differently in British and American English.  Individual letter combinations beginning with ~ are discussed below, together with the notations with no dependence on English variety.

    One property of ABCD is that it is very easily parsed by software - while some letter combinations, such as ch, have meanings distinct from those of their components, there is never (so far as I can determine) any ambiguity in how a word is divided into meaningful units.  I note that this property is preserved even if all the ~'s are removed. Which is to say, the ~'s are there to assist the human reader, but are unnecessary for accurate algorithmic decomposition.

The ABCD Alphabet

Having said all that, I am now ready to run down the alphabet, and produce a complete list of the ABCD notations.  Though the list is quite long and detailed, it is highly structured and organized, notably by the diacritical conventions given above, and for that reason is not hard to grasp and master.  For symbols beginning with a ~, the Denotes column of the tables gives both the American and the British meaning for the symbol: in a/, the a is the American form, and the the British form.

a

Symbol Denotes Example ABCD
Example
a [a:a] or
[a:]
cat
about
cat
about
[a:E] late lte
[a:A] father fther
-lly locally lclly
@ [a:i] message mess@G(e)
ai [ai:E] rain rain
air [air:r] fair fair
ar [ar:r] awkward awkward
r [ar:r] care cre
r [ar:ar] paradox pradox
rr [arr:ar] arrow rrW
au [au:] pause pauZe
aw [aw:] claw claw
ay [ay:E] play play
[a:] water wter
A [ae:I] algae alGA
~ a/ bath b~t
~r r/[ar:r] secretary secrt~ry

See below for [a:o], as in watch (ABCD wOtch).

b -

Symbol Denotes Example ABCD
Example
b [b:b] big big
bb [bb:b] rubble rubble

c -


Symbol Denotes Example ABCD
Example
c (Note 1) [c:k] or
[c:s]
cat
city
cat
city
cc (Note 1) [cc:k] accord accrd
ck [ck:k] luck luck
cqu [cqu:kw] acquit a^cquit
cQ [cqu:k] lacquer lacQer
ch [ch:C] chill chill
ci (Note 2) [ci:X] vicious viciuS
Ce, C(e)
(Note 3)
[ce:s] advance
furnace
advanCe
furn@C(e)

Notes:

  1. c denotes [c:s] if followed, in the traditional spelling, by e, i or y, and otherwise [c:k].  The few words which do not conform to this pattern must be spelled in ABCD with an explicit [c:k] or [c:s], as in [c:k]eltic or fa[c:s]d(e).  cc denotes [cc:k] unless followed by e, i or y. When it is followed by e, i or y, the pronunciation is ks - this is regarded as 2 c's in succesion, rather than a single occurrence of cc.

  2. ci denotes [ci:X] only when followed by a vowel.  Otherwise, the c and the i are distinct symbols.

  3. Ce and C(e) represent [c:s] followed by a silent e, in situations where the silent e is not a magic e, as in advanCe and furnaC(e).  In the case of furnace, the e is misleading about the preceding vowel, and so is parenthesized.  In the case of advance, the previous vowel is too distant in the word to be affected by the e, which serves the useful purpose of defining the pronunciation of the preceding c.

See below for [ch:k], as in chrome (ABCD KHrme), and for [ch:X], as in machine (ABCD maSHne).  Also see n below for information on the combinations c and KH as in uncle and anchor.

d -

Symbol Denotes Example ABCD
Example
d [d:d] or
[d:]
dog
wanted
d~g
wOntd
dd [dd:d] add add
dG (see G) [dg:j] judge judG(e)
dJ (see J) [d:j] procedure procdJur(e)
ed (Note 1) [ed:] missed missed

Notes:
  1. At the end of a word, ed represents [ed:], that is, a past tense in which the e is silent, and in which the d is pronounced either as t or d, depending on the previous letter.  There are some exceptional words ending with -ed in which the e is surprisingly not silent, such as beloved and wicked - these words are spelled with d in ABCD to prevent ambiguity.

    Note that words like hunted and raided are regular, represented by [e:i][d:], and unambiguously spelled with -d in ABCD.  Also note that the d spelling in unnecessary in one-syllable words, and so bed is bed and not bd in ABCD.

e -

Symbol Denotes Example ABCD
Example
e [e:e] or
[e:]
ten
rivet
ten
rivet
e (Note 1) [e:-] late lte
[e:I] medium mdum
(Note 2) [e:i] or

-lss, -nss
enable
erupt
lifeless
fitness
nble
rupt
lfe+lss
fitnss
ea [ea:I] feast feast
ear [ear:r] fear fear
ed (see d) [ed:] missed missed
ee [ee:I] feet feet
eer [eer:r] beer beer
er [er:r] or
[er:&r]
river
revert
river
rvert
r [er:r] cherish chrish
rr [err:r] terrible trrible
es (see s) [es:$] miles mles
eu [eu:U] sleuth sleut
eur
(Endnote 2)
[eur:r] pleurisy pleurisy
u [eu:yU] feud fud
ur
(Endnote 2)
[eur:yr] Europe urop(e)
ew [ew:U] drew drew
w [ew:yU] few fw
(Note 3) [e:I] me
crises
museum
m
crss
mZum
[e:E] ballet
cafe
ball(t)
caf
e [ee:E] matinee
matine
[e:] apostrophe
video
apostroPH
vid
(Note 4) [e:e] or
[e:i] or
[e:]
wicked
duet
diet
wickd
d~t
dt
Er [ear:r] heart hErt
E [ea:e] head
measure
hEd
mEZJur(e)
Er [ear:r] bear bEr
A
(Note 5)
[ea:] idea (Brit) dA
I [ei:I] seize sIze
Ir [eir:r] weird wIrd
I [ei:E] reign rI(g)n
Ir [ei:r] their thIr
ER [ear:&r] earth ERt
r [er:r] here hre
r (Note 6) [er:r] supplier su^pplr
Y [ey:E] survey survY
Y [ey:] money mnY
~Er
(Note 7)
r/[er:r] cemetery cemet~Ery
~U
(Endnote 2)
eu/u neutral n~Utral
~Ur eur/ur neurotic n~Urotic
~W ew/w news n~WZ

Notes:

  1. The handling of silent e in ABCD is complicated.  There are two functions that silent e commonly performs.  It indicates that the previous vowel sound is long, in which case the e is commonly called magic.  Alternately, in many words, such as mice, savage and tense, it changes the sound of the previous consonant.  (Note that without the final e, tens would be a plural, and the s would be pronounced as z.)  When both functions are taken into account, we can classify words ending with a silent e into 4 categories.  We say a final e is magic if the previous vowel (separated from the e by a single consonant sound) is long.  (If the consonant is an r, the sounds of , and are also treated as long.)  We say a final e is misleading if there is a vowel preceding it which ought to be long, but is not.  In vice, the e is magic, but in service, it is misleading.  In words in which a final e is not magic, we call it useful if it is preceded by c, g or s, and otherwise useless.  An e can be both useful and misleading, as in garbage, and both useless and misleading, as in festive

    When a silent e occurs at the end of a word, it is enclosed in parentheses if it is misleading or if it is useless.  Also, when a useful (but not magic) e follows the letter c or s, ABCD capitalizes the consonant to show what the e is accomplishing.   Some example words are mne, plce, festiv(e), sav@G(e) and tenSe.  When a magic e occurs within a word and is not parenthesized, it is followed by a +, usually indicating the end of an internal word, as in bre+lylfe+boat, or minCe+meat.

  2. is used only when [e:i] is unstressed.   is used instead when stressed, as in glish.

  3. is used only when [e:I] appears where a silent e might be expected, at the end of a word (b) or before s (parentess).  Note that is used even in words with no other vowels, such as be, even though it would be impossible for the e to be silent. is also used in words like museum, where use of the usual  would seem to be part of the u digraph.

  4. is used for the regular sound of e when a bare e would be misinterpreted, such as wicked, which looks like a past tense, and duet, where d~et would appear to be a one-syllable word whose vowel is ~e.

  5. The sound of A is an RP diphthong represented in SAMPA as /I@/, which usually occurs before r in words like pier.

  6. r is used like , to prevent ambiguity, as in flr, where a bare e would be treated as part of the composite vowel symbol e.

  7. Note that the distinction between ~r and ~Er is only orthographic - both are pronounced the same in either variety of English.

See below for [le:L], as in double (ABCD dUble).


f -

Symbol Denotes Example ABCD
Example
f [f:f] free free
ff [ff:f] stuff stuff

g -

Symbol Denotes Example ABCD
Example
g [g:g] good good
gg [gg:g] egg egg
G (Notes 1, 2) [g:j] germ Germ
GH [gh:-] high
taught
hGH
tauGHt
GJ [g:J] mirage
genre
mirGJ(e)
GJ[e:o]nr

Notes:

  1. Note that the spelling G is used even if the letter following g is unusual, as in margarine (American ABCD mrGarin(e)).

  2. The combination dG, as in edge (ABCD edG(e)), is treated as a double letter.

h -

Symbol Denotes Example ABCD
Example
h [h:h] hot hot
H (Note 1) [h:h] mishap misHap

Notes:

  1. Because the letter h is used in a number of digraphs, it is frequently ambiguous when it follows a consonant, as in the words porthole, mishap and rawhide.  ABCD uses a capital H for [h:h] if confusion might be possible, as in prtHlemisHap and rawHde.

i -

Symbol Denotes Example ABCD
Example
i [i:i] or
[i:]
pig
devil
pig
devil
[i:Y] item tem
(Endnote 3) [i:] or
[i:]
radio rd
ir [ir:r] or
[ir:&r]
direct
bird
direct
bird
r [ir:ir] miracle mr@cle
rr [irr:ir] mirror mrror
[i:I] marine marne
(Note 1) [e:i] pretty prtty
I [ie:I] brief brIf
Ir [ier:r] pier pIr
E [ie:Y] pie pE
E [ie:] cookie cookE
~ (Note 2) i/ missile
civilization
miss~le
civil~ztion

Notes:

  1. is used for [e:i] only when stressed; when unstressed, is used.

  2. Note that the ending e in miss~le is not parenthesized - it is misleading in American English, but magic in British English.

The letter i also occurs in the combinations ci, si, sci, ssiti, and Zi, where it has no sound of its own, but modifies the sound of the preceding consonant.  

See below for [i:y], as in billion (ABCD billYon).

j -

Symbol Denotes Example ABCD
Example
j [j:j] jam jam
jj [jj:j] hajj hajj
J (Note 1) see note capture captJur(e)

Notes:

  1. The capital J is inserted as a sign of palatalization in the combinations dJ (in procedure), sJ (in insure), ssJ (in pressure), tJ (in capture and question), and ZJ (in measure).  More precisely, it is used in representing the pairs [d:j] (dJ), [s:X] (sJ), [ss:X] (ssJ), [t:C] and [ti:C] (tJ) and [s:J] (ZJ).  The symbol J also appears in the combination GJ, described under g.

    (Note that there is no ambiguity between the t and ti spellings corresponding to tJ - an i was present in the original spelling exactly if the letter after the J is not a u.)

k -

Symbol Denotes Example ABCD
Example
k [k:k] skin skin
KH [ch:k] school sKHol

The combination ck is treated as a double k - see c above.

See n below for information on the combination k.

l -

Symbol Denotes Example ABCD
Example
l [l:L] leg leg
ll [ll:L] pill pill
le [le:L] purple purple
L (Note 1) [l:-] calm cLm

Notes:

  1. L represents a silent l following the letter a, as in talk, salmon and calm.  This has a special representation for no reason other than that it is surprisingly frequent.

m -

Symbol Denotes Exmaple ABCD
Example
m [m:m] mud mud
m [m:m] spasm spaZm
mm [mm:m] hammer hammer

n -

Symbol Denotes Example ABCD
Example
n [n:n] nice nce
n [n:n] didn't didnt
nn [nn:n] sunny sunny
ng [ng:G] song s~ng
(Note 1) [n:G] finger
sink
figer
sik
N (Note 2) [n:n] ungrateful uNgrte+ful

Notes:

  1. can be used before any of the various symbols representing or starting with the k sound, as in ucle, aKHor, baquet, coQer and jix.

  2. N represents [n:n] when the regular n sound is followed by g, as in ungratefulN is not needed preceding k sounds - unclean is simply spelled unclean in ABCD.

o -

Symbol Denotes Example ABCD
Example
o [o:o] or
[o:]
pot
lemon
pot
lemon
[o:O] zero zr
[o:] coral
sloth (Amer)
cral
slt
oa [oa:O] boat boat
oar [oar:r] boar boar
oe [oe:O] toe toe
oer [oer:r] Boer boer
oi [oi:Q] boil boil
oo [oo:V] book book
o [oo:U] boot bot
or
(Endnote 2)
[oor:r] poor por
or [or:r] motor
decorate
mtor
decorte
r [or:or] laboratory
(Brit)
labratory
rr [orr:or] sorry srry
ou [ou:W] house house
u -uS vicious viciuS
ow [ow:W] allow a^llow
oy [oy:Q] boy boy
O [a:o] squash squOsh
[o:u] mother mther
OR (Note 1) [our:r] favour fvOR
r [or:&r] word wrd
O [ou:U] soup sOp
Or
(Endnote 2)
[our:r] tour tOr
Ur [our:r] court cUrt
U [ou:u] trouble trUble
W [ow:O] blow blW
~ / cross
forest
cr~ss
f~rst
~r r/[or:r] category catg~ry

Notes:

  1. I chose to use OR rather than ur here because almost all -our words have an American equivalent spelled with -or.
p -

Symbol Denotes Example ABCD
Example
p [p:p] pink pik
pp [p:pp] happy happy
PH [ph:f] photo PHt

q -

Symbol Denotes Example ABCD
Example
qu [qu:kw] queen queen
Q [qu:k] unique nQe

See n above for the combinations qu and Q, as in baqut and coQer.

r -

Symbol Denotes Example ABCD
Example
r (Note 1) [r:r] red red

Notes:

  1. The letter r indicates [r:r] after a consonant or at the start of a word. When r follows a vowel, it generally forms a digraph or trigraph with that vowel.  The possibilities are described with the individual vowels.

s -

Symbol Denotes Example ABCD
Example
s (Note 1) [s:s] or
[s:$]
sad
cries
sad
crEs
ss [ss:s] guess g(u)ess
sc, sC (Note 2) [sc:s] scent
acquiesce
scent
acquesC(e)
sci (Note 3) [sci:X] luscious lusciuS
sh [sh:X] ship ship
si (Notes 3, 4) [si:X] mansion mansion
sJ (see J) [s:X] insure insJre
ssi [ssi:X] mission mission
ssJ (see J) [ss:X] pressure pressJur(e)
S (Note 5) [s:s] atlas
cactus
tense
atlaS
cactuS
tenSe
SH [ch:X] machine maSHne


Notes:

  1. At the end of a word (or before a +) s is assumed to indicate a plural, in which case, depending on the preceding sound, it may be pronounced as z.  The plural s often follows a silent e - however, in contrast to the past tense, where the d is always preceded by e, a silent e in the plural generally implies its presence in the singular as well.

  2. sc denotes [sc:s] preceding e, i or y.  In any other position, it is simply the juxtaposition of the regular s and c (pronounced as k) symbols.  The C may be capitalized to indicate a following non-magic e.

  3. si, sci, ssi and ti have the sound of {X} only when followed by a vowel.  Otherwise, the i is a separate symbol.

  4. When si or ti follows n, there are two common pronunciations: nch and nsh.  The CAAPR dictionary, from which the ABCD dictionary is derived, uses nsh as the recognized pronunciation, which is more in line with the pronunciation of si and ti in other positions.

  5. S represents [s:s] at the end of a word, where it might be mistaken for a plural.  S is also used before a silent e, where the e prevents the word from being interpreted as a plural.  See e note 1 above for more details.

See z below for [s:z] (except in plurals) as in hose (ABCD hZe).

t -

Symbol Denotes Example ABCD
Example
t [t:t] top top
tt [tt:t] kitten kitten
th [th:D] that
leather
that
lEther
t [th:T] think
truth
tik
trt
ti (see s
Notes 3, 4)
[ti:X] vocation vction
tJ (see J) [t:C] or
[ti:C]
capture
question
captJur(e)
questJon

u -

Symbol Denotes Example ABCD
Example
u [u:u] or
[u:]
sun
circus
sun
circus
(Note 1) [u:yU] or
[u:yV]
puny
annual
pny
annal
(Note 1) [u:U] or
[u:V]
lunar
gradual
lnar
gradJal
-flly awfully awflly
[u:yV] or
[u:y]
regular reglar
e [ue:yU] cue ce
er
(Endnote 2)
[uer:yr] puerile (Brit) per~le
e [ue:U] true tre
ur [ur:r] or
[ur:&r]
Arthur
creature
burn
rtur
creatJur(e)
burn
urr (Note 2) [urr:&r]/
[urr:ur]
hurry hurry
r
(Endnote 2)
[ur:yr] purity prity
r
(Endnote 2)
[ur:r] plural plral
r [ur:yVr] or
[ur:yr]
accurate accr@t(e)
[o:U] move mve
[u:V] or
[u:]
push
prejudice
psh
prejdiC(e)
~ / student st~dent
~e e/e Tuesday t~eZd[ay:y]
~r
(Endnote 2)
r/r durable d~rable
~ /y insulation ins~ltion

Notes:

  1. The symbols and ordinarily represent the long vowel /u:/, but they represent /u/ (which is rendered in CAAPR as {V}) before a vowel.

  2. urr is the only instance of an ABCD notation without a ~ which is interpreted differently for American and British English, but this seems reasonable, since TS exhibits this variance itself.

v -

Symbol Denotes Example ABCD
Example
v [v:v] very vry
vv [vv:v] savvy savvy

w -

Symbol Denotes Example ABCD
Example
w [w:w] way way
wh [wh:] which which
W (Note 1) [w:w] away aWay
Wh (Note 1) [wh:] awhile aWhle

Notes:

  1. When the consonant w follows an a, e or o, confusion with a vowel digraph is possible, in which case the w is spelled with a capital letter.  This results in spellings like aWay, bWre and mcrWve.   This is also possible with the wh digraph, as in aWhle and nWh[er:air]e.

x -

Symbol Denotes Example ABCD
Example
x [x:ks] fix fix
xc (Note 1) [xc:ks] except xcept
X [x:gz] exist Xist

Notes:

  1. xc stands for [xc:ks] only preceding e, i or y.  Otherwise, it is simply an x followed by a c, as in excavate.

See n above for information on the combination x, as in jix.

y -

Symbol Denotes Example ABCD
Example
y (Note 1) [y:y] or
[y:]
yes
Tokyo
yeS
tky
y (Note 1) [y:] happy
everything
happy
ev(e)ryting
[y:Y] fly
qualify
fl
quOlif
e [ye:Y] dye de
[y:i] myth mt
Y [i:y] million millYon
Ÿ (Note 1) [y:] lobbyist lobbŸist

Notes:

  1. The ABCD symbol y may indicate either a consonant or vowel sound. As a consonant, it denotes [y:y].  As a vowel, it denotes [y:]. The vowel sound occurs at the end of a word or before a consonant, and the consonantal sound occurs at the beginning of a word. Before a vowel, either sound may occur.  Usually, when y is found after a consonant and before a vowel, the corresponding pair is [y:], indicating that both the consonant and the vowel pronunciation are possible.  In this position, a consonantal pronunciation is assumed - if only a vowel pronunciation is used, then the spelling should be Ÿ.  See Endnote 3 for further discussion of the ambiguous letter y and its sounds.

A previous version of ABCD used rather than for long i at the end of a multi-syllable word like replyThis distinction has been dropped, as it did not seem particularly valuable.

z -

Symbol Denotes Example ABCD
Example
z [z:z] zoo zo
zz [zz:z] buzz buzz
Z [s:z] hose hZe
Zi (Note 1) [si:J] vision viZion
ZJ (see J) [s:J] measure mEZJur(e)

Notes:

  1. Zi denotes [si:J] only when followed by a vowel.  Otherwise, the Z and the i are distinct symbols.

Unusual sounds -

As noted, the ABCD spelling notation provides unique codes for high-frequency spelling patterns.  Of course, as we all know, English is afflicted with a sizable number of words that break these patterns.  ABCD handles these words by means of bracketed symbol pairs, for instance, [eau:w] in beautiful.  The eau is the letter sequence in the usual spelling, and the w defines the sound (but not the spelling).  Obviously, this representation is not unique: [eau:] or [eau:yo] could have been written instead.

Almost all sounds of English have at least one high-frequency spelling, and so there is at least one ABCD spelling that can be used in such pairs for those sounds.  But a few sounds, mostly from words of foreign origin, are so low-frequency that there is no standard ABCD notation for them.  An example is the final sound of the word loch, when pronounced in the authentic Scottish way.  ABCD therefore must assign representations to these sounds, so that these words can be rendered sensibly in ABCD.  For instance, the /x/ sound of loch is given the ABCD spelling of QH, and so the word is spelled lo[ch:QH] in ABCD.

This table catalogs the representations of unusual sounds (and one uncommon American/British difference):

Symbol Denotes
(SAMPA)
Example ABCD
Example
/A~/ melange ml[an:]GJ(e)
/O~/ concierge c[on:]c[er:air]GJe
QH /x/ loch lo[ch:QH]
UH /V~/ uh-huh UHhUH
&
(Note 1)
/3/ masseuse
(Brit)
mass[eu:&]Z(e)
~OOr
(Note 2)
or/oor courier c[our:~OOr]er

Notes:

  1. The CAAPR {&} symbol is normally used before the letter r, as in SH[au:]ff[eur:&r], to indicate the vowel sound of fur.  There are a few borrowed French words such as masseuse which, in British English, are pronounced using this vowel without an r.  The British pronunciation of masseuse is represented as mass[eu:&]Z(e) in ABCD.
  2. The ABCD spelling ~OOr corresponds to the CAAPR spelling {Vr}, used for words such as courier and hooray.  In American English, {Vr} is regarded as synonymous with {r}, spelled in ABCD as or.  Whereas in British English, {Vr} and {r} are different sounds, and {Vr} is symbolized in ABCD as (unaccented) oor.  See Endnote 2 for more detail.

Endnotes

I. CAAPR as used in ABCD

Completely pure CAAPR is not used here.  Certain simplifications have been introduced to remove distinctions not relevant to this project.  In particular,

  1. The indistinct i, CAAPR {}, is treated as identical to the short i ({i}).

  2. The CAAPR symbol {} is treated the same as {}, and the symbols {}, {3}, {} and {} are treated as synonymous with {}, and therefore with {i}.

  3. The symbol {} is treated as identical to {r}, and {R} as identical to {r}.

  4. The {*} symbol is removed.


Also, some aspects of ABCD depend on stress.  Sometimes, when stress differs between British and American English, it will happen that the ABCD spelling is based on a compromise between the two.  A good example is the word electronic.  The American CAAPR for this word is {iLektro'nik}, while the British CAAPR is {iLektro'nik}.  The conversion to ABCD is done on the composite form {iLektro'nik}, leading to the ABCD spelling lectronic, which does not accurately reflect the American pronunciation.  I have edited the ABCD dictionary to correct this particular instance, but it is likely that other examples of the same problem still exist.

II. R spellings, especially with u

ABCD utilizes a number of spellings that imply the equivalence of a short sound followed by an r to a related long sound followed by r.  Examples are the spellings air, eer and oar, which logically ought to be pronounced as r, r and r, but are actually pronounced as r, r and r respectively.  This implied equivalence is also reflected in the common use of the magic e in words like care, sphere and sore.

The most difficult case has to do with the vowels represented in CAAPR as {Vr} and {r}.  In American English, both {Vr} and {r} symbolize the same sound, represented in SAMPA as /Ur/, while for British English {r} represents the diphthong /U@(r)/.  I note that {r} is quite common in RP, while {Vr} occurs in only a few words, notably guru and courier. It turns out to be extraordinarily convenient to represent {yr}/{r} by the long vowel symbols r and r, as in cre and plral.  Furthermore, though American and British dictionaries quite consistently show this sound as {Vr}, most of the participants in the Saundspel group feel that {Ur} (Sampa /u:r/) is more accurate.  For these reasons, {r} is consistently shown with a long vowel.  For instance, por is used rather than poor.  However, when the sound is understood as {Vr} in British English, it is represented as a short sound there.  The word guru is spelled g[ur:~OOr] in ABCD, representing gr in American English, but gr in British English.

III. The ambiguity of y

CAAPR utilizes the symbol {y} for the consonant sound of the letter y (as in young), and {} for the vowel sound (as in happy).  But there is a third possibility, a quite common one, represented by {}.  {} represents a sound that can be either {y} or {}, varying by speaker.  Most words like champion and warrior, in which i is followed by an unstressed vowel, are of this sort.  Some words in which y is followed by a vowel, such as Tokyo and Libyan, are also of this sort.  The ABCD approach for dealing with words containing this ambiguity is to spell them with the existing letter.  Thus, champion is spelled champon, implying a vowel sound, even though the consonant sound is no doubt more common, and similarly, the spelling libyan is used, implying a consonant sound for the y, even though the word is probably more commonly pronounced with a vowel there.  The symbols Y and Ÿ can be used for words like spanYard and lobbŸist, where the pronunciation is unequivocally different from what one might expect.

Appendix - Context-Dependent Elements of ABCD

ABCD represents pronunciation and traditional spelling in an almost context-free way, which is to say that the interpretation of its symbols usually does not depend on their context.  For instance, the sequence SH always represents the sound of {X} and the spelling ch, regardless of where it occurs in a word, or what other symbols are adjacent.  For a computer program to understand ABCD, it is mostly necessary simply to divide the text into symbols.  Some letters are used in more than one symbol (for instance, the letter H occurs in the symbols H, GH, KH, PH, QH, SH and UH), but the rule is that each letter is contained in the longest possible symbol, so that SH will always represent SH, and never S followed by H.

There are, however, a small number of symbols whose interpretation is dependent on context.  These context dependencies are found in regular English spelling, and the familiarity benefits of adopting them in ABCD more than offset the additional complexity of context dependence.  The context-dependent elements of ABCD are of two sorts, positional and general.  The positional elements are as follows:

The other context-dependent elements of ABCD may occur anywhere within a word, as follows:

Appendix - An Unambiguous ABCD

As I mentioned earlier, ABCD is an ambiguous system.  The five unmarked vowel letters, as well as and , may denote either the schwa or a short vowel.  This ambiguity can be remedied without losing the readability of ABCD.  I'm not sure this is a change for the good, as it requires many more diacritics, while the benefits are small unless one considers this distinction important even in an orthography intended to be very similar to TS.  Nevertheless, here's how it is done.

The short vowel sounds of a, e, i and o are denoted by the vowel with a dieresis, in the way in which the dieresis is already used preceding r.  This gives rise to very precise spellings like mbidxtruS, hppoptamuS and slctvity.  The sounds of u require a more serious reorganization, due to the use of for both the {y} and {yV} sounds.  The table below shows how it could be done.

Sound Ambiguous
ABCD
Unambiguous
ABCD
Ambiguous
Example
Unambiguous
Example
{} u u campus cmpus
{u} u cut ct
{V} psh psh
{y} accr@t(e) ccr@t(e)
{yV} refGee rfGee
{U}/{yU} ~ | d~ty d|ty
{V}/{yV} ~ | d~rtion d|rtion
{}/{V} ~U instrment nstr~Ument
{y}/{yV} monment mnment

One other ambiguity that must be resolved is between the unstressed {r} and the stressed {&r}, which can both be spelled by er, ir or ur.  An obvious fix here is to use eR, iR and uR for the stressed sound, leading to spellings such as fiRstmeRGency and muRder.  (And also, r should be changed to R, for consistency, as in wRt.)

In some ways, the unambiguous system is a better arrangement, since is compatible with the other uses of dieresis, and the resemblance of the symbol | to the letter I may be mnemonic. Nevertheless, I think the number of diacritics required in the unambiguous system makes it inferior to the slightly simpler ambiguous one.  Certainly, the ambiguity of ABCD is not an issue for my planned uses of it.

The same process that generates the ambiguous ABCD dictionary could equally well generate an unambiguous version.  I am not at this time offering it for download, but if you have some use for it, please contact me (Alan at wyrdplay.org), and I'll be happy to provide a copy.


To comment on this page, e-mail Alan at wyrdplay.org

Go to wyrdplay.org home page
Go to wyrdplay.org spelling system roster