I have now used the FEWL word list and methodology to
generate dictionaries and supporting programs for 8 reformed spelling
systems: my own Arbdot, FLOSS, WLM and WMM, as
well as Paul Stought's Reed-Riet (an earlier version of Loosud)
and Bob Boden's Bobdot
and SRS4g.
I have used Python as the
implementation language; other languages could be used instead, of
course, but I prefer Python for this sort of thing. As an
example, the programming support for SRS4g is available as a download
(see the bottom of this page).
Whenever I implement a conversion to a new system, I
develop three programs. In the case of SRS4g, these were fl2s4g.py, xl2s4g.py and babble4g.py. fl2s4g.py is the dictionary
conversion program, reading from the FEWL dictionary, and generating
the SRS4g dictionary as its output. xl2s4g.py is the text conversion
program, which reads a regular text and the SRS4g dictionary, and
produces as output the text translated into SRS4g, to the best of its
abilities. babble4g.py
is a program which randomly samples the SRS4g dictionary, and writes
the results to a window. This is the best method I have found for
verifying that a conversion has been performed successfully, other than
reading the dictionary from beginning to end. The latter is more
effective, of course, but one's eyes glaze over after a while, and one
often ends up barely glancing at the s and t words.
The first phase of the program simplifies the FEWL, by discarding unnecessary information. Often the stress markings or prefix/suffix markings of FEWL are not needed for a particular scheme, in which case they should be dispensed with immediately. This phase is also useful for global transformations, such as forcing consistent non-phonemic spellings for suffixes, or correcting systematic disagreements with the FLEWSY pronunciations. It is also a good place to deal with the FLEWSY * and ? notations, indicating multiple possible pronunciations, of which one is normally chosen. (Note however that one always has the option of allowing the ?s to remain in the output dictionary, for manual correction. This option is recommended for systems whose inventors have strong opinions about what pronunciations are correct.)
The second phase of the program is the mainline conversion. It performs an almost-context-free, character-by-character transformation of each input word. This may be quite complex, if the rules of the target system are also complex. I try to put almost all processing into the second phase, except for the removal of irrelevant information - when a particular tranformation proves too difficult to easily manage, I then consider moving some or all of the work to the other phases.
The third phase of the program performs simplification of the output - for instance, introducing common abbreviations of phoneme sequences. This is also where details like insertion of apostrophes, hyphens and capital letters is best performed. Most of my conversions have had a very short third phase. The big exception was WLM, where most of the complex rules for the representation of syllabic liquids were delayed until phase 3, phase 2 having generated a more verbose but simpler representation.
The example program, fl2s4g.py,
is a relatively simple example, as SRS4g is a relatively simple
system. Most systems, such as Reed-Riet or WLM, have more
complicated rules, but the more complicated the target is, the more
specific the dictionary conversion programs tends to be, making it less
useful as a general example.
One characteristic of these converters should be mentioned
here. There are several ways in which it may prove impossible to
convert a word. Some of these are:
The word (or the root word) may not be present in the dictionary. In this case, the word is enclosed, unchanged, in angle brackets to so indicate.
The word may be upper-case, as AIDS or NATO, or as in a title. The conversion programs I have written do not attempt to convert these words, as they are likely to be acronyms, which may need special handling. NAYTO, for instance, is no longer a very successful acronym, even if the spelling does better indicate the pronunciation.
The word may have mutiple spellings. For instance, in WLM, the word <desert> could be either the noun (dezrt) or the verb (dizurt). My converters do not perform grammar analysis or any other technique for removal of such ambiguities. Instead, all the options are shown, enclosed in square brackets, e.g., [dezrt/dizurt]. It is intended that converter output be inspected by a human being, for removal of such ambiguities, before any use or publication. It would be possible, is some cases, to eliminate some choices without a lot of work (if the source word is <deserting>, then since nouns don't have participles, the correct WLM transliteration must be dizurtinq), but I have no plans to implement this myself.
I note that it would be straightforward to use xl2s4g.py as the basis for a SRS4g
to TS converter (which I would want to call s4g2ts.py). There has been no
demand so far for such a program, but it would really be nothing more
than xl2s4g.py with the
polarity reversed.
The random word listing program, in this case babble4g.py, is completely
cookie-cutter. It should be possible to take this program, modify
a few constants, and have a program which will work for any target
system. It would be easy to parameterize the program, to allow
specification of the target system on the command line. This
would be useful only for an individual with more than one scheme, and
comfortable enough with command-line programs to prefer an option to a
different command name for each system.
To download the SRS4g, sample programs, click here. They are combined into
a single .zip file. Note that the output of the example fl2s4g.py program is somewhat
different from the official SRS4g dictionary,
as Bob Boden, the inventor of SRS4g, has corrected many of the
automatically generated dictionary entries to better reflect "citation
pronunciation".
If you discover bugs or other problems with the example
programs, I would appreciate being informed. The distributed
versions have been cleaned up and slightly modified for publication,
and it is possible that errors might have crept in in the process.
To comment on this page,
e-mail Alan at wyrdplay.org
Go to wyrdplay.org home
page
Go to wyrdplay.org spelling
system roster