FEWL Programming Tips and Examples



I have now used the FEWL word list and methodology to generate dictionaries and supporting programs for 8 reformed spelling systems: my own Arbdot, FLOSS, WLM and WMM, as well as Paul Stought's Reed-Riet (an earlier version of Loosud) and Bob Boden's Bobdot and SRS4g.  I have used Python as the implementation language; other languages could be used instead, of course, but I prefer Python for this sort of thing.  As an example, the programming support for SRS4g is available as a download (see the bottom of this page).

Whenever I implement a conversion to a new system, I develop three programs.  In the case of SRS4g, these were fl2s4g.py, xl2s4g.py and babble4g.pyfl2s4g.py is the dictionary conversion program, reading from the FEWL dictionary, and generating the SRS4g dictionary as its output.  xl2s4g.py is the text conversion program, which reads a regular text and the SRS4g dictionary, and produces as output the text translated into SRS4g, to the best of its abilities.  babble4g.py is a program which randomly samples the SRS4g dictionary, and writes the results to a window.  This is the best method I have found for verifying that a conversion has been performed successfully, other than reading the dictionary from beginning to end.  The latter is more effective, of course, but one's eyes glaze over after a while, and one often ends up barely glancing at the s and t words.

Dictionary conversion


The dictionary converter, in this case fl2s4g.py, is always the most difficult of the three programs to write.  Its job is to transform the FEWL phonemic and morphological encoding for each dictionary word into the target system.  I have found it useful, in most cases, to divide the processing into three phases:
  1. The first phase of the program simplifies the FEWL, by discarding unnecessary information.  Often the stress markings or prefix/suffix markings of FEWL are not needed for a particular scheme, in which case they should be dispensed with immediately.  This phase is also useful for global transformations, such as forcing consistent non-phonemic spellings for suffixes, or correcting systematic disagreements with the FLEWSY pronunciations.  It is also a good place to deal with the FLEWSY * and ? notations, indicating multiple possible pronunciations, of which one is normally chosen.  (Note however that one always has the option of allowing the ?s to remain in the output dictionary, for manual correction.  This option is recommended for systems whose inventors have strong opinions about what pronunciations are correct.)

  2. The second phase of the program is the mainline conversion.  It performs an almost-context-free, character-by-character transformation of each input word.  This may be quite complex, if the rules of the target system are also complex.  I try to put almost all processing into the second phase, except for the removal of irrelevant information - when a particular tranformation proves too difficult to easily manage, I then consider moving some or all of the work to the other phases.

  3. The third phase of the program performs simplification of the output - for instance, introducing common abbreviations of phoneme sequences.  This is also where details like insertion of apostrophes, hyphens and capital letters is best performed.  Most of my conversions have had a very short third phase.  The big exception was WLM, where most of the complex rules for the representation of syllabic liquids were delayed until phase 3, phase 2 having generated a more verbose but simpler representation.

The example program, fl2s4g.py, is a relatively simple example, as SRS4g is a relatively simple system.  Most systems, such as Reed-Riet or WLM, have more complicated rules, but the more complicated the target is, the more specific the dictionary conversion programs tends to be, making it less useful as a general example.

Text Conversion


The text conversion program, in this case xl2s4g.py, is generally quite generic.  In general these programs are all the same, with one important exception.  The text conversion program is inflection-aware.  If it finds a word ending in -s, -'s, -ed, -ing, -er or -est, it attempts to look up the root word, and then inflect the corresponding respelled word in the target system.  This means that the program must understand how these inflections are spelled in the target system.  Usually, this is simple, but often there are gotchas.  For instance, in SRS4g, the present participle of the word ring is not ringing but riñing.

One characteristic of these converters should be mentioned here.  There are several ways in which it may prove impossible to convert a word.  Some of these are:

  1. The word (or the root word) may not be present in the dictionary.  In this case, the word is enclosed, unchanged, in angle brackets to so indicate.

  2. The word may be upper-case, as AIDS or NATO, or as in a title.  The conversion programs I have written do not attempt to convert these words, as they are likely to be acronyms, which may need special handling.  NAYTO, for instance, is no longer a very successful acronym, even if the spelling does better indicate the pronunciation.

  3. The word may have mutiple spellings.  For instance, in WLM, the word <desert> could be either the noun (dezrt) or the verb (dizurt).  My converters do not perform grammar analysis or any other technique for removal of such ambiguities.  Instead, all the options are shown, enclosed in square brackets, e.g., [dezrt/dizurt].  It is intended that converter output be inspected by a human being, for removal of such ambiguities, before any use or publication.  It would be possible, is some cases, to eliminate some choices without a lot of work (if the source word is <deserting>, then since nouns don't have participles, the correct WLM transliteration must be dizurtinq), but I have no plans to implement this myself.

I note that it would be straightforward to use xl2s4g.py as the basis for a SRS4g to TS converter (which I would want to call s4g2ts.py).  There has been no demand so far for such a program, but it would really be nothing more than xl2s4g.py with the polarity reversed.

Random Word Generation


The random word listing program, in this case babble4g.py, is completely cookie-cutter.  It should be possible to take this program, modify a few constants, and have a program which will work for any target system.  It would be easy to parameterize the program, to allow specification of the target system on the command line.  This would be useful only for an individual with more than one scheme, and comfortable enough with command-line programs to prefer an option to a different command name for each system.

Download location


To download the SRS4g, sample programs, click here.  They are combined into a single .zip file.  Note that the output of the example fl2s4g.py program is somewhat different from the official SRS4g dictionary, as Bob Boden, the inventor of SRS4g, has corrected many of the automatically generated dictionary entries to better reflect "citation pronunciation".

If you discover bugs or other problems with the example programs, I would appreciate being informed.  The distributed versions have been cleaned up and slightly modified for publication, and it is possible that errors might have crept in in the process.



To comment on this page, e-mail Alan at wyrdplay.org

Go to wyrdplay.org home page
Go to wyrdplay.org spelling system roster