The "addtones" program adds tones to cihui files that don't have
tones.  The cihui files should be in this format:

    <pinyin> <cihui>,<cihui>, ... ,<cihui><NL>

Where <NL> stands for newline (^J).

It generates a file containing cihui it couldn't add tones to for some
reason.  The file name generated is the name of the cihui file with a
".fix" extension.  If the cihui file is coming on from standard input,
the fix file is "stdin.fix".

Certain classes of spelling errors will be caught.

The "addtones" utility takes a danci file that has tones on the Pinyin
as a lookup table for adding tones to the cihui file.  "addtones"
assumes that the case (upper or lower) is the same in the danci and
cihui file.

There are still a few problems with this program, but it's a start.

Example of use:

   addtones -d danci.gb file1.gb file2.gb ... > results.gb

   will add tones to file1.gb and file2.gb and generate file1.gb.fix
   and file2.gb.fix containing lists of the cihui that tones couldn't
   be added to for some reason.

   You can also call it like this:

   cat file1.gb file2.gb ... | addtones -d danci.gb > results.gb

"addtones" can take as many input files as you want to give it on the
command line, and you can change the danci file anywhere on the
command line.  The only problem is that all of the results end up in
the same file.

Example:

   addtones -d danci.big5 file1.big5 -d danci.gb file1.gb > results

Wed Apr 24 07:03:28 1991
mleisher@nmsu.edu

================================================================

The "dancire" (danci rearrange) program takes a file in this format:

    <pinyin> <danci>,<danci>, ... ,<danci><NL>

and rearranges it into this order:

    <danci> <pinyin>,<pinyin>, ... ,<pinyin><NL>

Where <NL> stands for newline (^J).

The "dancire" program can take a danci file name as a command line
parameter, or it can read from standard input.

Examples:

        dancire danci.gb > re-danci.gb

        cat danci.gb | dancire > re-danci.gb

I've included the mult-pron.gb file that I generated with this
program.  Might be useful for someone.

To generate a list of the danci with multiple pronunciations, one only
needs to change line 204 of "dancire.c" from:

      if (st != NULL) {
to:
      if (st != NULL && stack_size(st) > 1) {

You can change the 1 to some other number and the > operator to = to
extract danci that have 'n' pronunciations.

Mon Apr 22 21:42:36 1991
mleisher@nmsu.edu (Mark Leisher)

================================================================

The "topinyin" program converts from Chinese back into Pinyin.

An example:

   topinyin -d danci.gb file1.gb file2.gb > results

   will convert file1.gb and file2.gb into Pinyin (as far as possible)
   and generate file1.gb.msg and file2.gb.msg files containing
   messages on the ambiguous and unknown conversions.

"topinyin" can take as many input files as you want to give it on the
command line, and you can change the danci file anywhere on the
command line.  The only problem is that all of the results end up in
the same file.

Example:

   topinyn -d danci.big5 file1.big5 -d danci.gb file1.gb > results

Ambiguous conversions are noted in the output text (the file "results"
in this case) by placing "<>" around the ambiguous character.

Another file called "stats" is generated, describing the percentagages
of successful and unsuccessful conversions.

Tue Apr 23 01:56:38 1991
mleisher@nmsu.edu (Mark Leisher)
