
===========================================================================
trans Character Encoding Converter Generator Package     file: readme
===========================================================================

This package is Copyright (c) 1993-1997 by kosta@kostis.net (Kosta Kostis)

You may use it free of charge but you may not sell this package.
Should you be interested in using any component of this package in a
commercial package you must contact me first to find terms.

Under MS-DOS one should convert all files from U*IX format to MS-DOS
format. This can be done by inserting a CR (0x0D) before each LF (0x0A).

Currently there are 79 different Character Encoding Description Files
supplied with this package, not counting the following files:
iso6429, iso646 which are included by other files and iso10646. 

It covers ISO 646, many MS-DOS Codepages (also for OS/2), Microsoft Windows
Codepages, ISO 8859-xx, HP, Adobe, Apple Macintosh, Atari, NeXTSTEP
Character Encodings, some EBCDIC Encodings, koi8-r and a few more...

Should your favourite Character Encoding be missing, please contribute!

===========================================================================
Where to get updates
===========================================================================

The latest version of this package should be available at:

ftp://ftp.informatik.uni-erlangen.de/pub/doc/ISO/charsets/transXXX.tar.gz

where XXX represents the current version number (120 for V1.20).

A "backup" copy should be available at

http://www.kostis.net/freeware/transXXX.tar.gz

===========================================================================
How to create a Character Encoding Converter
===========================================================================

To create translators, compile and link the file "transtab.c" in the
directory "./trans120/". Move the executable into a directory, that is
contained in your command search path.

Example:

	MS-DOS (Borland C++)

	    make -f makefile.bcc
	    rem if you have \bin in your PATH
	    copy transtab.exe \bin

	    Please note that there seems to be some trouble to compile
	    transiso.c and checkiso.c with Borland C++ 3.1

	MS-DOS (Microsoft C)

	    nmake -f makefile.msc
	    rem if you have \bin in your PATH
	    copy *.exe \bin

	U*IX (e. g. Linux)

	    #
	    # if you use gcc
	    #
	    make all             # compile transtab executables

        The U*IX Makefile offers lots of options which you may want to use:

            make install         # this copies executables to /usr/local/bin
            make clean           # this deletes objects and executables

            make check           # check cedf files (error.log)

            make html            # create HTML tables from cedf files
            make list            # create list of cedf files (encoding.lis)

            make date            # for my personal use only ;)
            make pack            # for my personal use only ;) 
            make checkuni        # for my personal use only :) (unierror.lis)

	SunOS using gcc seems to require

	    make COPTS='-DFILENAME_MAX=200 -DNO_STRUPR -O6'

After that, please change your working directory to the "./trans120/bin"
directory. There you will find files called "*" for U*IX and "*.bat"
for MS-DOS. This package is being maintained under Linux but has
originally been developed under MS-DOS.

Please set an enviroment variable TRANS that points to the directory
where all the program sources reside *including* the trailing directory
separator character (e. g. TRANS="/usr/local/lib/trans/") which could
be a link to the actual directory.

It makes life much easier and you can create your programmes and
tables completely independend from the source tree. All Character
Encoding Description Files have to reside in the cedf subdir.
If you don't set a variable TRANS "/usr/local/lib/trans" will be
assumed (see file "tab.h", DIR_TRANS).

To test the translator generator, type

        cd "$TRANS"bin
        one

This should generate two translators between ISO 8859-1 and MS-DOS
Codepage 850. Each translator consists of three files (e.g.):

        isox850.c                       the main program
        isox850.h                       the header file
        isox850.tab                     the translation table file

Each translator will #include the files

        trans.c                         the main invariant program
and     trans.h                         the main invariant header file

Please do not delete them and keep the files in the same directory!

You should be able to compile and link isox850.c and 850xiso.c easily.
Read transtab.man to learn more about the syntax for transtab.

Have a look at maketabs.bat or maketabs.sh respectively to get an
inspiration for program names.

This package is written in ANSI-C using the two non-ANSI functions
strdup () and strupr (). Sources for these functions are supplied should
your compiler/library not contain them. Should you encounter any problems
while trying to compile this package, your compiler is very likely not
ANSI-C compliant. Should your compiler be ANSI-C compliant and still report
warnings and/or errors, please let me know. I'll need the following data in
order to to help you:

	Version of this package (e. g. V1.20)
	Operating System and Version
	Compiler name and Version
	Compiler options used (if any)

===========================================================================
Directory tree for this package
===========================================================================

The directory tree for this utility should look like this:

./trans120/              contains the complete package 

           README               this file
           Makefile             sample makefile for U*IX using gcc
           encoding.lis         list of Character Encoding Description Files
           error.log            output created by checkall
           unierror.log         diffs between cedf and selected Unicode files

./trans120/src/          contains the translation table generator source

               Makefile         makefile for gcc (eg. Linux)

               makefile.bcc     makefile for Borland C++ (MS-DOS)
               makefile.gcc     makefile for gcc (same as Makefile)
               makefile.msc     makefile for Microsoft C (MS-DOS)

               comptran.c       compute translation table and output
               comptran.h       header file for comptran.c
               datatype.h       handy data types
               gettrans.c       get TRANS directory
               gettrans.h       header file for gettrans.c
               head_c.h         generic translator main program
               head_h.h         generic translator header file
               head_tab.h       generic translator table file header
               head_u.h         generic translator Unicode FormatA file header
               loadtab.c        read xlt binary table and Unicode FormatA
               loadtab.h        header file for loadtab.c
               os-stuff.h       OS/compiler dependent definitions
               readtab.c        read character encoding description file
               readtab.h        header file for readtab.c
               scanflag.c       parse program parameters and flags
               scanflag.h       header file for scanflag.c
               strdup.c         in case your compiler doesn't have it
               strdup.h         header file for strdup.c
               strupr.c         in case your compiler doesn't have it
               strupr.h         header file for strupr.c

               tab.h            table constants
               taberr.h         transtab error codes and messages

               checkiso.c       checks character encoding description names
               checkiso.h       header file for above program
               checkiso.man     man page for above program

               checkuni.c       compares cedf file with Unicode Format A table
               checkuni.h       header file for above program
               checkuni.man     man page for above program - for internal use

               transiso.c       translator generator to ISO 10646 main program
               transiso.h       header file for above program
               transiso.man     man page for above program

               transtab.c       translator generator main program
               transtab.h       header file for above program
               transtab.man     man page for above program

               transce8.c       translator program (8-bit) main program
               transce8.h       header file for above program
               transce8.man     man page for above program

               transhtm.c       program that displays HTML tables
               transhtm.h       header file for above program
               transhtm.man     man page for above program

               checkall.bat     check all tables (MS-DOS)
               checkall         check all tables (U*IX)
               chkuni           for internal use only
               mklist           create list of all tables (U*IX)
               mkhtml           create HTML table (prior mkbintab required)
               mkxlt            create XLT files (binary translation files)

./trans120/bin/         contains the translator main program (independent
                        part) and a few scripts to create translators

               compile.bat      compile one program (MS-DOS)
               compile          compile one program (U*IX)
               makeall.bat      compile all programs (MS-DOS)
               makeall          compile all programs (U*IX)
               maketabs.bat     create many translator sources (MS-DOS) 
               maketabs         create many translator sources (U*IX)
               one.bat          create one translator (MS-DOS)
               one              create one translator (U*IX)
               trans.c          invariant main translator program
               trans.h          invariant main translator header file

               utf.c            convert from/to plain 16-bit Unicode/UTF
               utf.h            header for utf.c
               utimbuf.h        helps to keep file date stamps

./trans120/doc/         contains information about the description files
                        (*.inf) and other more general information
              
               adobe.inf        Adobe Character Encoding Vector information
               apple.inf        Apple Macintosh information
               atari.inf        Atari ST/TT information
               cpdos.inf        MS-DOS/IBM Codepage information
               cpwin.inf        Microsoft Windows Codepage information
               dec.inf          DEC information
               ebcdic.inf       EBCDIC information
               hp.inf           HP (Hewlett-Packard) information
               iso10646.inf     ISO 10646 information
               iso6429.inf      ISO 6429 information
               iso646.inf       ISO 646 information
               iso8859.inf      ISO 8859 information
               nextstep.inf     NeXT information
               other.inf        other character encodings (non-"standard")
               winother.inf     Microsoft Windows (more Encodings)

               credits          credits to contributors
               format           Character Encoding Description File Format
               history          how things have developed
               network          read this for the use in Usenet-software
               sources          sources of information
               todo             things not yet done, known bugs

./trans120/cedf/        contains Character Encoding Description Files

                adobeiso        Adobe ISOLatin1Encoding Encoding Vector
                adobestd        Adobe StandardEncoding Encoding Vector
                adobesym        Adobe Symbol Encoding Vector

                applecro        Apple Macintosh Croatian
                applegk2        Apple ][ Greek extended for Macintosh
                applegrk        Apple Macintosh Greek
                appleice        Apple Macintosh Icelandic
                applerom        Apple Macintosh Roman
                applerum        Apple Macintosh Romanian
                appletur        Apple Macintosh Turkish

                atarist         Atari ST/TT

                cp1250          Microsoft Windows Codepage 1250 (EE)
                cp1251          Microsoft Windows Codepage 1251 (Cyrl)
                cp1252          Microsoft Windows Codepage 1252 (ANSI)
                cp1253          Microsoft Windows Codepage 1253 (Greek)
                cp1254          Microsoft Windows Codepage 1254 (Turk)
                cp1255          Microsoft Windows Codepage 1255 (Hebr)
                cp1256          Microsoft Windows Codepage 1256 (Arab)
                cp1257          Microsoft Windows Codepage 1256 (BaltRim)
                cp1258          Microsoft Windows Codepage 1256 (Viet)

                mslinedr        Microsoft Windows MS LineDraw
                symbol          Microsoft Windows Symbol Encoding Vector
                wingding        Microsoft Windows Wingdings Encoding Vector

                cp437           IBM Codepage 437 (US)
                cp737           IBM Codepage 737 (Greek defacto Standard)
                cp775           IBM Codepage 775 (BaltRim)
                cp850           IBM Codepage 850 (Multilingual Latin 1)
                cp851           IBM Codepage 851 (Greece) - obsolete
                cp852           IBM Codepage 852 (Multilingual Latin 2)
                cp853           IBM Codepage 853 (Multilingual Latin 3)
                cp855           IBM Codepage 855 (Russia) - obsolete
                cp857           IBM Codepage 857 (Multilingual Latin 5)
                cp860           IBM Codepage 860 (Portugal)
                cp861           IBM Codepage 861 (Iceland)
                cp862           IBM Codepage 862 (Israel)
                cp863           IBM Codepage 863 (Canada (French))
                cp864           IBM Codepage 864 (Arabic)
                cp865           IBM Codepage 865 (Norway)
                cp866           IBM Codepage 866 (Russia)
                cp869           IBM Codepage 869 (Greece)
                cp874           IBM Codepage 874 (Thai)
                cp895           IBM Codepage 895 (Czech Kamenicky)

                decmcs          DEC Multinational Character Set (DEC MCS)

                ebc037          EBCDIC Codepage  037
                ebc500          EBCDIC Codepage  500
                ebc875          EBCDIC Codepage  875 (Greek)
                ebc1026         EBCDIC Codepage 1026 (Turkish)
                ebc1047         EBCDIC Codepage 1047

                hp48            HP 48 Character Set
                hproman8        HP Roman-8

                iso10646	ISO 10646 (sorted by name, 16-bit)

                iso6429         ISO 6429 Control Characters (00-1F)
                iso646          ISO 646 (common character base)

                iso646.ca       ISO 646 (French Canadian)
                iso646.ch       ISO 646 (Swiss)
                iso646.de       ISO 646 (German)
                iso646.es       ISO 646 (Spanish)
                iso646.fi       ISO 646 (Finnish)
                iso646.fr       ISO 646 (French)
                iso646.gb       ISO 646 (United Kingdom)
                iso646.irv      ISO 646 (International Reference Version)
                iso646.it       ISO 646 (Italian)
                iso646.nl       ISO 646 (Dutch)
                iso646.no       ISO 646 (Norwegian/Danish)
                iso646.pt       ISO 646 (Portuguese)
                iso646.se       ISO 646 (Swedish)

                iso8859.1       ISO 8859-1  (Latin 1)
                iso8859.2       ISO 8859-2  (Latin 2)
                iso8859.3       ISO 8859-3  (Latin 3)
                iso8859.4       ISO 8859-4  (Latin 4)
                iso8859.5       ISO 8859-5  (Latin/Cyrillic)
                iso8859.6       ISO 8859-6  (Latin/Arabic)
                iso8859.7       ISO 8859-7  (Latin/Greek)
                iso8859.8       ISO 8859-8  (Latin/Hebrew)
                iso8859.9       ISO 8859-9  (Latin 5)
                iso8859.10      ISO 8859-10 (Latin 6)

                cyrilbas        Cyrillic Basic TT Font for Microsoft Windows
                koi8-r          Cyrillic encoding as defined in RFC-1489

                nextstep        NeXTSTEP Encoding Vector

                tex-dcr.in      TeX dcr input (contains non-ISO 10646 names)
                tex-dcr.out     TeX dcr output (contains non-ISO 10646 names)

                wingreek        WinGreek (Non-Std. Encoding TT Font, may go)

./trans120/xlt/          contains conversion tables (default is little endian)

                  all files mentioned in ./trans120/cedf/ should be here,
                  except for iso6429, iso646 and iso10646.

                  Should you not have a "little endian" CPU (Intel i386, i486,
                  Pentium and many other brands), please do a "make xclean"
                  and then "make bintab" to create the very same tables using
                  your native byte order. This will most likely only work on
                  U*IX (like) systems.

