TACTIS, tactis - A character encoding system (codeset) for Thai.  


The TACTIS (Thai API Consortium/Thai Industrial Standard) codeset consists of the following two character sets: ASCII (ISO 646-1983) TIS 620-2533

These characters are 8-bit coded, ranging from 00 to FF.

ASCII Characters

In the TACTIS codeset, all ASCII characters are implemented in the form of single-byte, 7-bit characters; that is, the most significant bit (MSB) of ASCII characters is always set off. For more information, refer to ascii(5).

TIS 620-2533 Characters

The TIS 620-2533 character set includes 89 characters that are categorized as follows: Consonants: 44 Vowels: 18 total (5 leading vowels, 6 following vowels, 2 below vowels, and 5 above vowels) Tone marks: 4 Diacritics: 5 (4 above diacritics and 1 below diacritic) NonComposibles: 8 (1 nobreak space, 10 Thai digits, 6 Thai special characters, and 1 word separator)

Thai digits are not recognized by the isdigit(), iswdigit(), isxdigit(), and iswxdigit(), isalnum(), and iswalnum() functions. Many applications make assumptions about how a digit character can be converted to its numeric equivalent. Changing the functions to recognize Thai digits would break these applications.


Code Ranges in the TACTIS Codeset

In the TACTIS codeset, the most significant bit (MSB) of a byte is set on in codes for TIS 620-2533 characters. This differentiates TIS 620-2533 character code from ASCII character code.

Following are the code ranges for each of the five categories of Thai characters in the codeset:

CategoryCode Range (hex)

ConsonantsA1 to CE
Leading vowelsE0 to E4
Normal following vowelsD0, D2, D3, E5
Special following vowelsC4, C6
Below vowelsD8, D9
Above vowelsD1, D4 to D7
Tone marksE8 to EB
Above diacriticsE7, EC to EE
Below diacriticsDA
Nobreak spaceA0
Thai digitsF0 to F9
Thai special charactersCF, DF, E6, EF, FA, FB
Word separatorDC

In TACTIS, the hexadecimal code points of TIS 620-2533 characters are as follows:

 A1 KO KAI              C1 MO MA            E1 SARA AE
 A2 KHO KHAI            C2 YO YAK           E2 SARA O
 A3 KHO KHUAT           C3 RO RUA           E3 SARA AI MAIMUAN
 A4 KHO KHWAI           C4 RU               E4 SARA AI MAIMALAI
 A5 KHO KHON            C5 LO LING          E5 LAKKHANGYAO
 A6 KHO RAKHANG         C6 LU               E6 MAIYAMOK
 A7 NGO NGU             C7 WO WAEN          E7 MAITAIKHU
 A8 CHO CHAN            C8 SO SALA          E8 MAI EK
 A9 CHO CHING           C9 SO RUSI          E9 MAI THO
 AA CHO CHANG           CA SO SUA           EA MAI TRIE
 AB SO SO               CB HO HEEP          EB MAI CHATTAWA
 AD YO YING             CD O ANG            ED NIKHANHIT
 B0 THO THO THAN        D0 SARA A           F0 THAI ZERO
 B2 THO PHOO THAO       D2 SARA AA          F2 THAI TWO
 B3 NOR NANE            D3 SARA AM          F3 THAI THREE
 B4 DOR DEK             D4 SARA E           F4 THAI FOUR
 B5 TO TAO              D5 SARA EE          F5 THAI FIVE
 B6 THO THUNG           D6 SARA UR          F6 THAI SIX
 B7 THO THAHAN          D7 SARA UUR         F7 THAI SEVEN
 B8 THO THONG           D8 SARA U           F8 THAI EIGHT
 B9 NO NU               D9 SARA UU          F9 THAI NINE
 BB PO PLA              DB                  FB KHOMUT
 BD FO FA               DD                  FD
 BE PO PAN              DE                  FE
 BF FO FAN              DF BAHT             FF

For more information on Thai characters, refer to Wototo(5).

Fonts for TIS 620 2533

The operating system provides both screen and printer fonts for TIS 620 2533 characters.

The following bitmap fonts reflect various sizes and typefaces for 75dpi and 100dpi display devices:

-adecw-screen-medium-r-normal--14-140-75-75-p-70-tis620.2533-1 -adecw-screen-medium-r-normal--18-180-75-75-p-80-tis620.2533-1 -adecw-screen-medium-r-normal--24-240-75-75-p-120-tis620.2533-1 -adecw-screen-medium-r-normal--14-140-100-100-p-70-tis620.2533-1 -adecw-screen-medium-r-normal--18-180-100-100-p-80-tis620.2533-1 -adecw-screen-medium-r-normal--24-240-100-100-p-120-tis620.2533-1

The operating system provides the following Thai fonts for PostScript printers: AngsanaUPC-Bold AngsanaUPC-BoldItalic AngsanaUPC-Italic AngsanaUPC-Light CordiaUPC-Bold CordiaUPC-BoldItalic CordiaUPC-Italic CordiaUPC-Light EucrosiaUPC-Bold EucrosiaUPC-BoldItalic EucrosiaUPC-Italic EucrosiaUPC-Light FreesiaUPC-Bold FreesiaUPC-BoldItalic FreesiaUPC-Italic FreesiaUPC-Light IrisUPC-Bold IrisUPC-BoldItalic IrisUPC-Italic IrisUPC-Light JasmineUPC-Bold JasmineUPC-BoldItalic JasmineUPC-Italic JasmineUPC-Light KodchiangUPC-Bold KodchiangUPC-BoldItalic KodchiangUPC-Italic KodchiangUPC-Light LilyUPC-Bold LilyUPC-BoldItalic LilyUPC-Italic LilyUPC-Light WaterlilyUPC-Bold WaterlilyUPC-BoldItalic WaterlilyUPC-Italic WaterlilyUPC-Light YuccaUPC-Bold YuccaUPC-BoldItalic YuccaUPC-Italic YuccaUPC-Light

For general information on printing Asian language text, refer to i18n_printing(5).

Codeset Conversion

The following converter pairs are available for converting data between TACTIS and other encoding formats. Refer to iconv_intro(5) for an introduction to codeset conversion. For more information about the other codeset for which TACTIS is the input or output, see the reference page specified in the list item. cp874_TACTIS, TACTIS_cp874

Converting from and to PC code page 874: code_page(5) UCS-2_TACTIS, TACTIS_UCS-2
Converting from and to UCS-2: Unicode(5) UCS-4_TACTIS, TACTIS_UCS-4
Converting from and to UCS-4: Unicode(5) UTF-8_TACTIS, TACTIS_UTF-8
Converting from and to UTF-8: Unicode(5)


Commands: locale(1)

Others: code_page(5), ascii(5), i18n_intro(5), i18n_printing(5), iconv_intro(5), Thai(5), Unicode(5), Wototo(5)



