Archived - Mac OS X 10.5: Changes in Thai user dictionaries from Mac OS X 10.4 and earlier
There have been changes in the way Thai user dictionaries are handled between Mac OS X 10.5 and Mac OS X 10.4 or earlier.
Mac OS X 10.5 comes with a built-in dictionary that it uses to find breaks between Thai words; however, for those who use words that are not in the built-in dictionary (for example, specialized vocabulary), the user dictionary facility allows them to supply their own word list to supplement the one built into the system.
In Mac OS X 10.5's built in dictionary functions, if you want the ability to add your own specialized Thai vocabulary, please follow these steps.
In Mac OS X 10.4.x or earlier, Thai user dictionaries are plain text files that can be stored in ~/Library/Dictionaries, /Library/Dictionaries, and /Network/Library/Dictionaries. The Thai user dictionary files needed to have the extension ".thaidict", and the file encoding could be one of the following:
- UTF-8 with UTF-8 BOM
- Native-endian UTF-16 without a BOM
The file format is one word per line, with the word terminating at white space or a line end. Lines that begin with white space are treated as comments.
In Mac OS X 10.5 and later, Thai user dictionaries are handled differently:
- The file name must end in "-Thai.txt". Existing dictionaries that end in ".thaidict" must be renamed or they are ignored. For example, "MyDict.thaidict" should be changed to "MyDict-Thai.txt".
- The set of file encodings accepted is now:
- Any form of Unicode with a BOM (UTF-8, UTF-16, either endian, or UTF-32, either endian)
- Native endian UTF-16 without a BOM
Note: TextEdit will not save UTF-8 files with a BOM, so if you are using TextEdit, save the file as UTF-16.
The locations for dictionaries, and the format of the file itself, are unchanged in Mac OS X 10.5.
For more about the BOM (Byte Order Mark), see this website.