Data Files (ASCII or Unicode) and Meadows Software

DesignMerge software can import either ASCII or Unicode (UTF-8) text into InDesign documents. This document describes various features of DesignMerge software, and how they interact with standard ASCII text or Unicode data.

TYPES OF TEXT

Each of the characters in a text file are specified by one or more codes. Below is a brief description of some common types of character encodings: 

ASCII: The ASCII Printable Character Set is the set of standard, printable characters whose ASCII decimal code is between 32 and 126 (letters, digits, punctuation marks, and a few miscellaneous symbols). Technically, standard ASCII supports decimal 0 - 127, however, character codes below 0 and above 126 and not commonly printed. The standard ASCII characters from 32-126 are identical between Macintosh and Windows systems.

Extended ASCII or High-Order Characters: ASCII codes with decimal values between 127-255 consist of less commonly used characters, such as special accented characters, currency marks, etc. Although some of the characters in this range are identical between Macintosh and Windows systems, many are not. If you need to support such characters, then you may require the use of a TransTable to convert them. Please see the TransTable article for more details.

UTF-8: UTF-8 is a variable-width character encoding that can represent every character in the Unicode character set, with each character represented by one to four bytes. For the first 128 characters, UTF-8 uses the same one-byte encoding as ASCII; in other words, UTF-8 uses ASCII character codes for the first 128 characters.

UTF-16 or UTF-32: UTF-16 and UTF-32 are each a type of Unicode character encoding that is different from UTF-8. DesignMerge does not support UTF-16 or UTF-32 data at this time. If you are using a database or spreadsheet application that does not support exporting data to a UTF-8 text file, you can export the data to a UTF-16 text file. Then, you can use a text utility application (such as BBEdit or Notepad++) to save the UTF-16 text file as a UTF-8 text file for use with DesignMerge.

DESIGNMERGE FEATURES

Below is a list describing how various DesignMerge features interact with the type of text encoding that is used in a data file: 

Data Type: A DesignMerge Data Source Definition (DDF) has a Data Type setting where you can indicate whether the data file contains UTF-8 or ASCII data.

Data Source Setup: Setup will automatically detect when a data file is a UTF-8 text file if the file includes the UTF-8 BOM (Byte Order Mark). If a UTF-8 file does not contain the UTF-8 BOM, Setup will assume the data file is an ASCII text file, however you can change the Data Type setting from ASCII to UTF-8 in the Setup dialog before proceeding. Also, you can change the Data Type setting in the Edit Data Source Definition dialog at any time. Please note selecting an inappropriate Data Type may yield unpredictable results when DesignMerge imports data.

Field Names: Setup uses the data in the first record (the header row) of the selected data file as the default names for the fields of data. If the data in the first record includes non-ASCII character codes, the field names that DesignMerge uses may not display as expected because DesignMerge field names do not display non-ASCII characters at this time. To avoid DesignMerge displaying field names in an unexpected manner, we recommend using only ASCII Printable Characters in the first record of data in a UTF-8 data file, or edit the Field Names that are displayed in the Setup dialog before proceeding.

Data Source Definition Settings: Please note that only ASCII Printable Characters are supported for use in any setting in a Data Source Definition (DDF). This includes, for example, the following: Field Names, Variable Link Names, Variable Link Parameters, Price Styles, DesignMerge Rule Names, DesignMerge Rule Criteria Values, or DesignMerge Search Criteria. Use only ASCII Printable Characters in any and all settings for a Data Source Definition.

Search Key Values: Only ASCII Printable Characters are supported for use as a search key value for variable links (placeholders) in a document. Use ASCII Printable Characters for search key values.

Variable Link Filter Setting: When setting up a variable link for UTF-8 data, do not set up the variable link to use a text import filter. Applying a filter to UTF-8 data will yield unpredictable results. Instead, when using UTF-8 data, always select None for the Filter setting.

TransTables: The TransTable feature has not yet been extended to convert non-ASCII characters. You may continue to use a TransTable to convert the first 128 characters that are in the UTF-8 file (characters whose UTF-8 encoding matches ASCII encoding; characters whose decimal encoding is 0 through 127).

SPECIAL NOTES

About Fonts

A font provides support for a specific set of characters. if your data file contains UTF-8 characters, consider carefully which font you will apply to the variable links that will import these characters. In particular, consider which languages a font supports. If a font does not support the language of the characters in the data file, then some or all of the characters will not display appropriately when merged into a variable link that is using this font. 

About Glyphs

Adobe InDesign, and other applications, allow you to assign any one of the glyphs that are available in a font for a particular character to display an alternate representation of that character. Please note that a glyph assignment is a formatting attribute and formatting attributes are not included in plain ASCII or UTF-8 text files. Therefore, a font's default glyph for each character will be applied when plain ASCII or UTF-8 text is imported into a document. This also applies to the text that DesignMerge imports into their variable links (placeholders).