Text Encoding

An encoding defines how a text file's bytes are interpreted as characters. Most modern programs use UTF-8, a variable-length encoding scheme that uses between one and four bytes to represent a character.

Historically, to differentiate a UTF-8 file from a file encoded in a format such as ANSI, a byte order mark (BOM) would be added to the beginning of the file, specifying its encoding. Since UTF-8 has been widely adopted, most programs now use the UTF-8 encoding without a BOM.

When Text Viewer opens a file, if no BOM is present, the file is assumed to be encoded in UTF-8. If you are opening an old file that was encoded as ANSI, you can change the way that Text Viewer displays the file by selecting Encoding -> ANSI. If the file has a BOM, Text Viewer determines how to display the file based on the characters of the BOM.

Text Viewer supports the following encodings:

ANSI
UTF-8
UTF-16 LE

See also: Menu Summary