<< Click to Display Table of Contents >> Navigation: DOI Namespace > Syntax of the DOI Name > Character Set Supported in the DOI Name |
DOI Names may incorporate any printable characters from the Universal Character Set (UCS) of ISO/IEC 10646, which is the character set defined by Unicode. The Handle System at its core uses UTF-8, which is a Unicode implementation and so in its pure form has no character set constraints at all: any character can be sent to, stored in, and retrieved from a Handle server.
The character set encompasses most characters used in every major language written today (see also UTF-8 Encoding of Non-ASCII Characters).
NOTE Some characters should be avoided: see Constraints on DOI Name Syntax in Specific Contexts.