Unicode Font Support

The Java 1.0 platform was limited to displaying only the characters in the ISO-Latin-1 subset of Unicode. This restriction was removed in the Java 1.1 platform. Java programs running on 1.1 or 1.2 can display any Unicode character which can be rendered with a host font.

The runtime provides a small number of predefined "virtual" font names, and maps them to real fonts available on the host. In 1.0, each Java font name mapped to exactly one host font. In 1.1 and 1.2, a Java font name can map to a series of host fonts, which can be chosen to cover as much of the Unicode character set as is desired. The font mapping is specified in a font properties file.

Since Java font names are virtual names that can represent multiple host fonts, it is appropriate that they have generic names. JDK 1.0 included the font names TimesRoman, Courier, and Helvetica, which were very specific and do not apply to many locales. JDK 1.1 introduced three replacements for these names: Serif, SansSerif, and MonoSpaced. Use of these new names is recommended.

The JDK ships with font property files that cover all supported locales. For a locale to be supported, an adequate font property file must exist.

Font Property File

The example that follows is a font property file with values that might be used on a Windows platform:

#-------------------------------------------
serif.0.plain=Times New Roman
serif.1.plain=MS Mincho
# Any style fonts use WinDings as component font 2
# and Lucida Sans Unicode Regular as component font 3.
serif.2=WingDings
serif.3=Lucida Sans Unicode Regular
...
sansserif.0.italic=Helvetica
sansserif.1.italic=MS Gothic
....
....
#-------------------------------------------

The complete key representation is:

<abstract name>.<component font number>.<style name>.

If the style name is omitted, the mapping applies for all styles in that family. Fully specified mappings take precedence over those without the style name. The component font number gives a priority to each host font. If a Unicode character can be displayed with multiple fonts in a mapping, the font with the lowest component number will be used.

The font properties file can also specify a default character to be displayed in place of characters that can not be rendered with the given mappings. The default character is specified in terms of its Unicode value, as shown below. If the default character can not be mapped, the ASCII `?' is used.

#-----------------------------------

default.char=274f

#-----------------------------------

The aliasing of the new font names to the old names is accomplished with the following entry

#--------------------------------

alias.timesroman=serif

alias.helvetica=sansserif

alias.courier=monospaced

#-----------------------------------

The priority ordering of host fonts may not be sufficient to specify the desired mapping when multiple host fonts overlap. Exclusion ranges can be set on a host font to prohibit characters from being displayed with that font. The following example shows how this is done:

#---------------------------------------

exclusion.sansserif.1=xxxx-XXXX

exclusion.monospaced.plain.2=xxxx-XXXX

#---------------------------------------

Exclusion ranges can be abbreviated in the same way the name mappings are. Fully specified names take priority over abbreviated ones.

Supporting User-Defined Characters

Especially in the Japanese market, many end-users require specialized fonts for non-standard characters. These characters are called Gaiji in Japan. To support Gaiji fonts, Java must be told how to map between the Gaiji font and Unicode.

For example, assume a user has a font which contains exactly three glyphs. The glyphs are indexed 0, 1, and 2 in the font, and the user wishes to map these into the Unicode characters \uE800, \uE801, \uE802 (three Private Use Area characters in Unicode). This can be accomplished with the following two steps:

1. Subclass CharToByteConvertor (or one of its subclasses).

class MyFontCharset extends CharToByteSingleByte {
	private String name;

	public MyFontCharset(){
		name = "MyGaiji";
	}

	public boolean isConvert(char ch){
		if (ch >= 0xE800 && ch <= 0xE802)
			return true;
		else
			return false;
	}

	// this is the conversion method actually called by 
	// the font mechanism
	public int convert(String str, byte[] out){
		for (int i = 0; i < str.length(); i++){
			out[i] = str.charAt(i) - 0xE800;
		}
		return str.length();
	}

	// needed as convert is an abstract method in
	// CharToByteConversion
	public int convert(char ch[],int off,int len, 
			   byte b[], int boff, int blen){
		String str = new String(ch, off, len);
		byte bb[] = new byte[blen-boff];
		System.arraycopy(b,boff,bb,0,bb.length);
		return convert(str, bb);
	}

	public String toString(){
		return name;
	}
}

2. Specify this class in a property as follows:

#------------------------------------------------------------
...
serif.4=<special font name>
...
fontcharset.serif.4=MyFontCharset
...
#----------------------------------------------------------------

Some Japanese companies, including Fujitsu and NEC, have their own defined characters (called Vendor defined characters). This mechanism allows vendors to extend Java to support these characters.

java-intl@java.sun.com