java.lang.Character
Character
represent primitive values of type char
.
public final classMany of the methods of classCharacter
{ public static final charMIN_VALUE
= '\u0000'; public static final charMAX_VALUE
= '\uffff'; public static final intMIN_RADIX
= 2; public static final intMAX_RADIX
= 36; publicCharacter
(char value); public StringtoString
(); public booleanequals
(Object obj); public inthashCode
(); public charcharValue
(); public static booleanisDefined
(char ch); public static booleanisLowerCase
(char ch); public static booleanisUpperCase
(char ch); public static booleanisTitleCase
(char ch); public static booleanisDigit
(char ch); public static booleanisLetter
(char ch); public static booleanisLetterOrDigit
(char ch); public static booleanisJavaLetter
(char ch); public static booleanisJavaLetterOrDigit
(char ch);) public static booleanisSpace
(char ch); public static chartoLowerCase
(char ch); public static chartoUpperCase
(char ch); public static chartoTitleCase
(char ch); public static intdigit
(char ch, int radix); public static charforDigit
(int digit, int radix); }
Character
are defined in terms of a "Unicode attribute table" that specifies a name for every defined Unicode character as well as other possible attributes, such as a decimal value, an uppercase equivalent, a lowercase equivalent, and/or a titlecase equivalent. Prior to Java 1.1, these methods were internal to the Java compiler and based on Unicode 1.1.5, as described here. The most recent versions of these methods should be used in Java compilers that are to run on Java systems that do not yet include these methods.The Unicode 1.1.5 attribute table is available on the World Wide Web as:
ftp://unicode.org/pub/MappingTables/UnicodeData-1.1.5.txtHowever, this file contains a few errors. The term "Unicode attribute table" in the following sections refers to the contents of this file after the following corrections have been applied:
03D0;GREEK BETA SYMBOL;Ll;0;L;;;;;N;GREEK SMALL LETTER CURLED BETA;;0392;;0392
03D1;GREEK THETA SYMBOL;Ll;0;L;;;;;N;GREEK SMALL LETTER SCRIPT THETA;;0398;;0398
03D5;GREEK PHI SYMBOL;Ll;0;L;;;;;N;GREEK SMALL LETTER SCRIPT PHI;;03A6;;03A6
03D6;GREEK PI SYMBOL;Ll;0;L;;;;;N;GREEK SMALL LETTER OMEGA PI;;03A0;;03A0
03F0;GREEK KAPPA SYMBOL;Ll;0;L;;;;;N;GREEK SMALL LETTER SCRIPT KAPPA;;039A;;039A
03F1;GREEK RHO SYMBOL;Ll;0;L;;;;;N;GREEK SMALL LETTER TAILED RHO;;03A1;;03A1
FF10;FULLWIDTH DIGIT ZERO;Nd;0;EN;0030;0;0;0;N;;;;;
FF11;FULLWIDTH DIGIT ONE;Nd;0;EN;0031;1;1;1;N;;;;;
FF12;FULLWIDTH DIGIT TWO;Nd;0;EN;0032;2;2;2;N;;;;;
FF13;FULLWIDTH DIGIT THREE;Nd;0;EN;0033;3;3;3;N;;;;;
FF14;FULLWIDTH DIGIT FOUR;Nd;0;EN;0034;4;4;4;N;;;;;
FF15;FULLWIDTH DIGIT FIVE;Nd;0;EN;0035;5;5;5;N;;;;;
FF16;FULLWIDTH DIGIT SIX;Nd;0;EN;0036;6;6;6;N;;;;;
FF17;FULLWIDTH DIGIT SEVEN;Nd;0;EN;0037;7;7;7;N;;;;;
FF18;FULLWIDTH DIGIT EIGHT;Nd;0;EN;0038;8;8;8;N;;;;;
FF19;FULLWIDTH DIGIT NINE;Nd;0;EN;0039;9;9;9;N;;;;;
03DA;GREEK LETTER STIGMA;Lu;0;L;;;;;N;GREEK CAPITAL LETTER STIGMA;;;;
03DC;GREEK LETTER DIGAMMA;Lu;0;L;;;;;N;GREEK CAPITAL LETTER DIGAMMA;;;;
03DE;GREEK LETTER KOPPA;Lu;0;L;;;;;N;GREEK CAPITAL LETTER KOPPA;;;;
03E0;GREEK LETTER SAMPI;Lu;0;L;;;;;N;GREEK CAPITAL LETTER SAMPI;;;;
03C2;GREEK SMALL LETTER FINAL SIGMA;Ll;0;L;;;;;N;;;03A3;;03A3
Java 1.1 will include the methods defined here, either based on Unicode 1.1.5 or, we hope, updated versions of the methods that use the newer Unicode 2.0. The character attribute table for Unicode 2.0 is currently available on the World Wide Web as the file:
ftp://unicode.org/pub/MappingTables/UnicodeData-2.0.12.txtIf you are implementing a Java compiler or system, please refer to the page:
http://java.sun.com/Serieswhich will be updated with information about the Unicode-dependent methods.
The biggest change in Unicode 2.0 is a complete rearrangement of the Korean Hangul characters. There are numerous smaller improvements as well.
It is our intention that Java will track Unicode as it evolves over time. Given that full Unicode support is just emerging in the marketplace, and that changes in Unicode are in areas which are not yet widely used, this should cause minimal problems and further Java's goal of worldwide language support.
20.5.1 public static final char
MIN_VALUE
= '\u0000';
The constant value of this field is the smallest value of type char
.
[This field is scheduled for introduction in Java version 1.1.]
20.5.2 public static final char
MAX_VALUE
= '\uffff';
The constant value of this field is the smallest value of type char
.
[This field is scheduled for introduction in Java version 1.1.]
20.5.3 public static final int
MIN_RADIX
= 2;
The constant value of this field is the smallest value permitted for the radix argument
in radix-conversion methods such as the digit
method (§20.5.23), the
forDigit
method (§20.5.24), and the toString
method of class Integer
(§20.7).
20.5.4 public static final int
MAX_RADIX
= 36;
The constant value of this field is the largest value permitted for the radix argument
in radix-conversion methods such as the digit
method (§20.5.23), the forDigit
method (§20.5.24), and the toString
method of class Integer
(§20.7).
20.5.5 public
Character
(char value)
This constructor initializes a newly created Character
object so that it represents
the primitive value that is the argument.
20.5.6 public String
toString
()
The result is a String
whose length is 1
and whose sole component is the primitive
char
value represented by this Character
object.
Overrides the toString
method of Object
(§20.1.2).
20.5.7 public boolean
equals
(Object obj)
The result is true
if and only if the argument is not null
and is a Character
object that represents the same char
value as this Character
object.
Overrides the equals
method of Object
(§20.1.3).
20.5.8 public int
hashCode
()
The result is the primitive char
value represented by this Character
object, cast
to type int
.
Overrides the hashCode
method of Object
(§20.1.4).
20.5.9 public char
charValue
()
The primitive char
value represented by this Character
object is returned.
20.5.10 public static boolean
isDefined
(char ch)
The result is true
if and only if the character argument is a defined Unicode character.
A character is a defined Unicode character if and only if at least one of the following is true:
\u3040
and not greater than \u9FA5
.
\uF900
and not greater than \uFA2D
.
0000-01F5, 01FA-0217, 0250-02A8, 02B0-02DE, 02E0-02E9, 0300-0345, 0360-0361, 0374-0375, 037A, 037E, 0384-038A, 038C, 038E-03A1, 03A3-03CE, 03D0-03D6, 03DA, 03DC, 03DE, 03E0, 03E2-03F3, 0401-040C, 040E-044F, 0451-045C, 045E-0486, 0490-04C4, 04C7-04C8, 04CB-04CC, 04D0-04EB, 04EE-04F5, 04F8-04F9, 0531-0556, 0559-055F, 0561-0587, 0589, 05B0-05B9, 05BB-05C3, 05D0-05EA, 05F0-05F4, 060C, 061B, 061F, 0621-063A, 0640-0652, 0660-066D, 0670-06B7, 06BA-06BE, 06C0-06CE, 06D0-06ED, 06F0-06F9, 0901-0903, 0905-0939, 093C-094D, 0950-0954, 0958-0970, 0981-0983, 0985-098C, 098F-0990, 0993-09A8, 09AA-09B0, 09B2, 09B6-09B9, 09BC, 09BE-09C4, 09C7-09C8, 09CB-09CD, 09D7, 09DC-09DD, 09DF-09E3, 09E6-09FA, 0A02, 0A05-0A0A, 0A0F-0A10, 0A13-0A28, 0A2A-0A30, 0A32-0A33, 0A35-0A36, 0A38-0A39, 0A3C, 0A3E-0A42, 0A47-0A48, 0A4B-0A4D, 0A59-0A5C, 0A5E, 0A66-0A74, 0A81-0A83, 0A85-0A8B, 0A8D, 0A8F-0A91, 0A93-0AA8, 0AAA-0AB0, 0AB2-0AB3, 0AB5-0AB9, 0ABC-0AC5, 0AC7-0AC9, 0ACB-0ACD, 0AD0, 0AE0, 0AE6-0AEF, 0B01-0B03, 0B05-0B0C, 0B0F-0B10, 0B13-0B28, 0B2A-0B30, 0B32-0B33, 0B36-0B39, 0B3C-0B43, 0B47-0B48, 0B4B-0B4D, 0B56-0B57, 0B5C-0B5D, 0B5F-0B61, 0B66-0B70, 0B82-0B83, 0B85-0B8A, 0B8E-0B90, 0B92-0B95, 0B99-0B9A, 0B9C, 0B9E-0B9F, 0BA3-0BA4, 0BA8-0BAA, 0BAE-0BB5, 0BB7-0BB9, 0BBE-0BC2, 0BC6-0BC8, 0BCA-0BCD, 0BD7, 0BE7-0BF2, 0C01-0C03, 0C05-0C0C, 0C0E-0C10, 0C12-0C28, 0C2A-0C33, 0C35-0C39, 0C3E-0C44, 0C46-0C48, 0C4A-0C4D, 0C55-0C56, 0C60-0C61, 0C66-0C6F, 0C82-0C83, 0C85-0C8C, 0C8E-0C90, 0C92-0CA8, 0CAA-0CB3, 0CB5-0CB9, 0CBE-0CC4, 0CC6-0CC8, 0CCA-0CCD, 0CD5-0CD6, 0CDE, 0CE0-0CE1, 0CE6-0CEF, 0D02-0D03, 0D05-0D0C, 0D0E-0D10, 0D12-0D28, 0D2A-0D39, 0D3E-0D43, 0D46-0D48, 0D4A-0D4D, 0D57, 0D60-0D61, 0D66-0D6F, 0E01-0E3A, 0E3F-0E5B, 0E81-0E82, 0E84, 0E87-0E88, 0E8A, 0E8D, 0E94-0E97, 0E99-0E9F, 0EA1-0EA3, 0EA5, 0EA7, 0EAA-0EAB, 0EAD-0EB9, 0EBB-0EBD, 0EC0-0EC4, 0EC6, 0EC8-0ECD, 0ED0-0ED9, 0EDC-0EDD, 10A0-10C5, 10D0-10F6, 10FB, 1100-1159, 115F-11A2, 11A8-11F9, 1E00-1E9A, 1EA0-1EF9, 1F00-1F15, 1F18-1F1D, 1F20-1F45, 1F48-1F4D, 1F50-1F57, 1F59, 1F5B, 1F5D, 1F5F-1F7D, 1F80-1FB4, 1FB6-1FC4, 1FC6-1FD3, 1FD6-1FDB, 1FDD-1FEF, 1FF2-1FF4, 1FF6-1FFE, 2000-202E, 2030-2046, 206A-2070, 2074-208E, 20A0-20AA, 20D0-20E1, 2100-2138, 2153-2182, 2190-21EA, 2200-22F1, 2300, 2302-237A, 2400-2424, 2440-244A, 2460-24EA, 2500-2595, 25A0-25EF, 2600-2613, 261A-266F, 2701-2704, 2706-2709, 270C-2727, 2729-274B, 274D, 274F-2752, 2756, 2758-275E, 2761-2767, 2776-2794, 2798-27AF, 27B1-27BE, 3000-3037, 303F, 3041-3094, 3099-309E, 30A1-30FE, 3105-312C, 3131-318E, 3190-319F, 3200-321C, 3220-3243, 3260-327B, 327F-32B0, 32C0-32CB, 32D0-32FE, 3300-3376, 337B-33DD, 33E0-33FE, 3400-9FA5, F900-FA2D, FB00-FB06, FB13-FB17, FB1E-FB36, FB38-FB3C, FB3E, FB40-FB41, FB43-FB44, FB46-FBB1, FBD3-FD3F, FD50-FD8F, FD92-FDC7, FDF0-FDFB, FE20-FE23, FE30-FE44, FE49-FE52, FE54-FE66, FE68-FE6B, FE70-FE72, FE74, FE76-FEFC, FEFF, FF01-FF5E, FF61-FFBE, FFC2-FFC7, FFCA-FFCF, FFD2-FFD7, FFDA-FFDC, FFE0-FFE6, FFE8-FFEE, FFFD
.[This method is scheduled for introduction in Java version 1.1, either as defined here, or updated for Unicode 2.0; see §20.5.]
20.5.11 public static boolean
isLowerCase
(char ch)
The result is true
if and only if the character argument is a lowercase character.
A character is considered to be lowercase if and only if all of the following are true:
ch
is not in the range \u2000
through \u2FFF
.
0061
-007A
, 00DF
-00F6
, 00F8
-00FF
, 0101
-0137
(odds only), 0138
-0148
(evens only), 0149
-0177
(odds only), 017A
-017E
(evens only), 017F
-0180
, 0183
, 0185
, 0188
, 018C
-018D
, 0192
, 0195
, 0199
-019B
, 019E
, 01A1
-01A5
(odds only), 01A8
, 01AB
, 01AD
, 01B0
, 01B4
, 01B6
, 01B9
-01BA
, 01BD
, 01C6
, 01C9
, 01CC
-01DC
(evens only), 01DD
-01EF
(odds only), 01F0
, 01F3
, 01F5
, 01FB
-0217
(odds only), 0250
-0261
, 0263
-0269
, 026B
-0273
, 0275
, 0277
-027F
, 0282
-028E
, 0290
-0293
, 029A
, 029D
-029E
, 02A0
, 02A3
-02A8
, 0390
, 03AC
-03CE
, 03D0
-03D1
, 03D5
-03D6
, 03E3
-03EF
(odds only), 03F0
-03F1
, 0430
-044F
, 0451
-045C
, 045E
-045F
, 0461
-0481
(odds only), 0491
-04BF
(odds only), 04C2
, 04C4
, 04C8
, 04CC
, 04D1
-04EB
(odds only), 04EF
-04F5
(odds only), 04F9
, 0561
-0587
, 1E01
-1E95
(odds only), 1E96
-1E9A
, 1EA1
-1EF9
(odds only), 1F00
-1F07
, 1F10
-1F15
, 1F20
-1F27
, 1F30
-1F37
, 1F40
-1F45
, 1F50
-1F57
, 1F60
-1F67
, 1F70
-1F7D
, 1F80
-1F87
, 1F90
-1F97
, 1FA0
-1FA7
, 1FB0
-1FB4
, 1FB6
-1FB7
, 1FC2
-1FC4
, 1FC6
-1FC7
, 1FD0
-1FD3
, 1FD6
-1FD7
, 1FE0
-1FE7
, 1FF2
-1FF4
, 1FF6
-1FF7
, FB00
-FB06
, FB13
-FB17
, FF41
-FF5A
.Of the first 128 Unicode characters, exactly 26 are considered to be lowercase:
abcdefghijklmnopqrstuvwxyz[This specification for the method
isLowerCase
is scheduled for introduction in Java version 1.1, either as defined here, or updated for Unicode 2.0; see §20.5. In previous versions of Java, this method returns false
for all arguments larger than \u00FF
.]20.5.12 public static boolean
isUpperCase
(char ch)
The result is true
if and only if the character argument is an uppercase character.
A character is considered to be uppercase if and only if all of the following are true:
ch
is not in the range \u2000
through \u2FFF
.
0041
-005A
, 00C0
-00D6
, 00D8
-00DE
, 0100
-0136
(evens only), 0139
-0147
(odds only), 014A
-0178
(evens only), 0179
-017D
(odds only), 0181
-0182
, 0184
, 0186
, 0187
, 0189
-018B
, 018E
-0191
, 0193
-0194
, 0196
-0198
, 019C
-019D
, 019F
-01A0
, 01A2
, 01A4
, 01A7
, 01A9
, 01AC
, 01AE
, 01AF
, 01B1
-01B3
, 01B5
, 01B7
, 01B8
, 01BC
, 01C4
, 01C7
, 01CA
, 01CD
-01DB
(odds only), 01DE
-01EE
(evens only), 01F1
, 01F4
, 01FA
-0216
(evens only), 0386
, 0388
-038A
, 038C
, 038E
, 038F
, 0391
-03A1
, 03A3
-03AB
, 03E2
-03EE
(evens only), 0401
-040C
, 040E
-042F
, 0460
-0480
(evens only), 0490
-04BE
(evens only), 04C1
, 04C3
, 04C7
, 04CB
, 04D0
-04EA
(evens only), 04EE
-04F4
(evens only), 04F8
, 0531
-0556
, 10A0
-10C5
, 1E00
-1E94
(evens only), 1EA0
-1EF8
(evens only), 1F08
-1F0F
, 1F18
-1F1D
, 1F28
-1F2F
, 1F38
-1F3F
, 1F48
-1F4D
, 1F59
-1F5F
(odds only), 1F68
-1F6F
, 1F88
-1F8F
, 1F98
-1F9F
, 1FA8
-1FAF
, 1FB8
-1FBC
, 1FC8
-1FCC
, 1FD8
-1FDB
, 1FE8
-1FEC
, 1FF8
-1FFC
, FF21
-FF3A
.Of the first 128 Unicode characters, exactly 26 are considered to be uppercase:
ABCDEFGHIJKLMNOPQRSTUVWXYZ[This specification for the method
isUpperCase
is scheduled for introduction in Java version 1.1, either as defined here, or updated for Unicode 2.0; see §20.5. In previous versions of Java, this method returns false
for all arguments larger than \u00FF
.]20.5.13 public static boolean
isTitleCase
(char ch)
The result is true
if and only if the character argument is a titlecase character.
The notion of "titlecase" was introduced into Unicode to handle a peculiar situation: there are single Unicode characters whose appearance in each case looks exactly like two ordinary Latin letters. For example, there is a single Unicode character `LJ' (\u01C7
) that looks just like the characters `L' and `J' put together. There is a corresponding lowercase letter `lj' (\u01C9
) as well. These characters are present in Unicode primarily to allow one-to-one translations from the Cyrillic alphabet, as used in Serbia, for example, to the Latin alphabet. Now suppose the word "LJUBINJE" (which has six characters, not eight, because two of them are the single Unicode characters `LJ' and `NJ', perhaps produced by one-to-one translation from the Cyrillic) is to be written as part of a book title, in capitals and lowercase. The strategy of making the first letter uppercase and the rest lowercase results in "LJubinje"-most unfortunate. The solution is that there must be a third form, called a titlecase form. The titlecase form of `LJ' is `Lj' (\u01C8
) and the titlecase form of `NJ' is `Nj'. A word for a book title is then best rendered by converting the first letter to titlecase if possible, otherwise to uppercase; the remaining letters are then converted to lowercase.
A character is considered to be titlecase if and only if both of the following are true:
ch
is not in the range \u2000
through \u2FFF
.
isTitleCase
returns
true
:
\u01C5 LATIN CAPITAL LETTER D WITH SMALL LETTER Z WITH CARON \u01C8 LATIN CAPITAL LETTER L WITH SMALL LETTER J \u01CB LATIN CAPITAL LETTER N WITH SMALL LETTER J \u01F2 LATIN CAPITAL LETTER D WITH SMALL LETTER Z[This method is scheduled for introduction in Java version 1.1, either as defined here, or updated for Unicode 2.0; see §20.5.]
20.5.14 public static boolean
isDigit
(char ch)
The result is true
if and only if the character argument is a digit.
A character is considered to be a digit if and only if both of the following are true:
ch
is not in the range \u2000
through \u2FFF
.
DIGIT
.
Of the first 128 Unicode characters, exactly 10 are considered to be digits:0030
-0039
ISO-Latin-1 (and ASCII) digits ('
0'
-'
9'
)0660
-0669
Arabic-Indic digits06F0
-06F9
Eastern Arabic-Indic digits0966
-096F
Devanagari digits09E6
-09EF
Bengali digits0A66
-0A6F
Gurmukhi digits0AE6
-0AEF
Gujarati digits0B66
-0B6F
Oriya digits0BE7
-0BEF
Tamil digits (there are only nine of these-no zero digit)0C66
-0C6F
Telugu digits0CE6
-0CEF
Kannada digits0D66
-0D6F
Malayalam digits0E50
-0E59
Thai digits0ED0
-0ED9
Lao digitsFF10
-FF19
Fullwidth digits
0123456789[This specification for the method
isDigit
is scheduled for introduction in Java version 1.1, either as defined here, or updated for Unicode 2.0; see §20.5. In previous versions of Java, this method returns false
for all arguments larger than \u00FF
.]20.5.15 public static boolean
isLetter
(char ch)
The result is true
if and only if the character argument is a letter.
A character is considered to be a letter if and only if it is a letter or digit (§20.5.16) but is not a digit (§20.5.14).
[This method is scheduled for introduction in Java version 1.1, either as defined here, or updated for Unicode 2.0; see §20.5.]
20.5.16 public static boolean
isLetterOrDigit
(char ch)
The result is true
if and only if the character argument is a letter-or-digit.
A character is considered to be a letter-or-digit if and only if it is a defined Unicode character (§20.5.10) and its code lies in one of the following ranges:
It follows, then, that for Unicode 1.1.5 as corrected above, the Unicode letters and digits are exactly those with codes in the following list, which contains both single codes and inclusive ranges:0030
-0039
ISO-Latin-1 (and ASCII) digits ('0'
-'9'
)0041
-005A
ISO-Latin-1 (and ASCII) uppercase Latin letters ('A'-'Z'
)0061
-007A
ISO-Latin-1 (and ASCII) lowercase Latin letters ('a'-'z'
)00C0
-00D6
ISO-Latin-1 supplementary letters00D8
-00F6
ISO-Latin-1 supplementary letters00F8
-00FF
ISO-Latin-1 supplementary letters0100
-1FFF
Latin extended-A, Latin extended-B, IPA extensions, spacing modifier letters, combining diacritical marks, basic Greek, Greek symbols and Coptic, Cyrillic, Armenian, Hebrew extended-A, Basic Hebrew, Hebrew extended-B, Basic Arabic, Arabic extended, Devanagari, Bengali, Gurmukhi, Gujarati, Oriya, Tamil, Telugu, Kannada, Malayalam, Thai, Lao, Basic Georgian, Georgian extended, Hanguljamo, Latin extended additional, Greek extended3040
-9FFF
Hiragana, Katakana, Bopomofo, Hangul compatibility Jamo, CJK miscellaneous, enclosed CJK characters and months, CJK compatibility, Hangul, Hangul supplementary-A, Hangul supplementary-B, CJK unified ideographsF900
-FDFF
CJK compatibility ideographs, alphabetic presentation forms, Arabic presentation forms-AFE70
-FEFE
Arabic presentation forms-BFF10
-FF19
Fullwidth digitsFF21
-FF3A
Fullwidth Latin uppercaseFF41
-FF5A
Fullwidth Latin lowercaseFF66
-FFDC
Halfwidth Katakana and Hangul
0030
-0039
, 0041
-005A
, 0061
-007A
, 00C0
-00D6
, 00D8
-00F6
, 00F8
-01F5
, 01FA
-0217
, 0250
-02A8
, 02B0
-02DE
, 02E0
-02E9
, 0300
-0345
, 0360
-0361
, 0374
-0375
, 037A
, 037E
, 0384
-038A
, 038C
, 038E
, 038F
-03A1
, 03A3
-03CE
, 03D0
-03D6
, 03DA
-03E2
, 03DA
, 03DC
, 03DE
, 03E0
, 03E2
-03F3
, 0401
-040C
, 040E
-044F
, 0451
-045C
, 045E
-0486
, 0490
-04C4
, 04C7
-04C8
, 04CB
-04CC
, 04D0
-04EB
, 04EE
-04F5
, 04F8
-04F9
, 0531
-0556
, 0559
-055F
, 0561
-0587
, 0589
, 05B0
-05B9
, 05BB
-05C3
, 05D0
-05EA
, 05F0
-05F4
, 060C
, 061B
, 061F
, 0621
, 0622
-063A
, 0640
-0652
, 0660
-066D
, 0670
-06B7
, 06BA
-06BE
, 06C0
-06CE
, 06D0
-06ED
, 06F0
-06F9
, 0901
-0903
, 0905
-0939
, 093C
-094D
, 0950
-0954
, 0958
-0970
, 0981
-0983
, 0985
-098C
, 098F
-0990
, 0993
-09A8
, 09AA
-09B0
, 09B2
, 09B6
-09B9
, 09BC
, 09BE
, 09BF
-09C4
, 09C7
-09C8
, 09CB
-09CD
, 09D7
, 09DC
-09DD
, 09DF
-09E3
, 09E6
-09FA
, 0A02
, 0A05
-0A0A
, 0A0F
-0A10
, 0A13
-0A28
, 0A2A
-0A30
, 0A32
-0A33
, 0A35
-0A36
, 0A38
-0A39
, 0A3C
, 0A3E
, 0A3F
-0A42
, 0A47
-0A48
, 0A4B
-0A4D
, 0A59
-0A5C
, 0A5E
, 0A66
-0A74
, 0A81
-0A83
, 0A85
-0A8B
, 0A8D
, 0A8F
, 0A90
-0A91
, 0A93
-0AA8
, 0AAA
-0AB0
, 0AB2
-0AB3
, 0AB5
-0AB9
, 0ABC
-0AC5
, 0AC7
-0AC9
, 0ACB
-0ACD
, 0AD0
, 0AE0
, 0AE6
-0AEF
, 0B01
-0B03
, 0B05
-0B0C
, 0B0F
-0B10
, 0B13
-0B28
, 0B2A
-0B30
, 0B32
-0B33
, 0B36
-0B39
, 0B3C
-0B43
, 0B47
-0B48
, 0B4B
-0B4D
, 0B56
-0B57
, 0B5C
-0B5D
, 0B5F
-0B61
, 0B66
-0B70
, 0B82
-0B83
, 0B85
-0B8A
, 0B8E
-0B90
, 0B92
-0B95
, 0B99
-0B9A
, 0B9C
, 0B9E
, 0B9F
, 0BA3
-0BA4
, 0BA8
-0BAA
, 0BAE
-0BB5
, 0BB7
-0BB9
, 0BBE
-0BC2
, 0BC6
-0BC8
, 0BCA
-0BCD
, 0BD7
, 0BE7
-0BF2
, 0C01
-0C03
, 0C05
-0C0C
, 0C0E
-0C10
, 0C12
-0C28
, 0C2A
-0C33
, 0C35
-0C39
, 0C3E
-0C44
, 0C46
-0C48
, 0C4A
-0C4D
, 0C55
-0C56
, 0C60
-0C61
, 0C66
-0C6F
, 0C82
-0C83
, 0C85
-0C8C
, 0C8E
-0C90
, 0C92
-0CA8
, 0CAA
-0CB3
, 0CB5
-0CB9
, 0CBE
-0CC4
, 0CC6
-0CC8
, 0CCA
-0CCD
, 0CD5
-0CD6
, 0CDE
, 0CE0
, 0CE1
, 0CE6
-0CEF
, 0D02
-0D03
, 0D05
-0D0C
, 0D0E
-0D10
, 0D12
-0D28
, 0D2A
-0D39
, 0D3E
-0D43
, 0D46
-0D48
, 0D4A
-0D4D
, 0D57
, 0D60
-0D61
, 0D66
-0D6F
, 0E01
-0E3A
, 0E3F
-0E5B
, 0E81
-0E82
, 0E84
, 0E87
-0E88
, 0E8A
, 0E8D
, 0E94
-0E97
, 0E99
-0E9F
, 0EA1
-0EA3
, 0EA5
, 0EA7
, 0EAA
-0EAB
, 0EAD
-0EB9
, 0EBB
-0EBD
, 0EC0
-0EC4
, 0EC6
, 0EC8
, 0EC9
-0ECD
, 0ED0
-0ED9
, 0EDC
-0EDD
, 10A0
-10C5
, 10D0
-10F6
, 10FB
, 1100
-1159
, 115F
-11A2
, 11A8
-11F9
, 1E00
-1E9A
, 1EA0
-1EF9
, 1F00
-1F15
, 1F18
-1F1D
, 1F20
-1F45
, 1F48
-1F4D
, 1F50
-1F57
, 1F59
, 1F5B
, 1F5D
, 1F5F
-1F7D
, 1F80
-1FB4
, 1FB6
-1FC4
, 1FC6
-1FD3
, 1FD6
-1FDB
, 1FDD
-1FEF
, 1FF2
-1FF4
, 1FF6
-1FFE
, 3041
-3094
, 3099
-309E
, 30A1
-30FE
, 3105
-312C
, 3131
-318E
, 3190
-319F
, 3200
-321C
, 3220
-3243
, 3260
-327B
, 327F
-32B0
, 32C0
-32CB
, 32D0
-32FE
, 3300
-3376
, 337B
-33DD
, 33E0
-33FE
, 3400
-9FA5
, F900
-FA2D
, FB00
-FB06
, FB13
-FB17
, FB1E
-FB36
, FB38
-FB3C
, FB3E
, FB40
, FB41
, FB43
, FB44
, FB46
, FB47
-FBB1
, FBD3
-FD3F
, FD50
-FD8F
, FD92
-FDC7
, FDF0
-FDFB
, FE70
-FE72
, FE74
, FE76
, FE77
-FEFC
, FF10
-FF19
, FF21
-FF3A
, FF41
-FF5A
, FF66
-FFBE
, FFC2
-FFC7
, FFCA
-FFCF
, FFD2
-FFD7
, FFDA
-FFDC
.[This method is scheduled for introduction in Java version 1.1, either as defined here, or updated for Unicode 2.0; see §20.5.]
20.5.17 public static boolean
isJavaLetter
(char ch)
The result is true if and only if the character argument is a character that can begin a Java identifier.
A character is considered to be a Java letter if and only if it is a letter (§20.5.15) or is the dollar sign character '$'
(\u0024
) or the underscore ("low line") character '_'
(\u005F
).
[This method is scheduled for introduction in Java version 1.1, either as defined here, or updated for Unicode 2.0; see §20.5.]
20.5.18 public static boolean
isJavaLetterOrDigit
(char ch)
The result is true if and only if the character argument is a character that can occur in a Java identifier after the first character.
A character is considered to be a Java letter-or-digit if and only if it is a letter-or-digit (§20.5.16) or is the dollar sign character '$'
(\u0024
) or the underscore ("low line") character '_'
(\u005F
).
[This method is scheduled for introduction in Java version 1.1, either as defined here, or updated for Unicode 2.0; see §20.5.]
20.5.19 public static boolean
isSpace
(char ch)
The result is true
if the argument ch
is one of the following characters:
Otherwise, the result is'
\t'
\u0009 HT HORIZONTAL TABULATION'
\n'
\u000A LF LINE FEED (also known asNEW LINE
)'
\f'
\u000C FF FORM FEED'
\r'
\u000D CR CARRIAGE RETURN'
'
\u0020 SP SPACE
false
.
20.5.20 public static char
toLowerCase
(char ch)
If the character ch
has a lowercase equivalent specified in the Unicode attribute
table, then that lowercase equivalent character is returned. Otherwise, the argument
ch
is returned.
The lowercase equivalents specified in the Unicode attribute table, for Unicode 1.1.5 as corrected above, are as follows, where character codes to the right of arrows are the lowercase equivalents of character codes to the left of arrows: 0041
-005A
0061
-007A
, 00C0
-00D6
00E0
-00F6
, 00D8
-00DE
00F8
-00FE
, 0100
-012E
0101
-012F
(evens to odds), 0132
-0136
0133
-0137
(evens to odds), 0139
-0147
013A
-0148
(odds to evens), 014A
-0176
014B
-0177
(evens to odds), 0178
00FF
, 0179
-017D
017A
-017E
(odds to evens), 0181
0253
, 0182
0183
, 0184
0185
, 0186
0254
, 0187
0188
, 018A
0257
, 018B
018C
, 018E
0258
, 018F
0259
, 0190
025B
, 0191
0192
, 0193
0260
, 0194
0263
, 0196
0269
, 0197
0268
, 0198
0199
, 019C
026F
, 019D
0272
, 01A0
-01A4
01A1
-01A5
(evens to odds), 01A7
01A8
, 01A9
0283
, 01AC
01AD
, 01AE
0288
, 01AF
01B0
, 01B1
028A
, 01B2
028B
, 01B3
01B4
, 01B5
01B6
, 01B7
0292
, 01B8
01B9
, 01BC
01BD
, 01C4
01C6
, 01C5
01C6
, 01C7
01C9
, 01C8
01C9
, 01CA
01CC
, 01CB
-01DB
01CC
-01DC
(odds to evens), 01DE
-01EE
01DF
-01EF
(evens to odds), 01F1
01F3
, 01F2
01F3
, 01F4
01F5
, 01FA
-0216
01FB
-0217
(evens to odds), 0386
03AC
, 0388
-038A
03AD
-03AF
, 038C
03CC
, 038E
03CD
, 038F
03CE
, 0391
-03A1
03B1
-03C1
, 03A3
-03AB
03C3
-03CB
, 03E2
-03EE
03E3
-03EF
(evens to odds), 0401
-040C
0451
-045C
, 040E
045E
, 040F
045F
, 0410
-042F
0430
-044F
, 0460
-0480
0461
-0481
(evens to odds), 0490
-04BE
0491
-04BF
(evens to odds), 04C1
04C2
, 04C3
04C4
, 04C7
04C8
, 04CB
04CC
, 04D0
-04EA
04D1
-04EB
(evens to odds), 04EE
-04F4
04EF
-04F5
(evens to odds), 04F8
04F9
, 0531
-0556
0561
-0586
, 10A0
-10C5
10D0
-10F5
, 1E00
-1E94
1E01
-1E95
(evens to odds), 1EA0
-1EF8
1EA1
-1EF9
(evens to odds), 1F08
-1F0F
1F00
-1F07
, 1F18
-1F1D
1F10
-1F15
, 1F28
-1F2F
1F20
-1F27
, 1F38
-1F3F
1F30
-1F37
, 1F48
-1F4D
1F40
-1F45
, 1F59
1F51
, 1F5B
1F53
, 1F5D
1F55
, 1F5F
1F57
, 1F68
-1F6F
1F60
-1F67
, 1F88
-1F8F
1F80
-1F87
, 1F98
-1F9F
1F90
-1F97
, 1FA8
-1FAF
1FA0
-1FA7
, 1FB8
1FB0
, 1FB9
1FB1
, 1FBA
1F70
, 1FBB
1F71
, 1FBC
1FB3
, 1FC8
-1FCB
1F72
-1F75
, 1FCC
1FC3
, 1FD8
1FD0
, 1FD9
1FD1
, 1FDA
1F76
, 1FDB
1F77
, 1FE8
1FE0
, 1FE9
1FE1
, 1FEA
1F7A
, 1FEB
1F7B
, 1FEC
1FE5
, 1FF8
1F78
, 1FF9
1F79
, 1FFA
1F7C
, 1FFB
1F7D
, 1FFC
1FF3
, 2160
-216F
2170
-217F
, 24B6
-24CF
24D0
-24E9
, FF21
-FF3A
FF41
-FF5A
.
Note that the method isLowerCase
(§20.5.11) will not necessarily return true
when given the result of the toLowerCase
method.
[This specification for the method toLowerCase
is scheduled for introduction in Java version 1.1, either as defined here, or updated for Unicode 2.0; see §20.5. In previous versions of Java, this method returns its argument for all arguments larger than \u00FF
.]
20.5.21 public static char
toUpperCase
(char ch)
If the character ch
has an uppercase equivalent specified in the Unicode attribute
table, then that uppercase equivalent character is returned. Otherwise, the argument
ch
is returned.
The uppercase equivalents specified in the Unicode attribute table for Unicode 1.1.5 as corrected above, are as follows, where character codes to the right of arrows are the uppercase equivalents of character codes to the left of arrows: 0061
-007A
0041
-005A
, 00E0
-00F6
00C0
-00D6
, 00F8
-00FE
00D8
-00DE
, 00FF
0178
, 0101
-012F
0100
-012E
(odds to evens), 0133
-0137
0132
-0136
(odds to evens), 013A
-0148
0139
-0147
(evens to odds), 014B
-0177
014A
-0176
(odds to evens), 017A
-017E
0179
-017D
(evens to odds), 017F
0053
, 0183
-0185
0182
-0184
(odds to evens), 0188
0187
, 018C
018B
, 0192
0191
, 0199
0198
, 01A1
-01A5
01A0
-01A4
(odds to evens), 01A8
01A7
, 01AD
01AC
, 01B0
01AF
, 01B4
01B3
, 01B6
01B5
, 01B9
01B8
, 01BD
01BC
, 01C5
01C4
, 01C6
01C4
, 01C8
01C7
, 01C9
01C7
, 01CB
01CA
, 01CC
01CA
, 01CE
-01DC
01CD
-01DB
(evens to odds), 01DF
-01EF
01DE
-01EE
(odds to evens), 01F2
01F1
, 01F3
01F1
, 01F5
01F4
, 01FB
-0217
01FA
-0216
(odds to evens), 0253
0181
, 0254
0186
, 0257
018A
, 0258
018E
, 0259
018F
, 025B
0190
, 0260
0193
, 0263
0194
, 0268
0197
, 0269
0196
, 026F
019C
, 0272
019D
, 0283
01A9
, 0288
01AE
, 028A
01B1
, 028B
01B2
, 0292
01B7
, 03AC
0386
, 03AD
-03AF
0388
-038A
, 03B1
-03C1
0391
-03A1
, 03C2
03A3
, 03C3
-03CB
03A3
-03AB
, 03CC
038C
, 03CD
038E
, 03CE
038F
, 03D0
0392
, 03D1
0398
, 03D5
03A6
, 03D6
03A0
, 03E3
-03EF
03E2
-03EE
(odds to evens), 03F0
039A
, 03F1
03A1
, 0430
-044F
0410
-042F
, 0451
-045C
0401
-040C
, 045E
040E
, 045F
040F
, 0461
-0481
0460
-0480
(odds to evens), 0491
-04BF
0490
-04BE
(odds to evens), 04C2
04C1
, 04C4
04C3
, 04C8
04C7
, 04CC
04CB
, 04D1
-04EB
04D0
-04EA
(odds to evens), 04EF
-04F5
04EE
-04F4
(odds to evens), 04F9
04F8
, 0561
-0586
0531
-0556
, 1E01
-1E95
1E00
-1E94
(odds to evens), 1EA1
-1EF9
1EA0
-1EF8
(odds to evens), 1F00
-1F07
1F08
-1F0F
, 1F10
-1F15
1F18
-1F1D
, 1F20
-1F27
1F28
-1F2F
, 1F30
-1F37
1F38
-1F3F
, 1F40
-1F45
1F48
-1F4D
, 1F51
1F59
, 1F53
1F5B
, 1F55
1F5D
, 1F57
1F5F
, 1F60
-1F67
1F68
-1F6F
, 1F70
1FBA
, 1F71
1FBB
, 1F72
-1F75
1FC8
-1FCB
, 1F76
1FDA
, 1F77
1FDB
, 1F78
1FF8
, 1F79
1FF9
, 1F7A
1FEA
, 1F7B
1FEB
, 1F7C
1FFA
, 1F7D
1FFB
, 1F80
-1F87
1F88
-1F8F
, 1F90
-1F97
1F98
-1F9F
, 1FA0
-1FA7
1FA8
-1FAF
, 1FB0
1FB8
, 1FB1
1FB9
, 1FB3
1FBC
, 1FC3
1FCC
, 1FD0
1FD8
, 1FD1
1FD9
, 1FE0
1FE8
, 1FE1
1FE9
, 1FE5
1FEC
, 1FF3
1FFC
, 2170
-217F
2160
-216F
, 24D0
-24E9
24B6
-24CF
, FF41
-FF5A
FF21
-FF3A
.
Note that the method isUpperCase
(§20.5.12) will not necessarily return true
when given the result of the toUpperCase
method.
[This specification for the method toUpperCase
is scheduled for introduction in Java version 1.1, either as defined here, or updated for Unicode 2.0; see §20.5. In previous versions of Java, this method returns its argument for all arguments larger than \u00FE
. Note that although \u00FF
is a lowercase character, its uppercase equivalent is \u0178
; toUpperCase
in versions of Java prior to version 1.1 simply do not consistently handle or use Unicode character codes above \u00FF
.]
20.5.22 public static char
toTitleCase
(char ch)
If the character ch
has a titlecase equivalent specified in the Unicode attribute
table, then that titlecase equivalent character is returned; otherwise, the argument
ch
is returned.
Note that the method isTitleCase
(§20.5.13) will not necessarily return true
when given the result of the toTitleCase
method. The Unicode attribute table always has the titlecase attribute equal to the uppercase attribute for characters that have uppercase equivalents but no separate titlecase form.
Example: Character.toTitleCase('a')
returns 'A'
Example: Character.toTitleCase('Q')
returns 'Q'
Example: Character.toTitleCase('lj')
returns 'Lj'
where 'lj'
is the Unicode character \u01C9
and 'Lj'
is its titlecase equivalent character \u01C8
.
[This method is scheduled for introduction in Java version 1.1.]
20.5.23 public static int
digit
(char ch, int radix)
Returns the numeric value of the character ch
considered as a digit in the specified
radix. If the value of radix
is not a valid radix, or the character ch
is not a valid
digit in the specified radix, then -1
is returned.
A radix is valid if and only if its value is not less than Character.MIN_RADIX
(§20.5.3) and not greater than Character.MAX_RADIX
(§20.5.4).
A character is a valid digit if and only if one of the following is true:
isDigit
returns true
for the character, and the decimal digit value of the character, as specified in the Unicode attribute table, is less than the specified radix. In this case, the decimal digit value is returned.
'A'
-'Z'
(\u0041
-\u005A
) and its code is less than radix+'A'-10
. In this case ch-'A'+10
is returned.
'
a'
-'z'
(\u0061
-\u007A
) and its code is less than radix+'a'-10
. In this case ch-'a'+10
is returned.
digit
is scheduled for introduction in Java version 1.1, either as defined here, or updated for Unicode 2.0; see §20.5. In previous versions of Java, this method returns -1
for all character codes larger than \u00FF
.]20.5.24 public static char
forDigit
(int digit, int radix)
Returns a character that represents the given digit in the specified radix. If the
value of radix
is not a valid radix, or the value of digit
is not a valid digit in the
specified radix, the null character '\u0000'
is returned.
A radix is valid if and only if its value is not less than Character.MIN_RADIX
(§20.5.3) and not greater than Character.MAX_RADIX
(§20.5.4).
A digit is valid if and only if it is nonnegative and less than the radix
.
If the digit is less than 10
, then the character value '0'+digit
is returned; otherwise, 'a'+digit-10
is returned. Thus, the digits produced by forDigit
, in increasing order of value, are the ASCII characters:
0123456789abcdefghijklmnopqrstuvwxyz(these are
'\u0030'
through '\u0039'
and '\u0061'
through '\u007a'
). If
uppercase letters are desired, the toUpperCase
method may be called on the
result:
Character.toUpperCase(Character.forDigit(digit, radix))
Contents | Prev | Next | Index
Java Language Specification (HTML generated by Suzette Pelouch on February 24, 1998)
Copyright © 1996 Sun Microsystems, Inc.
All rights reserved
Please send any comments or corrections to doug.kramer@sun.com