This is a study of CLDR locale data that was adopted in Java 1.8 (but disabled by default) and enabled by default in Java 9. You can check CLDR on the site mentioned for reference, but briefly, it is a project underway at the Unicode Consortium and has different locales (date format, date format, around the world). A database of currency names, country names, dates of the week, numeric formats, etc. is created. This data is managed and published in XML format LDML (Locale Data Markup Language), and Java also incorporates this data, though not completely.
The motivation for writing this article is Problems and solutions caused by migrating Nulab's account infrastructure to Java 9. After reading the article, I was wondering what kind of code would affect the part quoted below.
** Date and currency formats have changed ** The automated test failed due to a change in run-time behavior between Java 8 and 9.
The date format is internationalized. This is because Java 9 changed the initial value of the internationalization extension to CLDR (Common Locale Data Repository), which is the de facto standard for internationalization defined by the Unicode Consortium (JEP 252).
environment
reference
I briefly investigated the difference in behavior between versions of Java 8 (Oracle JDK) / Java 9 (OpenJDK) / Java 10 (OpenJDK).
Locale.toLanguageTag
Returns a well-formed IETF BCP 47 language tag that represents this locale.
Locale.getDefault().toLanguageTag(); // → (1)
new Locale("ja", "JP").toLanguageTag(); // → (2)
new Locale("ja", "JP", "JP").toLanguageTag(); // → (3)
** Output result **
pattern | 1.8.0 | 9.0.4 | 10.0.1 |
---|---|---|---|
1 | ja-JP |
ja-JP |
ja-JP |
2 | ja-JP |
ja-JP |
ja-JP |
3 | ja-JP-u-ca-japanese-x-lvariant-JP |
ja-JP-u-ca-japanese-x-lvariant-JP |
ja-JP-u-ca-japanese-x-lvariant-JP |
Locale (Java SE 10 & JDK 10) Two non-compliant locales are treated as a special case for compatibility. These are ja_JP_JP and th_TH_TH.
In Java, ja_JP_JP has been used to represent the Japanese imperial year along with the Japanese language used in Japan. This is now represented using the Unicode locale extension by specifying the Unicode locale key ca (calendar) and type japanese. The extension u-ca-japanese is automatically added when the Locale constructor is called with the arguments "ja", "JP", "JP".
The "u-ca-japanese" in the JavaDoc quoted above is "u", which stands for Unicode locale extension, and a keyword (key / type pair) that overrides the default behavior of the locale (calendar in this example). ) Is a combination of "ca-japanese".
+----- U extension
| +--- Keyword (Key & Type)
| |
- -----------
u-ca-japanese
^^ ^^^^^^^^
| |
| +--- Type (japanese = Japanese Imperial calendar)
+------ Key (ca = Calendar algorithm)
Java supports two types of keys in Java 9 (JEP 314: Additional Unicode Language-Tag Extensions):
Java 10 adds four things:
There is a Locale.forLanguageTag method, but Locale.Builder is recommended, so use this Builder to generate a customized locale. The following is an example of overriding the currency type with US dollars for the Japanese locale.
Locale locale = new Locale.Builder()
.setLocale(Locale.getDefault())
.setUnicodeLocaleKeyword("cu", "USD")
.build();
System.out.println(locale.toLanguageTag());
// → ja-JP-u-cu-usd
Currency currency = Currency.getInstance(locale);
System.out.println(currency.getCurrencyCode());
// → USD
System.out.println(currency.getDisplayName());
//→ US dollar
System.out.println(currency.getSymbol());
// → $
double money = 123456789.12345;
NumberFormat formatter = NumberFormat.getCurrencyInstance(locale);
formatter.setMinimumFractionDigits(3);
System.out.println(formatter.format(money));
// → $123,456,789.123
As quoted below, you can override the first day of the week with any day of the week by specifying the Unicode extension keyword "u-fw-xxx" as described in the JavaDoc of the Calendar class.
Calendar (Java SE 10 & JDK 10)
Calendar A locale-specific 7 days a week is defined using two parameters: the first day of the week and the minimum number of days in the first week (1-7). These numbers are taken from the locale resource data when the Calendar was built or from the locale itself. If the specified locale contains "fw" and / or "rg" "Unicode extensions", the first day of the week will be retrieved according to those extensions.
Calendar calendar = Calendar.getInstance();
System.out.println(calendar.getCalendarType());
// → gregory
System.out.println(calendar.getFirstDayOfWeek());
// → 1 (Calendar.SUNDAY)
Locale locale = new Locale.Builder()
.setLocale(Locale.getDefault())
.setUnicodeLocaleKeyword("fw", "mon")
.build();
System.out.println(locale.toLanguageTag());
// → ja-JP-u-fw-mon
Calendar calendar = Calendar.getInstance(locale);
System.out.println(calendar.getCalendarType());
// → gregory
System.out.println(calendar.getFirstDayOfWeek());
// → 2 (Calendar.MONDAY)
Unicode locale extension (addition of u-ca-japanese)
Locale locale = new Locale("ja", "JP", "JP");
Date now = new Date();
DateFormat.getDateInstance(DateFormat.FULL, locale).format(now); // → (1)
DateFormat.getDateInstance(DateFormat.LONG, locale).format(now); // → (2)
DateFormat.getDateInstance(DateFormat.MEDIUM, locale).format(now); // → (3)
DateFormat.getDateInstance(DateFormat.SHORT, locale).format(now); // → (4)
Default locale
Date now = new Date();
DateFormat.getDateInstance(DateFormat.FULL).format(now); // → (5)
DateFormat.getDateInstance(DateFormat.LONG).format(now); // → (6)
DateFormat.getDateInstance(DateFormat.MEDIUM).format(now); // → (7)
DateFormat.getDateInstance(DateFormat.SHORT).format(now); // → (8)
** Output result **
pattern | 1.8.0 | 9.0.4 | 10.0.1 | 1.Difference between 8 and 9 |
---|---|---|---|---|
1 | June 25, 2018 | June 25, 2018 | June 25, 2018 | |
2 | H30.06.25 | 2018.06.25 | 2018.06.25 | Yes |
3 | H30.06.25 | 2018.06.25 | 2018.06.25 | Yes |
4 | H30.06.25 | 2018.06.25 | 2018.06.25 | Yes |
5 | June 25, 2018 | Monday, June 25, 2018 | Monday, June 25, 2018 | Yes |
6 | 2018/06/25 | June 25, 2018 | June 25, 2018 | Yes |
7 | 2018/06/25 | 2018/06/25 | 2018/06/25 | |
8 | 18/06/25 | 2018/06/25 | 2018/06/25 | Yes |
Unicode locale extension (addition of u-ca-japanese)
Locale locale = new Locale("ja", "JP", "JP");
Calendar now = Calendar.getInstance(locale);
DateFormat.getDateInstance(DateFormat.FULL, locale).format(now.getTime()); // → (1)
DateFormat.getDateInstance(DateFormat.LONG, locale).format(now.getTime()); // → (2)
DateFormat.getDateInstance(DateFormat.MEDIUM, locale).format(now.getTime()); // → (3)
DateFormat.getDateInstance(DateFormat.SHORT, locale).format(now.getTime()); // → (4)
Default locale
Calendar now = Calendar.getInstance();
DateFormat.getDateInstance(DateFormat.FULL).format(now.getTime()); // → (5)
DateFormat.getDateInstance(DateFormat.LONG).format(now.getTime()); // → (6)
DateFormat.getDateInstance(DateFormat.MEDIUM).format(now.getTime()); // → (7)
DateFormat.getDateInstance(DateFormat.SHORT).format(now.getTime()); // → (8)
** Output result **
pattern | 1.8.0 | 9.0.4 | 10.0.1 | 1.Difference between 8 and 9 |
---|---|---|---|---|
1 | June 25, 2018 | June 25, 2018 | June 25, 2018 | |
2 | H30.06.25 | 2018.06.25 | 2018.06.25 | Yes |
3 | H30.06.25 | 2018.06.25 | 2018.06.25 | Yes |
4 | H30.06.25 | 2018.06.25 | 2018.06.25 | Yes |
5 | June 25, 2018 | Monday, June 25, 2018 | Monday, June 25, 2018 | Yes |
6 | 2018/06/25 | June 25, 2018 | June 25, 2018 | Yes |
7 | 2018/06/25 | 2018/06/25 | 2018/06/25 | |
8 | 18/06/25 | 2018/06/25 | 2018/06/25 | Yes |
Unicode locale extension (addition of u-ca-japanese)
Locale locale = new Locale("ja", "JP", "JP");
LocalDateTime now = LocalDateTime.now();
DateTimeFormatter.ofLocalizedDate(FormatStyle.FULL).withLocale(locale).format(now); // → (1)
DateTimeFormatter.ofLocalizedDate(FormatStyle.LONG).withLocale(locale).format(now); // → (2)
DateTimeFormatter.ofLocalizedDate(FormatStyle.MEDIUM).withLocale(locale).format(now); // → (3)
DateTimeFormatter.ofLocalizedDate(FormatStyle.SHORT).withLocale(locale).format(now); // → (4)
Default locale
LocalDateTime now = LocalDateTime.now();
DateTimeFormatter.ofLocalizedDate(FormatStyle.FULL).format(now); // → (5)
DateTimeFormatter.ofLocalizedDate(FormatStyle.LONG).format(now); // → (6)
DateTimeFormatter.ofLocalizedDate(FormatStyle.MEDIUM).format(now); // → (7)
DateTimeFormatter.ofLocalizedDate(FormatStyle.SHORT).format(now); // → (8)
** Output result **
pattern | 1.8.0 | 9.0.4 | 10.0.1 | 1.Difference between 8 and 9 |
---|---|---|---|---|
1 | June 25, 2018 | Monday, June 25, 2018 | Monday, June 25, 2018 | Yes |
2 | 2018/06/25 | June 25, 2018 | June 25, 2018 | Yes |
3 | 2018/06/25 | 2018/06/25 | 2018/06/25 | |
4 | 18/06/25 | 2018/06/25 | 2018/06/25 | Yes |
5 | June 25, 2018 | Monday, June 25, 2018 | Monday, June 25, 2018 | Yes |
6 | 2018/06/25 | June 25, 2018 | June 25, 2018 | Yes |
7 | 2018/06/25 | 2018/06/25 | 2018/06/25 | |
8 | 18/06/25 | 2018/06/25 | 2018/06/25 | Yes |
DateTimeFormatter.localizedBy
The localizedBy method is a method introduced in Java 10. If the locale contains Unicode extensions, the locale will be overridden. (No side effects on locale instances)
Locale locale = new Locale("ja", "JP", "JP");
LocalDateTime now = LocalDateTime.now();
DateTimeFormatter.ofLocalizedDate(FormatStyle.FULL).localizedBy(locale).format(now); // → (1)
DateTimeFormatter.ofLocalizedDate(FormatStyle.LONG).localizedBy(locale).format(now); // → (2)
DateTimeFormatter.ofLocalizedDate(FormatStyle.MEDIUM).localizedBy(locale).format(now); // → (3)
DateTimeFormatter.ofLocalizedDate(FormatStyle.SHORT).localizedBy(locale).format(now); // → (4)
** Output result **
pattern | 1.8.0 | 9.0.4 | 10.0.1 | 1.Difference between 8 and 9 |
---|---|---|---|---|
1 | - | - | Wednesday, June 27, 2018 | - |
2 | - | - | June 27, 2018 | - |
3 | - | - | June 27, 2018 | - |
4 | - | - | H30/6/27 | - |
Unicode locale extension (addition of u-ca-japanese)
Locale locale = new Locale("ja", "JP", "JP");
ZonedDateTime now = ZonedDateTime.now();
DateTimeFormatter.ofLocalizedDateTime(FormatStyle.FULL, FormatStyle.FULL).withLocale(locale).format(now); // → (1)
DateTimeFormatter.ofLocalizedDateTime(FormatStyle.LONG, FormatStyle.LONG).withLocale(locale).format(now); // → (2)
DateTimeFormatter.ofLocalizedDateTime(FormatStyle.MEDIUM, FormatStyle.MEDIUM).withLocale(locale).format(now); // → (3)
DateTimeFormatter.ofLocalizedDateTime(FormatStyle.SHORT, FormatStyle.SHORT).withLocale(locale).format(now); // → (4)
Default locale
ZonedDateTime now = ZonedDateTime.now();
DateTimeFormatter.ofLocalizedDateTime(FormatStyle.FULL, FormatStyle.FULL).format(now); // → (5)
DateTimeFormatter.ofLocalizedDateTime(FormatStyle.LONG, FormatStyle.LONG).format(now); // → (6)
DateTimeFormatter.ofLocalizedDateTime(FormatStyle.MEDIUM, FormatStyle.MEDIUM).format(now); // → (7)
DateTimeFormatter.ofLocalizedDateTime(FormatStyle.SHORT, FormatStyle.SHORT).format(now); // → (8)
** Output result **
pattern | 1.8.0 | 9.0.4 | 10.0.1 | 1.Difference between 8 and 9 |
---|---|---|---|---|
1 | June 25, 2018 22:20:24 JST | Monday, June 25, 2018 22:22:28 Japan Standard Time | Monday, June 25, 2018 22:24:57 Japan Standard Time | Yes |
2 | 2018/06/25 22:20:24 JST | June 25, 2018 22:22:28 JST | June 25, 2018 22:24:57 JST | Yes |
3 | 2018/06/25 22:20:24 | 2018/06/25 22:22:28 | 2018/06/25 22:24:57 | |
4 | 18/06/25 22:20 | 2018/06/25 22:22 | 2018/06/25 22:24 | Yes |
5 | June 25, 2018 22:20:24 JST | Monday, June 25, 2018 22:22:28 Japan Standard Time | Monday, June 25, 2018 22:24:57 Japan Standard Time | Yes |
6 | 2018/06/25 22:20:24 JST | June 25, 2018 22:22:28 JST | June 25, 2018 22:24:57 JST | Yes |
7 | 2018/06/25 22:20:24 | 2018/06/25 22:22:28 | 2018/06/25 22:24:57 | |
8 | 18/06/25 22:20 | 2018/06/25 22:22 | 2018/06/25 22:24 | Yes |
There was no difference between the versions when doing any pattern.
Locale locale = new Locale("ja", "JP", "JP");
LocalDateTime now = LocalDateTime.now();
DateTimeFormatter.ofPattern("G yyyy-MM-dd (E) a HH:mm:ss.SSS", locale).format(now); // → (1)
DateTimeFormatter.ofPattern("G yyyy-MM-dd (E) a HH:mm:ss.SSS").format(now); // → (2)
DateTimeFormatter.ofPattern("G yy-MM-dd (E) a HH:mm:ss.SSS").localizedBy(locale).format(now); // → (3)
Locale locale = new Locale("ja", "JP", "JP");
ZonedDateTime now = ZonedDateTime.now();
DateTimeFormatter.ofPattern("G yyyy-MM-dd (E) a HH:mm:ss.SSS zzz", locale).format(now); // → (4)
DateTimeFormatter.ofPattern("G yyyy-MM-dd (E) a HH:mm:ss.SSS zzz").format(now); // → (5)
DateTimeFormatter.ofPattern("G yy-MM-dd (E) a HH:mm:ss.SSS zzz").localizedBy(locale).format(now); // → (6)
** Output result **
pattern | 1.8.0 | 9.0.4 | 10.0.1 | 1.Difference between 8 and 9 |
---|---|---|---|---|
1 | Year 2018-06-26 (fire)00 am:01:45.267 | Year 2018-06-26 (fire)00 am:02:50.751 | Year 2018-06-26 (fire)00 am:03:56.584 | |
2 | Year 2018-06-26 (fire)00 am:01:45.267 | Year 2018-06-26 (fire)00 am:02:50.751 | Year 2018-06-26 (fire)00 am:03:56.584 | |
3 | - | - | Heisei 30-06-26 (fire)00 am:26:03.888 | - |
4 | Year 2018-06-26 (fire)00 am:01:45.269 JST | Year 2018-06-26 (fire)00 am:02:50.767 JST | Year 2018-06-26 (fire)00 am:03:56.601 JST | |
5 | Year 2018-06-26 (fire)00 am:01:45.269 JST | Year 2018-06-26 (fire)00 am:02:50.767 JST | Year 2018-06-26 (fire)00 am:03:56.601 JST | |
6 | - | - | Heisei 30-06-26 (fire)00 am:26:03.898 JST | - |
Unicode Technical Standard #35
UNICODE LOCALE DATA MARKUP LANGUAGE (LDML)
Internationalization extensions in Java SE 6
There is no description about CLDR. Although it has nothing to do with CLDR, support for Japanese history is provided in Java SE 6.
** Japanese calendar support **
A new Calendar implementation has been added to support Japanese calendar counting, such as 2005 (Gregorian calendar) as 2005. This Japanese calendar instance can be created in the Calendar.getInstance factory by specifying Locale ("ja", "JP", "JP"). The java.text.SimpleDateFormat class supports calendar-specific year and date formats other than the Gregorian calendar.
Locale locale = new Locale("ja", "JP", "JP");
Calendar calendar = Calendar.getInstance(locale);
System.out.println(calendar.getClass().getCanonicalName());
// → java.util.JapaneseImperialCalendar
System.out.println(new SimpleDateFormat("Gyy year MM month dd day(E)", locale).format(calendar.getTime()));
//→ June 25, 2018(Month)
Calendar calendar = Calendar.getInstance();
System.out.println(calendar.getClass().getCanonicalName());
// → java.util.GregorianCalendar
System.out.println(new SimpleDateFormat("Gyyyy year MM month dd day(E)").format(calendar.getTime()));
//→ June 25, 2018 AD(Month)
Internationalization extensions in Java SE 7
** Locale class supports BCP47 and UTR35 **
The Locale class has been updated to implement identifiers that can be exchanged with BCP 47 (IETF BCP 47 "Tags for Identifying Languages") and LDML for locale data exchange (UTS # 35 "Unicode Locale Data Markup Language"). Supports BCP 47 compatibility extensions.
Extension of internationalization in JDK 8
** Adoption of Unicode CLDR data and java.locale.providers system properties **
The Unicode Consortium has released the Common Locale Data Repository (CLDR) project to "support the world's languages with the largest and most extensive standard locale data repository". CLDR is becoming the de facto standard for locale data.
CLDR's XML-based locale data is included in the JDK 8 release, but is disabled by default.
Default
The default behavior is equivalent to the following settings.
java.locale.providers=JRE,SPI
Internationalization extensions in JDK 9
** CLDR locale data enabled by default **
The XML-based locale data for the Unicode Common Locale Data Repository (CLDR) that was first added to JDK 8 is the default locale data for JDK 9. In previous releases, the default was JRE.
Default
If you do not set this property, the default behavior is equivalent to the following setting:
java.locale.providers=CLDR,COMPAT,SPI
Internationalization extensions in JDK 10
** Additional Unicode language tag extensions **
Java SE 9 only supports -ca (calendar) and -nu (numeric) extensions. Java SE 10 adds support for the following additional extensions in the associated JDK classes:
- -cu (currency type)
- -fw (first day of the week)
- -rg (Region Override)
- -tz (time zone)
java.time: DateTimeFormatter containing "DD" fails on 3-digit day-of-year value
Java 9 has solved the bug that an exception occurs when "DD" is specified in the pattern string when the total number of days in the target date is 100 days or more.
LocalDateTime now = LocalDateTime.now();
DateTimeFormatter.ofPattern("D").format(now);
// → 177
DateTimeFormatter.ofPattern("DD").format(now); // ← Java 1.Exception in 8
// → 177
DateTimeFormatter.ofPattern("DDD").format(now);
// → 177
DateTimeFormatter won't parse dates with custom format "yyyyMMddHHmmssSSS"
In Java 1.8, the bug that an exception occurs when "yyyyMMddHHmmssSSS" is specified in the pattern string has been solved in Java 9.
LocalDateTime now = LocalDateTime.now();
DateTimeFormatter.ofPattern("yyyyMMddHHmmssSSS").format(now);
// → 20180625222024147
java.time.format.FormatStyle.LONG or FULL causes unchecked exception
Formatting LocalDateTime with DateTimeFormatter.ofLocalizedDateTime (FormatStyle.FULL) will raise an exception, but this is by design and not a problem.
LocalDateTime now = LocalDateTime.now();
DateTimeFormatter.ofLocalizedDateTime(FormatStyle.FULL).format(now);
// → Exception in thread "main" java.time.DateTimeException: Unable to extract ZoneId from temporal 2018-06-26T01:00:54.557262700
In case of ZonedDateTime, it can be formatted without any problem.
ZonedDateTime now = ZonedDateTime.now();
DateTimeFormatter.ofLocalizedDateTime(FormatStyle.FULL, FormatStyle.FULL).format(now);
//→ Monday, June 25, 2018 22:24:57 Japan Standard Time
Correspondence for other pattern character strings (correspondence that conforms to CLDR specifications rather than addition)
DateTimeFormatter pattern letters 'A','n','N' Add date-time patterns 'v' and 'vvvv' DateTimeFormatter pattern letter 'g' Incorrect documentation for DateTimeFormatter letter 'k'
I'm picking up what I'm interested in, so not all of them are listed here.
Japanese new era implementation Release Note: Japanese New Era Implementation
This is a response to the revision that will take place on May 1, 2019. At the moment, "New Era" will be displayed in the era as a provisional issue.
Release Note: Update locale data to Unicode CLDR v33
The CLDR data to Version 33 (http://cldr.unicode.org/index/downloads/cldr-33) will be upgraded.
Recommended Posts