The format of Japanese addresses posted on the net is usually fixed. It should be displayed as it flows from the zip code to the prefecture, city, ward, and town. And there are many specifications such as Google Geocoding by extracting the regular expression zip code or prefecture name.
What you have to be careful about here is that you cannot use the greedy mode to extract the prefecture name or city name. Not (. \ * Prefecture) but (. \ *? Prefecture) Otherwise, for example, the analysis result of "Nagoya City, Aichi Prefecture" will be extracted to "Nagoya City, Aichi Prefecture" instead of "Aichi Prefecture".
String matchString = "〒066-0012\n" +
"Bibi, Chitose City, Hokkaido New Chitose Airport Domestic Terminal Building 2F";
Matcher matcher = Pattern.compile("\\s*〒(\\d{3}-\\d{4})[\\s ]*(.*?Tokyo)?(.*?road)?(.*?Fu)?(.*?Prefecture)?(.*?city)?(.*?Ward)?").matcher(matchString);
while (matcher.find()){
System.out.println("Street address:" + matcher.group(0));
System.out.println();
System.out.println("Postal code:" + matcher.group(1));
System.out.println();
System.out.println("City name:" + matcher.group(2));
System.out.println();
System.out.println("Road name:" + matcher.group(3));
System.out.println();
System.out.println("Prefecture name:" + matcher.group(4));
System.out.println();
System.out.println("Prefecture name:" + matcher.group(5));
System.out.println();
System.out.println("City name:" + matcher.group(6));
System.out.println();
System.out.println("Ward name" + matcher.group(7));
}
Street address:〒066-0012
Chitose City, Hokkaido
Postal code:066-0012
City name:null
Road name:Hokkaido
Prefecture name:null
Prefecture name:null
City name:Chitose
Ward name null
It's a simple content, but I posted it because I thought it would be used often.
Recommended Posts