Read items containing commas in a CSV file without splitting (Java)

If there is an item with a comma in the CSV file, splitting it with the normal String.split method will solve the problem of splitting in the middle of the item.

For example, if there is such a CSV record

test.csv


ABC,DEF,GHI,'1,000,000',JKL,MN

When you want to split this record with commas, suppose you want to split this single-quote-enclosed string as a single string, not separated by commas.

The expected result at that time is

Expected results


ABC
DEF
GHI
'1,000,000'
JKL
MN

However, if you use the split method normally, it will be divided like this.

split(",")Disappointing result of


ABC
DEF
GHI
'1
000
000'
JKL
MN

I tried to find out what happened, regular expressions, etc., but the split method is in the first place

"Split when conditions are met" Is a method

"Do not split when conditions are met" I can't say that (in this case, the condition is "comma surrounded by quotation marks", and I can't say that it doesn't split only when that condition is met), so I gave up and made my own method.

Main.java



import java.io.BufferedReader;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.ArrayList;
import java.util.List;

public class Main {

	public static void main(String[] args) throws IOException {

		try {
			String fileName= "C:\\temp\\test.csv"; //CSV file you want to read

			//Read input CSV file
			File file= new File(fileName);
			FileInputStream input = new FileInputStream(file);
			InputStreamReader stream= new InputStreamReader(input,"SJIS");
			BufferedReader br = new BufferedReader(stream);

			String line = br.readLine();
			//Call your own method
			List<String> data = csvSplit(line);

			for (String col : data) {
				System.out.print(col + "\r\n");
			}

		} catch (IOException e) {
			e.printStackTrace();
		}
	}

	private static List<String> csvSplit(String line) {

		char c;
		StringBuilder s = new StringBuilder();
		List<String> data = new ArrayList<String>();
		boolean singleQuoteFlg = false;

		for(int i=0; i < line.length(); i++){
			c = line.charAt(i);
			if (c == ',' && !singleQuoteFlg) {
				data.add(s.toString());
				s.delete(0,s.length());
			} else if (c == ',' && singleQuoteFlg) {
				s.append(c);
			} else if (c == '\'') {
				singleQuoteFlg = !singleQuoteFlg;
				s.append(c);
			} else {
				s.append(c);
			}
		}
		return data;
	}
}

Below are the execution results.

Execution result


ABC
DEF
GHI
'1,000,000'
JKL

The commas in the quotes are not separated and are properly considered as a single string.

What you are doing

--Read one CSV record (String line = br.readLine ();) --Extract characters one by one from the read record and store them in a StringBuilder (c = line.charAt (i);). (S.append (c);) --Processing is divided according to the one-character pattern. --When it is a comma and the comma is not surrounded by single quotes --When it is a comma and the comma is surrounded by single quotes --When single quotes

The explanation of logic is like this. By the way, if you don't want to include single quotes in the output, they will not be included in the output unless you concatenate to StringBuilder.

After modification csvSplit


	private static List<String> csvSplit(String line) {

		char c;
		StringBuilder s = new StringBuilder();
		List<String> data = new ArrayList<String>();
		boolean singleQuoteFlg = false;

		for(int i=0; i < line.length(); i++){
			c = line.charAt(i);
			if (c == ',' && !singleQuoteFlg) {
				data.add(s.toString());
				s.delete(0,s.length());
			} else if (c == ',' && singleQuoteFlg) {
				s.append(c);
			} else if (c == '\'') {
				singleQuoteFlg = !singleQuoteFlg;
//				s.append(c); //If you stop connecting characters, single quotes will not be output!
			} else {
				s.append(c);
			}
		}

		return data;
	}

Execution result (without single quote)


ABC
DEF
GHI
1,000,000
JKL

I feel like this.

If there is any other good way, I would appreciate it if you could teach me!

Recommended Posts

Read items containing commas in a CSV file without splitting (Java)
Read a string in a PDF file with Java
Read Java properties file in C #
Read CSV in Java (Super CSV Annotation)
Read xlsx file in Java with Selenium
A bat file that uses Java in windows
[Java] Read the file in src / main / resources
Log out a CSV file that can be read in Excel using logback
How to ZIP a JAVA CSV file and manage it in a Byte array
Read JSON in Java
Read Java Property file
[Java] Integer information of characters in a text file acquired by the read () method
How to convert a file to a byte array in Java
Java11: Run Java code in a single file as is
How to read your own YAML file (*****. Yml) in Java
Read WAV data as a byte array in Android Java
Activate Excel file A1 cell of each sheet in Java
Read binary files in Java 1
Read standard input in Java
[Java] Create a temporary file
Find a subset in Java
[Java] [Android] Read ini file
Text file placed in resources in Java cannot be read when jarted
Get the public URL of a private Flickr file in Java
[Java] Returns a Japanese name file in filename of HTTP header
Sample to read and write LibreOffice Calc fods file in Java 2021
Easily read text files in Java (Java 11 & Java 7)
3 Implement a simple interpreter in Java
I created a PDF in Java.
Upload a file using Java HttpURLConnection
Run a batch file from Java
Unzip the zip file in Java
Log output to file in Java
A simple sample callback in Java
[Java] Reflash all items in BeanClass
About file copy processing in Java
Get stuck in a Java primer
The story of forgetting to close a file in Java and failing
To create a Zip file while grouping database search results in Java
Memo: [Java] If a file is in the monitored directory, process it.
How to read log4j configuration file in Java project summarized in jar file Memo