CsvUtil

CSV file processing tool - CsvUtil

Introduction

Comma-Separated Values (CSV), sometimes referred to as character separated values due to the fact that the delimiter can be any character other than a comma, is a file format that stores tabular data (numbers and text) in plain text form.

Hutool has developed a CSV file reading and writing implementation based on the FastCSV project (completely independent of any third-party dependencies) specifically for this format.

CsvUtil is a CSV tool class that mainly encapsulates two methods:

  • getReader is used for reading CSV files.
  • getWriter is used for generating CSV files.

These two methods respectively obtain CsvReader and CsvWriter objects, allowing independent completion of reading and writing CSV files.

Usage

Reading CSV Files

Reading as CsvRow

CsvReader reader = CsvUtil.getReader();
// Read CSV data from a file
CsvData data = reader.read(FileUtil.file("test.csv"));
List<CsvRow> rows = data.getRows();
// Iterate through the rows
for (CsvRow csvRow : rows) {
	// getRawList returns a List where each item is a cell (separated by commas) in the CSV.
	Console.log(csvRow.getRawList());
}

The CsvRow object also records other information, including the original row number, etc.

Reading as Bean List

  1. Test CSV: test_bean.csv:
姓名,gender,focus,age
张三,男,无,33
李四,男,好对象,23
王妹妹,女,特别关注,22
  1. Define Bean:
// Lombok annotation
@Data
private static class TestBean {
	// If the titles in the CSV do not correspond to the fields, you can use the alias annotation to set aliases
	@Alias("姓名")
	private String name;
	private String gender;
	private String focus;
	private Integer age;
}
  1. Reading:
final CsvReader reader = CsvUtil.getReader();
// Assuming the CSV file is in the classpath directory
final List<TestBean> result = reader.read(ResourceUtil.getUtf8Reader("test_bean.csv"), TestBean.class);
  1. Output:
CsvReaderTest.TestBean(name=张三, gender=男, focus=无, age=33)
CsvReaderTest.TestBean(name=李四, gender=男, focus=好对象, age=23)
CsvReaderTest.TestBean(name=王妹妹, gender=女, focus=特别关注, age=22)

focus=‘无’, age=33), CsvReaderTest.TestBean(name=‘李四’, gender=‘男’, focus=‘好对象’, age=23), CsvReaderTest.TestBean(name=‘王妹妹’, gender=‘女’, focus=‘特别关注’, age=22)] java [CsvReaderTest.TestBean(name='张三', gender='男', focus='无', age=33), CsvReaderTest.TestBean(name='李四', gender='男', focus='好对象', age=23), CsvReaderTest.TestBean(name='王妹妹', gender='女', focus='特别关注', age=22)]>

Generating CSV Files

// Specify the path and encoding
CsvWriter writer = CsvUtil.getWriter("e:/testWrite.csv", CharsetUtil.CHARSET_UTF_8);
// Write rows
writer.write(new String[] {"a1", "b1", "c1"}, new String[] {"a2", "b2", "c2"}, new String[] {"a3", "b3", "c3"});

The effect is as follows:

CSV file written using CsvUtil

Encoding Issues

CSV files are a simple text format and can be encoded using different charsets, allowing for reading using various systems. However, if the encoding of the CSV file does not match the system’s default encoding, issues such as garbled text may occur when reading the file using Excel.

To address this issue, you can either set the encoding of the CSV file to match the system’s default encoding or add a BOM (Byte Order Mark) header to specify the encoding used for parsing when reading the file using Excel.