Using common CSV of Apache to export CSV file

Posted by PHPoracle on Mon, 20 Jan 2020 03:58:50 +0100

1. Introduction to CSV

The full name of CSV is: Comma Separated Values or character separated values. Its files store tabular data (numbers and text) in plain text. A CSV file consists of any number of records separated by a line break character. Each record consists of fields. The separators between fields are other characters or strings, the most common being commas or tabs. Each line record is on a separate line, separated by a carriage return CRLF (that is, CRLF).

  • For excel, it is used by default to segment data.

  • The last field of each row cannot be followed by a comma

  • One record per line

  • The column is empty, and you need to specify ''

  • Split each record with carriage return CRLF(\r\n)

  • Plain text, using a character set such as ASCII, Unicode, EBCDIC, or GB2312

2. Export CSV file

The Commons CSV package of Apache open source is adopted. For details, please refer to Official website

2.1 use

2.1.1 import dependency
<dependency>  
    <groupId>org.apache.commons</groupId>  
    <artifactId>commons-csv</artifactId>  
    <version>1.7</version>  
</dependency>
2.1.2 API
/**  
 * Export csv file  
 * [@param](https://my.oschina.net/u/2303379) out Output stream  
 * [@param](https://my.oschina.net/u/2303379) iter Data me general  
 * [@param](https://my.oschina.net/u/2303379) charset Character set encoding  
 * [@param](https://my.oschina.net/u/2303379) header Header  
 */  
public void exportCSVFile(OutputStream out, Iterable<?> iter, String charset, String... header) {  
	  try {  
		  // Write bom to prevent Chinese miscoding  
		  byte[] bytes = {(byte) 0xEF, (byte) 0xBB, (byte) 0xBF};  
		  out.write(bytes);  

		  OutputStreamWriter osw = new OutputStreamWriter(out, charset);  
		  CSVFormat csvFormat = CSVFormat.EXCEL.withHeader(header);  

		  CSVPrinter csvPrinter = new CSVPrinter(osw, csvFormat);  
		  csvPrinter.printRecords(iter);  
		  csvPrinter.flush();  
		  csvPrinter.close();  
	  } catch (IOException e) {  
		  e.printStackTrace();  
	  }  
}

2.2 example

List<List<Object>> lists = new ArrayList<>();  
​  
List<Object> list1 = new ArrayList<>();  
list1.add("Zhang San");  
list1.add(18);  
list1.add("male");  
lists.add(list1);  
​  
List<Object> list2 = new ArrayList<>();  
list2.add("Li Si");  
list2.add(20);  
list2.add("female");  
lists.add(list2);  

String[] header = {"Full name", "Age", "Gender"};  

// Ellipsis out  

exportCSVFile(out, lists, "UTF-8", header);
  • Excel open

  • Text editor open

    Zhang San, 18, male

    Li Si, 20, female

2.3 problem

  • Use Excel to open the Chinese code

    The above example code is written in bom and encoded as UTF-8. There are other schemes that use GBK.

  • When it is opened in Excel, the date format will be changed into ×××××××××××××××××××××××××××××××××××××××××××××××××××××××××××××××××××××××××××××

    Add tabs when writing data. If one tab is not enough, just two
    Instead of calling the printRecords(); method, call csvPrinter.print("\t" + data + "\ t");

    public void exportCSVFile(OutputStream out, Iterable<?> iter, String charset,String... header) {  
    	  try {  
    		  // Write bom to prevent Chinese miscoding  
    		  byte[] bytes = {(byte) 0xEF, (byte) 0xBB, (byte) 0xBF};  
    		  out.write(bytes);  
    ​  
    		  OutputStreamWriter osw = new OutputStreamWriter(out, charset);  
    		  CSVFormat csvFormat = CSVFormat.EXCEL.withHeader(header);  
    ​  
    		  CSVPrinter csvPrinter = new CSVPrinter(osw, csvFormat);  
    		  Iterator<?> iterator = iter.iterator();  
    		  while (iterator.hasNext()) {  
    			  Collection list = (Collection)iterator.next();  
    			  // Start writing a row of data  
    			  list.forEach(c->{  
    				  try {  
    					  csvPrinter.print("\t" +c.toString() +"\t" );  
    				  } catch (IOException e) {  
    					  e.printStackTrace();  
    				  }  
    			  });  
    			  // Write a line, need to wrap  
    			  csvPrinter.println();  
    		  }  
    		  csvPrinter.flush();  
    		  csvPrinter.close();  
    	  } catch (IOException e) {  
    		  e.printStackTrace();  
    	  }  
    }
    
  • When the written data contains comma, double quotation mark and other special characters, the exported csv file is opened with a text editor, and double quotation mark will be added by default

Topics: Programming Excel Apache ascii encoding