POI - simple reading and writing Excel

Posted by bentobenji on Tue, 07 Dec 2021 12:31:00 +0100

Official website address: https://poi.apache.org/

Basic functions:

  • HSSF - provides the ability to read and write documents in Microsoft Excel format. (version 2003, can only store 65536 rows of data at most)
  • XSSF - provides the ability to read and write documents in Microsoft Excel OOXML format. (2007 edition, unlimited data can be stored)

1. POI - Excel write

1.1 related dependencies:

<!-- xls(2003) -->
<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi</artifactId>
    <version>3.9</version>
</dependency>

<!-- xlsx(2007) -->
<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi-ooxml</artifactId>
    <version>3.9</version>
</dependency>

<!-- Date formatting tool -->
<dependency>
    <groupId>joda-time</groupId>
    <artifactId>joda-time</artifactId>
    <version>2.10.1</version>
</dependency>

<!-- test -->
<dependency>
    <groupId>junit</groupId>
    <artifactId>junit</artifactId>
    <version>4.12</version>
</dependency>

explain; There is a compatibility problem between the 2003 version and the 2007 version. The 2003 version can only have 65536 lines at most!

2003|2007 Version of the write, but the object is different, the method is the same!

1.2 write - 2003 version

Suffix: xls

Object: HSSFWorkbook

@Test
public void writeExcel2003() throws Exception {
    // 1. Create a workbook version 2003
    Workbook workbook = new HSSFWorkbook();
    // 2. Create a worksheet
    Sheet sheet = workbook.createSheet("Book sales form");
    // 3. Create the first row
    Row row1 = sheet.createRow(0);

    // 4. Create cells
    // (1,1)
    Cell cell11 = row1.createCell(0);
    cell11.setCellValue("Romance of the Three Kingdoms");
    // (1,2)
    Cell cell12 = row1.createCell(1);
    cell12.setCellValue(88);

    // Create second row
    Row row2 = sheet.createRow(1);
    // (2,1)
    Cell cell21 = row2.createCell(0);
    cell21.setCellValue("Sales time");
    // (2,2)
    Cell cell22 = row2.createCell(1);
    String time = new DateTime().toString("yyyy-MM-dd HH:mm:ss");
    cell22.setCellValue(time);

    // Generate an Excel table (IO stream). The 2003 version ends with xls
    FileOutputStream fileOutputStream = new FileOutputStream(PATH + "Book sales form 03.xls");
    // output
    workbook.write(fileOutputStream);
    // Closed flow
    fileOutputStream.close();

    System.out.println("Book sales table generation completed!");
}

1.3 write - 2007 version

Suffix: xlsx

Object: XSSFWorkbook

@Test
public void writeExcel2007() throws Exception {
    // 1. Create a workbook 2007 version
    Workbook workbook = new XSSFWorkbook();
    // 2. Create a worksheet
    Sheet sheet = workbook.createSheet("Book sales form");
    // 3. Create the first row
    Row row1 = sheet.createRow(0);

    // 4. Create cells
    // (1,1)
    Cell cell11 = row1.createCell(0);
    cell11.setCellValue("Water Margin");
    // (1,2)
    Cell cell12 = row1.createCell(1);
    cell12.setCellValue(76);

    // Create second row
    Row row2 = sheet.createRow(1);
    // (2,1)
    Cell cell21 = row2.createCell(0);
    cell21.setCellValue("Sales date");
    // (2,2)
    Cell cell22 = row2.createCell(1);
    String time = new DateTime().toString("yyyy-MM-dd HH:mm:ss");
    cell22.setCellValue(time);

    // Generate an Excel table (IO stream). The 2007 version ends with xlsx
    FileOutputStream fileOutputStream = new FileOutputStream(PATH + "Book sales form 07.xlsx");
    // output
    workbook.write(fileOutputStream);
    // Closed flow
    fileOutputStream.close();

    System.out.println("Book sales table generation completed!");
}

1.4 large file writing HSSF

Disadvantages: only 65536 rows can be processed at most, otherwise an exception will be thrown

java.lang.IllegalArgumentException: Invalid row number (65536) outside allowable range (0..65535)

Advantages: write cache in the process, do not operate the disk, and write to the disk at the last time, which is fast

@Test
public void writeExcel2003BigData() throws Exception {
    // start time
    long begin = System.currentTimeMillis();

    // Create Workbook
    Workbook workbook = new HSSFWorkbook();

    // Create worksheet
    Sheet sheet = workbook.createSheet();

    // Write data
    int totalRowNum = 65536;
    for (int rowNum = 0; rowNum < totalRowNum; rowNum++) {
        Row row = sheet.createRow(rowNum);
        for (int cellNum = 0; cellNum < 10; cellNum++) {
            Cell cell = row.createCell(cellNum);
            cell.setCellValue(cellNum);
        }
    }
    System.out.println("over");

    FileOutputStream fileOutputStream = new FileOutputStream(PATH + "writeExcel2003BigData.xls");
    workbook.write(fileOutputStream);
    fileOutputStream.close();

    // End time
    long end = System.currentTimeMillis();

    // Time consuming for batch writing 65536 pieces of data: 1.35s
    System.out.println("Batch write" + totalRowNum + "Time consuming data:" + (double) (end - begin) / 1000 + "s");
}

1.5 large file write XSSF

Disadvantages: when writing files, the speed is very slow, which consumes a lot of memory, and memory overflow may occur (such as 1 million)

Advantages: it can write a large amount of data (such as 200000 pieces)

// Time consuming!
@Test
public void writeExcel2007BigData() throws Exception {
    // start time
    long begin = System.currentTimeMillis();

    // Create Workbook
    Workbook workbook = new XSSFWorkbook();

    // Create worksheet
    Sheet sheet = workbook.createSheet();

    // Write data
    int totalRowNum = 100000;
    for (int rowNum = 0; rowNum < 100000; rowNum++) {
        Row row = sheet.createRow(rowNum);
        for (int cellNum = 0; cellNum < 10; cellNum++) {
            Cell cell = row.createCell(cellNum);
            cell.setCellValue(cellNum);
        }
    }
    System.out.println("over");

    FileOutputStream fileOutputStream = new FileOutputStream(PATH + "writeExcel2007BigData.xlsX");
    workbook.write(fileOutputStream);
    fileOutputStream.close();

    // End time
    long end = System.currentTimeMillis();

    // Time consuming for batch writing 100000 pieces of data: 11.229s
    System.out.println("Batch write" + totalRowNum + "Time consuming data:" + (double) (end - begin) / 1000 + "s");
}

1.6 large file write SXSSF

Advantages: it can write a very large amount of data, such as 1 million or more, with fast data writing speed and less memory

be careful:

Temporary files will be generated during the process, which need to be cleaned up;

By default, 100 pieces of data are saved to the memory. If more than 100 pieces, the first data is written to the temporary file;

If you customize the amount of data in memory, you can use the new sxssfworkbook.

// Use cache optimization!
@Test
public void writeExcel2007BigDataS() throws Exception {
    // start time
    long begin = System.currentTimeMillis();

    // Create Workbook
    Workbook workbook = new SXSSFWorkbook();

    // Create worksheet
    Sheet sheet = workbook.createSheet();

    // Write data
    int totalRowNum = 500000;
    for (int rowNum = 0; rowNum < 500000; rowNum++) {
        Row row = sheet.createRow(rowNum);
        for (int cellNum = 0; cellNum < 10; cellNum++) {
            Cell cell = row.createCell(cellNum);
            cell.setCellValue(cellNum);
        }
    }
    System.out.println("over");

    FileOutputStream fileOutputStream = new FileOutputStream(PATH + "writeExcel2007BigDataS.xlsX");
    workbook.write(fileOutputStream);
    fileOutputStream.close();

    // Clear temporary files
    ((SXSSFWorkbook)workbook).dispose();

    // End time
    long end = System.currentTimeMillis();

    // Time consuming for batch writing 500000 pieces of data: 5.216s
    System.out.println("Batch write" + totalRowNum + "Time consuming data:" + (double) (end - begin) / 1000 + "s");
}

The official explanation of SXSSFWorkbook is to implement the streaming XSSFWorkbook version of the "BigGridDemo" policy. This allows writing very large files without running out of memory, because only configurable line parts are saved in memory at any time.

Please note that SXSSFWorkbook may still consume a lot of memory, which is based on the functions you are using. For example, merge areas, comments... Are still stored only in memory, so if it is widely used, it may require a lot of memory. (when using POI, memory problem - Jprofile)

2. POI - Excel reading

When reading values, the types need to match

getNumericCellValue(): get numeric type

getStringCellValue(): get string type

...

2.1 read-2003 version

@Test
public void readExcel2003() throws Exception {
    // Get file stream
    FileInputStream inputStream = new FileInputStream(PATH + "Book sales form 03.xls");

    // 1. Create a workbook. Any class that can be operated by Excel can be operated
    Workbook workbook = new HSSFWorkbook(inputStream);
    // 2. Get worksheet
    Sheet sheet = workbook.getSheetAt(0);
    // 3. Get line
    Row row = sheet.getRow(0);
    // 4. Get column
    Cell cell = row.getCell(1);

    // When reading the value, pay attention to the type matching. getStringCellValue(): get the string type
    System.out.println(cell.getNumericCellValue());
    inputStream.close();
}

2.2 read-2007 version

@Test
public void readExcel2007() throws Exception {
    // Get file stream
    FileInputStream inputStream = new FileInputStream(PATH + "Book sales form 07.xlsx");

    // 1. Create a workbook. Any class that can be operated by Excel can be operated
    Workbook workbook = new XSSFWorkbook(inputStream);
    // 2. Get worksheet
    Sheet sheet = workbook.getSheetAt(0);
    // 3. Get line
    Row row = sheet.getRow(0);
    // 4. Get column
    Cell cell = row.getCell(1);

    // When reading the value, pay attention to the type matching. getStringCellValue(): get the string type
    System.out.println(cell.getNumericCellValue());
    inputStream.close();
}

2.3 reading different data types

Pay attention to type conversion

// Read different data types
@Test
public void testCellType() throws Exception {
    FileInputStream inputStream = new FileInputStream(PATH + "Schedule.xlsx");

    // Create a workbook. This class can operate anything that can be operated by Excel
    Workbook workbook = new XSSFWorkbook(inputStream);
    // Get worksheet
    Sheet sheet = workbook.getSheetAt(0);

    // Read title content
    Row rowTitle = sheet.getRow(0);
    if (rowTitle != null) {
        int cellCount = rowTitle.getPhysicalNumberOfCells();
        for (int cellNum = 0; cellNum < cellCount; cellNum++) {
            Cell cell = rowTitle.getCell(cellNum);
            if (cell != null) {
                int cellType = cell.getCellType();
                String cellValue = cell.getStringCellValue();
                System.out.print(cellValue + " | ");
            }
        }
        System.out.println();
    }

    // Get the contents of the table
    int rowCount = sheet.getPhysicalNumberOfRows();
    for (int rowNum = 1; rowNum < rowCount; rowNum++) {
        Row rowData = sheet.getRow(rowNum);
        if (rowData != null) {
            // Read column
            int cellCount = rowTitle.getPhysicalNumberOfCells();
            for (int cellNum = 0; cellNum < cellCount; cellNum++) {
                System.out.print("[" + (rowNum + 1) + "-" + (cellNum + 1) + "]");

                Cell cell = rowData.getCell(cellNum);
                // Data type of matching column
                if (cell != null) {
                    int cellType = cell.getCellType();
                    String cellValue = "";

                    switch (cellType) {
                        case XSSFCell.CELL_TYPE_STRING:  // character string
                            System.out.print("[STRING]");
                            cellValue = cell.getStringCellValue();
                            break;
                        case XSSFCell.CELL_TYPE_BOOLEAN:  // Boolean
                            System.out.print("[BOOLEAN]");
                            cellValue = String.valueOf(cell.getBooleanCellValue());
                            break;
                        case XSSFCell.CELL_TYPE_BLANK:  // empty
                            System.out.print("[BLANK]");
                            break;
                        case XSSFCell.CELL_TYPE_NUMERIC:  // Number (date, normal number)
                            System.out.print("[NUMERIC]");
                            if (HSSFDateUtil.isCellDateFormatted(cell)) {  // date
                                System.out.print("[[date]");
                                Date date = cell.getDateCellValue();
                                cellValue = new DateTime(date).toString("yyyy-MM-dd");
                            } else {
                                // It is not a date format to prevent the number from being too long to display
                                System.out.print("[Convert number to string output]");
                                cell.setCellType(XSSFCell.CELL_TYPE_STRING);
                                cellValue = cell.toString();
                            }
                            break;
                        case XSSFCell.CELL_TYPE_ERROR:
                            System.out.print("[[wrong data type]");
                            break;
                    }
                    System.out.println(cellValue);
                }
            }
        }
    }

    inputStream.close();
}

2.4 calculation formula

// Formula calculation
@Test
public void testFormula() throws Exception {
    FileInputStream inputStream = new FileInputStream(PATH + "formula.xlsx");
    Workbook workbook = new XSSFWorkbook(inputStream);
    Sheet sheet = workbook.getSheetAt(0);

    // Cell position
    Row row = sheet.getRow(6);
    Cell cell = row.getCell(1);

    // Get the calculation formula
    FormulaEvaluator formulaEvaluator = new XSSFFormulaEvaluator((XSSFWorkbook) workbook);

    // Calculation results
    CellValue evaluate = formulaEvaluator.evaluate(cell);
    String cellValue = evaluate.formatAsString();
    System.out.println(cellValue);

    // Output cell contents (i.e. output calculation formula)
    int cellType = cell.getCellType();
    switch (cellType) {
        case Cell.CELL_TYPE_FORMULA:  // formula
            String cellFormula = cell.getCellFormula();
            System.out.println(cellFormula);
            break;
    }
}

Topics: Java Excel IntelliJ IDEA Junit poi