0021 - use JDBC to insert Chinese characters into Kudu table - Secret of cast

Posted by ohjay on Sun, 08 Dec 2019 14:53:32 +0100

1. Problem description

Use Impala JDBC to insert Chinese characters into Kudu table. The inserted Chinese characters are garbled and the Chinese characters are truncated.

After the previous document uses sql splicing method to insert Chinese string scrambling solution, this document describes the problem of using jdbc PreparedStatement method to insert Chinese string scrambling.

2. Problem recurrence

Test environment:

  • CDH5.12.0
  • Kudu1.4.0
  • ImpalaJDBC41_2.5.35

1. Test with impala JDBC code

 staticString JDBC_DRIVER ="com.cloudera.impala.jdbc41.Driver";
static String CONNECTION_URL ="jdbc:impala://ip-172-31-10-118:21050/default";

public static void main(String[] args) {
    Connection con = null;
 ResultSetrs = null;
 PreparedStatementps = null;

    try {
        Class.forName(JDBC_DRIVER);
 con =DriverManager.getConnection(CONNECTION_URL);

 Stringsql2 = "insert into my_first_table values(?, ?)";
 ps =con.prepareStatement(sql2);
 ps.setInt(1,81);
 ps.setString(2,"Test Chinese characters");
 ps.execute();
 ps.close();

 ps =con.prepareStatement("select * from my_first_table order byid asc");
 rs = ps.executeQuery();
        while (rs.next()){
            System.out.println(rs.getLong(1)+ "\t" +rs.getString(2));
 }

    } catch (Exceptione) {
        e.printStackTrace();
 } finally{
 try {// Turn off rs, ps and con
 rs.close();
 ps.close();
 con.close();
 } catch(SQLException e) {
 // TODOAuto-generated catch block
 e.printStackTrace();
 }

    }
}

2. Insert test data into Kudu table, such as "test", "test Chinese", "test Chinese character"

String sql2 = "insert into my_first_table values(?, ?)";
ps = con.prepareStatement(sql2);
ps.setInt(1, 73);
ps.setString(2, "test");
ps.execute();
ps.close();

ps = con.prepareStatement(sql2);
ps.setInt(1, 74);
ps.setString(2, "Test Chinese");
ps.execute();
ps.close();

ps = con.prepareStatement(sql2);
ps.setInt(1, 75);
ps.setString(2, "Test Chinese characters");
ps.execute();
ps.close();

The query results through Hue are as follows:

All Chinese characters are garbled, some are garbled, and the string is truncated.

3. Solutions

Modify the insert statement in the program, and use the cast function to convert the insert String column to String type

String sql2 = "insert into my_first_table values(?, cast(?as string))";
ps = con.prepareStatement(sql2);
ps.setInt(1, 60);
ps.setString(2, "Test Chinese characters");
ps.execute();
ps.close();

ps = con.prepareStatement(sql2);
ps.setInt(1, 61);
ps.setString(2, "Test Chinese");
ps.execute();
ps.close();

ps = con.prepareStatement(sql2);
ps.setInt(1, 62);
ps.setString(2, "test");
ps.execute();
ps.close();

After modification, insert test data into Kudu again: "test Chinese character", "test Chinese", "test"

The use of Hue query is shown as follows:

The insertion of Chinese string into Kudu is normal.

Drunk whip famous horse, young more pompous! Lingnan Huanxi sand, vomit wine shop! Dear friends do not want to put, data play flowers!

Topics: Big Data JDBC SQL