java implements batch conversion of file encoding formats

Posted by RedMaster on Mon, 06 May 2019 10:15:03 +0200

1. Description of the scene

I don't know if you've ever encountered a situation where the previous project was GBK and now you need to replace it all with UTF-8. I've encountered it anyway.

eclipse can change the encoding format of the project, but if the file is converted directly, the Chinese language inside will be completely scrambled. You need to copy the contents of the file first, then change the file format, and then paste it all (there may be other better ways I don't know).

It's hard to think about a project that needs to be changed one file at a time. As a programmer ape, you've written a simple way to get the program to work on it.

The method is simple, traversing project folders - filtering java extension files - converting file encoding from GBK to UTF-8.

Encoding format must not be wrong, backup before conversion is recommended. Otherwise, you will regret that after I tested two projects for laziness, I converted the entire workspace, and then some projects were originally UTF-8, which would be out of order after execution.

The remedy is to convert the project from UTF-8 to GBK without scrambling, but with sequelaeMost of them have been saved, but don't know if a character becomes?

2. Reference dependence

Here I use a very useful java toolkit developed by Chinese people, hutool, official website https://hutool.cn/ Similar to lang3 packages, hutool has more features, Chinese annotations, Chinese annotations and Chinese annotations than lang3.

        <dependency>
            <groupId>cn.hutool</groupId>
            <artifactId>hutool-all</artifactId>
            <version>4.5.6</version>
        </dependency>

 

3. Method Implementation

import java.io.File;
import java.io.FileFilter;
import java.nio.charset.Charset;

import cn.hutool.core.io.FileUtil;
import cn.hutool.core.lang.Console;
import cn.hutool.core.util.StrUtil;

public class ConcertEncodeing {

    public static void main(String[] args) {
        
        convertCharset("D:\\workspaces\\workspaceOxygen\\ceshi",Charset.forName("GBK"),Charset.forName("UTF-8"),"java");
        
    }
    
    /**
     * Convert file encoding format
     * @param path File or folder path to be converted
     * @param fromCharset Original Encoding Format
     * @param toCharset   Target Encoding Format
     * @param expansion      File extensions that need to be converted, pass null if all conversion is required
     */
    private static void convertCharset(String path,Charset fromCharset,Charset toCharset,String expansion ) {
        if (StrUtil.isBlank(path)) {
            return;
        }
        File file = FileUtil.file(path);
        File[] listFiles = file.listFiles(new FileFilter() {
            @Override
            public boolean accept(File pathname) {
                if (StrUtil.isBlank(expansion)) {
                    return true;
                }
                if (FileUtil.isDirectory(pathname)||FileUtil.extName(pathname).equals("java")) {
                    return true;
                }
                return false;
            }
        });
        for (int i = 0; i < listFiles.length; i++) {
            if (listFiles[i].isDirectory()) {
                String canonicalPath = FileUtil.getCanonicalPath(listFiles[i]);
                //Separate threads per folder,Increase efficiency
                new Thread(new Runnable() {
                    @Override
                    public void run() {
                        convertCharset(canonicalPath,fromCharset,toCharset,expansion);
                    }
                }).start();
            }else {
                FileUtil.convertCharset(listFiles[i], fromCharset,  toCharset);
                Console.log("Conversion Completion File Name:{}",listFiles[i].getName());
            }
        }
    }
}

 

End.

What's wrong? You can leave a message.

Topics: Java encoding Eclipse