Baidu cloud official website:
1, OCR (character recognition) function
First, login the Baidu cloud account on Baidu Intelligent Cloud official website, click the management console and click the word recognition:
Click create application and fill in as required. Pay attention to select the interface you need in the interface selection. After setting, click Create immediately:
After successful creation, you can view the AppID, API Key and Secret Key of the application in the application list:
These three parameters will be used in the project to connect this application:
java project writing method:
public class GeneralRecognition { //Set APPID/AK/SK public static final String APP_ID = ""; public static final String API_KEY = ""; public static final String SECRET_KEY = ""; private static AipOcr client = null; public static void main(String[] args) throws IOException, URISyntaxException { File file = new File(chooseFile()); Desktop desktop = Desktop.getDesktop(); desktop.open(file); // URI uri = new URI("E:\\"); // desktop.browse(uri); dis(file.getPath()); } //Select file to upload public static String chooseFile() { FileSystemView fsv = FileSystemView.getFileSystemView(); JFileChooser fileChooser = new JFileChooser(); fileChooser.setCurrentDirectory(fsv.getHomeDirectory()); fileChooser.setDialogTitle("Please select the file to upload..."); fileChooser.setApproveButtonText("determine"); fileChooser.setFileSelectionMode(JFileChooser.FILES_ONLY); int result = fileChooser.showOpenDialog(null); if (JFileChooser.APPROVE_OPTION == result) { String path = fileChooser.getSelectedFile().getPath(); return path; } return "Can't find"; } public static void init(){ // Initialize an AipOcr if(client == null){ client = new AipOcr(APP_ID, API_KEY, SECRET_KEY); } // Optional: set network connection parameters client.setConnectionTimeoutInMillis(2000); client.setSocketTimeoutInMillis(60000); } //Common character recognition public static void dis(String path){ init(); // Call interface by passing in optional parameters HashMap<String, String> options = new HashMap<>(); options.put("language_type", "CHN_ENG"); options.put("detect_direction", "true"); options.put("detect_language", "true"); options.put("probability", "true"); //The parameter is the local picture path JSONObject res = client.basicGeneral(path, options); System.out.println(res.toString(2)); }
There is a problem when calling the interface in the middle:
[main] INFO com.baidu.aip.client.BaseClient - get access_token success. current state: STATE_AIP_AUTH_OK { "error_msg": "No permission to access data", "error_code": 6 } Process finished with exit code 0
The reason is that there is no permission to use the method (API).
Error messages like this can be viewed in the error message of the application:
Solution steps:
1. Enter the application list, as shown below:
2. Click Manage and edit successively. In addition to the interface checked by default in this application, and then check other interfaces that need to be used, you can also click to get free interface permission:
Note: some interfaces need some authentication. For example, the public security authentication interface and the ID card and name comparison interface need enterprise authentication before submitting the enterprise authentication. After the authentication is passed, you need to apply in the console - face recognition - offline collection SDK management office according to the process before you can use it. After passing the authentication, you will automatically open the interface permission for you, Generally, it is automatically approved within 2 hours.
3. Click Save to modify and call again to solve the problem.
After receiving or applying for other permissions for free or paying, you can use the APIs of related functions, and you can also view the usage of related APIs:
2, ASR (speech recognition) function
The steps are similar to the character recognition steps above. First find the character recognition or speech recognition module on the console, and then create an application in the corresponding function module. When or after creating, pay attention to configuring the interface permissions to ensure that the corresponding API can be called normally later. Each application has three important parameters: app ID, API key and secret key, Configure these three parameters into the project. The following is the code of asr speech recognition project:
public class MandarinRecognition { //Set APPID/AK/SK public static final String APP_ID = ""; public static final String API_KEY = ""; public static final String SECRET_KEY = ""; private static AipSpeech client = null; public static void main(String[] args) throws IOException, URISyntaxException { File file = new File(chooseFile()); // Desktop desktop = Desktop.getDesktop(); // desktop.open(file); // URI uri = new URI("E:\\"); // desktop.browse(uri); System.out.println("Preparing for output.."); String outPutPath = "template/asrOutput.txt"; dis(file.getPath(),outPutPath); } //Select file to upload public static String chooseFile() { FileSystemView fsv = FileSystemView.getFileSystemView(); JFileChooser fileChooser = new JFileChooser(); fileChooser.setCurrentDirectory(fsv.getHomeDirectory()); fileChooser.setDialogTitle("Please select the file to upload..."); fileChooser.setApproveButtonText("determine"); fileChooser.setFileSelectionMode(JFileChooser.FILES_ONLY); int result = fileChooser.showOpenDialog(null); if (JFileChooser.APPROVE_OPTION == result) { String path = fileChooser.getSelectedFile().getPath(); return path; } return "Can't find"; } public static void init(){ // Initialize an AipSpeech if(client == null){ client = new AipSpeech(APP_ID, API_KEY, SECRET_KEY); } } //Common character recognition public static void dis(String imgPath, String outPutPath) throws IOException { init(); // Call interface by passing in optional parameters HashMap<String, Object> options = new HashMap<>(); options.put("dev_pid",1537); //The parameter is the local picture path System.out.println(imgPath); /** * The audio format of the original pcm must comply with 16k sampling rate, 16bit depth and mono. Supported formats are: PCM (uncompressed), wav (uncompressed, PCM encoded), amr (compressed format). * Support up to 60s recording files. There is no limit on the file size, only the length of time. */ System.out.println(client.asr(imgPath, "pcm", 16000, options)); } }
Output result:
Preparing for output.. E:\16k.pcm [main] INFO com.baidu.aip.client.BaseClient - get access_token success. current state: STATE_AIP_AUTH_OK {"result":["Beijing Science and Technology Museum."],"err_msg":"success.","sn":"238256483091644572246","corpus_no":"7063384013687529084","err_no":0} Process finished with exit code 0