1. Overview
Program Operating System: CentOS6.5 64bit
JDK Version: 7
2. Testing
2.1 Prepare test program
The test program is simple, just like a class, a main function, the general flow:
The interval for getting the zip file from the parameter is read first, then the zip file path is obtained from the parameter.Then get the full path name of the file from the zip file through the api of the ZipFile class.Get one file sleep interval time at a time for easy testing.
The code is as follows:
/** * Usage: App <interval in ms to get entry> <zip file path> * @param args */ public static void main( String[] args ) { String arg0 = args[0]; String arg1 = args[1]; Long interval = Long.valueOf(arg0); System.out.println("interval = " + interval); String filename = arg1; System.out.println("filename = " + filename); ZipFile zipFile = null; try { zipFile = new ZipFile(filename); Enumeration<? extends ZipEntry> entries = zipFile.entries(); while (entries.hasMoreElements()) { ZipEntry entry = entries.nextElement(); System.out.println(entry.getName()); try { Thread.sleep(interval); } catch (InterruptedException e) { e.printStackTrace(); } } } catch (IOException e) { e.printStackTrace(); } finally { if (zipFile != null) { try { zipFile.close(); } catch (IOException e) { } } } }
2.2 Start testing
Place a compressed test.zip file in the / tmp directory for testing first
Package the program to the server and execute the following commands:
java -classpath $CLASSPATH com.spiro.test.App 5000 /tmp/test.zip > $LOG_HOME/app.log
app.log output:
interval = 5000 filename = /tmp/test.zip frontend/ frontend/gulpfile.js frontend/node_modules/ frontend/node_modules/.bin/ frontend/node_modules/.bin/browser-sync frontend/node_modules/.bin/browser-sync.cmd frontend/node_modules/.bin/gulp
Next, reopen a terminal to execute the following command:
echo "abcd" > /tmp/test.zip
Reset and modify the contents of the test.zip file
app.log outputs:
interval = 5000 filename = /tmp/test.zip frontend/ frontend/gulpfile.js frontend/node_modules/ frontend/node_modules/.bin/ # # A fatal error has been detected by the Java Runtime Environment: # # SIGBUS (0x7) at pc=0x00007fa6701cb7b2, pid=7262, tid=140352832034560 # # JRE version: OpenJDK Runtime Environment (7.0_141-b02) (build 1.7.0_141-mockbuild_2017_05_09_14_20-b00) # Java VM: OpenJDK 64-Bit Server VM (24.141-b02 mixed mode linux-amd64 compressed oops) # Derivative: IcedTea 2.6.10 # Distribution: CentOS release 6.9 (Final), package rhel-2.6.10.1.el6_9-x86_64 u141-b02 # Problematic frame: # C [libzip.so+0x47b2] # # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again # # An error report file with more information is saved as: # /tmp/jvm-7262/hs_error.log # # If you would like to submit a bug report, please include # instructions on how to reproduce the bug and visit: # http://icedtea.classpath.org/bugzilla # The crash happened outside the Java Virtual Machine in native code. # See problematic frame for where to report the bug.
The jps viewing process has disappeared.
According to the log prompt, the jvm dump file is output to the file/tmp/jvm-7262/hs_error.log.
View stack:
Stack: [0x00007fa670a25000,0x00007fa670b26000], sp=0x00007fa670b24650, free space=1021k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) C [libzip.so+0x47b2] C [libzip.so+0x4dc8] ZIP_GetNextEntry+0x48 j java.util.zip.ZipFile.getNextEntry(JI)J+0 j java.util.zip.ZipFile.access$500(JI)J+2 j java.util.zip.ZipFile$1.nextElement()Ljava/util/zip/ZipEntry;+54 j java.util.zip.ZipFile$1.nextElement()Ljava/lang/Object;+1 j com.spiro.test.App.main([Ljava/lang/String;)V+100 v ~StubRoutines::call_stub V [libjvm.so+0x60ea4e] V [libjvm.so+0x60d5e8] V [libjvm.so+0x61e9c7] V [libjvm.so+0x632fac] C [libjli.so+0x34c5]
You can see that the java code is positioned at
private static native long getNextEntry(long jzfile, int i);
The problem has reappeared.
2.3. Interpretation of the problem
By querying data, this is related to mmap's linux operating system mechanism. The general idea is that the MMAP mechanism can improve file access efficiency by mapping files to memory, but once the files are modified during the reading process, it may lead to errors, resulting in jvm crash.
https://bugs.openjdk.java.net/browse/JDK-8160933
The online solutions include disabling the mmap mechanism by adding the -Dsun.zip.disableMemoryMapping=true option to the jvm parameters. Let's see how this option works.
2.4. Disable mmap testing
Execution:
java -Dsun.zip.disableMemoryMapping=true -classpath $CLASSPATH com.spiro.test.App 5000 /tmp/test.zip > $LOG_HOME/app.log
app.log output:
frontend/ frontend/gulpfile.js frontend/node_modules/ frontend/node_modules/.bin/ frontend/node_modules/.bin/browser-sync frontend/node_modules/.bin/browser-sync.cmd frontend/node_modules/.bin/gulp
Execution:
echo "abcd" > test.zip
The process is found to continue executing and an Error exception is thrown after a period of time:
Exception in thread "main" java.util.zip.ZipError: jzentry == 0, jzfile = 140689976146736, total = 8313, name = /tmp/test.zip, i = 62, message = null at java.util.zip.ZipFile$1.nextElement(ZipFile.java:505) at java.util.zip.ZipFile$1.nextElement(ZipFile.java:483) at com.spiro.test.App.main(App.java:38)
2.5 Interpretation again
When mmap is disabled, instead of crash, the process throws an exception after a period of time and exits the process.
When mmap is disabled, the file is not mapped to memory, but the program loads part of the data into memory before continuing to read. Exceptional errors do not occur until the file data changes.It's just a guess, so I'll keep working on it later.
3. Summary
You can see that the root cause of jvm crash is that the zip file is modified during reading when the mmap mechanism is turned on.
There are two ways to solve this problem:
1. Logically control the zip file from code and do not be modified by other logic during operation.
2. Add -Dsun.zip.disableMemoryMapping=true to the jvm startup parameters.
However, the individual feels that 2 is a bad indicator. The root of the problem lies in the control of file resource sharing access.