About Java/Kotlin downloading pictures, the problem that pictures can't be displayed when they are opened

Posted by downfall on Mon, 24 Feb 2020 14:10:16 +0100

Picture download is a very simple function. I can get the stream from the online address through IO stream, and then output the stream to the file to complete the download function. However, recently, I found that the picture download in a website is successful, but it can't be opened when it is opened, which puzzled me. No one has made it clear at all

Today, through research and discussion with friends, we finally found the answer. As for the answer, please read on patiently~

Problems arise

The test image address is http://www.xbiquge.la/files/article/image/10/10489/10489s.jpg

Download the Java version of picture code:

URL url = new URL("http://www.xbiquge.la/files/article/image/10/10489/10489s.jpg");
URLConnection connection=url.openConnection();//Open links
InputStream inputStream = connection.getInputStream();
BufferedInputStream bufferedInputStream = new BufferedInputStream(inputStream);
BufferedOutputStream bufferedOutputStream = new BufferedOutputStream(new FileOutputStream(new File("e:\\test.jpg")));
int c;
byte[] temp = new byte[1024 * 2];//Provide buffers
while ((c = bufferedInputStream.read(temp)) != -1) {
    bufferedOutputStream.write(temp,0,c);//How much to read, how much to write
}
bufferedOutputStream.close();
inputStream.close();

Download code Kotlin version:

val file =File("e:\\test.jpg")
val openConnection = URL("http://www.xbiquge.la/files/article/image/10/10489/10489s.jpg").openConnection()
val bytes = openConnection.getInputStream().readBytes()
file.writeBytes(bytes)

Through the above comparison, we can see that Kotlin code is much simpler than Java code

The above code is all right. Download the picture and open it as shown in the figure below

Then open it with a browser and save it as a save picture. The picture can be opened normally

It can't be opened with the thunderbolt test. It seems that the problem can't be found

Unwilling to admit defeat, I went to search and added various request headers, but they were still invalid. It seems that I have reached a dead end

Reason

I have no choice but to ask the big boys in the study group for advice

"Ah, this picture can also be decompressed. There are pictures in it!" Said the group's netizen named yeshen.

? I changed the extension of the picture to zip and unzipped it. I found the picture that can be opened normally

We know that the downloaded file is a compressed package, so the problem is solved, but why?

I just talked about this problem with python's boss. He tried. python can get the picture correctly. Why can't java? After discussion, the reason is found from the request header, as shown in the following figure

It turns out that when the website responds, it returns the compressed file stream of GZIP, which can reduce the waiting time for users to browse the webpage

Python and browser both have built-in automatic decompression function, so that's why browsers can view pictures and python can get correct pictures

Solution

For gzip file streams

Here, we just need to use GZIPInputStream to wrap InputStream and then output it. Here, I only paste the code of kotlin version. For Java, please refer to and change it

val file =File("e:\\test.jpg")
val openConnection = URL("http://www.xbiquge.la/files/article/image/10/10489/10489s.jpg").openConnection()
val bytes = GZIPInputStream(openConnection.getInputStream()).readBytes()
file.writeBytes(bytes)

General method of downloading pictures

Because of the pictures we want to download, the server may return uncompressed pictures. If we continue to use the above method, an error will be reported

So we need to add a judgment to determine whether the input stream is compressed

Here I'll just wrap it up as a method

fun downloadImage(url: String, file: File): File {
    val openConnection = URL(url).openConnection()
    //Prevent some websites from jumping to the verification interface
    openConnection.addRequestProperty("user-agent", "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36")
    //If the image is gzip compressed
    val bytes = if (openConnection.contentEncoding == "gzip") {
        GZIPInputStream(openConnection.getInputStream()).readBytes()
    } else {
        openConnection.getInputStream().readBytes()
    }
    file.writeBytes(bytes)
    return file
}

Reference resources

How to check if InputStream is Gzipped? stackflow

Topics: Java Python Windows