premise
I have encountered the encapsulation problem of Media type multipart / form data when writing a general HTTP component before. This article mainly introduces the definition, application and simple implementation of Media type multipart / form data in HTTP protocol.
Definition of multipart / form data
The media type multipart / form data follows the multipart MIME data stream definition (which can be referenced) Section 5.1 - RFC2046 )The general meaning is that the data body of media type multipart / form data consists of multiple parts separated by a fixed Boundary value.
Multipart / form data request body layout
The layout of multipart / form data request body is as follows:
# Request header - this is required. You need to specify the content type as multipart / form data and specify the unique boundary value Content-Type: multipart/form-data; boundary=${Boundary} # Request body --${Boundary} Content-Disposition: form-data; name="name of file" Content-Type: application/octet-stream bytes of file --${Boundary} Content-Disposition: form-data; name="name of pdf"; filename="pdf-file.pdf" Content-Type: application/octet-stream bytes of pdf file --${Boundary} Content-Disposition: form-data; name="key" Content-Type: text/plain;charset=UTF-8 text encoded in UTF-8 --${Boundary}--
Compared with other media types such as application/x-www-form-urlencoded, the most obvious difference of media type multipart / form data is:
- In addition to specifying the content type attribute of the request header as multipart / form data, the boundary parameter needs to be defined
- The request line data in the request body is composed of multiple parts. The value mode of Boundary parameter - ${Boundary} is used to separate each independent segment
- Each part must have a request header content disposition: form data; name="${PART_NAME}";, Here ${part_name} needs URL encoding. In addition, the filename field can be used to represent the name of the file, but its constraint is lower than the name attribute (because it is not sure whether the local file is available or has objections)
- Each part can define the content type and the data body of the part separately
- The request body takes the value mode of Boundary parameter - ${Boundary} - as the end flag
{% note warning flat %}
RFC7578 mentions two usage methods of multipart / form data expiration. One is the use of content transfer encoding request header, which will not be expanded here. The other is the way that a single form attribute in the request body transmits multiple binaries. It is recommended to use multipart/mixed (a scenario where a "name" corresponds to multiple binaries)
{% endnote %}
In particular:
- If the content of a part is text and its content type is text/plain, you can specify the corresponding character set, such as content type: text/plain; charset=UTF-8
- Can pass_ charset_ Property to specify the default character set, as follows:
Content-Disposition: form-data; name="_charset_" UTF-8 --ABCDE-- Content-Disposition: form-data; name="field" ...text encoded in UTF-8... ABCDE--
Boundary parameter value specification
The Boundary parameter values are as follows:
- The value of Boundary must start with the English middle double horizontal bar -- which is called the leading hyphen
- The Boundary value cannot exceed 70 characters except the leading hyphen
- The Boundary value cannot contain characters with special meaning disabled by HTTP protocol or URL, such as English colon: and so on
- Before each -- ${Boundary}, the default mandatory must be CRLF. If the text type request body of a part ends with CRLF, there must be two crlfs explicitly in the secondary format of the request body. If the request body of a part does not end with CRLF, there can be only one CRLF. These two cases are called explicit and implicit types of separators respectively, It is more abstract. See the following example:
# Request header Content-type: multipart/data; boundary="--abcdefg" --abcdefg Content-Disposition: form-data; name="x" Content-type: text/plain; charset=ascii It does NOT end with a linebreak # < = = = there is no CRLF here, implicit type --abcdefg Content-Disposition: form-data; name="y" Content-type: text/plain; charset=ascii It DOES end with a linebreak # < = = = there is CRLF, explicit type --abcdefg ## Direct viewing implicit type CRLF It does NOT end with a linebreak CRLF --abcdefg ## Direct viewing explicit type CRLF It DOES end with a linebreak CRLF CRLF --abcdefg
Implement POST request of multipart / form data media type
Here, only HTTP clients for POST requests of multipart / form data media types are written for HttpURLConnection in low JDK version and HttpClient built in high JDK version. Other implementations such as custom Socket can be completed according to similar ideas. First introduce org springframework. boot:spring-boot-starter-web:2.6. 0 as a simple controller:
@RestController public class TestController { @PostMapping(path = "/test") public ResponseEntity<?> test(MultipartHttpServletRequest request) { return ResponseEntity.ok("ok"); } }
Postman's simulation request is as follows:
The request parameters obtained by the background controller are as follows:
The client written later can directly call this interface for debugging.
Module that encapsulates the conversion of the request body to a byte container
The boundary values here are all explicitly realized, and the boundary values can be directly generated with fixed prefix and UUID. Some simplifications have been made in the simple implementation process:
- Only consider submitting text form data and binary (file) form data
- Based on the above point, each part explicitly specifies the content type request header
- Text encoding is fixed to UTF-8
Write a MultipartWriter:
public class MultipartWriter { private static final Charset DEFAULT_CHARSET = StandardCharsets.UTF_8; private static final byte[] FIELD_SEP = ": ".getBytes(StandardCharsets.ISO_8859_1); private static final byte[] CR_LF = "\r\n".getBytes(StandardCharsets.ISO_8859_1); private static final String TWO_HYPHENS_TEXT = "--"; private static final byte[] TWO_HYPHENS = TWO_HYPHENS_TEXT.getBytes(StandardCharsets.ISO_8859_1); private static final String CONTENT_DISPOSITION_KEY = "Content-Disposition"; private static final String CONTENT_TYPE_KEY = "Content-Type"; private static final String DEFAULT_CONTENT_TYPE = "multipart/form-data; boundary="; private static final String DEFAULT_BINARY_CONTENT_TYPE = "application/octet-stream"; private static final String DEFAULT_TEXT_CONTENT_TYPE = "text/plain;charset=UTF-8"; private static final String DEFAULT_CONTENT_DISPOSITION_VALUE = "form-data; name=\"%s\""; private static final String FILE_CONTENT_DISPOSITION_VALUE = "form-data; name=\"%s\"; filename=\"%s\""; private final Map<String, String> headers = new HashMap<>(8); private final List<AbstractMultipartPart> parts = new ArrayList<>(); private final String boundary; private MultipartWriter(String boundary) { this.boundary = Objects.isNull(boundary) ? TWO_HYPHENS_TEXT + UUID.randomUUID().toString().replace("-", "") : boundary; this.headers.put(CONTENT_TYPE_KEY, DEFAULT_CONTENT_TYPE + this.boundary); } public static MultipartWriter newMultipartWriter(String boundary) { return new MultipartWriter(boundary); } public static MultipartWriter newMultipartWriter() { return new MultipartWriter(null); } public MultipartWriter addHeader(String key, String value) { if (!CONTENT_TYPE_KEY.equalsIgnoreCase(key)) { headers.put(key, value); } return this; } public MultipartWriter addTextPart(String name, String text) { parts.add(new TextPart(String.format(DEFAULT_CONTENT_DISPOSITION_VALUE, name), DEFAULT_TEXT_CONTENT_TYPE, this.boundary, text)); return this; } public MultipartWriter addBinaryPart(String name, byte[] bytes) { parts.add(new BinaryPart(String.format(DEFAULT_CONTENT_DISPOSITION_VALUE, name), DEFAULT_BINARY_CONTENT_TYPE, this.boundary, bytes)); return this; } public MultipartWriter addFilePart(String name, File file) { parts.add(new FilePart(String.format(FILE_CONTENT_DISPOSITION_VALUE, name, file.getName()), DEFAULT_BINARY_CONTENT_TYPE, this.boundary, file)); return this; } private static void writeHeader(String key, String value, OutputStream out) throws IOException { writeBytes(key, out); writeBytes(FIELD_SEP, out); writeBytes(value, out); writeBytes(CR_LF, out); } private static void writeBytes(String text, OutputStream out) throws IOException { out.write(text.getBytes(DEFAULT_CHARSET)); } private static void writeBytes(byte[] bytes, OutputStream out) throws IOException { out.write(bytes); } interface MultipartPart { void writeBody(OutputStream os) throws IOException; } @RequiredArgsConstructor public static abstract class AbstractMultipartPart implements MultipartPart { protected final String contentDispositionValue; protected final String contentTypeValue; protected final String boundary; protected String getContentDispositionValue() { return contentDispositionValue; } protected String getContentTypeValue() { return contentTypeValue; } protected String getBoundary() { return boundary; } public final void write(OutputStream out) throws IOException { writeBytes(TWO_HYPHENS, out); writeBytes(getBoundary(), out); writeBytes(CR_LF, out); writeHeader(CONTENT_DISPOSITION_KEY, getContentDispositionValue(), out); writeHeader(CONTENT_TYPE_KEY, getContentTypeValue(), out); writeBytes(CR_LF, out); writeBody(out); writeBytes(CR_LF, out); } } public static class TextPart extends AbstractMultipartPart { private final String text; public TextPart(String contentDispositionValue, String contentTypeValue, String boundary, String text) { super(contentDispositionValue, contentTypeValue, boundary); this.text = text; } @Override public void writeBody(OutputStream os) throws IOException { os.write(text.getBytes(DEFAULT_CHARSET)); } @Override protected String getContentDispositionValue() { return contentDispositionValue; } @Override protected String getContentTypeValue() { return contentTypeValue; } } public static class BinaryPart extends AbstractMultipartPart { private final byte[] content; public BinaryPart(String contentDispositionValue, String contentTypeValue, String boundary, byte[] content) { super(contentDispositionValue, contentTypeValue, boundary); this.content = content; } @Override public void writeBody(OutputStream out) throws IOException { out.write(content); } } public static class FilePart extends AbstractMultipartPart { private final File file; public FilePart(String contentDispositionValue, String contentTypeValue, String boundary, File file) { super(contentDispositionValue, contentTypeValue, boundary); this.file = file; } @Override public void writeBody(OutputStream out) throws IOException { try (InputStream in = new FileInputStream(file)) { final byte[] buffer = new byte[4096]; int l; while ((l = in.read(buffer)) != -1) { out.write(buffer, 0, l); } out.flush(); } } } public void forEachHeader(BiConsumer<String, String> consumer) { headers.forEach(consumer); } public void write(OutputStream out) throws IOException { if (!parts.isEmpty()) { for (AbstractMultipartPart part : parts) { part.write(out); } } writeBytes(TWO_HYPHENS, out); writeBytes(this.boundary, out); writeBytes(TWO_HYPHENS, out); writeBytes(CR_LF, out); } }
This class has encapsulated three different types of partial request body implementations. The forEachHeader() method is used to traverse the request header, and the final write() method is used to write the request body to the OutputStream.
HttpURLConnection implementation
The implementation code is as follows (only the simplest implementation is made without considering fault tolerance and exception handling):
public class HttpURLConnectionApp { private static final String URL = "http://localhost:9099/test"; public static void main(String[] args) throws Exception { MultipartWriter writer = MultipartWriter.newMultipartWriter(); writer.addTextPart("name", "throwable") .addTextPart("domain", "vlts.cn") .addFilePart("ico", new File("I:\\doge_favicon.ico")); DataOutputStream requestPrinter = new DataOutputStream(System.out); writer.write(requestPrinter); HttpURLConnection connection = (HttpURLConnection) new java.net.URL(URL).openConnection(); connection.setRequestMethod("POST"); connection.addRequestProperty("Connection", "Keep-Alive"); // Set request header writer.forEachHeader(connection::addRequestProperty); connection.setDoInput(true); connection.setDoOutput(true); connection.setConnectTimeout(10000); connection.setReadTimeout(10000); DataOutputStream out = new DataOutputStream(connection.getOutputStream()); // Set request body writer.write(out); StringBuilder builder = new StringBuilder(); BufferedReader reader = new BufferedReader(new InputStreamReader(connection.getInputStream(), StandardCharsets.UTF_8)); String line; while (Objects.nonNull(line = reader.readLine())) { builder.append(line); } int responseCode = connection.getResponseCode(); reader.close(); out.close(); connection.disconnect(); System.out.printf("Response code:%d,Response content:%s\n", responseCode, builder); } }
Execution response result:
Response code:200,Response content:ok
You can try adding two lines of code to print the request body:
MultipartWriter writer = MultipartWriter.newMultipartWriter(); writer.addTextPart("name", "throwable") .addTextPart("domain", "vlts.cn") .addFilePart("ico", new File("I:\\doge_favicon.ico")); DataOutputStream requestPrinter = new DataOutputStream(System.out); writer.write(requestPrinter);
The console output is as follows:;
JDK built-in HttpClient implementation
JDK11 + has built-in HTTP client implementation, and the specific entry is Java net. http. Httpclient, the implementation code is as follows:
public class HttpClientApp { private static final String URL = "http://localhost:9099/test"; public static void main(String[] args) throws Exception { HttpClient httpClient = HttpClient.newBuilder() .connectTimeout(Duration.of(10, ChronoUnit.SECONDS)) .build(); MultipartWriter writer = MultipartWriter.newMultipartWriter(); writer.addTextPart("name", "throwable") .addTextPart("domain", "vlts.cn") .addFilePart("ico", new File("I:\\doge_favicon.ico")); ByteArrayOutputStream out = new ByteArrayOutputStream(); writer.write(out); HttpRequest.Builder requestBuilder = HttpRequest.newBuilder(); writer.forEachHeader(requestBuilder::header); HttpRequest request = requestBuilder.uri(URI.create(URL)) .method("POST", HttpRequest.BodyPublishers.ofByteArray(out.toByteArray())) .build(); HttpResponse<String> response = httpClient.send(request, HttpResponse.BodyHandlers.ofString()); System.out.printf("Response code:%d,Response content:%s\n", response.statusCode(), response.body()); } }
Almost all the built-in HTTP components use the Reactive programming model, and the API s used are relatively low-level, with high flexibility but low ease of use.
Summary
The media type multipart / form data is often used for HTTP requests under the POST method, and the scenarios as HTTP responses are relatively rare.
reference material:
(after writing c-1-d e-a-20211226 of this article, I found that the Boundary preamble added more middle bars, but I didn't bother to change it after reading Postman's request.)