Java Protocol Buffer tutorial

Posted by Rdam on Tue, 18 Jan 2022 08:08:57 +0100

1, Why use Protocol Buffer

Suppose we want to create a very simple "address book" application that can read and write people's contact information from files. Everyone in the address book has a name, ID, e-mail address and contact phone number.

How to serialize and retrieve such structured data?

  1. Use Java serialization. There are many well-known problems
  2. I invented a special method to encode data items into a single string. For example, code four ints as "12:3: - 23:67". This is a simple and flexible approach, although it does require one-time coding and parsing code, and parsing will bring less runtime cost. This is most effective for encoding very simple data
  3. Serializing data into XML: this approach is very attractive because XML is (to some extent) readable and has binding libraries for many languages. If you want to share data with other applications / projects, this may be a good choice. However, XML is notoriously space intensive, and encoding / decoding it will bring huge performance loss to applications. Moreover, navigating in the XML DOM tree is much more complex than navigating simple fields in classes.

Protocol buffer is a flexible, efficient and automatic solution to this problem. Using the protocol buffer, you can write a for the data structure you want to store proto description. On this basis, the protocol buffer compiler creates a class to automatically encode and parse the protocol buffer data in an effective binary format. The generated class provides getter s and setter s for the fields that make up the protocol buffer, and is responsible for the details of reading and writing the protocol buffer as a unit. Importantly, the protocol buffer format supports the idea of extension

2, A simple address book program

2.1 create a proto file

To create an address book application, you need to create it from The proto file starts. The definition in the proto file is simple. Add a message for each data structure you want to serialize, and then specify a name and type for each field in the message. Here's what defines your message Proto file, addressbook proto.

syntax = "proto2";

package tutorial;

option java_multiple_files = true;
option java_package = "com.example.tutorial.protos";
option java_outer_classname = "AddressBookProtos";

message Person {
  optional string name = 1;
  optional int32 id = 2;
  optional string email = 3;

  enum PhoneType {
    MOBILE = 0;
    HOME = 1;
    WORK = 2;
  }

  message PhoneNumber {
    optional string number = 1;
    optional PhoneType type = 2 [default = HOME];
  }

  repeated PhoneNumber phones = 4;
}

message AddressBook {
  repeated Person people = 1;
}

The proto file starts with a package declaration, which helps prevent naming conflicts between different projects. In Java, the package name is used as a java package unless you explicitly specify java_package, just as we did here. Even if you provide a java_package, you should still define a common package to avoid name conflicts in the Protocol Buffers namespace and in non java languages.

After the package declaration, you can see three Java specific options: java_multiple_files,java_package and java_outer_classname.

  • java_package: specifies which Java package the generated class should exist in. If not explicitly specified, it only matches the package names given in the package declaration, but these names are usually not suitable for Java package names (because they usually do not start with a domain name).
  • java_outer_classname: the option defines the class name of the wrapper class that will represent this file. If you don't explicitly give java_outer_classname, it will generate the class name by converting the file name to large hump case.

Next, the message definition. A message is simply an aggregation that contains a set of typed fields. Many standard simple data types can be used as field types, including bool, int32, float, double, and string.

  optional string name = 1;
  optional int32 id = 2;
  optional string email = 3;

You can also add further structure to the message by using other message types as field types - in the above example, the Person message contains the PhoneNumber message and the AddressBook message contains the Person message. You can even define message types nested in other messages. As you can see, the PhoneNumber type is defined in Person.

message Person {
  ...
  message PhoneNumber {
    optional string number = 1;
    optional PhoneType type = 2 [default = HOME];
  }
  ...

You can also define enumeration types if you want a field to have one of the predefined value lists. Here you want to specify a phone number, which can be one of MOBILE, HOME or WORK.

enum PhoneType {
    MOBILE = 0;
    HOME = 1;
    WORK = 2;
  }

The "= 1" and "= 2" tags on each element identify the unique "tags" used by the field in binary encoding. Tag numbers 1-15 need one byte less to encode than higher numbers, so as an optimization, you can decide to use these tags for commonly used or repeated elements and leave tags 16 and higher for less commonly used optional elements. Each element in the repeating field needs to recode the tag number, so the repeating field is a particularly good candidate for this optimization. For example, the name, ID and email fields here are given sequence numbers.

  optional string name = 1;
  optional int32 id = 2;
  optional string email = 3;

Each field must be annotated with one of the following modifiers:

  • Optional: the default is optional. If it is not set, the default is optional.
  • Repeated: this field can be repeated any number of times (including 0 times). The order of duplicate values is preserved in the protocol buffer. You can think of duplicate fields as dynamically sized arrays.
  • Required: a value must be provided for the field, otherwise the message will be considered uninitialized. An attempt to build an uninitialized message throws a RuntimeException. Parsing uninitialized messages will throw IOException. In addition, the behavior of the required field is exactly the same as that of the optional field.

2.2 compile your protocol Buffer

Now there are Proto, the next thing to do is to generate classes that need to read and write addressbook (as well as Person and PhoneNumber) messages. To do this, you need to be in Protocol buffer compiler running on proto:

  1. If you do not have a compiler installed, please download The package and follow the instructions in README.
  2. Now run the compiler, specifying the source directory (the directory where the application source code is located - if no value is provided, the current directory is used), the target directory (the directory where the code you want to generate is located; usually the same as $SRC_DIR), and to Path to proto. Then execute
protoc -I=$SRC_DIR --java_out=$DST_DIR $SRC_DIR/addressbook.proto

This will generate a COM / example / tutorial / protocols / subdirectory under the specified target directory, which contains some generated java file.

2.3 Protocol Buffer API

Let's look at the generated java code and see what classes and methods the compiler has created for you. If you look at COM / example / tutorial / protocols /, you can see that it contains Java file for you in addressbook Each message specified in proto defines a class. Each class has its own Builder class (constructor) for creating instances of this class.

Both the message and the builder automatically generate access methods for each field of the message; The message has only getters, while the builder has both getters and setter s. Here are some accessors of the Person class (implementation omitted for brevity):

// required string name = 1;
public boolean hasName();
public String getName();

// required int32 id = 2;
public boolean hasId();
public int getId();

// optional string email = 3;
public boolean hasEmail();
public String getEmail();

// repeated .tutorial.Person.PhoneNumber phones = 4;
public List<PhoneNumber> getPhonesList();
public int getPhonesCount();
public PhoneNumber getPhones(int index);

Similarly, person The builder has the same getter and setter:

// required string name = 1;
public boolean hasName();
public java.lang.String getName();
public Builder setName(String value);
public Builder clearName();

// required int32 id = 2;
public boolean hasId();
public int getId();
public Builder setId(int value);
public Builder clearId();

// optional string email = 3;
public boolean hasEmail();
public String getEmail();
public Builder setEmail(String value);
public Builder clearEmail();

// repeated .tutorial.Person.PhoneNumber phones = 4;
public List<PhoneNumber> getPhonesList();
public int getPhonesCount();
public PhoneNumber getPhones(int index);
public Builder setPhones(int index, PhoneNumber value);
public Builder addPhones(PhoneNumber value);
public Builder addAllPhones(Iterable<PhoneNumber> value);
public Builder clearPhones();

As you can see, each field has a simple javabeans style getter and setter. There is also a getter for each individual field. If the field is set, it returns true. Finally, each field has an explicit method to unset the field to the empty state.
The repeated field has some additional methods - Count method (returning the size of the List), getter and setter methods to get or set the index of a specific element List, a method to add a new element to the List, and an addAll method to add the elements of the whole container to the List.

2.4 generate a message

The message classes generated by the protocol buffer compiler are immutable. Once a message object is constructed, it cannot be modified like a Java string. To build messages, you must first construct a builder, set any fields you want to set to the selected values, and then call the builder's build() method. You may have noticed that each set method of the builder returns another builder. The returned object is actually the same as the constructor that called the method. Its return is for convenience, so you can string multiple setter s on one. For example:

Person john =
  Person.newBuilder()
    .setId(1234)
    .setName("John Doe")
    .setEmail("jdoe@example.com")
    .addPhones(
      Person.PhoneNumber.newBuilder()
        .setNumber("555-4321")
        .setType(Person.PhoneType.HOME))
    .build()

2.5 standard message method

Each message and builder class also contains many other methods that allow you to examine or manipulate the entire message, including:

  • isInitialized(): check that all required fields are set.
  • toString(): returns a string, which is useful for debugging.
  • Merge from (message other): (only builder) merges the contents of other into this message, overwrites singular scalar fields, merges compound fields, and connects duplicate fields.
  • Clear(): (builder only) clears all fields back to the empty state.

2.6 parsing and serialization

Finally, each protocol buffer class has methods to write and read messages of the selected type using the protocol buffer binary format. These include:

  • byte[] toByteArray(); Serializes the message and returns a byte array containing its original bytes.
  • static Person parseFrom(byte[] data); Parses the message from the given byte array.
  • void writeTo(OutputStream output);: Serialize the message and write it to OutputStream
  • static Person parseFrom(InputStream input); Read and parse messages from InputStream

2.7 send a Protocol Buffer type message

Now let's try using your protocol buffer class. The first thing you want the address book application to do is write personal details to the address book file. To do this, you need to create and populate instances of the protocol buffer class and then write them to the output stream.

The following is a program that reads AddressBook from the file, adds a new Person to it according to user input, and writes the new AddressBook to the file again. Highlights the part of the code generated by the direct call or reference protocol compiler.

import com.example.tutorial.protos.AddressBook;
import com.example.tutorial.protos.Person;
import java.io.BufferedReader;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.InputStreamReader;
import java.io.IOException;
import java.io.PrintStream;

class AddPerson {
  // This function fills in a Person message based on user input.
  static Person PromptForAddress(BufferedReader stdin,
                                 PrintStream stdout) throws IOException {
    Person.Builder person = Person.newBuilder();

    stdout.print("Enter person ID: ");
    person.setId(Integer.valueOf(stdin.readLine()));

    stdout.print("Enter name: ");
    person.setName(stdin.readLine());

    stdout.print("Enter email address (blank for none): ");
    String email = stdin.readLine();
    if (email.length() > 0) {
      person.setEmail(email);
    }

    while (true) {
      stdout.print("Enter a phone number (or leave blank to finish): ");
      String number = stdin.readLine();
      if (number.length() == 0) {
        break;
      }

      Person.PhoneNumber.Builder phoneNumber =
        Person.PhoneNumber.newBuilder().setNumber(number);

      stdout.print("Is this a mobile, home, or work phone? ");
      String type = stdin.readLine();
      if (type.equals("mobile")) {
        phoneNumber.setType(Person.PhoneType.MOBILE);
      } else if (type.equals("home")) {
        phoneNumber.setType(Person.PhoneType.HOME);
      } else if (type.equals("work")) {
        phoneNumber.setType(Person.PhoneType.WORK);
      } else {
        stdout.println("Unknown phone type.  Using default.");
      }

      person.addPhones(phoneNumber);
    }

    return person.build();
  }

  // Main function:  Reads the entire address book from a file,
  //   adds one person based on user input, then writes it back out to the same
  //   file.
  public static void main(String[] args) throws Exception {
    if (args.length != 1) {
      System.err.println("Usage:  AddPerson ADDRESS_BOOK_FILE");
      System.exit(-1);
    }

    AddressBook.Builder addressBook = AddressBook.newBuilder();

    // Read the existing address book.
    try {
      addressBook.mergeFrom(new FileInputStream(args[0]));
    } catch (FileNotFoundException e) {
      System.out.println(args[0] + ": File not found.  Creating a new file.");
    }

    // Add an address.
    addressBook.addPerson(
      PromptForAddress(new BufferedReader(new InputStreamReader(System.in)),
                       System.out));

    // Write the new address book back to disk.
    FileOutputStream output = new FileOutputStream(args[0]);
    addressBook.build().writeTo(output);
    output.close();
  }
}

2.8 reading a Protocol Buffer message

import com.example.tutorial.protos.AddressBook;
import com.example.tutorial.protos.Person;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.PrintStream;

class ListPeople {
  // Iterates though all people in the AddressBook and prints info about them.
  static void Print(AddressBook addressBook) {
    for (Person person: addressBook.getPeopleList()) {
      System.out.println("Person ID: " + person.getId());
      System.out.println("  Name: " + person.getName());
      if (person.hasEmail()) {
        System.out.println("  E-mail address: " + person.getEmail());
      }

      for (Person.PhoneNumber phoneNumber : person.getPhonesList()) {
        switch (phoneNumber.getType()) {
          case MOBILE:
            System.out.print("  Mobile phone #: ");
            break;
          case HOME:
            System.out.print("  Home phone #: ");
            break;
          case WORK:
            System.out.print("  Work phone #: ");
            break;
        }
        System.out.println(phoneNumber.getNumber());
      }
    }
  }

  // Main function:  Reads the entire address book from a file and prints all
  //   the information inside.
  public static void main(String[] args) throws Exception {
    if (args.length != 1) {
      System.err.println("Usage:  ListPeople ADDRESS_BOOK_FILE");
      System.exit(-1);
    }

    // Read the existing address book.
    AddressBook addressBook =
      AddressBook.parseFrom(new FileInputStream(args[0]));

    Print(addressBook);
  }
}

Topics: Java protocol