ZeroC Labs: Protocol Buffers
Google Protocol Buffers provide a serialization mechanism for simple structured data, but with only rudimentary RPC support.

ZeroC has added native support for Protocol Buffers to Ice, which allows you to use Ice as a transport for Protocol Buffers.

Overview

A tool kit that has a definition language with an intuitive syntax, an efficient encoding, and an easy-to-use API. Sound familiar? Google's Protocol Buffers and ZeroC's Ice have different purposes, but similar design philosophy.

Protocol Buffers provide a data interchange format suitable for persistent storage and network communication, similar to the marshaling code that is generated for user-defined Slice types. (Protocol Buffers are not as fully-featured as Slice—for example, there is no support for inheritance, slicing, or marshaling graphs of objects.)

ZeroC created this Labs project to demonstrate that Ice can accommodate alternate data formats, and to provide users of Protocol Buffers with an opportunity to learn about Ice and discover what it has to offer. In addition, Protocol Buffers provide no functionality for sending messages (other than rudimentary interface definitions), so Ice can be used as a fully-featured transport for Protocol Buffers.

Defining a Message

Protocol Buffers use a simple definition language to specify the fields of a message. For example, consider the following message type:

// proto
package tutorial;

message Person {
    required int32 id = 1;
    required string name = 2;
    optional string email = 3;
}

The package statement does what you would expect: it provides a namespace that encapsulates one or more message definitions. The Person message defines two mandatory fields and one optional field. (The Protocol Buffers implementation uses the required and optional modifiers to verify that messages are properly initialized). The integer value assigned to each field is an identifier tag that Protocol Buffers embed in the binary encoding.

An application uses Protocol Buffers by compiling the definitions with a compiler that produces code in a particular target language. (Protocol Buffers currently support C++, Java, and Python.) The generated code is combined with application code to produce the final program.

Defining an Interface

Slice is the Specification Language for Ice; it allows you to specify the signatures of remote operations, including the types of parameters and exceptions that operations can raise. (For a gentle introduction to Ice concepts and terminology, please review this page.) As with Protocol Buffers, Slice definitions must be compiled into a target language for use by an application.

Slice definitions drive Ice's marshaling engine to produce compact messages for transmission over a network connection and for storage in a local database. An Ice developer uses Slice to define all of the data types used as parameters of a remote operation. For example, we can use Slice to define a Person structure similar to the Person message above, along with an interface that allows us to send an instance of Person to another process:

// Slice
module Demo {
    struct Person {
        int id;
        string name;
        string email;
    };

    interface Transporter {
        void sendPerson(Person p);
    };
};

The Slice module keyword serves the same function as the package statement in a Protocol Buffers definition.

Explicit Marshaling

Since the purpose of this ZeroC Labs project is to integrate Protocol Buffers with Ice, let's explore how we can use the protocol buffers version of Person. Our first step is to change the Slice definition of Person:

// Slice
module Demo {
    sequence<byte> Person;

    interface Transporter {
        void sendPerson(Person p);
    };
};

Person is now defined as a byte sequence, which essentially means that Ice will treat values of type Person as opaque blobs. Consequently, any time the application wants to transmit a Person value, it must manually convert the object to and from the format that Ice expects. Fortunately, the default representation (or mapping) of a byte sequence in the target languages makes this process relatively straightforward. In C++, a byte sequence maps to std::vector<Ice::Byte>, in Java it maps to a native byte array, and in Python it maps to a string.

The example below shows how we can serialize and send a Protocol Buffers Person in C++:

// C++
tutorial::Person p;
p.set_id(3);
p.set_name("Fred Jones");
std::vector<Ice::Byte> data(p.ByteSize());
if (!p.SerializeToArray(&data[0], data.size()))
    assert(false);
transporter->sendPerson(data);

After instantiating a Person and initializing the required fields, the code serializes the value into a byte sequence and passes it as the parameter to sendPerson. A similar process is necessary on the receiving side to convert the value from a byte sequence back into a Person.

Metadata to the Rescue

C++ Example

Although Google's serialization API is fairly painless, we can make it more convenient by annotating our Slice definitions with metadata. Consider the new definition of Person below:

module Demo {
    ["cpp:type:tutorial::Person"] sequence<byte> Person;

    interface Transporter {
        void sendPerson(Person p);
    };
};

In Slice, metadata annotations influence the code that a Slice compiler generates, but they do not change the client-server contract. In this example, the metadata instructs the Slice compiler for C++ to use the Protocol Buffers type tutorial::Person in place of the default representation of std::vector. The application also needs to supply some boilerplate template code that works for any message type. With these changes, the application can now pass an instance of Person directly to the Ice API:

// C++
tutorial::Person p;
p.set_id(3);
p.set_name("Fred Jones");
transporter->sendPerson(p);

Java Example

There is even less work for the developer to do in Java:

// Slice
module Demo {
    ["java:protobuf:tutorial.PersonPB.Person"] sequence<byte> Person;

    interface Transporter {
        void sendPerson(Person p);
    };
};

The type name tutorial.PersonPB.Person corresponds to the Java class that the Protocol Buffers compiler generates for the following definition:

// proto
package tutorial;

option java_outer_classname = "PersonPB";

message Person {
    required int32 id = 1;
    required string name = 2;
    optional string email = 3;
}

A Java application can send an instance of Person as shown below:
// Java
tutorial.PersonPB.Person p = tutorial.PersonPB.Person.newBuilder().
    setId(3).setName("Fred Jones").build();
transporter.sendPerson(p);

Python Example

As in Java, the Python metadata specifies the name of the class generated by the Protocol Buffers compiler for our Person message type:

// Slice
module Demo {
    ["python:protobuf:Person_pb2.Person"] sequence<byte> Person;

    interface Transporter {
        void sendPerson(Person p);
    };
};

To send an instance of Person, we simply initialize it and hand it over to Ice for serialization:

# Python
import Person_pb2
p = Person()
p.id = 1
p.name = "Fred Jones"
transporter.sendPerson(p)

Known Limitations

The C++ code generated by the Protocol Buffers compiler does not include definitions for operator== or operator!=. As a result, a message type cannot be used as a data member of a Slice structure, class, or exception, or as the key type of a Slice dictionary. This limitation only applies to C++ applications.

Terms of Use | Privacy © 2014 ZeroC, Inc.