Protocol Design for Mobile Internet IM

Introduction: if you want to implement a mobile internet IM app on your own, what do you have to do? The first issue to be addressed is the design of the IM protocol. This article will talk about how to design a private tcp protocol from 0 to 1.

Although a variety of messaging push SDKs such as carrier pigeons already exist on the market now, they may not fully meet the demand for various reasons, or you may want to implement an IM or push function yourself. So what issues do you need to address? The first problem faced is how to implement the IM protocol?

Transmission protocol selection

The transport protocols are generally referred to as TCP and UDP protocols. The UDP protocol is connectionless, message-oriented, and provides primarily efficient services. It is efficient and takes up fewer resources, but its transmission is unreliable and only sends, regardless of whether the other party receives it, although reliability can be achieved by other means. TCP is connection-oriented, flow-oriented, and primarily provides reliability services. Reliability is exactly the feature IM needs most, so nowadays mainstream IM is basically implemented using TCP protocol.

Regarding the issue of PC QQ still using UDP, I understand privately that it is due to historical reasons, so it has been used until now. I guess it should be because the C10K problem was not well solved back then, because TCP is connection-oriented, there was no epoll technology existed at that time, and it could not solve the high load problem of simultaneous online well, so we can only use UDP, because UDP is connectionless, there is no load problem, but UDP is not reliable, so we can only implement TCP timeout, retransmission, confirmation and other mechanisms on UDP.

Protocol format selection

commonTCP Protocol formats are usually3 kind: Text Protocol、 Binary protocol、 XML protocol。

Text Protocol

Text Protocol It is usually made up of a string ofACSII Data consisting of characters。 Text Protocol Easily interpreted by humans, More suitable for the public, typical ofHTTP agreements。 Cite one.HTTP GET examples of:

User-Agent: curl
Accept: */*

Text Protocol characteristics of the: a. Good readability, Easy to develop and debug; b. Good scalability,key-value Easy to extend; c. Better parsing efficiency; d. Smaller flow rate。

The once dominant party ofIM productsMSN The use of is is Text Protocol。

XML protocol

One of the mainstream IM protocols, XMPP, is an open XML-based real-time communication protocol. As an example of an XMPP sent message.

<message from="sendinguser@somedomain" to="recipient@somedomain" xml:lang='en'>
    Body of message

XML protocol characteristics of the: a. InheritedXML merits, Good readability, Good scalability; b. Higher cost of resolution, inefficient, Takes up a lot of resources; c. High flow rate。

GTalk, the IM product from Google, uses the XMPP protocol.

Binary protocol

Binary protocol It's a stream of bytes., Generally includes a fixed-length head and an extendable, variable-length body, typical ofMQTT agreements。 Cite one. Binary protocol examples:

Binary protocol distinction: a. poor readability, Difficult to debug; b. Less scalable; c. High parsing efficiency, Almost no parsing cost; d. Traffic usage is minimal。

QQ And WeChat is a typical representation of using binary, Most of them are on the market nowIM Products are also binary。 Although it is poorly readable, Difficult to debug, But this is also raising the threshold for protocol to be cracked。 So traffic and power sensitive to the mobile InternetIM say, Binary protocol Best for。

Comparison of Mainstream Protocols

Having compared the protocol formats, we then go on to compare the various protocol standards. The main mainstream IM protocols on the market today are XMPP applied to the PC Internet, MQTT on the IoT of embedded devices, together with a comparison of the advantages and disadvantages between them.

| Name | Advantages | Disadvantages | | | :--- |:--- | :--- | | | XMPP | Based on XML protocol, easy to understand, widely used and easy to extend | Traffic is high and parsing at mobile terminals also consumes power. The interaction process is complicated, mostly used by the PC era products, not suitable for application in mobile Internet IM | | MQTT | Low bandwidth, suitable for pushing, adaptable to multiple platforms | The protocol is simple, but you need to expand your own friends, groups, etc. | | Private Agreement | Flexible, low bandwidth, autonomous control | To consider scalability, compatibility, serialization and deserialization, security, etc. |

Private agreement design

Based onTCP Application layer protocols are generally divided into buns and packages( asHTTP),IM Agreements are no exception。 So it's common practice: Set the length of the binary head, Scalable variable-length package, The package body can use text such asProtobuf、MessagePack、JSON、XML Protocols that scale well, such as。 The package head is responsible for transmission and parsing efficiency, is a common part of all packages, Non-business related。 Package body guarantees scalability, Business-related。 A typical Binary protocol as follows:

| Field | length | message_id | version | type | data | | | :-: | :-: | :-: | :-: | :-: | :-: | | Type | int | int | byte | int | byte[] | | Number of bytes | 4 | 4 | 1 | 4 | n |

1. length: the length of the packet, informing the server how long the packet data is to be received.

2. message_id: message ID, due to the complexity of the network, the interaction between the client and the server may not be guaranteed to reach the message, so it needs to be retransmitted to ensure that, in order to avoid duplication of messages, the unique identifier of the message can be used to de-duplicate;

3. version: the message version number, as the binary format is not scalable, if the field is to be extended, the old protocol is not compatible, so there is usually a version field used to distinguish the version.

4. type: message type, used to distinguish message packages with different functions, such as key exchange messages, heartbeat messages, business messages, error return messages, push messages, etc..

5. data: packet body data, variable length for different services.

Problems with sticky bales

It is worth mentioning that since TCP is based on streaming data transmission, there is the problem of "sticky packets", which means that a complete packet of message data is not received in one reception. As an example, suppose that the server sends 3 packet messages in sequence, as represented in the following figure.

However, the data read by the client is likely to be split into the following pieces.

This is the so-called "sticky packet" problem, the solution is generally the following two: 1, the message packet header contains a field indicating the total length of the message packet (or the length of the message packet body), the above example of length is the use of this scheme; 2, the end of the packet to add a special separator, such as the end of each message to add a carriage return line feed (such as FTP protocol) or a specific character as a message separator, the receiver through the special separator to cut the message, such as the above example can be modified to the following format: | Field | length | message_id | version | type | data_length | data | delmiter | | | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | | Type | int | int | byte | int | int | byte[] | byte| | Number of bytes | 4 | 4 | 1 | 4 | 4 | n | 1 |

Where delmiter can be fixed to a special character such as "@", delmiter should be as small as possible to reduce traffic usage. In addition, since the package body may contain delimiters, the delmiter needs to be escaped to prevent parsing errors, so it is generally more recommended to use the first option to solve the "sticky package" problem.


Package headers and package tails contain subpackage separators: I have come across a number of projects in the past where this method of subpackaging was used in the agreement, and the above analysis of the "sticky package" problem shows that this practice will only waste traffic, not more benefits.

Serialization selection

The package body can use text such as Protobuf, MessagePack, JSON, XML and other protocols with good extensibility, but we recommend giving preference to Protobuf. There is also a lot of discussion on the choice of serialization and deserialization schemes on the Internet, so we won't go into it here, which is the current choice of mainstream IM. Protobuf Advantages.

  1. Standard IDL and IDL compiler, which makes it very engineer friendly.
  2. Serialized data is very clean and compact, serialized to 1/10 the size of json, 1/20 the size of xml format, and 1/10 the size of binary serialization.
  3. Very fast parsing, about 20 - 100 times faster than the XML counterpart.
  4. Very user-friendly dynamic libraries are provided, use is very brief, and deserialization requires only one line of code.

Scenarios for which Protobuf is suitable.

  1. messages that need to be exchanged with other systems, that are sensitive to message size, and that save a lot of message space compared to xml and json, for example.
  2. The occasion of small data. (a) If you are big data, using it is not appropriate.
  3. The project languages are c++, java, python as they can use google native class libraries and serialization and deserialization is very efficient. So Protobuf parsing performance is high and the amount of data after serialization is relatively small, making it ideal for application to mobile Internet IM scenarios.

Security considerations

Sensitive information is transmitted directly over the network via IM, so the security layer is essential and generally only requires encryption of the packet body and plain text in the packet header. In other words, the security of the TCP protocol can be considered mainly in the following ways.

Using SSL

andHTTPS The same, Using SSL High security, But the difference is,HTTPS It is up to the specialized agencies to verify the legality of the certificate, whereasIM It is impossible to do so, It is possible to package the certificate into the client, Certificate updates can be upgraded with client upgrades, Or upgrade by agreement。 The encrypted interaction process is when the client generates a symmetric key, And after the certificate is encrypted, the request is handed over to the server, The server decrypts and obtains this symmetric key, Subsequent communications are all decrypted using this symmetric key, Please refer to the specific principleSSL, I won't go into that here。 However, certificate costs are slightly higher and management is slightly more complex, The cost is high。

Decrypt yourself

To implement encryption and decryption by itself, the focus is on key generation and management, and there are two main ways of key management.

1) Fixed Key

The server and client agree on a key, and also agree on a symmetric encryption algorithm such as AES, and each time before the client sends a message, the message is encrypted using the agreed algorithm and key, and after the server receives the message, it is decrypted using the agreed algorithm and key. The advantage of this approach is that the implementation is relatively simple, but the disadvantage is also obvious, the agreed key and algorithm exist on the client side, there is a risk of being decompiled and cracked, the scheme is more suitable for scenarios with low requirements for encryption;.

2) Dynamic Key

Since fixed keys are easily exposed, the idea of dynamic keys is to add another layer of protection to fixed keys. Similar to the SSL key negotiation process, the central idea of dynamic keys is that the client and server negotiate through asymmetric RSA encryption (increasing the difficulty of cracking), and eventually the client gets a key for the current session, and subsequent data transfers are AES symmetric encryption and decryption through this key. The process is more complex, as shown in the following diagram.

Public Key Request.

1. the client carries the account number to initiate the request.

2. the server generates the corresponding RSA public and private keys according to the account number.

3. the server side issues the public key and retains the private key.

4. the server returns the RSA public key to the client, and the client saves the RSA public key.

Login Forensics. 1, the client uses RSA public key to account and password equivalents (account password encoded according to certain rules) for RSA asymmetric encryption, and then carries this encryption result to initiate a request; 2, the server uses RSA private key to decrypt and obtain the account and password; 3, the server verifies whether the account and password are correct; 4, the server assigns the current session key session_key to the client; 5, the server returns the AES encrypted session key session key, and the AES key is the account/password equivalents. Subsequent requests are encrypted using session_key as the key.

Non-login requests.

1. the client initiates the request using session_key as the key for AES symmetric encryption of the request.

2. AES decryption of the request by the server using session_key.

3. Processing of business logic based on requests.

4. The server uses session_key as the key to encrypt the processing result with AES and returns it to the client.

Finally, it is described, The article mainly expounds the mobile InternetIM The main elements of protocol design that will be faced include transport protocols、 The protocol format、 Protocol design、 Protocol serialization、 Protocol security and other issues, and the corresponding solutions, These are the author's summaries and reflections on past projects。 In the midst of WeChat andQQ Two major mobile InternetIM under pressure, The article does have the appearance of a banmen making an axe, If there are deficiencies or errors, Please also go all the wayIM God teaches:) It's worth mentioning, The article's reflections will also apply equally to other uses oftcp Long connection scenarios, as in the Internet of Things、 handheld game, etc.。

1、From Neural Turing Machines to Differentiable Neural Computers Series on NTM Explained
2、Bitcoin rollover Breaking down the domestic blockchain wordofmouth apps Planet Netflix is lagging behind
3、Netflix proposes bottom line for Google and Facebook to enter China accept Chinese law Beijing officially allows unmanned cars to test on roads
4、AeroChampion 2018 Cyber Security Education Week 917923
5、From the Fomo3D cheating incident deciphering the blockchain consensus mechanism

    已推荐到看一看 和朋友分享想法
    最多200字,当前共 发送