background

Because I want to get some data of Zhihu live barrage and make a live broadcasting robot, I am studying the live barrage of Zhihu recently

Analysis of the

Grab relatively simple, not to say more… All normal operations

But the numbers are strange

For the sake of demonstration, we use the REST interface, which is essentially the same as the WebSocket interface.

Let’s take direct broadcast room 11529 as an example

Take barrage interface is: https://www.zhihu.com/api/v4/drama/theaters/11529/recent-messages

Screenshot of data returned from live broadcast barrage

It can be seen that the barrage data should be in messages, but the data seems to be encrypted

Js big search

First, search for recent-messages globally to find the js file you want.

After downloading the js file and formatting it locally, search for recent-messages


Search LOAD_RECENT_MESSAGES

Found the first step in how to parse a message, base64 decryption

Atob function interpretation in JS [1]


And the result of the transformation is passed to the function P

Continue searching p search up (remember to select full word matching and case sensitive search mode) otherwise search results are too many…

Lucky for you, the first one on the list is


To verify you can replace zhihu JS to your local JS

Just add two lines of console.log

The code is as follows…

function p(e{

        console.log("before:", e);

        var t = d.EventMessage.decode(e),

            n = t.eventCode,

            r = t.event;

        console.log("after:", t);

Copy the code

You can see that this is what we want

So now just figure out what eventmessage. decode does…

And I found the code

Debug step by step, find some kind of coding specification?

Is it zhihu’s own definition…

After a week here… I haven’t figured it out yet

Tell me roughly what I’m confused about

Uint8Array as above

What does this byte mean for the first >>> operation

Then the XXX bytes after represent the specific value, but how many XXX bytes are exactly and how are they distinguished

In particular, the following three are int64, but their bytes are different

TimestampMs is 6 bytes

The theaterId is two bytes

DramaId is 9 bytes

I’ll take a little notebook and debug and write… Event is a dictionary containing 40 keys. Event is a dictionary containing 40 keys. Event is a dictionary containing 40 keys.

I exploded when I saw the code…


So I thought, forget it. I don’t know how it works. Let me just cut out this js… Then we can build a NodeJS service

Buckle js

The deduction was simple, except for the “S” in this sentence

instanceof s || (e = s.create(e));

Copy the code

I can’t figure out where the s came from

So I just Had to Google.

What a surprise! Turns out to beprotobuf[2]

So this so-called encryption is a universal protocol…

At this point, the problem is simple

Protocol Buffers

The official definition is as follows:

Protocol Buffers is a language-independent, platform-independent, extensible mechanism for serializing structured data.

For more information about protocol-buffers, see the protocol- Buffers website [3]

The following is from parsing protobuf data streams in Burpsuite [4]

Varint coding

Protobuf binary uses Varint encoding. Varint is a compact way to represent numbers. It uses one or more bytes to represent a number, and the smaller the number, the fewer bytes it uses. This reduces the number of bytes used to represent numbers.

The highest bit of each byte in Varint has a special meaning. If the byte is 1, it indicates that subsequent bytes are part of the number. If the byte is 0, it ends. The other seven bits are used to represent numbers. Therefore, any number less than 128 can be represented by a byte. Numbers greater than 128, such as 300, are represented by two bytes: 1010 1100 0000 0010.

The following diagram illustrates how A Protobuf can parse two bytes. Note that the positions of the two bytes are swapped once before the final calculation, because the protobuf byte order is little-endian.

So we’ve solved that puzzle… How to determine how many bytes a field should have (or can now partition data)

Numeric types

A Protobuf is serialized and stored as a binary data stream, which is a series of key-value pairs. A Key is used to identify a specific Field. During unpacking, a Protobuf can use the Key to determine which Field the Value should correspond to in the message.

Key is defined as follows:

(field_number << 3) | wire_type

A Key consists of two parts. The first part is field_number, such as 1 for field name in message tutorial. Person. The second part is wire_type. Represents the transfer type of Value. The possible Wire Type is as follows:

Type Meaning Used For
Varint int32, int64, uint32, uint64, sint32, sint64, bool, enum
1 64-bit fixed64, sfixed64, double
2 Length-delimi string, bytes, embedded messages, packed repeated fields
3 Start group Groups (deprecated)
4 End group Groups (deprecated)
5 32-bit fixed32, sfixed32, float

Take data flow: 08 96 01 as an example to analyze the value of key-value.

#! Bash 08 = 0000 1000B => 000 1000b => field_num = 0001b(middle 4 bits), type = 000(last 3 bits) => field_num = 1, Type = 0(Varint) 96 01 = 1001 0110 0000 0001B => 001 0110 0000 0001B => 1001 0110b => 128 + 16 + 4 + 2 = 150Copy the code

The final structured data is:

What is zero

1 is field_num and 150 is value.

Manual deserialization


Deserialization analysis is performed using the serialized binary data stream in the above example:

#! bash 0A = 0000 1010b => field_num=1, type=2; 2E = 0010 1110b => value=46; 0A = 0000 1010b => field_num=1, type=2; 07 = 0000 0111b => value=7;Copy the code

Read 7 characters “Vincent”;

#! bash 10 = 0001 0000 => field_num=2, type=0; 09 = 0000 1001 => value=9; 1A = 0001 1010 => field_num=3, type=2; 10 = 0001 0000 => value=16;Copy the code

Read 10 characters “[email protected]”;

#! bash 22 = 0010 0010 => field_num=4, type=2; 0F = 0000 1111 => value=15; 0A = 0000 1010 => field_num=1, type=2; 0B = 0000 1011 => value=11;Copy the code

Read 11 characters “15011111111”;

#! bash 10 = 0001 0000 => field_num=2, type=0; 02 = 0000 0010 => value=2;Copy the code

The final structured data is:

#! bash 1 { 1: "Vincent" 2: 9 3: "[email protected]" 4 { 1: "15011111111" 2: 2 } }Copy the code

Use Protoc deserialization

Protoc “decode_RAW” parameter can be used to decode the stream data. I have implemented a Python script for use:

import subprocess, sys

import json

import base64



def decode(data):

    process = subprocess.Popen(

        ["protoc"."--decode_raw"].

        stdin=subprocess.PIPE,

        stdout=subprocess.PIPE,

        stderr=subprocess.PIPE,

    )



    output = error = None

    try:

        output, error = process.communicate(data)

    except OSError:

        pass

    finally:

        ifprocess.poll() ! =0:

            process.wait()

    return output



with open(sys.argv[1]."rb"as f:

    data = f.read()

    print(' ',decode(data))

Copy the code

Back to Zhihu live

So let’s test and parse one first

import subprocess, sys

import json

import base64



def decode(data):

    process = subprocess.Popen(

        ["protoc"."--decode_raw"].

        stdin=subprocess.PIPE,

        stdout=subprocess.PIPE,

        stderr=subprocess.PIPE,

    )



    output = error = None

    try:

        output, error = process.communicate(data)

    except OSError:

        pass

    finally:

        ifprocess.poll() ! =0:

            process.wait()

    return output



a1 = "CAESpgMKowMKhgMKIDQwYjQ3Y2NiZmM0NDc1YjAxOGE1YTQxN2UxY2Y5ODk3EhLlsI/pgI/mmI7niLHlvrfljY4aDGR1LXlhby0xMy04NiJBaHR0cHM6Ly9 waWM0LnpoaW1nLmNvbS92Mi1kMDYxNjFiMWQzOWNkNjRlYmRhNDBmOWMwNjVhNmNhNV94cy5qcGcqswEIiVoSCemFkueqneeqnRgBIAAoJTABOpoBCAEQABp AaHR0cHM6Ly9waWMxLnpoaW1nLmNvbS92Mi0xZDEyNTg1YzdhOTY2MTNkM2JlZjQxMTcyY2Q4ZWYxNV9yLnBuZyJAaHR0cHM6Ly9waWM0LnpoaW1nLmNvbS9 2Mi1iMDM1ZWRkNTA3NjgwNzU3MmJkNGU3YTg5MjRjZTEzYl9yLnBuZyoHIzcyQkJGRioHIzAwODRGRjJHCAkQjGAaQGh0dHBzOi8vcGljNC56aGltZy5jb20 vdjItODI1NTRlYzgzYmViMzJlOWVjNDQxNGY0YzYyMmFjMmNfci5wbmcQARoM5oiR5Lmf6KeJ5b6XIICA8cTaz6SEERiplO2Pki4giVoogKDrtNOUlYQRMhU xLTEyMjczOTE5NjY4NTU4Mzk3NDQ4AQ=="



message = base64.b64decode(a1)



print(decode(message))

Copy the code

The results are as follows

Out[7]: b'1: 1\n2 {\n  1 {\n    1 {\n      1: "40b47ccbfc4475b018a5a417e1cf9897"\n      2: "\\345\\260\\217\\351\\200\\217\\346\\230\\216\\347\\210\\261\\345
\\276\\267\\345\\215\\216"\n      3: "du-yao-13-86"\n      4: "https://pic4.zhimg.com/v2-d06161b1d39cd64ebda40f9c065a6ca5_xs.jpg"\n      5 {\n        1: 1152
9\n        2: "\\351\\205\\222\\347\\252\\235\\347\\252\\235"\n        3: 1\n        4: 0\n        5: 37\n        6: 1\n        7 {\n          1: 1\n
  2: 0\n          3: "https://pic1.zhimg.com/v2-1d12585c7a96613d3bef41172cd8ef15_r.png"\n          4: "https://pic4.zhimg.com/v2-b035edd5076807572bd4e7a8924c
e13b_r.png"\n          5: "#72BBFF"\n          5: "#0084FF"\n        }\n      }\n      6 {\n        1: 9\n        2: 12300\n        3: "https://pic4.zhimg.com/v2-82554ec83beb32e9ec4414f4c622ac2c_r.png"\n      }\n    }\n    2: 1\n    3: "\\346\\210\\221\\344\\271\\237\\350\\247\\211\\345\\276\\227"\n    4: 1227391966855839744\n  }\n}\n3: 1585413048873\n4: 11529\n5: 1227323967020912640\n6: "1-1227391966855839744"\n7: 1\n'
Copy the code

You can see that the parsing is successful, so the rest of the work is easier…

As long as according to zhihu JS corresponding to a specific location of the field name is good…

Finally, take a look at a successful screenshot


The resources

[1]

In js atob function explanation: https://developer.mozilla.org/zh-CN/docs/Web/API/WindowBase64/atob


[2]

protobuf: https://github.com/protobufjs/protobuf.js


[3]

Protocol buffers – website: https://developers.google.com/protocol-buffers


[4]

Protobuf data stream parsing in Burpsuite: https://wooyun.js.org/drops/Burpsuite%E4%B8%ADprotobuf%E6%95%B0%E6%8D%AE%E6%B5%81%E7%9A%84%E8%A7%A3%E6%9E%90.html

This article is formatted using MDNICE