preface

Since Bilibili has always had an online view count feature for articles, I wanted to know how he did it, so I came up with this note.

1. Architecture selection

Common technical architectures

  1. Log records such as access_log are filtered by Url and date segmentation is implemented.

    • disadvantages

      • Real time is not high
      • IO request high
    • advantages

      • Easy to implement, just need text reading
  2. WebSocket+Redis

    • disadvantages

      • Required server performance
      • The implementation is a little more complicated
    • advantages

      • Real time high

The scheme adopted by Bilibili (guess)

  1. Through checking the network, it can be seen that Bilibili adopts the second scheme, which uses WebSocket

  2. Bilibili USES connection wss://broadcast.chat.bilibili.com: 7826 / sub service.

  3. [datA1] [datA1] [datA1] [datA1]

    0x00, 0x00, 0x00, 0x5B, 0x00, 0x12, 0x00, 0x01, 0x00, 0x00, 0x00, 0x07,
    0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x7B, 0x22, 0x72, 0x6F, 0x6F, 0x6D,
    0x5F, 0x69, 0x64, 0x22, 0x3A, 0x22, 0x76, 0x69, 0x64, 0x65, 0x6F, 0x3A,
    0x2F, 0x2F, 0x35, 0x30, 0x33, 0x33, 0x34, 0x35, 0x38, 0x30, 0x2F, 0x38,
    0x38, 0x31, 0x32, 0x30, 0x37, 0x39, 0x32, 0x22, 0x2C, 0x22, 0x70, 0x6C,
    0x61, 0x74, 0x66, 0x6F, 0x72, 0x6D, 0x22, 0x3A, 0x22, 0x77, 0x65, 0x62,
    0x22, 0x2C, 0x22, 0x61, 0x63, 0x63, 0x65, 0x70, 0x74, 0x73, 0x22, 0x3A,
    0x5B, 0x31, 0x30, 0x30, 0x30, 0x5D, 0x7D
    Copy the code

    Looks like a hexadecimal byte array, parse it

    /** * can parse * 0x00, 0x00, 0x00, 0x5B, 0x00, 0x12, 0x00, 0x01, 0x00, 0x00, 0x00, 0x07,0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00 * This part is supposed to be the header of the definition and cannot parse the actual meaning * the rest of the message, Parsed is * {" room_id ":" video: / / 50334580/88120792 ", "platform" : "web", "accepts" : [1000]} * / @ Test public void test3 () {byte [] bytes = { 0x7B, 0x22, 0x72, 0x6F, 0x6F, 0x6D, 0x5F, 0x69, 0x64, 0x22, 0x3A, 0x22, 0x76, 0x69, 0x64, 0x65, 0x6F, 0x3A, 0x2F, 0x2F, 0x35, 0x30, 0x33, 0x33, 0x34, 0x35, 0x38, 0x30, 0x2F, 0x38, 0x38, 0x31, 0x32, 0x30, 0x37, 0x39, 0x32, 0x22, 0x2C, 0x22, 0x70, 0x6C, 0x61, 0x74, 0x66, 0x6F, 0x72, 0x6D, 0x22, 0x3A, 0x22, 0x77, 0x65, 0x62, 0x22, 0x2C, 0x22, 0x61, 0x63, 0x63, 0x65, 0x70, 0x74, 0x73, 0x22, 0x3A, 0x5B, 0x31, 0x30, 0x30, 0x30, 0x5D, 0x7D }; StringBuilder result = new StringBuilder(); for (int index = 0, len = bytes.length; index <= len - 1; index += 1) { int char1 = ((bytes[index] >> 4) & 0xF); char chara1 = Character.forDigit(char1, 16); int char2 = ((bytes[index]) & 0xF); char chara2 = Character.forDigit(char2, 16); result.append(chara1); result.append(chara2); } System.out.println(new String(new BigInteger(result.toString(), 16).toByteArray())); }Copy the code
    1. The focus should be on Room_id, where the video is divided into two parts. 50334580 should be the av number corresponding to the video (bV number is used now, but actually there is a corresponding AV number), and 88120792 should be the ammunition information CID of the video

    2. According to the code, there should be only one connection address

      Aid/CID is used as a key, click on the video connection, make a Websocket connection, then send [data1], and the server returns

      b'\x00\x00\x00+\x00\x12\x00\x01\x00\x00\x00\x08\x00\x00\x00\x01\x00\x00{"code":0,"message":"ok"}'

    3. And then it sends a base64 encoded user information or something, right? I didn’t understand the code here, and then I went back

      b'\x00\x00\x00n\x00\x12\x00\x01\x00\x00\x00\x03\x00\x00\x00\t\x00\x00{"code":0,"message":"0","data":{"room":{"online":3, "room_id":"video://85919470/146861497"}}}'

2. My scheme -Java

  1. Webscoket connection, user subscription/video/{vid}
  2. According to thevidCreate the correspondingMapStorage,vidAnd the correspondingwebsocket-session
  3. Send a message to the background with the useruserId
  4. According to thevidSend it to Redis, using videovidFor key, use RedissetData structure, storageuserIdThrough thescard keyViewing the Number of Online Users
  5. Refresh with HeatBeat or on a regular basis.
  6. The user closes the tag and disconnects the Websocket servicevidRemove the corresponding in RedisvidAs a keyuserId

3, lack of

Since users may open multiple videos at the same time, if one of them is closed, the others will also be disconnected and the latest online viewing number cannot be pushed constantly.

Reference code github.com/penpen456/b…

import websocket import base64 import requests import json import datetime import time def get_cid(av): response = requests.get("https://api.bilibili.com/x/web-interface/view?aid=" + str(av)) response.encoding = 'utf-8' res = response.text # print(res) data = json.loads(res) c = data['data']['cid'] # print(c) return c def make_send(av): cid = str(get_cid(av)) res = b'\x00\x00\x00\\x00\x12\x00\x01\x00\x00\x00\x07\x00\x00\x00\x01\x00\x00{"room_id":"video://' + str(av).encode('utf-8') +  '/'.encode('utf-8') + cid.encode('utf-8') + '","platform":"web","accepts":[1000]}'.encode('utf-8') return res def get_online(text): cache = text.find(b'"online":') # print(cache) cache2 = text[cache+9:].find(b',') get = int(text[cache+9:cache2+9+cache]) print(get) return get def connect(plz): url = "wss://broadcast.chat.bilibili.com:7823/sub" normal = base64.b64decode('AAAAIQASAAEAAAACAAAACQAAW29iamVjdCBPYmplY3Rd') ws = websocket.create_connection(url,timeout=10) ws.send(bytes(plz)) get = ws.recv() print(get) print(normal) ws.send(bytes(normal)) get = ws.recv() print(get) if get.find(b'online') ! = -1: # online = get_online(get) online=get_online(get) return online else : print("None") def get_online_from_av(av): send = make_send(av) online = connect(send) return online def write_file(onlines,times): with open(file_name,'a') as file_obj: file_obj.write(str(times) + ',' + str(onlines) + '\r') get_online_from_av(85919470) # file_name = str(input("File_Name(a.txt):")) # avid = int(input('AVid(85919470):')) # while True: # online=get_online_from_av(avid) # now_time = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S') # print(now_time) #  # write_file(online,now_time) # time.sleep(60)Copy the code

\