Writing in the front

Some time ago, my friend sent me a link on wechat, asking me to open it on my computer.

“My computer does not install wechat” “the content is very rich” “Wait, my account seems not to board the webpage version of wechat” “you go to the browser to enter, I will read to you” “…”

I don’t know if you have ever had this kind of trouble, so my husband wrote a Web application in his spare time, using a browser to realize cross-end data transmission without installing other software. 👉 ox. Vanoc. Top

Method of use

Start by opening ox.vanoc.top in your PC browser

This is the homepage designed by my husband, and it is the only one page. It has two main functions:

  • Click on the leftComputer copy to phoneA two-dimensional code will pop up, scan the two-dimensional code with a mobile phone will appear a connection prompt, when the connection is successful, there will be a beating prompt on the two-dimensional code, this time you can copy a paragraph of text or pictures on the computer, and then run to the pageControl+VIt sends the contents of the clipboard to the phone.
  • Click on the sidePhone copy to computerA QR code will pop up, scan the code to establish the connection process as above. After the connection is successful, you can paste or input any text in the input box of the mobile phone. You can also select the picture and send it to the computer. After the connection is successful, the sent content will be automatically insertedComputer clipboardAnd then we can paste the content manually on the computer.I also added a photo recognition feature that allows you to send pictures from your phone to your computer, and then hold down and drag on the computer to select the text you want to recognize.

The two functions work exactly the same way, from establishing a connection to transferring data. Why split into two functions? Yeah, just to get clicks.

When you first enter the page, you can also see my carefully prepared beginner’s guide, or you can also read the illustrated guide:

  • Computer copy to phone
  • Phone copy to computer

WebRTC in front

WebRTC connection is the core of the application of this paper. It is a set of open source standards released by Google in 2011 to establish P2P real-time audio and video transmission, including hardware, audio and video drivers, various data transmission protocols, STUN/TURN server and SDP, etc.

Nowadays, in modern Web browsers, it is easy to establish WebRTC connections through RTCPeerConnection, MediaDevice and other apis for single point transmission of audio and video streams or other data buffers. For example, the main means of establishing P2P transmission in this paper are provided by the RTCPeerConnection interface. Below, I will briefly introduce the ideas and principles of WebRTC establishing connection for data transmission.

WebRTC and WebSocket

When it comes to data communication between Web, WebSocket will come to mind, which is stable and safe. But based on TCP WebSocket, in order to ensure the reliability of transmission, add a pile of packet head not to say, send a message to a cipher, not to send again. The main problem, of course, is that the old man is currently a month 1500 food costs of financial resources, rent the server bandwidth is small very slow, may spread a few big pictures may owe fees.

In contrast, WebRTC is used to establish a P2P single point of communication, not only the transmission rate is fast, and the P2P connection is not to take the server traffic, it is quite cost-effective. Of course, the disadvantage is that in the domestic network environment, the probability of successful NAT penetration is only about 50%. Once the penetration fails, weBRTC switches to the trunk mode and uses the specified server for traffic transfer to ensure data transmission.

Dsp and signaling server

[Photo from the Internet]

First of all, let’s take a look at the process of webRtc establishing P2P, as shown in the figure above:

  1. Device A creates an Offer(SDP) and sends it to device B.
  2. Device B receives the Offer, sets it to remote, creates an Answer(SDP), and sends the Offer to device A.
  3. A The device receives the Answer and sets it to the remote remote.
  4. Ask the STUN server to get the ICecandidate
  5. NAT traversal establishes a connection

So the first step in establishing a WebRTC connection is to exchange SDP, so what exactly is SDP?

SDP is actually a text describing various hardware and status of the device, such as network status, audio and video device parameters, ICE candidates, and so on

V =0 Mins O = -6005770948690505604 2 IN IP4 127.0.0.1 Mins S =-...Copy the code

WebRTC needs to compare the SDP of the two parties and negotiate the most suitable connection mode according to their respective capabilities, so we need to use a Web server to exchange each other’s rings, but not each other’s SDP, this server is generally called signaling server.

Stun/TURN Server and NAT traversal

After confirming each other’s SDP information, the communication parties can try NAT penetration. WebRTC solves the problem of network penetration through the ICE framework, which includes STUN/TURN technology.

stunThe main purpose of the server is to return the received UDP packets to the client, return the public IP address and port, and check the connectivity between the two sides. It is relatively simple to implement, so there are many publicly available, freely available STUN addresses available on the web.[Photo from Internet]

If STUN server NAT fails, then we need turn server to realize indirect communication. This server needs to be built by itself. Compared with STUN, it mainly adds relay server, that is, data transmitted through TURN server will be charged by traffic of its own server.

Generally, multiple STUN/TURN server addresses can be set in the interface provided by webRTC. WebRTC tries all servers and returns the corresponding ICecandidate. Then, the signaling server exchanges the ICecandidate of both parties and tries to establish a connection.

const ICE_config= {
  'iceServers': [{'url': 'stun:stun.l.google.com:19302'
    },
    {
      'url': 'turn: 192.158.29.39:3478? transport=udp'}, {'url': 'turn: 192.158.29.39:3478? transport=tcp',}}]const pc = new RTCPeerConnection(ICE_config);
Copy the code

journey

The figure above shows the connection establishment process. After the user creates a QR code on the PC, a unique UID is generated on the signaling server, and an expired connection status object (stateInfo) is cached by the LRU algorithm using the UID as the key value, and the UID is returned to the PC. The PC uses uid to poll stateInfo for state synchronization and updates to StateInfo when an offer or ICE is generated locally.

When the phone scans the code, it gets the UID of stateInfo on the PC, and then uses the UID to poll and synchronize the stateInfo state just as on the PC. However, if I try NAT penetration for more than a certain period of time, I will consider it as a connection failure and switch to a relay server (Websocket) for data transmission.

Of course, a mobile device will record the current network environment and UID after switching to Websocket. When connecting with the same network environment and the same PC, it will directly switch to WebSocket mode, saving unnecessary waiting time.

Note: The code snippet below is not complete code and may not run or have bugs. It is for reference only

Setting up a Signaling Server

There are many ways to set up a signaling server, such as polling to maintain signaling on the server or webscoket to synchronize signaling directly.

Because the application concurrency is not very high, so using Node KOa simple machine a server, to HTTP polling way to achieve signaling exchange.

You first need an interface to create StateInfo

router.post('/create'.async (ctx) => {
    const uid = createUid()
    cache.set(uid, { 
      offer: ' '.answer: ' '.status: 1.candidate1: [].candidate2: [],
    })
    
    ctx.body = {
        code: 0.data: {
            uid
        }
    }
})
Copy the code

Optimized to generate uid only once per PC, saving memory and providing the possibility of fast reconnection. We can add a UID check middleware:

module.exports = () = > async (ctx, next) => {
  let uid = ctx.cookies.get(names.CLIENT_UID_NAME)
  if(! uid) { uid = createUid(ctx) ctx.cookies.set(names.CLIENT_UID_NAME, uid, {httpOnly: true })
  }

  ctx.state.uid = uid
  return next()
}

// create The interface is changed to
router.post('/create'.async (ctx) => {
    const uid = ctx.state.uid
    ...
Copy the code

A check interface is also provided for polling status

router.get('/check'.async (ctx) => {
  const { uid } = ctx.state
  const data = cache.get(uid)
  if(! data) { ctx.body = {code: errorCode.NOT_FIND, message: errorCode.NOT_FIND_MESSAGE }
    return
  }

  ctx.body = { code: 0, data, message: ' '}})Copy the code

When client state updates need to be synchronized to server stateInfo, an update interface update is required

router.post('/update'.async (ctx) => {
  const { body } = ctx.request
  const { uid } = ctx.state
  
  const stateinfo = cache.get(uid)
  
  if (body.offer) {
      stateinfo.offer = body.offer
      stateinfo.status = 2
  }
  
  // Switch to websocket
  if (body.upgrade) {
      stateinfo.status = -1}... })Copy the code

Setting up a Relay server

Webrtc’s success rate of NAT penetration in China is less than 50%, so a relay service is required to ensure successful data delivery when the penetration fails. Originally it used the turn server for relay, but it couldn’t find a free one, so it usually deployed itself with Coturn. But this thing depends on the database, either mysql or redis what, with my only 0.5G memory ali Cloud ECS can only stop, anyway, are running their own traffic, directly use webscoket relay.

Websocket is an upgrade of HTTP. The Client sends a upgraderequest, which is an HTTP request. The server returns 101, the upgrade succeeds, and the Webscoket connection starts. So when we don’t want to start another Node server, we just need to customize the upgrade function of the HTTP server to realize that the HTTP server and the Websocket server share the same port.

module.exports = function createSocket(server) {

  const socketServer = new webSocket.Server({ noServer: true })
  
  socketServer.on('connection'.function connection(ws, uid, type) {
    setSocketRoom(uid, type, ws)
    ws._room_type = type
    ws._room_id = uid
    
    ws.on('message'.function incoming(message) {
      sendMessage(this._room_id, this._room_type, message)
    })

    ws.on('close'.function onclose() {
      closeRoom(this._room_id)
    })
  })

  server.on('upgrade'.(request, socket, head) = > {
    const urlParsed = request.url.match(/^\/ws\/\? uid=(.+)? &type=(.+)$/)
    const [_, uid, type] = urlParsed

    initSocketRoom(uid)
    if (hasConnectedRoom(uid)) {
      socket.write('HTTP2.0 401 room is Connected \r\n\r\n')
      socket.destroy()
      return
    }

    socketServer.handleUpgrade(request, socket, head, function done(ws) {
      socketServer.emit('connection', ws, uid, type); }); })}Copy the code

Image fragment transmission

There are many articles about how to use WebrTC to create a dataChannel for data transmission, which will not be covered here. It supports data types such as string, bolb, arraybuffer, etc., and there is a limit on the size of a single transfer of data. This limit varies in different browsers, such as chrome /firfox, which can reach 256kb. In ios safari, it’s only 16kb. Therefore, some large data, such as images, need to be transmitted in fragments.

First we turn the image into an ArrayBuffer object, which we divide into 8KB segments to be safe

 public sendFileSlice(file: File) {
    if (file.size > 1024 * 1024 * 10) {
      message.error('The picture is too big for my husband to carry.')
      return
    }
    const fragmentSize = 1024 * 8; // The maximum size of each send is 8KB
    const fileReader = new FileReader();
    fileReader.onload = () = > {
      const buffer = fileReader.result as ArrayBuffer;
      const size = buffer.byteLength;
      const partitionLength = Math.abs(size / fragmentSize);
      const fragments = Array<ArrayBuffer> ();for (let i = 0; i < partitionLength; i++) {
        fragments.push(buffer.slice(i * fragmentSize, (i + 1) * fragmentSize));
      }
      this.sendFragmentArrayBuffer(fragments, file.type);
    };
    fileReader.readAsArrayBuffer(file);
  }
Copy the code

We can define some type enumerations for the transmitted data and carry these heads with us each time a message is sent so that the receiver can do different things with different heads

export enum MessageActionHead {
  // System communication
  NATIVE = "[native]".// The connection was successful
  CONNECT = "[connect]".// Send text data
  TEXT = "[text]".// Image buffer, because send has a size limit, I use fragment transmission
  BUFFER = "[buffer]".// The buffer fragment received an acknowledgement reply
  INFO = "[info]"./ / heartbeat packets
  HEART = "[heart]".// Close the connection
  CLOSE = "[close]".// Suspend. Js will be suspended when the mobile terminal selects a picture. In this case, it is necessary to stop heartbeat packet monitoring
  HOLDER = "[holder]",}Copy the code

When the receiver receives BUFFER characters, it means that fragment data collection is needed. After receiving each BUFFER, the sender sends back INFO characters, and then sends the next BUFFER to ensure the timing of the packet.

Operate computer clipboard

There are two ways to copy to the clipboard in the browser

The first is to use input + execCommand to select the content in the input and then call execCommand(‘copy’) to copy it, but this method only copies the text to the clipboard and requires user click event authorization.

The second method is to modify the clipboard content directly by manipulating the navigator.clipboard object. This method is non-perceptive and takes into account both text and images

const tryCopyClipboard = async (
  type: MessageActionHead,
  content: string | Blob
) => {
  if(! hasClipboard)return 0;
  if (type === MessageActionHead.TEXT) {
    try {
      const copyText = (content as string).replace(/\\n/g.'\n')
      await navigator.clipboard.writeText(copyText);
      return true;
    } catch (e) {
      console.log(e);
      return false; }}if (type === MessageActionHead.BUFFER) {
    try {
      const targetType = (content as Blob).type;
      let copyTargetBlob = content as Blob;
      if (targetType.endsWith("jpg") || targetType.endsWith("jpeg")) {
        copyTargetBlob = await jpgBlobToPng(content as Blob);
      }
      await navigator.clipboard.write([
        new window.ClipboardItem({
          [copyTargetBlob.type]: copyTargetBlob,
        }),
      ]);
      return true;
    } catch (e) {
      console.log(e);
      return false; }}return false;
};
Copy the code

Recognize the text in the picture

Tesseract.js is composed of OCR engine and language package. OCR will download the corresponding language package to Github according to the language type specified during initialization. The Chinese language package is about 30M.

Tessearct.js API is very simple, but the problem is that the language package is put on Github, domestic access can not, how to solve the problem?

Tessearct language package loading source code can be seen that it will first search the cache in the browser indexDB, can not find the re-download, so we only need to download the required voice package from Github and upload it to our OWN CDN. Then manually download the language package from the CDN to the client and cache it into indexDB before the client initializes Tessearct.

// Cache check
export async function catchDistrictFile(processCallback? : (event: ProgressEvent) =>void, successCallback? : () = >void
) {
  try {
    await initDBConnect();
    const hasChCache = await findDB(CH_DATA_NAME);
    const hasEnCache = await findDB(EN_DATA_NAME);

    / / if
    if (hasChCache) {
      const buffer = await loadFile(
        `The ${import.meta.env.VITE_CDN_URL}gz/${CH_DATA_NAME}.gz`,
        processCallback
      );

      setBuffer('/' + CH_DATA_NAME, gzip.gunzipSync(buffer));
    }
    if (hasEnCache) {
      const buffer = await loadFile(
        `The ${import.meta.env.VITE_CDN_URL}gz/${EN_DATA_NAME}.gz`,
        processCallback
      );
      setBuffer('/' + EN_DATA_NAME, gzip.gunzipSync(buffer));
    }
    successCallback && successCallback();
  } catch (e) {
    console.log(e); }}async function readText () {
    await catchDistrictFile()
    await worker.load();
    await worker.loadLanguage("eng+chi_sim");
    await worker.initialize("eng+chi_sim", OEM.TESSERACT_ONLY);
    await worker.setParameters({
      tessedit_pageseg_mode: PSM.SINGLE_BLOCK,
    });
}
Copy the code

The last

Although webRTC successfully establishes P2P connections less than 50 percent of the time, it saves at least half of the bandwidth and data charges compared with direct server forwarding.

This article is participating in the “Nuggets 2021 Spring Recruitment Campaign”, click to see the details of the campaign