This section describes the generation mechanism of mongodb-objectid

This is the 10th day of my participation in Gwen Challenge

Introduction to the

The following is an excerpt from the introduction of the rookie tutorial

MongoDB is a database based on distributed file storage

Documents stored in MongoDB must have an “_id” key. The value of this key can be any type, and the default is an ObjectId object

ObjectId is a 12-byte BSON data:

  • The first four bytes represent the timestamp
  • The next three bytes are the machine id
  • The next two bytes consist of the process ID (PID)
  • The last three bytes are auto-increment random numbers

According to this statement, 1 second can theoretically produce 2^24 (16777216), let’s explore whether this is true

Source code analysis

To do this, find the mongodb/js-bson/objectid location

In the js-bson library

The constructor

Locate to constructor: 47-96

As you can see from the type definition, multiple types of parameters can be passed in

Here, we only consider the case where the parameter passed in is undefined. The simplified code looks like this

const kId = Symbol('id');
class ObjectId{
    static index = ~~(Math.random() * 0xffffff)

    constructor(id? :string | Buffer | number | ObjectIdLike | ObjectId) {
      // The most common use case (blank id, new objectId instance)
      if (id == null || typeof id === 'number') {
        // Generate a new id
        this[kId] = ObjectId.generate(typeof id === 'number' ? id : undefined); }}}Copy the code

First, Symbol globally generates a unique key to hold the generated ObjectId

Class also has a random static variable index, which is used to generate an auto-increment random number, as we’ll see later

Null == undefined results in true, so we call static method generate when nothing is passed in

This simplifies to a JS class code as follows

const kId = Symbol('id');
class ObjectId{
    static index = ~~(Math.random() * 0xffffff)
    constructor(id) {
      if (id == null || typeof id === 'number') {
        this[kId] = ObjectId.generate(typeof id === 'number' ? id : undefined); }}}Copy the code

Objectid. generate the method to generate an ID

The true face of lushan is revealed, and in this method, the generation of the aforementioned “4” part of the structure is involved

The article I read is out of date:

  • 1-4 bytes: timestamp
  • 5-9 bytes: Process PID (actually random 5 bytes of content)
  • 10-12 bytes: random number increased by itself
class ObjectId {
    staticgenerate(time? :number): Buffer {
        if ('number'! = =typeof time) {
            time = ~~(Date.now() / 1000);
        }

        const inc = ObjectId.getInc();
        const buffer = Buffer.alloc(12);

        // 4-byte timestamp
        buffer.writeUInt32BE(time, 0);

        // set PROCESS_UNIQUE if yet not initialized
        if (PROCESS_UNIQUE === null) {
            PROCESS_UNIQUE = randomBytes(5);
        }

        // 5-byte process unique
        buffer[4] = PROCESS_UNIQUE[0];
        buffer[5] = PROCESS_UNIQUE[1];
        buffer[6] = PROCESS_UNIQUE[2];
        buffer[7] = PROCESS_UNIQUE[3];
        buffer[8] = PROCESS_UNIQUE[4];

        // 3-byte counter
        buffer[11] = inc & 0xff;
        buffer[10] = (inc >> 8) & 0xff;
        buffer[9] = (inc >> 16) & 0xff;

        returnbuffer; }}Copy the code

Here’s a look at the Buffer in the code:

  • Buffer.alloc(size[, fill[, encoding]]): Specifies a contiguous space in memory with a specified number of bytes for storing binary data. This space is used by default0To fill
  • Buffer arrays can use direct assignment just like normal arrays
  • Buffer.writeUInt32BE(value[, offset]): Writes the value to buffer as the number of 32-bit unsigned certificates, the number of bytes offset by the second argument bit (since it is a 32-bit integer, offset must be an integer multiple of 4)

A simple demonstration of writeUInt32BE

const buffer = Buffer.alloc(8)
// <Buffer 00 00 00 00 00 00 00 00>

// 0x indicates a hexadecimal number
buffer.writeUInt32BE(0xff.0)
// <Buffer 00 00 00 ff 00 00 00 00>

buffer.writeUInt32BE(255.4)
// <Buffer 00 00 00 ff 00 00 00 ff>
Copy the code

Here is a brief introduction to the base conversion knowledge to help understand the source code

Hexadecimal conversion

// Generate a contiguous space of 12 bytes
const buffer = Buffer.alloc(12)
// <Buffer 00 00 00 00 00 00 00 00 00 00 00 00>
Copy the code

1 Byte is equal to 8 bits (bit-binary bits). The storage range ranges from 0 to 2^8-1, that is, 0-255. There are 256 digits in total

The contents of each buffer are represented by two hexadecimal bits (00 to FF).

Binary bit 0, 0, 0 the corresponding bit value (base 10) 8, 4, 2, 1Copy the code

For example (see the corresponding values above)

Binary 0, 1, 0, 1# equivalentDecimal 5 = 0 + 4 + 0 + 1# equivalentHexadecimal 5Copy the code
Binary 1, 1, 0, 1# equivalentDecimal 13 = 8 + 4 + 0 + 1# equivalentHexadecimal dCopy the code

Examples of numbers stored in Buffer (which will be converted to base automatically) :

const buf = Buffer.alloc(1)
buf[0] = 12
// <Buffer 0c>
buf[0] = 15
// <Buffer 0f>
buf[0] = 255
// <Buffer ff>
Copy the code

Let’s get back to the point

Generation of timestamp

Use date.now () to get the current time, rounded by 1000

  • ~ : indicates that the bit operator is reversed
  • Bitwise operators operate on binary
  • ~~ : This parameter is reached when two consecutive reverse operations are performedintegerThe purpose, the behavior of positive numbers andMath.floorConsistent, subordinate behavior withMath.ceilconsistent
console.log(~~(12.5)) / / 12
console.log(Math.floor(12.5)) / / 12

console.log(~~(-12.5)) / / - 12
console.log(Math.ceil(-12.5)) / / - 12
Copy the code

Get the timestamp

const time = ~~(Date.now() / 1000);
Copy the code

Save the first 4 bytes

// 4-byte timestamp
buffer.writeUInt32BE(time, 0);
Copy the code

Random process PID generation

You can see that this is a random 5-byte number generated with the randomBytes method

import {randomBytes} from './parser/utils'

PROCESS_UNIQUE = randomBytes(5);
Copy the code

RandomButes, condensed, looks like this

// parser/utils
const detectRandomBytes = (): RandomBytesFunction= > {
  if (typeof global! = ='undefined' && global.crypto && global.crypto.getRandomValues) {
    return size= > global.crypto.getRandomValues(Buffer.alloc(size));
  }

  let requiredRandomBytes: RandomBytesFunction | null | undefined;
  try {
    requiredRandomBytes = require('crypto').randomBytes;
  } catch (e) {
  }

  return requiredRandomBytes || insecureRandomBytes;
};

export const randomBytes = detectRandomBytes();
Copy the code

The test actually calls require(‘crypto’).randomBytes

The following code calls a method provided by Node’s built-in crypto library

TODO: Next time I’ll introduce the principle of require(‘crypto’).randomBytes

export const randomBytes = require('crypto').randomBytes;
Copy the code

The resulting code for these five bytes is as follows

const randomBytes = require('crypto').randomBytes;

const PROCESS_UNIQUE = randomBytes(5);

// 5-byte process unique
buffer[4] = PROCESS_UNIQUE[0];
buffer[5] = PROCESS_UNIQUE[1];
buffer[6] = PROCESS_UNIQUE[2];
buffer[7] = PROCESS_UNIQUE[3];
buffer[8] = PROCESS_UNIQUE[4];
Copy the code

Self-increasing random number

class ObjectId {
    static index = ~~(Math.random() * 0xffffff)
    static getInc(): number {
      return (ObjectId.index = (ObjectId.index + 1) % 0xffffff);
    }
    staticgenerate(time? :number): Buffer {
        const inc = ObjectId.getInc();
        // omit intermediate extraneous code
        // 3-byte counter
        // Save the last 2 bytes
        buffer[11] = inc & 0xff;

        // Shift 8 bits to the right and then store the lowest 2 bytes to the 11th bit
        buffer[10] = (inc >> 8) & 0xff;

        // Move 16 bits to the right, saving the lowest 2 bytes to the 10th bit
        buffer[9] = (inc >> 16) & 0xff;

        returnbuffer; }}Copy the code
  1. Generates a random 3 – byte number~~(Math.random() * 0xffffff)
  2. Increment by 1 as the final random number

Simplify the ObjectId implementation

After disassembling the ObjectId, real three parts above, implement a minimum feasible method

const randomBytes = require('crypto').randomBytes
const kId = Symbol('id');

let PROCESS_UNIQUE = null;

class MyObjectId {
    static index = ~~(Math.random() * 0xffffff)
    constructor(id) {
        if (id == null || typeof id === 'number') {
            this[kId] = MyObjectId.generate(typeof id === 'number' ? id : undefined); }}static getInc() {
      return (MyObjectId.index = (MyObjectId.index + 1) % 0xffffff);
    }
    static generate(time) {
        if ('number'! = =typeof time) {
            time = ~~(Date.now() / 1000);
        }

        const inc = MyObjectId.getInc();
        const buffer = Buffer.alloc(12);

        // 4-byte timestamp
        buffer.writeUInt32BE(time, 0);

        // set PROCESS_UNIQUE if yet not initialized
        if (PROCESS_UNIQUE === null) {
            PROCESS_UNIQUE = randomBytes(5);
        }

        // 5-byte process unique
        buffer[4] = PROCESS_UNIQUE[0];
        buffer[5] = PROCESS_UNIQUE[1];
        buffer[6] = PROCESS_UNIQUE[2];
        buffer[7] = PROCESS_UNIQUE[3];
        buffer[8] = PROCESS_UNIQUE[4];

        // 3-byte counter
        buffer[11] = inc & 0xff;
        buffer[10] = (inc >> 8) & 0xff;
        buffer[9] = (inc >> 16) & 0xff;

        return buffer;
    }
    toHexString(){
        return this[kId].toString('hex')}}module.exports = {
    MyObjectId
}
Copy the code

test

const { ObjectId } = require('mongodb')
const { MyObjectId } = require('./myObjectId')
console.log(new ObjectId().toHexString());
console.log(new ObjectId().toHexString());
console.log(new ObjectId().toHexString());
console.log(new MyObjectId().toHexString());
console.log(new MyObjectId().toHexString());
console.log(new MyObjectId().toHexString());
Copy the code

The results were as expected

Complete example source code address

conclusion

  • Some of the translated materials on the Internet are indeed out of date
  • Take the time to look at simple source code, or to help review their knowledge
  • Through reading excellent source code, help to speed up cultivation
  • Bit operators are not used much in the development, but there are many excellent libraries, remind yourself down or more familiar with it, see if it can be used in computing scenarios, improve computing efficiency

TODO: Write an article on bitwise computing and learn about its use in a good open source library