This is the sixth day of my participation in the November Gwen Challenge. See details: The Last Gwen Challenge 2021.

Author: Mintimate

Blog: www.mintimate.cn

Minxml’s Blog, just to share with you

Short link

What is a short link

Short link, also known as abbreviated URL service, shortening, shortening, short URL, shortened URL, shortened URL, refers to a technology and service on the Internet.

This service can shorten long URLS by providing short urls instead of what might have been longer urls.

When users access a shortened URL, they are usually redirected to the original URL.

Why short links

Using short links, the main scenarios are:

  • Twitter, Weibo and other platforms, message word limit, the use of short links to shorten the original link.
  • Hide the Get and PATH parameters.

Examples demonstrate

Some small partners may still have no concept, here is a Tencent cloud with a short link. For example: Tencent cloud activity link is:

https://cloud.tencent.com/act/cps/redirect?redirect=1077&cps_key=&from=console
Copy the code

Tencent cloud external to the short link:

https://curl.qcloud.com/XnaFxKqr
Copy the code

As you can see, the link is effectively shortened. At the same time, it’s out of sightPATHandGetParameters. If the user accesses the short link, 301/302 is automatically redirected to the original link:

Implementation approach

In fact, the implementation idea is very simple, we generate a short link, the general idea is to pass in the original link, after processing in the background,Get a unique identifier, store it in the database, and finally display the unique identifier back to the user.

After obtaining the short link, the user sends it to other users for access. The background will query the database according to the identification code (in a perfect system, there will be Redis for caching), and finally redirect to the original link:

So, actually, the implementation is very simple. The main point is:

  • Generate a uniqueIdentification code, corresponding to the link, and the identification code should be short.
  • The background of 301/302Redirect redirect.

How do you achieve these two points?

Using Java (Springboot) as an example, other programming languages can follow suit.

Unique identifier

Each time the background receives the response from the foreground, it generates an identification code and stores it in the database for subsequent retrieval redirection.

This identifier is best timestamp dependent and, if multiple servers are networked at the same time, a mechanical identifier.

I’m sure many of you already know what I’m going to use…

To sum up, we can use snowflake ID; At the same time, the snowflake ID is a Long, 19 bits Long int, which is definitely too Long, but definitely not repeated.

Snowflakes ID

Snowflake is an algorithm that generates distributed globally unique IDs called Snowflake IDs or snowflakes.

This algorithm was created by Twitter and used for the ID of the tweet. Other companies, such as Discord and Instagram, have adopted the revised version. A snowflake ID:

  • The first 41 bits are time stamps:This means that snowflake ids can be sorted.
  • The next 10 digits represent computer ids for distributed storage.
  • The remaining 12 bits represent the serial numbers that generate ids on each machine: for distributed storage and clustering.

Java version reference code:

/ * * * Twitter SnowFlake algorithm, using the SnowFlake algorithm to generate an integer, and then into 62 base into a short address URL https://github.com/beyondfengyu/SnowFlake * * * /
public class SnowFlakeShortUrl {

    /** * the initial timestamp */
    private final static long START_TIMESTAMP = 1480166465631L;

    /** * The number of bits each part occupies */
    private final static long SEQUENCE_BIT = 12;   // The number of digits occupied by the serial number
    private final static long MACHINE_BIT = 5;     // The number of bits occupied by the machine identifier
    private final static long DATA_CENTER_BIT = 5; // The number of bits occupied by the data center

    /** * The maximum value of each part */
    private final static long MAX_SEQUENCE = -1L ^ (-1L << SEQUENCE_BIT);
    private final static long MAX_MACHINE_NUM = -1L ^ (-1L << MACHINE_BIT);
    private final static long MAX_DATA_CENTER_NUM = -1L ^ (-1L << DATA_CENTER_BIT);

    /** * The displacement of each part to the left */
    private final static long MACHINE_LEFT = SEQUENCE_BIT;
    private final static long DATA_CENTER_LEFT = SEQUENCE_BIT + MACHINE_BIT;
    private final static long TIMESTAMP_LEFT = DATA_CENTER_LEFT + DATA_CENTER_BIT;

    private long dataCenterId;  // Data center
    private long machineId;     // Machine id
    private long sequence = 0L; / / serial number
    private long lastTimeStamp = -1L;  // Last time stamp

    private long getNextMill(a) {
        long mill = getNewTimeStamp();
        while (mill <= lastTimeStamp) {
            mill = getNewTimeStamp();
        }
        return mill;
    }

    private long getNewTimeStamp(a) {
        return System.currentTimeMillis();
    }

    /** * Generates the specified serial number ** based on the specified data center ID and machine flag ID@paramDataCenterId dataCenterId *@paramMachineId machineId */
    public SnowFlakeShortUrl(long dataCenterId, long machineId) {
        if (dataCenterId > MAX_DATA_CENTER_NUM || dataCenterId < 0) {
            throw new IllegalArgumentException("DtaCenterId can't be greater than MAX_DATA_CENTER_NUM or less than 0!");
        }
        if (machineId > MAX_MACHINE_NUM || machineId < 0) {
            throw new IllegalArgumentException("MachineId can't be greater than MAX_MACHINE_NUM or less than 0!");
        }
        this.dataCenterId = dataCenterId;
        this.machineId = machineId;
    }

    /** * generates the next ID **@return* /
    public synchronized long nextId(a) {
        long currTimeStamp = getNewTimeStamp();
        if (currTimeStamp < lastTimeStamp) {
            throw new RuntimeException("Clock moved backwards. Refusing to generate id");
        }

        if (currTimeStamp == lastTimeStamp) {
            // The serial number increases within the same millisecond
            sequence = (sequence + 1) & MAX_SEQUENCE;
            // The number of sequences in the same millisecond has reached the maximum
            if (sequence == 0L) { currTimeStamp = getNextMill(); }}else {
            // Set the serial number to 0 for different milliseconds
            sequence = 0L;
        }

        lastTimeStamp = currTimeStamp;

        return (currTimeStamp - START_TIMESTAMP) << TIMESTAMP_LEFT // The timestamp part
                | dataCenterId << DATA_CENTER_LEFT       // Data center section
                | machineId << MACHINE_LEFT             // Machine identifier section
                | sequence;                             // Serial number section
    }
    
    public static void main(String[] args) {
        SnowFlakeShortUrl snowFlake = new SnowFlakeShortUrl(2.3);

        for (int i = 0; i < (1 << 4); i++) {
            / / decimalSystem.out.println(snowFlake.nextId()); }}}Copy the code

If you are using Mybatis Plus, you can use the idworker. getId method of Mybatis Plus to generate the snowflake ID.

The generated Long type, which we expand in decimal, should be a 17-19 digit number.

Sixty binary

Since the snowflake ID is a 17-19 digit number by decimal expansion, it is too long to use as a short link, so we need to shorten it.

To ensure that it is unique and comparable. Let’s convert to sixty binary.

The reason is simple: sixty binary uses a-z, A-z, and 0-9. Converting from decimal to sextuple is a great way to shorten the length.

According to the relevant provisions of the Wiki – Base62 sixty abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz binary corresponds to a scale of 0-61 in 0123456789. So, we encode and decode:

/** * Initializes the 62-base data with the index position representing the conversion character values 0-61, such as A for 1 and z for 61 */
    private static String CHARS = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";

    /** * Base conversion ratio */
    private static int SCALE = 62;

    /** * The matching string contains only numbers and uppercase letters */
    private static String REGEX = "^[0-9a-zA-Z]+$";

    /** * A decimal number is converted to a 62-base string **@paramVal Indicates the decimal number *@returnThe 62-digit character string is */
    public static String encode10To62(long val)
    {
        if (val < 0)
        {
            throw new IllegalArgumentException("this is an Invalid parameter:" + val);
        }
        StringBuilder sb = new StringBuilder();
        int remainder;
        while (Math.abs(val) > SCALE - 1)
        {
            // Conversion begins with the last digit, takes the converted value, and finally reverses the string
            remainder = Long.valueOf(val % SCALE).intValue();
            sb.append(CHARS.charAt(remainder));
            val = val / SCALE;
        }
        // Get the highest bit
        sb.append(CHARS.charAt(Long.valueOf(val).intValue()));
        return sb.reverse().toString();
    }

    /** * A decimal number is converted to a 62-base string **@paramVal 62 hexadecimal string *@returnThe decimal number */
    public static long decode62To10(String val)
    {
        if (val == null)
        {
            throw new NumberFormatException("null");
        }
        if(! val.matches(REGEX)) {throw new IllegalArgumentException("this is an Invalid parameter:" + val);
        }
        String tmp = val.replace("^ 0 *"."");

        long result = 0;
        int index = 0;
        int length = tmp.length();
        for (int i = 0; i < length; i++)
        {
            index = CHARS.indexOf(tmp.charAt(i));
            result += (long)(index * Math.pow(SCALE, length - i - 1));
        }
        return result;
    }
Copy the code

Test it again:

//Test
public static void main(String[] args) {
    Long snow = IdWorker.getId();
    System.out.println(snow);
    String str = encode10To62(snow);
    System.out.println(str);
    Long g = decode62To10(str);
    System.out.println(g);
}
Copy the code

Output:

1425664925648310274
1hJYkVByV0M
1425664925648310274
Copy the code

As you can see, this produces a short, non-repeatable string. This is appropriate for short link generation:

  • Each short link degree is basically fixed.
  • Short link length should not be too long.
  • Generated short links are sortable (chronological sort)

Response headers

The response header is important for redirecting links. Nginx allows you to directly jump 301/302 using configuration, such as forcing HTTPS:

if ($server_port !~ 443).{
    rewrite^ (/. *) $ https://$hostThe $1 permanent;
}
Copy the code

And we build short link platform, also use 301 or 302 for redirection:

301/302

301 and 302 are both redirects, so what’s the difference?

  • 301:Permanent redirectionIs used when the requested URL has been removed. The location header of the response should contain the current URL of the resource
  • 302:Temporary redirectionSimilar to permanent redirection, the client applies location to give a URL to locate the resource temporarily, and future requests remain at the same URL.

In a real scenario, after a redirect, the browser remembers the redirect and requests a new address instead of the original one. Therefore, 301 is usually used to migrate the domain name of a website, forcing HTTPS for the website, while 302 is usually used for website maintenance, requiring temporary redirecting to non-maintenance pages.

So what redirection do we need to build a short link platform? I think it’s both. Using 301 redirection can reduce server load, while using 302 redirection can facilitate us to count the actual number of links fetched.

RedirectView; HttpStatus;

# RedirectView class RedirectView =new RedirectView(fullURL);
# 301Jump redirectView. SetStatusCode (HttpStatus. MOVED_PERMANENTLY); #302Jump redirectView. SetStatusCode (HttpStatus. MOVED_TEMPORARILY);Copy the code

Actually, lookHttpStatusAs you can see, here enumerates a number of HTTP response headers:

Maven deployment (Code implementation)

Finally, let’s look at the actual deployment and code implementation. Just provide ideas casually, the code may have logic is not rigorous place ao. Actual development should also require Redis buffering to avoid database overloads.

MariaDB is used as the database, Mybatis Plus is used to operate the database, Springboot provides the framework and easy packaging.

Again: Actual development should also require Redis buffering to avoid database overloads.

Depend on the package

First, we create a project where Lombok is designed to make it easy for entity classes to generate Set/Get methods:

<dependencies>
<! -- Springboot-->
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-web</artifactId>
    </dependency>
<! -- MariaDB driver -->
    <dependency>
        <groupId>org.mariadb.jdbc</groupId>
        <artifactId>mariadb-java-client</artifactId>
        <scope>runtime</scope>
    </dependency>
<! -- MybatisPlus-->
    <dependency>
        <groupId>com.baomidou</groupId>
        <artifactId>mybatis-plus-boot-starter</artifactId>
        <version>3.4.0</version>
    </dependency>
<! -- Lombok plugin -->
    <dependency>
        <groupId>org.projectlombok</groupId>
        <artifactId>lombok</artifactId>
        <optional>true</optional>
    </dependency>
<! -- Test-->
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-test</artifactId>
        <scope>test</scope>
    </dependency>
</dependencies>
Copy the code

Entity class

Let’s look at the short linked entity class:

@Data
@NoArgsConstructor
public class ShortUrl {
    @TableId
    private Long id;
    private String baseUrl;
    private String suffixUrl;
    private String fullUrl;
    private String shortCode;
    @TableField(fill = FieldFill.INSERT)
    private Integer totalClickCount;
    @TableField(fill = FieldFill.INSERT)
    private Date expirationDate;
    
    public ShortUrl(String baseUrl, String suffixUrl, String fullUrl) {
        this.baseUrl = baseUrl;
        this.suffixUrl = suffixUrl;
        this.fullUrl = fullUrl; }}Copy the code

Among them:

  • BaseUrl: the original link domain provided by the user, for example:tool.mintimate.cn.
  • SuffixUrl: parameter to which the user provides a link, such as:/user? login=yes.
  • FullUrl: the original link provided by the user, such as:https://tool.mintimate.cn/user?login=yes.
  • ShortCode: Generated short link.
  • TotalClickCount: Count clicks (HanderAutomatically set defaults)
  • ExpirationDate = expirationDateHanderAutomatically set defaults)

Short link processing

First, make a controller that receives user requests:

// Receive the original link and return the short link
@ResponseBody
@PostMapping(value = "/add")
public ShortUrl encodeURL(@RequestParam(value = "UserURL") String UserURL){
    String Domain = DomainUntil.getDomain(UserURL);
    if (Domain==null) {return null;
    }
    return shortUrlService.saveURL(UserURL);
}
Copy the code

Then, look at the business layer, we need to process the domain name, first get a snowflake ID, then convert it to 60 binary, and echo:

@Resource
ShortUrlMapper shortUrlMapper;
@Override
public ShortUrl saveURL(String UserURL) {
    // Create a new object
    ShortUrl shortUrl=new ShortUrl(DomainUntil.getTopDomain(UserURL),DomainUntil.getFile(UserURL),UserURL);
    // Use Mybatis Plus to get the object's snowflake ID in advance
    Long ID= IdWorker.getId(shortUrl);
    // convert to 60 binary
    String Short_URL_CODE = DecimalAndSixtyBinary.encode10To62(ID);
    shortUrl.setShortCode(Short_URL_CODE);
    int code=shortUrlMapper.insert(shortUrl);
    return shortUrl;
}
Copy the code

At this point, let’s use Postman to test:As you can see, the test is successful.

Short link redirection

Short link redirection, that’s easy. Let’s just write a request:

@ResponseBody
@RequestMapping(value = "/{UrlID}")
public RedirectView reverseURL(@PathVariable(value = "UrlID") String UrlID) {
    // The receive request parameter is String
    String fullURL=shortUrlService.findURL(UrlID);
    if (fullURL==null) {// Define a 404 page
        RedirectView redirectView = new RedirectView("/error/404");
        redirectView.setStatusCode(HttpStatus.NOT_FOUND);
        return redirectView;
    }
    else {
        RedirectView redirectView = new RedirectView(fullURL);
        redirectView.setStatusCode(HttpStatus.MOVED_PERMANENTLY);
        returnredirectView; }}Copy the code

The findURL of shortUrlService is a simple JDBC query that is not implemented.

Demo

Based on the above ideas, I briefly built a Demo and deployed it on a personal server:

  • Front end: Vue based, using Element UI and Bootstrap
  • The backend: Springboot

We can use it on Linux/macOScurlTest it, such as using the Linux remote terminal of Tencent Cloud Lightweight application server directly:

curl -I "https://curl.mintimate.ml/1Hjsg8wDe8i"
Copy the code

Improve the way of thinking

It can be seen that the implementation of the article is a little rough, providing the following ideas for improvement:

  • Limit the frequency of requests for a single IP over a period of time: CURRENTLY I use front-end Vue for control, but it is better to use back-end control as well.
  • Database optimization: Currently using MariaDB, Redis buffering is better for a better experience or for large response data volumes.
  • Cron timed task: use snowflake ID to 60 binary, in the link length, still a bit long, but the security should be very high; If you reduce security and further shorten the length, you can create Cron timed threads that invalidate old short links.