Android incremental update is a very common function in the application market, and many games and other apps will use incremental update to upgrade version, can be said to be a common and mature technology.

What is incremental update

Incremental update is based on the differential update algorithm BSDiff. Based on the difference between the two APK bytecodes, the patch package is generated on the server side. Then, the client uses the same algorithm to combine the installed APK with the patch package to generate the updated APK for installation. In this way, the download time of app version upgrade is reduced and the update efficiency is improved.

What are the benefits of incremental updates?

For most apps in the current market, the VOLUME of APK is generally around 100M, assuming the network speed of 4M/s, then the full update time is half a minute. As for incremental update, the patch package required is related to the scope of app changes of the new and old versions, and the download volume can be reduced by more than half in most of the time.

Bsdiff algorithm principle

Since Bsdiff is a difference update, the core of this algorithm is to find the difference.

First, bsdiff records where the last string of each string group grouped by a prefix begins in OLD, thus forming a dictionary of all the substrings in the OLD file

static void qsufsort(long *I, long *V, u_char *pold, long oldsize) { long buckets[256]; long i, h, len; for (i = 0; i < 256; i++) buckets[i] = 0; for (i = 0; i < oldsize; i++) buckets[pold[i]]++; for (i = 1; i < 256; i++) buckets[i] += buckets[i - 1]; for (i = 255; i > 0; i--) buckets[i] = buckets[i - 1]; buckets[0] = 0; for (i = 0; i < oldsize; i++) I[++buckets[pold[i]]] = i; I[0] = oldsize; for (i = 0; i < oldsize; i++) V[i] = buckets[pold[i]]; V[oldsize] = 0; for (i = 1; i < 256; i++) if (buckets[i] == buckets[i - 1] + 1) I[buckets[i]] = -1; I[0] = -1; for (h = 1; I[0] ! = -(oldsize + 1); h += h) { len = 0; for (i = 0; i < oldsize + 1;) { if (I[i] < 0) { len -= I[i]; i -= I[i]; } else { if (len) I[i - len] = -len; len = V[I[i]] + 1 - i; split(I, V, i, len, h); i += len; len = 0; }; }; if (len) I[i - len] = -len; }; for (i = 0; i < oldsize + 1; i++) I[V[i]] = i; }Copy the code

Then, compare the old file with the new file to produce DiffString and Extra String

while (scan < newsize) { oldscore = 0; For (SCSC = scan += len; scan < newsize; scan++) { ... }; if ((len ! = oldscore) | | (scan = = newsize)) {/ / find the length of the different code again... For (I = 0; I = 0; I = 0; i < lenf; i++) db[dblen + i] = pnew[lastscan + i] - pold[lastpos + i]; for (i = 0; i < (scan - lenb) - (lastscan + lenf); i++) eb[eblen + i] = pnew[lastscan + lenf + i]; Dblen += lenf; eblen += (scan - lenb) - (lastscan + lenf); // Write the location of the difference to pfBz2}; };Copy the code

Finally, the DiffString, Extra String and corresponding control word are compressed into a patch package by ZIP

If ((pfbz2 = BZ2_bzWriteOpen(&bz2err, pf, 9, 0, 0)) == NULL) errx(1, "BZ2_bzWriteOpen," bz2err = %d", bz2err); BZ2_bzWrite(&bz2err, pfbz2, db, dblen); if (bz2err ! = BZ_OK) errx(1, "BZ2_bzWrite, bz2err = %d", bz2err); BZ2_bzWriteClose(&bz2err, pfbz2, 0, NULL, NULL); if (bz2err ! = BZ_OK) errx(1, "BZ2_bzWriteClose, bz2err = %d", bz2err); . /* if ((pfbz2 = BZ2_bzWriteOpen(&bz2err, pf, 9, 0, 0)) == NULL) errx(1, "BZ2_bzWriteOpen, bz2err = %d", bz2err); BZ2_bzWrite(&bz2err, pfbz2, eb, eblen); if (bz2err ! = BZ_OK) errx(1, "BZ2_bzWrite, bz2err = %d", bz2err); BZ2_bzWriteClose(&bz2err, pfbz2, 0, NULL, NULL); if (bz2err ! = BZ_OK) errx(1, "BZ2_bzWriteClose, bz2err = %d", bz2err); /* if (fseek(pf, 0, SEEK_SET)) err(1, "fseeko"); if (fwrite(header, 32, 1, pf) ! = 1) err(1, "fwrite(%s)", argv[3]); if (fclose(pf)) err(1, "fclose");Copy the code

Finally, the file that PF points to is the newly generated Patch package. It can be seen that in this algorithm, a large number of bzip2 packages are used to compress the volume of patch.

How to update incrementally

Download the bsdiff source file

Because the download link of the official website cannot be opened normally at present, so I found a compiled project on the Internet on Windows. After downloading, it is a compiled c language program. Bsdiff. C bspatch. C and other source code can be seen in source.

Test incremental updates

Since BSDIff is a binary based incremental update algorithm, this algorithm can be applied to any file type, including but not limited to APK, TXT, JPG, etc. So this time, I did a simple test with TXT text.

Write a simple old.txt file, just write a few words in it

After saving successfully, we will create a new. TXT

At this point, we have both the new and old versions of the document, and we can use BSDIff to incrementally update the patch package

If we enter bsdiff in the command window and the desired file name is not given, it prompts for the parameter format. So let’s fill in the names of the new and old TXT files as prompted, and call the update package patch

After the command is executed, we can see that the patch file already exists in the folder. With this patch package, we can use bspatch to generate a new document from old.txt through patch. At this time, we name it result.txt

If we enter only BSPatch, we will also be prompted with parameters

When the command is executed, we can see that result. TXT already exists in the folder

When you open it, you can see the same content as new.txt.

Three, incremental update combat

Now we understand how incremental update works. In the actual business scenario, the simple description is that the server generates the patch file based on the new and old version, the client downloads the corresponding patch file, generates the apK of the latest version together with the local APK, and finally executes the installation command.

The process is that simple and involves at most some NDK development knowledge

In addition to bspatch.c, we need to introduce the source code for bzip2

Using the BsPatchUtils utility class, we can refer to the methods in bspatch.c

After obtaining the local apK currently installed and the patch file downloaded, we can call the BsPatchUtils tool class to generate the latest version of APK, and then directly install according to the APK file as the full update.

One final thought

Incremental update can only be applied to apK of two specified versions. In reality, there are a large number of versions running online, so different patch files are needed to correspond to multiple versions, and the background needs to generate and configure many files according to the version.

In addition, Bsdiff is a patch file generated based on binary comparison. Therefore, if the client installs the cracked APK, it cannot be updated. Therefore, it is better to perform MD5 verification on the local APK file, patch file, and generated APK file.

Author: Li Yi, Research and development Center