For the application scenario of “nearby people” in the area of location service, PG, MySQL, MongoDB and other DB spatial indexes can be commonly used to implement. However, Redis takes a new approach, combining its ordered queue Zset and GeoHash encoding to realize the spatial search function with high operation efficiency.

This paper will analyze the algorithm principle from the source point of view, and calculate the query time complexity.

To provide a complete “nearby people” service, the most basic is to achieve “add”, “delete”, “search” functions. In the following sections, the query function will be analyzed.

Starting with Redis 3.2, Redis provides location-dependent functionality based on geohash and ordered collections. The Redis Geo module contains the following six commands:

GEOADD: Adds the given location object (latitude, longitude, name) to the specified key; GEOPOS: Returns the position (longitude and latitude) of all objects in a given position from the key; GEODIST: Returns the distance between two given locations; GEOHASH: Returns a GEOHASH representation of one or more positional objects; GEORADIUS: Centered on a given latitude and longitude, return all position objects in the target set whose distance from the center does not exceed the given maximum distance; GEORADIUSBYMEMBER: Centered on a given location object, returns all location objects that are within a given maximum distance from it. The combination of GEOADD and GEORADIUS enables the basic functions of “add” and “search” in “Nearby People.”

To implement the “nearby people” function in wechat, you can directly use the GEORADIUSBYMEMBER command. The “given location object” is the user himself, and the search object is other users.

In essence, GEORADIUSBYMEMBER = GEOPOS + GEORADIUS searches the user’s location first and then other nearby user objects that meet the condition of distance between locations.

The following will analyze GEOADD and GEORADIUS commands from the perspective of source code, and analyze their algorithm principles.

The Redis Geo operation only contains “add” and “search” operations, there is no special “delete” command. This is mainly because Redis internally uses ordered sets (Zsets) to store location objects, which can be deleted by ZREM.

In the Redis source geo. C file annotation, only stated that this file is the GEOADD, GEORADIUS and GEORADIUSBYMEMBER implementation file (actually also implemented the other three commands). The other three commands are auxiliary commands.

Recommend the following Spring Boot actual combat projects:

Github.com/YunaiV/ruoy…

How to use GEOADD

GEOADD key longitude latitude member [longitude latitude member ...]
Copy the code

Adds the given location object (latitude, longitude, name) to the specified key.

Where, key is the set name, and member is the object corresponding to the latitude and longitude. In practical application, when the number of objects to be stored is too large, multiple keys can be set (for example, one key is saved) to sharding the object set in a disguised way to avoid excessive number of single sets.

Return value after successful insertion:

(integer) N Where N is the number of successful inserts.

Source code analysis

/* GEOADD key long lat name [long2 lat2 name2... longN latN nameN] */ void geoaddCommand(client *c) {// GEOADD key long lat name [long2 lat2 name2... longN latN nameN] */ void geoaddCommand(client *c) arguments number for sanity. */ if ((c->argc - 2) % 3 ! = 0) { /* Need an odd number of arguments if we got this far... */ addReplyError(c, "syntax error. Try GEOADD key [x1] [y1] [name1] " "[x2] [y2] [name2] ... "); return; Redis int elements = (c-> argc-2) / 3; int argc = 2+elements*2; /* ZADD key score ele ... */ robj **argv = zcalloc(argc*sizeof(robj*)); argv[0] = createRawStringObject("zadd",4); argv[1] = c->argv[1]; /* key */ incrRefCount(argv[1]); /* Create the argument vector to call ZADD in order to add all * the score,value pairs to the requested zset, where score is actually * an encoded version of lat,long. */ int i; for (i = 0; i < elements; i++) { double xy[2]; If (extractLongLatOrReply(c, (c->argv+2)+(I *3),xy) == C_ERR) {for (I = 0; i < argc; i++) if (argv[i]) decrRefCount(argv[i]); zfree(argv); return; /* coordinates Turn the coordinates into the score of the element. */ GeoHashBits hash; geohashEncodeWGS84(xy[0], xy[1], GEO_STEP_MAX, &hash); GeoHashFix52Bits bits = geohashAlign52Bits(hash); robj *score = createObject(OBJ_STRING, sdsfromlonglong(bits)); robj *val = c->argv[2 + i * 3 + 2]; Argv [2+ I *2] = score; argv[3+i*2] = val; incrRefCount(val); } / / call zadd command, stored into good object / * Finally call zadd that will do the work for us. * / replaceClientCommandVector (c, arg c, argv); zaddCommand(c); }Copy the code

From the source code analysis, it can be seen that Redis internally uses zset to store location objects. Each element in the ordered set is an object with location, and the score value of the element is the 52-bit geohash value corresponding to its latitude and longitude.

The precision of double is 52 bits; Geohash is base32 encoded and can store up to 10 bits of geohash value corresponding to a 0.60.6 meter grid. In other words, the position converted by Redis Geo will theoretically have an error of 0.31.414=0.424 meters.

A brief summary of GEOADD command is as follows: 1. Parameter extraction and verification; 2. Convert the latitude and longitude of the input parameter to a 52-bit geohash value (score); 3. Call ZADD command to store member and its corresponding score into the set key.

I would like to recommend some practical projects of Spring Cloud:

Github.com/YunaiV/onem…

GEORADIUS use mode

GEORADIUS key longitude latitude radius m|km|ft|mi [WITHCOORD] [WITHDIST] [WITHHASH] [ASC|DESC] [COUNT count] [STORE key] [STORedisT key]
Copy the code

Returns all position objects in the target set that are within a given maximum distance from the center, centered on a given latitude and longitude.

Range of the unit: m | km | ft | mi – – | | | km feet > m miles

Additional parameters:

WITHDIST: The distance between the location object and the center is also returned. The units of distance are the same as the units of range given by the user. WITHCOORD: Returns the longitude and dimension of the position object as well. WITHHASH: Returns the original Geohashed ordered set score of the position object as a 52-bit signed integer. This option is mainly used for low-level applications or debugging, and is not useful in practice. ASC | DESC: return from near to far place object element | return from far to near location object elements. -count COUNT: selects the first N elements of the matching position object. – STORE Key: Saves the geographical location information of the returned result to the specified key. – STORedisT key: Stores the distance between the returned result and the center point to a specified key. Due to the existence of STORE and STORedisT, GEORADIUS and GEORADIUSBYMEMBER commands are technically marked as write commands, so that only the primary instance is queried (written). High QPS may cause excessive read and write pressure on the primary instance.

To solve this problem, GEORADIUS_RO and GEORADIUSBYMEMBER_RO are added in Redis 3.2.10 and Redis 4.0.0 respectively.

However, the author found that in the actual development on the Java package Redis. Clients. Jedis. Params. GeoRadiusParam parameters of geo class does not contain STORE and STORedisT two parameters option, Is it true that only the primary instance is queried when georadius is called, or is it read-only encapsulation? Interested friends can do their own research.

Return value after successful query:

Returns a member List without the WITH qualification, as in:

[“member1″,”member2″,”member3”] WITH WITH, member list each member is a nested list, e.g.

[
 ["member1", distance1, [longitude1, latitude1]]
 ["member2", distance2, [longitude2, latitude2]]
]
Copy the code

Source code analysis this section of the source code is longer, can not go down to see Chinese notes directly, or directly jump to the summary part

/* GEORADIUS key x y radius unit [WITHDIST] [WITHHASH] [WITHCOORD] [ASC|DESC] * [COUNT count] [STORE key] [STORedisT key] * GEORADIUSBYMEMBER key member radius unit ... options ... */ void georadiusGeneric(client *c, int flags) { robj *key = c->argv[1]; robj *storekey = NULL; int stoRedist = 0; Robj *zobj = NULL; /* 0 for STORE, 1 for STORedisT. if ((zobj = lookupKeyReadOrReply(c, key, shared.null[c->resp])) == NULL || checkType(c, zobj, OBJ_ZSET)) { return; } int base_args; double xy[2] = { 0 }; If (flags & RADIUS_COORDS) {... } double radius_meters = 0, conversion = 1; if ((radius_meters = extractDistanceOrReply(c, c->argv + base_args - 2, &conversion)) < 0) { return; Int withdist = 0, withHash = 0, withCoords = 0; int sort = SORT_NONE; long long count = 0; if (c->argc > base_args) { ... . } / / STORE and STORedisT parameter if (storekey && (withdist | | withhash | | withcoords)) {addReplyError (c, "STORE option in GEORADIUS is not compatible with " "WITHDIST, WITHHASH and WITHCOORDS options"); return; } // set sort if (count! = 0 && sort == SORT_NONE) sort = SORT_ASC; / / the center and radius is used to calculate the target area range GeoHashRadius georadius = geohashGetAreasByRadiusWGS84 (x, y [0], xy [1], radius_meters); GeoArray *ga = geoArrayCreate(); geoArray *ga = geoArrayCreate(); membersOfAllNeighbors(zobj, georadius, xy[0], xy[1], radius_meters, ga); /* If no matching results, the user gets an empty reply. */ if (ga->used == 0 && storekey == NULL) { addReplyNull(c); geoArrayFree(ga); return; } // Some return values set and return...... geoArrayFree(ga); }Copy the code

The core steps in the code above are “compute the center point range” and “search the center point and the eight geoHash grid areas around it.”

The corresponding is geohashGetAreasByRadiusWGS84 and membersOfAllNeighbors two functions.

Let’s look at them in turn:

Range of computing center points:

// geohash_helper.c GeoHashRadius geohashGetAreasByRadiusWGS84(double longitude, double latitude, double radius_meters) { return geohashGetAreasByRadius(longitude, latitude, radius_meters); } // Return 9 geohashboxes GeoHashRadius geohashGetAreasByRadius geohashGetAreasByRadius(Double longitude, double Latitude, Double radius_meters) {// Set some parameters GeoHashRange long_range, lat_range; GeoHashRadius radius; GeoHashBits hash; GeoHashNeighbors neighbors; GeoHashArea area; double min_lon, max_lon, min_lat, max_lat; double bounds[4]; int steps; GeohashBoundingBox (longitude, latitude, radius_meters, bounds) geohashBoundingBox(longitude, latitude, radius_meters, bounds); min_lon = bounds[0]; min_lat = bounds[1]; max_lon = bounds[2]; max_lat = bounds[3]; // Calculate the geohash precision (bits) of the 9 search boxes with the query based on the latitude and radius of the central point of the target region The smaller the digits) steps = geohashEstimateStepsByRadius (radius_meters, latitude); // Set the longitude-longitude Max/min: -180<=longitude<=180, -85<=latitude<=85 geohashGetCoordRange(&long_range,&lat_range); / / the thesis& latitude and longitude (steps) encoded geohash value specified precision geohashEncode (& long_range, & lat_range, longitude, latitude, steps, & hash); GeohashNeighbors (&hash,&neighbors); geohashNeighbors(&hash,&neighbors); geohashNeighbors(&hash,&neighbors); GeohashDecode (long_range,lat_range,hash,&area); geohashDecode(long_range,lat_range,hash,&area); // Some special cases are handled...... // Build and return the result radius.hash = hash; radius.neighbors = neighbors; radius.area = area; return radius; } Find the center point and eight geoHash grid areas around it: Int membersOfAllNeighbors(robj *zobj, GeoHashRadius n, double lon, double lat, double radius, geoArray *ga) { GeoHashBits neighbors[9]; unsigned int i, count = 0, last_processed = 0; int debugmsg = 0; HashBox Neighbors [0] = n.hash; ... neighbors[8] = n.neighbors.south_west; // Search for target point in each hashBox for (I = 0; i < sizeof(neighbors) / sizeof(*neighbors); i++) { if (HASHISZERO(neighbors[i])) { if (debugmsg) D("neighbors[%d] is zero",i); continue; If (last_processed && neighbors[I]. Bits == neighbors[last_processed].bits &&) neighbors[i].step == neighbors[last_processed].step) { continue; Count += membersOfGeoHashBox(Zobj, neighbors[I], GA, lon, lat, radius); last_processed = i; } return count; } int membersOfGeoHashBox(robj *zobj, GeoHashBits hash, geoArray *ga, double lon, double lat, Double radius) {GeoHashFix52Bits (52 bits) GeoHashFix52Bits min, Max; scoresOfGeoHashBox(hash,&min,&max); Return geoGetPointsInRange(zobj, min, Max, LON, Lat, radius, GA); } int geoGetPointsInRange(robj *zobj, double min, double max, double lon, double lat, double radius, GeoArray *ga) {zrangespec Range = {.min = min,.max = Max,.minex = 0, .maxex = 1 }; size_t origincount = ga->used; sds member; If (zobj->encoding == OBJ_ENCODING_ZIPLIST) {if (zobj->encoding == OBJ_ENCODING_ZIPLIST) { } else if (zobj->encoding == OBJ_ENCODING_SKIPLIST) { zset *zs = zobj->ptr; zskiplist *zsl = zs->zsl; zskiplistNode *ln; // Get the first element in the hashBox (skip table data structure, efficient than binary lookup tree), If ((ln = zslFirstInRange(ZSL, &range)) == NULL) {/* Nothing exists starting at our min.no results.  While (ln) {SDS ele = ln->ele; /* Abort when the node is no longer in range. */ if (! zslValueLteMax(ln->score, &range)) break; Ele = sdsdup(ele); if (geoAppendIfWithinRadius(ga,lon,lat,radius,ln->score,ele) == C_ERR) sdsfree(ele); ln = ln->level[0].forward; } } return ga->used - origincount; } int geoAppendIfWithinRadius(geoArray *ga, double lon, double lat, double radius, double score, sds member) { double distance, xy[2]; // Decode error, return error if (! decodeGeohash(score,xy)) return C_ERR; /* Can't decode. */ / Final distance check (calculate spherical distance to see if less than radius) if (! geohashGetDistanceIfInRadiusWGS84(lon,lat, xy[0], xy[1], radius, &distance)) { return C_ERR; GeoPoint *gp = geoArrayAppend(ga); gp->longitude = xy[0]; gp->latitude = xy[1]; gp->dist = distance; gp->member = member; gp->score = score; return C_OK; }Copy the code

Leaving aside the many optional parameters, here’s a quick summary of how GEORADIUS uses geoHash to get the target location object:

1. Parameter extraction and verification;

2. Use the center point and input radius to calculate the scope of the area to be checked. The range parameters include the highest geoHash mesh level (accuracy) that satisfies the condition and the corresponding nine-grid position that can cover the target region; (More on that later)

Iterate over the grid, selecting position objects based on the range box of each GeoHash grid. Further find the object whose distance from the center point is less than the input radius and return. Algorithm analysis why to use this algorithm strategy for query, or the advantage of this strategy, let us analyze in the way of question and answer.

Why find the highest geoHash grid level that satisfies the criteria? Why use the nine-grid?

This is actually a problem, essentially a preliminary filter of all element objects. In a multi-tier GeoHash grid, each lower-level GeoHash grid is made up of four higher-level meshes. In other words, the higher the geoHash grid level, the smaller the geographic range covered. When we calculate the highest grade of the nine grids (grids) that can cover the target area based on the input radius and center point position, we have screened the extra elements of the nine grids.

The main reason why nine grids are used here instead of a single grid is to avoid the boundary case and narrow the query area as much as possible. If you look at a zero latitude and longitude center, you would have to look at the entire region of the earth if you looked at a single grid covering just one meter. Extending a circle in eight directions avoids this problem.

How do I select element objects from the scope box of the GeoHash grid? How efficient is it?

First, the GeoHash value in each geoHash grid is continuous and has a fixed range. So you just have to find the location objects in that range in the ordered set. It has query efficiency similar to binary search tree, and the average operation time complexity is O(log(N)). And all the elements at the bottom are arranged in a linked list.

So when querying, you just need to find the first value in the collection that is in the target GeoHash grid, and then compare it in sequence instead of searching multiple times.

The reason why nine grids cannot be searched together and traversed one by one is that the geohash values corresponding to each grid of nine grids do not have continuity. Only continuous, query efficiency will be high, otherwise to do a lot of distance calculation.

To sum up, we have analyzed the detailed process of “GEOADD” and “GEORADIUS” in Redis Geo module from the perspective of source code. And the function of GEORADIUS searching for nearby people in Redis can be calculated, and the time complexity is O(N+log(M)).

Where N is the number of positional elements within the specified radius, and M is the number of elements encircled by nine grids to calculate the distance. Combined with the memory – based storage characteristics of Redis itself, it has very high operation efficiency in the actual use process.