This is the 29th day of my participation in the More Text Challenge. For more details, see more Text Challenge

The business logic

Publish the activity page links of small programs through various channels, such as multiple SMS attached links (channel is denoted as SMS1, SMS2, SMS3), or post qr codes of wechat small programs on posters (channel is denoted as qrcode1, qrcode2, qrcode3). Offline members can enter the activity page designated by the applet by scanning the QR code, or they can enter the applet by the applet link shared by other members (channel is marked as share). These different entry methods are collectively referred to in this article as the different channels, referred to as the channel field. Entering the active page from different channels generates a page visit record. It’s going to be in the page_view table.

Members enter the designated activity page of the small program, and after triggering a series of operations on the page, they will get corresponding feedback, such as getting points, or getting coupons, etc. This step is called participation. This data is recorded in an activity_record.

Now, the operation sister is asking for a data report. From what time and which channel did each member participate in the activity?

Data table structure

The name of the table member_id participate_time
activity_record Membership no. Activity Participation Time
The name of the table member_id channel view_time
page_view Membership no. channel Page access time

Query logic

Because each member can only participate in one event, which means that each member can only earn one point during the event, or receive one coupon, etc., this means that each member can only generate one Activity_record at most.

The page_view table, however, is recorded differently. Members may have received SMS links, and scan the activity QR code, and friends have shared the activity link, this, for this member, will generate multiple page visit records, namely in page_View generated multiple data.

If you think about it, members must go to the activity page through some channel before they can participate in the activity. That is, there are multiple page_view data in reverse order of view_time, and there is always one whose view_time is less than and closest to the participate_time of activity_record. The view_time of the next page_view is greater than the participate_time of activity_record.

SQL script

select c.member_id,c.view_time,.channel from (
SELECT
	member_id,
	SUBSTRING_INDEX( GROUP_CONCAT( view_time ORDER BY view_time DESC ), ', '.1 ) AS view_time,
	SUBSTRING_INDEX( GROUP_CONCAT( channel ORDER BY channel DESC ), ', '.1 ) AS channel
FROM
	page_view a LEFT JOIN activity_record b
        on a.member_id = b.member_id
        where a.view_time < b.participate_time
GROUP BY
	member_id) c;
Copy the code

The script that

  • GROUP_CONCAT: Using DISTINCT can exclude duplicate values; Group_concat ([distinct] name to join [ORDER by asC /desc] [separator ‘ ‘])
  • SUBSTRING_INDEX: string interception function. Substring_index (STR, delim, count). STR: string to be processed; Delim: separator. Counting the count: