How to update data via INSERT Overwrite

background

For most storage formats in big data, supporting random updates is complex. It scans large files. The latest Transactional table functionality introduced by MaxCompute supports UPDATE and DELETE statements, but the UPDATE and DELETE functionality is not suitable for high-frequency updates, deletes data, or real-time writes to the target table. At the same time, UPDATE and DELETE cannot be performed on non-transactional tables. This article focuses on how to update data via INSERT Overwrite.

1. Create a table to insert data

create table update_table(ID int,
 tranValue string,
 last_update_user string) PARTITIONED by(dt STRING ) LIFECYCLE 1;
INSERT INTO update_table PARTITION (dt="20210510") VALUES
(1, 'value_01', 'creation'),
(2, 'value_02', 'creation'),
(3, 'value_03', 'creation'),
(4, 'value_04', 'creation'),
(5, 'value_05', 'creation'),
(6, 'value_06', 'creation'),
(7, 'value_07', 'creation'),
(8, 'value_08', 'creation'),
(9, 'value_09', 'creation'),
(10, 'value_10','creation');

2. Update a piece of data

Update to value_011 when id is 1

INSERT OVERWRITE TABLE update_table PARTITION(dt) SELECT id,CASE WHEN id=1 THEN "value_011" ELSE TranValue END TranValue ,last_update_user ,dt FROM update_table WHERE dt = "20210510" ;

3. Update multiple pieces of data

According to the delta table update, first create the delta table insert data

create table update_table_inc(ID int, TranValue string, last_update_user string) LIFECYCLE 1; INSERT INTO update_table_inc VALUES (5, 'value_11', 'creation'), (6, NULL, '20170410'), (7, 'value22', '20170413'); ID is 5 and 7 update TranValue, INSERT OVERWRITE TABLE UPDATE_TABLE PARTITION(DT) SELECT A.ID,CASE WHEN A.ID = B.ID AND b.TranValue is not null THEN b.TranValue ELSE a.TranValue END TranValue ,CASE WHEN a.id=b.id and b.TranValue is not null  THEN b.last_update_user ELSE a.last_update_user END last_update_user ,dt FROM update_table a LEFT JOIN update_table_inc  b ON a.id = b.id WHERE a.dt = "20210510" ;

4. Delete data

Mysql > delete data

INSERT OVERWRITE TABLE update_table PARTITION( dt) SELECT * FROM update_table WHERE dt = "20210510" and id ! = 4;

This article is the original content of Aliyun, shall not be reproduced without permission.