This article aims to improve your SQL skills through easy-to-understand business scenarios. I have fully understood all the SQL in the article, and can flexibly deal with all kinds of SQL scenarios encountered in work and all kinds of SQL problems encountered in the interview. MySQL > Hive SQL > Hive SQL > Hive SQL > Hive SQL > Hive SQL In the process of practicing SQL before, I found some good SQL topics on the Internet, but I am frustrated that the SQL provided by many blogs is not readable, normative and efficient, so I have this blog post today. If I find SQL that can be optimized later, I will continue to update this blog post.

Data Table Introduction

Students table

create table study.student (
     student_id   string  -- Student Id
    ,name         string  -- Student name
    ,birthday     string  -- Student's birthday
    ,sex          string  -- Gender of students
)
stored as parquet
tblproperties("orc.compress"="snappy");
Copy the code

Teacher’s table

create table study.teacher (
     teacher_id   string  -- Teacher No.
    ,name         string  -- Name of teacher
)
stored as parquet
tblproperties("orc.compress"="snappy");
Copy the code

The curriculum

create table study.course (
     course_id    string  -- Course No.
    ,name         string  - of course
    ,teacher_id   string  -- Teacher number corresponding to the course
)
stored as parquet
tblproperties("orc.compress"="snappy");
Copy the code

League tables

create table study.score (
     student_id   string  -- Student Id
    ,course_id    string  -- Course No.
    ,score        int     -- Corresponding grades
)
stored as parquet
tblproperties("orc.compress"="snappy");
Copy the code

Please note:

  • It is considered that the courses may be electives and students may not have taken all the courses
  • All the courses in the transcript should exist in the transcript, and each course should have a corresponding instructor, who can be found in the transcript
  • The students in the transcript should also be in the transcript
  • The following import data is also written randomly, or readers can create their own random data import

Import data

Students table

insert overwrite table study.student VALUES
('01' , 'zhao' , '1990-01-01' , 'male'),
('02' , 'money electricity' , '1990-12-21' , 'male'),
('03' , 'Sun Feng' , '1990-12-20' , 'male'),
('04' , 'liu yun' , '1990-12-06' , 'male'),
('05' , 'Zhou Mei' , '1991-12-01' , 'woman'),
('06' , 'nuss' , '1992-01-01' , 'woman'),
('07' , 'Zheng Zhu' , '1989-01-01' , 'woman'),
('09' , 'Joe' , '2017-12-20' , 'woman'),
('10' , 'bill' , '2017-12-25' , 'woman'),
('11' , 'bill' , '2012-06-06' , 'woman'),
('12' , 'Daisy' , '2013-06-13' , 'woman'),
('13' , 'seven sun' , '2014-06-01' , 'woman');
Copy the code

Teacher’s table

insert overwrite table study.teacher VALUES
('01' , 'Joe'),
('02' , 'bill'),
('03' , 'Cathy');
Copy the code

The curriculum

insert overwrite table study.course VALUES
('01' , 'Chinese' , '02'),
('02' , 'mathematics' , '01'),
('03' , 'English' , '03');
Copy the code

League tables

insert overwrite table study.score VALUES
('01' , '01' , 80),
('01' , '02' , 90),
('01' , '03' , 99),
('02' , '01' , 70),
('02' , '02' , 60),
('02' , '03' , 80),
('03' , '01' , 80),
('03' , '02' , 80),
('03' , '03' , 80),
('04' , '01' , 50),
('04' , '02' , 30),
('04' , '03' , 20),
('05' , '01' , 76),
('05' , '02' , 87),
('06' , '01' , 31),
('06' , '03' , 34),
('07' , '02' , 89),
('07' , '03' , 98);
Copy the code

exercises

  1. Query full information of all teachers (teacher number and name)

  2. Print all information about male students

  3. Select * from student where name = ‘boy’

  4. Query all students’ information, sort first by gender, then by birthday descending order

  5. Find the total number of students

  6. Query the number of boys and girls in a student

  7. Find out each student’s student number and how many courses they have taken

  8. Retrieves the student numbers of students enrolled in at least three courses

  9. Query for failed course numbers

  10. Output course numbers, course names, and corresponding teacher names for all courses

  11. Find the number of students in each course and the average score, output the course number, corresponding number of students, the average score

  12. Find the number of students in each course and the average score, output the course name, the corresponding number of students, the average score

  13. Query student numbers and course scores of both “01” and “02”

  14. Query the student numbers and course scores of “01” with higher grades than “02”

  15. Query the names of students with higher grades in “01” and their scores in “01” and “02”

  16. Query name of student who selected course “01” but not course “02”

  17. Select name, date of birth, gender from all students who took ‘John’ class

  18. Select student id from ’01’, ’02’ and student scores from ’01’ and ’02

  19. Query student id and grade of student enrolled in 01 but not enrolled in 02

  20. Select student ID from student who is enrolled in “02” but not enrolled in “01” and student score from “02”

  21. Query student’s name, total number of courses selected, total scores of all courses, course average

  22. Select * from student whose average score is greater than 60; select * from student whose average score is greater than 60; select * from student whose average score is greater than 60

  23. Retrieve the information of students whose “01” course score is less than 60 and their “01” course score in descending order

  24. Query the name and grade average of students who failed two or more courses

  25. Select * from student where not enrolled in all courses

  26. Query the list of students born in 1990

  27. Query student information whose name contains “wind”

  28. Query the number of teachers named Li

  29. Select * from student where at least two courses have the same student ID as student id “01”

  30. Query the name and name of the student ID enrolled in all courses

  31. Select * from student where id = ’01’ and id = ’01’

  32. Statistics on the number of students in each score section of each subject: course number, course name, [100-85], [85-70], [70-60], [60-0] and percentage

  33. Query students’ total scores and rank them

  34. Query the records of top 3 students in each subject

  35. Select student ID and name from only two courses

  36. Query the grade point average of each course, the results are arranged in descending order by grade point average, the same grade point average, by course number ascending order

  37. Query student id, name, and gpa of all students whose gpa is 85 or greater

  38. Select * from student whose course name is math and whose score is less than 60

  39. Select student id and name from student whose course id is 01 and whose course score is 80 or above

  40. Query student id, course id, and grade of student who took the course taught by teacher “Joe”

  41. Count the number of students enrolled in each course (only for courses with more than 5 students).

  42. Query student number, course number, student grade where the student has the same grade in different courses

Answer (please think independently before referring to the answer)

  1. Query full information of all teachers (teacher number and name)

    select *
      from study.teacher
    ;
    Copy the code
  2. Print all information about male students

    select *
      from study.student
    where sex = 'male'
    ;
    Copy the code
  3. Select * from student where name = ‘boy’

    select *
      from study.student
    where sex = 'male'
    order by birthday desc
    ;
    Copy the code
  4. Query all students’ information, sort first by gender, then by birthday descending order

    select *
      from study.student
    order by sex
            ,birthday desc
    ;
    Copy the code
  5. Find the total number of students

    select count(*)
      from study.student
    ;
    Copy the code
  6. Query the number of boys and girls in a student

    select sex 
          ,count(*)
      from study.student
    group by sex
    Copy the code
  7. Find out each student’s student number and how many courses they have taken

    select student_id
        ,count(*) as course_count
      from study.score
    group by student_id
    ;
    Copy the code
  8. Retrieves the student numbers of students enrolled in at least three courses

    select student_id
          ,count(*) as course_num
      from score
    group by student_id
    having course_num >= 3
    ;
    Copy the code
  9. Query for failed course numbers

    - write a
    select distinct course_id
      from score
    where score < 60
    
    - write two
    select course_id
      from score
    where score< 60
    group by course_id
    Copy the code
  10. Output course numbers, course names, and corresponding teacher names for all courses

    select course_id
          ,a.name as course_name
          ,b.name as teacher_name
      from 
        (
            select course_id 
                  ,name 
                  ,teacher_id
              from study.course
        ) a 
    join 
        (
            select teacher_id 
                  ,name
              from study.teacher
        ) b 
      on a.teacher_id = b.teacher_id
    Copy the code
  11. Find the number of students in each course and the average score, output the course number, corresponding number of students, the average score

    select course_id
          ,count(*) as student_count
          ,avg(score) as avg_score
      from score
    group by course_id
    Copy the code
  12. Find the number of students in each course and the average score, output the course name, the corresponding number of students, the average score

    select name
          ,student_count
          ,avg(score) as avg_score
      from 
        (
            select *
              from study.course
        ) a 
    join
        (
            select course_id
                  ,count(*) as student_count
              from score
            group by course_id
    
        ) b
      on a.course_id = b.course_id
    Copy the code
  13. Query student numbers and course scores of both “01” and “02”

    select a.student_id
          ,a.score as score_01
          ,b.score as score_02
      from 
        (
          select * 
            from score 
            where course_id = '01'
        ) a  
    join
        (
          select * 
            from score 
            where course_id = '02'
        ) b 
      on a.student_id = b.student_id;
    Copy the code
  14. Query the student numbers and course scores of “01” with higher grades than “02”

    select a.student_id
          ,a.score as score_01
          ,b.score as score_02
      from 
        (
          select * 
            from score 
            where course_id = '01'
        ) a  
    join
        (
          select * 
            from score 
            where course_id = '02'
        ) b 
      on a.student_id = b.student_id
    where a.score > b.score
    Copy the code
  15. Query the names of students with higher grades in “01” and their scores in “01” and “02”

    • The three consecutive questions are related, and the first two questions pave the way for this question, so when we get a relatively complex requirement, we can split the requirement into 1 and 2, and finally reach the complete requirement
    select name
          ,score_01
          ,score_02
      from
        (
            select *
              from student
        ) a 
    join      
        (
          select a.student_id
                ,a.score as score_01
                ,b.score as score_02
            from 
              (
                select * 
                  from score 
                  where course_id = '01'
              ) a  
          join
              (
                select * 
                  from score 
                  where course_id = '02'
              ) b 
            on a.student_id = b.student_id
          where a.score > b.score
        ) b 
      on a.student_id=b.student_id
    Copy the code
  16. Query name of student who selected course “01” but not course “02”

    • First of all, we can find out the number of the student who chose “01” course but not “02” course, and then find the name of the student with the student number associated with the student table
    Find the number of the student who chose "01" but not "02"
    select a.student_id
      from 
        (
          select * 
            from score 
            where course_id = '01'
        ) a  
    left outer join
        (
          select * 
            from score 
            where course_id = '02'
        ) b 
      on a.student_id = b.student_id
    where b.student_id is null
    Copy the code
    -- Take the student number to find the name
    select name
      from 
        (
            select a.student_id as student_id
              from 
                (
                  select * 
                    from score 
                    where course_id = '01'
                ) a  
            left outer join
                (
                  select * 
                    from score 
                    where course_id = '02'
                ) b 
              on a.student_id = b.student_id
              where b.student_id is null
        ) a 
    join 
        (
            select *
              from student
        ) b 
      on a.student_id = b.student_id
    Copy the code
  17. Select name, date of birth, gender from all students who took ‘John’ class

    • Their thinking
      • Find out the teacher number of Teacher ‘Zhang SAN’
      • Find out the course numbers of all of Mr. Joe’s professors
      • Find the student numbers for these courses
      • Find the corresponding student information according to the student id
    • Follow the above four steps to write the SQL
    select name
          ,birthday
          ,sex
      from 
        (
            select student_id
              from 
                (
                    select course_id
                      from 
                        (
                            select teacher_id
                              from teacher
                            where name='Joe'
                        ) a
                    join
                        (
                            select course_id
                                  ,teacher_id    
                              from course  
                        ) b
                      on a.teacher_id = b.teacher_id
                ) a 
            join 
                (
                    select student_id
                          ,course_id 
                      from score            
                ) b 
              on a.course_id = b.course_id
            group by student_id
        ) a 
    join 
        (
            select * 
              from student
        ) b 
      on a.student_id = b.student_id
    Copy the code
  18. Select student id from ’01’, ’02’ and student scores from ’01’ and ’02

    select a.student_id as student_id
          ,a.score as score_01
          ,b.score as score_02
      from 
        (
            select * 
              from score 
            where course_id = '01'
        ) a  
    join
        (
          select * 
            from score 
            where course_id = '02'
        ) b 
      on a.student_id = b.student_id;
    Copy the code
  19. Query student id and grade of student enrolled in 01 but not enrolled in 02

    select a.student_id as student_id
          ,a.score as score_01
      from 
        (
            select * 
              from score 
            where course_id = '01'
        ) a  
    left outer join
        (
          select * 
            from score 
            where course_id = '02'
        ) b 
      on a.student_id = b.student_id
    where b.student_id is null
    Copy the code
  20. Select student ID from student who is enrolled in “02” but not enrolled in “01” and student score from “02”

    select b.student_id as student_id
          ,b.score as score_02
      from 
        (
            select * 
              from score 
            where course_id = '01'
        ) a  
    right outer join
        (
          select * 
            from score 
            where course_id = '02'
        ) b 
      on a.student_id = b.student_id
    where a.student_id is null
    Copy the code
  21. Query student’s name, total number of courses selected, total scores of all courses, course average

    select a.name as name
          ,course_num
          ,score_sum
          ,score_avg
      from
        (
            select student_id
                  ,name
              from student
        ) a
    join
        (
            select student_id
                  ,count(course_id) as course_num
                  ,sum(score) as score_sum  
                  ,avg(score) as score_avg
               from score
               group by student_id
        ) b 
      on a.student_id=b.student_id
    order by course_num desc 
            ,score_sum desc
    Copy the code
  22. Select * from student whose average score is greater than 60; select * from student whose average score is greater than 60; select * from student whose average score is greater than 60

    select a.name as name
          ,course_num
          ,score_sum
          ,score_avg
      from
        (
            select student_id
                  ,name
              from student
        ) a
    join
        (
            select student_id
                  ,count(course_id) as course_num
                  ,sum(score) as score_sum  
                  ,avg(score) as score_avg
               from score
            group by student_id
        ) b 
      on a.student_id=b.student_id
    where score_avg > 60
    order by course_num desc 
            ,score_sum desc
    Copy the code
  23. Retrieve the information of students whose “01” course score is less than 60 and their “01” course score in descending order

    select a.student_id as student_id
          ,name
          ,birthday
          ,sex
          ,score
      from 
        (
            select *
              from student
        ) a 
    join 
        (
            select student_id
                  ,score
              from score
            where course_id='01'
                and score < 60
        ) b 
      on a.student_id = b.student_id
    order by score desc
    Copy the code
  24. Query the name and grade average of students who failed two or more courses

    select name
          ,score_avg
      from 
        (
            select * 
              from student
        ) a
    join
        (
            select student_id
                  ,score_avg
              from 
                (
                    select student_id
                          ,avg(score) as score_avg
                          ,count(case when score < 60 then 1 end) as fail_count
                      from score
                    group by student_id
                ) a 
            where fail_count > 1 
        ) b
      on a.student_id = b.student_id
    ;
    Copy the code
  25. Select * from student where not enrolled in all courses

    – Find the student ID in the transcript that does not meet the total number of courses – find the student ID in the transcript that does not meet the total number of courses – the sum of the two

    select b.student_id
        (
            select count(*) as course_num
              from score
        ) a
    join
        (
            select student_id
                  ,count(*) as course_num 
              from score 
        ) b
      on a.course_num = b.course_num
    union 
        (
            select a.student_id
              from
                (
                    select *
                      from student
                ) a 
            left join
                (
                    select *
                      from score
                ) b   
              on a.student_id = b.student_id
            where course_id is null
        ) b
    Copy the code
  26. Query the list of students born in 1990

    select *
      from student
    where substr(birthday,1.4) ='1990'
    Copy the code
  27. Query student information whose name contains “wind”

    select *
      from student
    where name like '% % wind'
    Copy the code
  28. Query the number of teachers named Li

    select count(*)
      from teacher
    where name like 'l %'
    Copy the code
  29. Select * from student where at least two courses have the same student ID as student id “01”

    select a.student_id
      from
        (
            select b.student_id as student_id
                  ,b.course_id as course_id
              from
                (
                    select course_id
                      from score
                    where student_id='01'
                ) a
            join
                (
                    select student_id
                          ,course_id 
                      from score
                    where student_id <> '01'
                ) b
              on a.course_id = b.course_id
        ) a
    group by a.student_id
    having count(course_id)>1;
    Copy the code
  30. Query the name and name of the student ID enrolled in all courses

    select b.student_id as student_id
        ,b.name as name
        ,a.course_num as course_num
    from
      (
          select student_id
                ,a.course_num as course_num   
            from
              (
                  select student_id
                        ,count(course_id) as course_num
                    from course
              ) a 
          join
              (
                  select student_id
                        ,count(*) as course_num
                    from score
                  group by student_id
              ) b 
            on a.course_num = b.course_num
      ) a 
    join
      (
          select * 
            from student
      ) b 
    on  a.student_id = b.student_id
    Copy the code
  31. Select * from student where id = ’01’ and id = ’01’

    • Select the course ID of student ’01’ first
    • Then select the student IDS that have taken these course ids and the total number of courses that have taken these course ids
    • Match the total number of courses learned by student “01” (there is a student who has learned more courses than student “01”)
    • Matches the total number of courses taken by students on the transcript
    select  student_id
      from
        (
            select student_id
                  ,count(course_id) as course_num
              from
                (                
                    select student_id
                          ,course_id
                      from score 
                    where student_id <> '01'
                      and course_id in ( 
                                           select course_id 
                                             from score 
                                           where student_id = '01' 
                                       )
                ) a
            group by student_id
        ) a
    join
        (
            select count(course_id) as course_num
              from score
            where student_id = '01'             
        ) b
      on a.course_num = b.course_num
    join
        (
            select student_id
                  ,count(*) as num_course
              from score
            group by student_id
        ) c
      on a.course_num = c.course_num
    Copy the code
  32. Statistics on the number of students in each score section of each subject: course number, course name, [100-85], [85-70], [70-60], [60-0] and percentage

    – Divide the number of score segments in the grade table by the total number of courses to be studied

      select a.course_id as course_id
            ,b.name as name
            ,round(sum(case when score > 85 then 1 else 0 end) /count(*),2) as '100-85'
            ,round(sum(case when score between 70 and 84 then 1 else 0 end) /count(*),2) as '85-70'
            ,round(sum(case when score between 60 and  69 then 1 else 0 end) /count(*),2) as '70-60'
            ,round(sum(case when score < 60 then 1 else 0 end) /count(*),2) as 60-0 ' '
        from
          (
              select course_id
                    ,score
                from score
          ) a
      join
          (
              select course_id
                    ,name
                from course
          ) b
        on a.course_id = b.course_id
      group by a.course_id
              ,b.name
    Copy the code
  33. Query students’ total scores and rank them

    select student_id
          ,sum(score) score_sum
          ,row_number() over(order by sum(score) desc) as rank
      from score 
    group by student_id
    Copy the code
  34. Query the records of top 3 students in each subject

    select *
      from(
              select *
                    ,row_number() over(partition by course_id order by score desc) as rank
                from score 
          ) a
    where rank< =3
    Copy the code
  35. Select student ID and name from only two courses

    select a.student_id as student_id
          ,a.name as name
      from
        (
            select student_id
                  ,name 
              from student 
        ) a
    join 
        (
            select student_id 
              from score 
            group by student_id 
            having count(course_id) = 2
        ) b
        on a.student_id = b.student_id;
    Copy the code
  36. Query the grade point average of each course, the results are arranged in descending order by grade point average, the same grade point average, by course number ascending order

    select course_id
          ,avg(score) as score_avg
      from score
    order by score_avg desc
            ,course_id asc
    Copy the code
  37. Query student id, name, and gpa of all students whose gpa is 85 or greater

    select a.student_id as student_id
          ,a.name as name
          ,b.score_avg
      from
        (     
            select student_id
                  ,name
              from student
        ) a
    join
        (
            select student_id
                  ,avg(score) as score_avg
              from score
            group by student_id
            having avg(score) >= 85
        ) b
      on a.student_id = b.student_id
    Copy the code
  38. Select * from student whose course name is math and whose score is less than 60

    select name 
          ,score
      from
        (
            select student_id
                  ,score
              from 
                (
                    select course_id
                      from course
                    where name = 'mathematics'
                ) a
            join 
                (
                    select student_id
                          ,course_id
                          ,score
                      from score
                    where score < 60
                ) b
              on a.course_id = b.course_id
        ) a
    join
        (
            select student_id
                  ,name
              from student
        ) b          
      on a.student_id = b.student_id
    Copy the code
  39. Select student id and name from student whose course id is 01 and whose course score is 80 or above

    select a.student_id as student_id
        ,name 
      from
        (       
            select student_id
              from score
            where course_id = '01'
              and score >= 80          
        ) a
    join
        (
            select student_id
                  ,name
              from student
        ) b          
      on a.student_id = b.student_id
    Copy the code
  40. Query student id, course id, and grade of student who took the course taught by teacher “Joe”

    select student_id
          ,a.course_id as course_id
          ,score
      from
        (
            select course_id
              from 
                (
                    select course_id
                          ,teacher_id
                      from course
                  ) a
            join
                (
                    select teacher_id
                      from teacher
                    where name = 'Joe'
                ) b
              on a.teacher_id = b.teacher_id
        ) a
    join 
        (
            select student_id
                  ,course_id
                  ,score
              from score
        ) b
      on a.course_id = b.course_id
    order by score desc
    limit 1
    ;
    Copy the code
  41. Count the number of students enrolled in each course (only for courses with more than 5 students).

    select a.course_id as course_id
          ,b.name
          ,a.num as num
      from
        (
            select course_id
                  ,count(*) as num
              from score
            group by course_id
            having count(*) > =5
        ) a
    join
        (
            select course_id
                  ,name
              from course
        ) b
      on a.course_id =b.course_id
    Copy the code
  42. Query student number, course number, student grade where the student has the same grade in different courses

    select a.student_id as student_id
          ,a.course_id as course_id
          ,a.score as score
      from 
        (
            select student_id
                  ,course_id
                  ,score
              from score
        ) a
    join 
        (
            select student_id
                  ,course_id
                  ,score
              from score
        ) b
      on a.student_id = b.student_id
      and a.score = b.score
      and a.course_id <> b.course_id
    group by student_id
            ,course_id
            ,score
    Copy the code

Reference:

50 SQL exercises and answers and detailed analysis