The business scenario

In the Model in Student, there is a relational map that stores a collection of strings, but there is a problem in the actual business logic.

Model:

@Data
@Entity
@Table(name = "student_tab")
public class Student {
    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    @Column(name = "id", columnDefinition = "INTEGER(10) UNSIGNED", nullable = false)
    private Integer id;

    @Column(name = "name")
    private String name;

    private Set<String> elementCollectionString1;

    private Set<String> elementCollectionString2;
}
Copy the code

Business:

@Transactional
public void test(a) { Student student = repository.findById(id); asyncFunction(student); }...@Async
public void asyncFunction(Student student) {... student.getElementCollectionString1(); . }Copy the code

Error and Resolution

AsyncFunction invokes the student getElementCollectionString1 () complains:

Failed to lazily initialize a collection of role……

The main reason is that the database session is closed, so the data cannot be read.

There are two solutions

  1. inasyncFunction(student);Before, call it manually once

student.getElementCollectionString1(); JPA will query the contents of the lazyString once, and the model will have a value before being passed into asyncFunction. However, if there are too many associated objects, multiple objects need to be manually called before async, which makes the code very magic and is not conducive to maintenance.

  1. Set ElementCollection to EAGER.

@ ElementCollection (fetch = FetchType. EAGER) so when querying data, JPA will directly read student. GetElementCollectionString1 data, but there are a lot of scenarios, Tables that do not need to read relational relationships can result in wasted performance.

So it’s still not the best solution.

@EntityGraph

When using @manytomany, @manytoone, @onetomany, @oneToOne, @Element associations, FetchType does not care whether LAZY or EAGER is configured. SQL is actually executed by a primary table query and N sub-table query, this query is generally inefficient, for example, N+1 SQL will be executed when there are N sub-objects.

This is also the N+1 problem for JPA.

Sometimes we need to use Left Join or Inner Join to improve efficiency, which can only be implemented using the @query JQPL syntax.

Spring Data JPA introduces the concept of an EntityGraph, which can solve N+1 SQL problems, to improve query rates simply.

Implementation steps

  1. Define @namedEntityGraph in Entity. @NamedEntityGraph and @nameDattributenode can have multiple or one entities.
@NamedEntityGraphs( @NamedEntityGraph(name = "student.all", attributeNodes = { @NamedAttributeNode("elementCollectionString1"), @NamedAttributeNode("elementCollectionString2") }) )
public class Student {

Copy the code

2. Just add an @EntityGraph annotation to the query method, where value is the Name in @NamedEntityGraph.

@EntityGraph(value="student.all",type= EntityGraph.EntityGraphType.FETCH)
List<Student> findAll(a);
Copy the code

JPA statement execution process

Code:

@Transactional
@Test
public void getData() {
  System.out.println("findAll");
  List<Student> studentList = studentRepository.findAll();
  if (studentList.isEmpty()) {
    System.out.println("return");
    return;
  }
  System.out.println("get");
  Student student1 = studentList.get(0);
  System.out.println("getElementCollectionString1");
  student1.getElementCollectionString1().size();
  System.out.println("getElementCollectionString2");
  student1.getElementCollectionString2().size();
}
Copy the code

Lazy

findAll
Hibernate: select student0_.id as id1_2_, student0_.name as name2_2_ from student_tab student0_
get
getElementCollectionString1
Hibernate: select elementcol0_.student_id as student_1_0_0_, elementcol0_.element_collection_string1 as element_2_0_0_ from student_element_collection_string1 elementcol0_ where elementcol0_.student_id=?
getElementCollectionString2
Hibernate: select elementcol0_.student_id as student_1_1_0_, elementcol0_.element_collection_string2 as element_2_1_0_ from student_element_collection_string2 elementcol0_ where elementcol0_.student_id=?
Copy the code

Eager

Generated statement

findAll
Hibernate: select student0_.id as id1_2_, student0_.name as name2_2_ from student_tab student0_
Hibernate: select elementcol0_.student_id as student_1_1_0_, elementcol0_.element_collection_string2 as element_2_1_0_ from student_element_collection_string2 elementcol0_ where elementcol0_.student_id=?
Hibernate: select elementcol0_.student_id as student_1_0_0_, elementcol0_.element_collection_string1 as element_2_0_0_ from student_element_collection_string1 elementcol0_ where elementcol0_.student_id=?
get
getElementCollectionString1
getElementCollectionString2
Copy the code

EntityGraph

findAll
Hibernate: select student0_.id as id1_2_, student0_.name as name2_2_, elementcol1_.student_id as student_1_1_0__, elementcol1_.element_collection_string2 as element_2_1_0__, elementcol2_.student_id as student_1_0_1__, elementcol2_.element_collection_string1 as element_2_0_1__ from student_tab student0_ left outer join student_element_collection_string2 elementcol1_ on student0_.id=elementcol1_.student_id left outer join student_element_collection_string1 elementcol2_ on student0_.id=elementcol2_.student_id
get
getElementCollectionString1
getElementCollectionString2
Copy the code

conclusion

Either lazy or eager can cause N+1 problems when reading data.

EntityGraph, on the other hand, optimizes database performance by directly using Left Join when querying statements.

So I did some tests on how much performance was optimized.

Each data has two ElementCollection external association relationships.

Take one set of data from the database at a time, in groups of ten.

Make 2000 data requests.

Data modified with Eager takes 10.2-10.8s.

Using EntityGraph data, it takes only 4.4-4.6 seconds.

Saved more than half the time.

PS: If you only query the list without accessing associated data, only 3.0-3.2s is required.

More and more

spring-data-jpa-entity-graph

The spring-data-jPA-entity-graph library is more flexible than it should be.

Instead of adding an annotation, you can now pass in an EntityGraph as a parameter.

Such as:

productRepository.findByName("MyProduct", EntityGraphs.named("Product.brand"));
Copy the code

EntityGraphType

@EntityGraph(value = "student.all", type = EntityGraph.EntityGraphType.FETCH)
Copy the code

Here, type can be divided into fetch and load.

Fetch: Values specified by EntityGraph are loaded as fetchType. EAGER and others as fetchtype. LAZY.

Load: Values specified by EntityGraph are loaded as FetchType.EAGER, and others are loaded as default values or as set to FetchType.

Fetch is usually used.

Other problems

Most of the problems are just known because of the contact with the Java backend, familiar ones can be skipped.

named

If you use @EntityGraph, it’s a lot faster, but sometimes you don’t want to query for extra data.

List<Student> findAllByIdIn(List<Integer> ids);

@EntityGraph(value = "Student.all", type = EntityGraph.EntityGraphType.FETCH)
List<Student> findAllGraphByIdIn(List<Integer> ids);
Copy the code

You can customize the EntityGraph (without rules) for new queries, calling findAllGraphByIdIn only when you need to correlate data.

Compiler optimizations cause problems

for (Student student: studentList) {
	System.out.print(1);
	student.getLazyStrings();
}
Copy the code

In the System. The out. Print (1); This line has a breakpoint. If you go to this line in debug mode, you will find that lazyStrings already has a value. Why is the lazy attribute not lazy?

This is because in debug mode, the compiler behaves differently and the lazy attribute is accessed during breakpoints, which normally would not happen.

Implicit call

public List<Student> findAll(a) {
  return studentRepository.findAll();
}
Copy the code

Without adding @EntityGraph, and without an explicit call, the data is still returned with an N+1 query because the associated attributes were accessed at serialization time.

HHH000104: firstResult/maxResults specified with collection fetch; applying in memory!

An error is reported after @EntityGraph is added to paging queries. JPA can return a Page object, the number of data is far less than the number of memory warnings will occur, why the error?

The reason is that @EntityGraph uses a left Join.

Without left Join, the SQL statement knows exactly how many pieces of data are required and where the offset is.

However, it is not clear if limit and offset are used, and only the data is returned to the back end for JPA to process, and then a Page object is returned when the assembly is complete.

The current processing mode is to find the required data ID through the ordinary query, and then query the details through the data ID and @entity.