🏆【Alibaba middleware technology series 】 "EasyExcel combat case" combat study EasyExcel how to read data from the specified file location

Background of EasyExcel

In my work, I always encounter the function of reading and writing Excel. I have been exposed to EasyExcel before, and we basically replace the traditional POI and JXL with it and even have an EasyPOI technology.

EasyExcel when pain points

When using EasyExcel, under the general scenario header is more traditional, also not complex, but this time the header is a little bit complex, read data from the specified position, to read EasyExcel from the specified position, so after constant groping, found the right solution.

EasyExcel compared to other frameworks

Poi is usually used to read excel data in a small amount, and EasyExcel is a bit complicated to read Excel, so EasyExcel has not been used in the project until once the amount of data to read is too large. Workbook -> workbookFactory.create (inputStream) is not working properly, so we can use EasyExcel instead of POI.

Java parsing, Excel generation frameworks are well known Apache POI, JXL. Poi has a SET of SAX mode API, which can solve some problems of memory overflow to a certain extent. However, POI still has some defects. For example, the decompression and storage after decompression of 07 Version Excel are completed in memory, which still consumes a lot of memory. Easyexcel rewrites poI on 07 version of Excel parsing, can originally a 3M Excel with POI SAX still need about 100M memory reduced to a few M, and no matter how large Excel memory overflow, 03 version depends on POI SAX mode.

In the upper layer, model transformation is made to make the user more simple and convenient –EasyExcel uses EasyExcel to read Excel has been thinking about how to simplify the reading method, without creating a XXDataListene listener class for each Excel read. At the beginning, I thought, Add a DataListener to a generic, sharing a DataListener, but there are also concerns about how the Dao is passed and how each Dao holds the data, and you may need to process the data differently before saving it.

EasyExcel programming mode

EasyExcel has been open source for a long time, but it’s a bit daunting to use. At first, it’s easy to read Excel from the official documentation. It only takes one line of code, but if you look more closely, you need to create a callback listener, which is a bit complicated (each Excel requires a separate callback listener class).

EasyExcel reads the specified position

To start reading the data, line 8 is the actual data, so go to the code, headRowNumber(), no write, default is 1, so start reading the data from line 2.

    /** * Read file information *@param filePath
     * @param headNum
     */
    public ContactInfoExcelDataListener read(String filePath , int headNum){
        EasyExcel.read(filePath, this).head(ContactInfoExcelEntity.class).autoCloseStream(true
                ).autoTrim(true).ignoreEmptyRow(true).sheet()
                // This can be set to 1 because the header is a line. If you have multiple rows, you can set other values. It is ok to pass no, because by default DemoData is parsed, and it does not specify a header, which defaults to 1 row
                .headRowNumber(Math.max(headNum,NumberUtils.BYTE_ZERO)).doRead();
        return this;
    }

    /** * Read file information *@param filePath
     */
    public ContactInfoExcelDataListener read(String filePath){
        EasyExcel.read(filePath, this).head(ContactInfoExcelEntity.class).autoCloseStream(true).autoTrim(true).ignoreEmptyRow(true).sheet()
                // This can be set to 1 because the header is a line. If you have multiple rows, you can set other values. It is ok to pass no, because by default DemoData is parsed, and it does not specify a header, which defaults to 1 row
               .doRead();
        return this;
    }

    /** * Read file information *@param inputStream
     * @param headNum
     */
    public ContactInfoExcelDataListener read(InputStream inputStream, int headNum){
        EasyExcel.read(inputStream, this).head(ContactInfoExcelEntity.class).autoCloseStream(true).autoTrim(true).ignoreEmptyRow(true).sheet()
                // This can be set to 1 because the header is a line. If you have multiple rows, you can set other values. It is ok to pass no, because by default DemoData is parsed, and it does not specify a header, which defaults to 1 row
                .headRowNumber(Math.max(headNum,NumberUtils.BYTE_ZERO)).doRead();
        return this;
    }
Copy the code

The process of importing data

Headers to check

InvokeHeadMap () method


    /** * call header *@param map
     * @param analysisContext
     */
    @Override
    public void invokeHead(Map<Integer, CellData> map, AnalysisContext analysisContext) {
        log.info("[start read the excel head data] :{}",map);
        // Check whether the tag header exists
        // Use this as a control assignment for our lock initialization operation. Remember that if headNum = 0 this method will probably not trigger.
        // Disposable chopsticks! If the value is set to 1, only the relevant single node synchronization lock is realized at present. If related distributed nodes are extended in the future, the distributed lock mechanism shall be used for control! The lock range needs to be controlled
        try {
            int titleRows = map.size();
            // Header interrupt handling mechanism!failureDataCount = preValidate? orginalHead.size() ! = titleRows? NumberUtils.INTEGER_ONE: NumberUtils.BYTE_ZERO:NumberUtils.BYTE_ZERO;// set
            if(preValidate && (failureDataCount.intValue() == NumberUtils.INTEGER_ONE)){
                causeByHeadFormatAbort = Boolean.TRUE;
            }
            if(! isMockFlag) {// TODO doesn't go that far: we can use this method to initialize resource usage if we need to!
                //Preconditions.checkNotNull(clueLogic,"not support clueLogic is inject this class subject!" );
                if (Objects.isNull(clueLogic)) {
                    clueLogic = SpringUtils.getBean(ClueLogic.class);
                }
                customerImportVO = new CustomerImportVO();
                // This part is mainly to reduce unnecessary memory space request
                tempDataList = Lists.newArrayListWithExpectedSize(batchSizeUnit);
            }
// syncLockController.lock();
        } catch (Exception e) {
            log.error("invoke the analysis the title head info data is failure!",e);
            throw new UnsupportedOperationException("invoke the analysis the title head info data is failure!",e);
        }
        log.info("[finished reading the excel head data]");
    }
Copy the code

The data processing

Invoke () method

The invoke() method is parsed data by data, and inside the method is my business logic, data validation. Invoke is the specific data value for each row

    /** * Calls the operation handling control mechanism *@param excelEntity
     * @param context
     */
    @Override
    public void invoke(ContactInfoExcelEntity excelEntity, AnalysisContext context) {
        log.info("---- [start read the excel main data:{}] ----",excelEntity);
        if(batchSizeUnit <= tempDataList.size()){
            CustomerImportVO customerImportVO = clueLogic.startCallTaskProxy(contactInfoImportParam,tempDataList);
            // Merge calculation result -> update to the latest result
            this.customerImportVO.merge(customerImportVO);
            tempDataList.clear();
            tempDataList = Lists.newArrayListWithExpectedSize(batchSizeUnit);
        }else{
            tempDataList.add(excelEntity);
        }
        log.info("[Finished reading the excel main data]");
    }
Copy the code

The interrupt

HasNextdoAfterAllAnalysed () method

/ * * * have next execution * [@ param] (https://my.oschina.net/u/2303379), the context * [@ return] (https://my.oschina.net/u/556800) * / [@Override](https://my.oschina.net/u/1162528) public boolean hasNext(AnalysisContext context) { return causeByHeadFormatAbort?Boolean.FALSE:isSupportAbort? failureDataCount <= 0 :Boolean.TRUE; }Copy the code

Data is complete

DoAfterAllAnalysed () method

All data parsed, doAfterAllAnalysed() method, which is written to save the data method.

    /** * Execute end callback mechanism *@param analysisContext
     */
    @Override
    public void doAfterAllAnalysed(AnalysisContext analysisContext) {
        log.info([doAfterAllAnalysed the process]);
        try {
            CustomerImportVO customerImportVO = clueLogic.startCallTaskProxy(contactInfoImportParam,tempDataList);
            this.customerImportVO.merge(customerImportVO);
            finisheDataResult = Boolean.TRUE;
        }catch (Exception e){
            log.error("execute finially the flush data is failure！");
            // How to achieve consistency and complete compensation for TODO data information!
            finisheDataResult =  Boolean.FALSE;
        } finally {
            tempDataList.clear();
// syncLockController.unlock();}}Copy the code

Data reference

Blog.csdn.net/weixin_3992…

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

🏆【Alibaba middleware technology series 】 “EasyExcel combat case” combat study EasyExcel how to read data from the specified file location

Background of EasyExcel

EasyExcel when pain points

EasyExcel compared to other frameworks

EasyExcel programming mode

EasyExcel reads the specified position

The process of importing data

Headers to check

InvokeHeadMap () method

The data processing

Invoke () method

The interrupt

HasNextdoAfterAllAnalysed () method

Data is complete

DoAfterAllAnalysed () method

Data reference

🏆【Alibaba middleware technology series 】 “EasyExcel combat case” combat study EasyExcel how to read data from the specified file location

Background of EasyExcel

EasyExcel when pain points

EasyExcel compared to other frameworks

EasyExcel programming mode

EasyExcel reads the specified position

The process of importing data

Headers to check

InvokeHeadMap () method

The data processing

Invoke () method

The interrupt

HasNextdoAfterAllAnalysed () method

Data is complete

DoAfterAllAnalysed () method

Data reference

Related Posts

SpringBoot integrates the memory database Sqlite

Understand IM message “reliability” and “consistency” issues and discuss solutions

Why is Redis single threaded? Why such high performance?