Hui-miner is the first utility-list based algorithm, as well as the first “one-phase” algorithm, which mines on the original data set to ensure that all generated itemsets have previously appeared in the original data set. In addition, the utility-list structure can compress the key information so that the algorithm can prune ahead of time to reduce the candidates. At present, many scholars continue to study hui-Miner algorithm and put forward various methods to further improve its mining efficiency

Efficient high utility itemset mining using buffered utility-lists

motivation

As hui-Miner algorithm itself is the utility-list of new high-order itemsets by taking two different utility-lists intersection, so all lists of the same level should be kept until new high-order lists cannot be generated before the memory is released. In terms of memory overhead, it is undoubtedly huge. The utility List-based algorithm is improved from the perspective of reducing memory waste

define

Most of the definitions can be found in hui-Miner notes, but the following are only new additions

  • Utility-list buffer: A sort of pipe that connects all the utility-lists into one (first subgraph, three lists)
  • The index segment (index segment) :
    S U L ( X ) SUL(X)
    Indicates the item set
    X X
    A position segment in the utility list buffer for quick positioning. In order to (X, StartPos, EndPos, SumIutil, SumRutilIt can also join the index segments of all item sets together to form a pipe (the second subgraph below consists of a list)

Pseudo code

Construct Utility-List Buffer Structure

Different from the process of utility-List generating list after list, ULB-Miner directly continues to splicing the newly obtained item set information into the last part of the original ULB storage structure, and its time complexity is directly reduced to OOO(NNN + MMM). While Hui-Miner’s time complexity is OOO(nlog(m)nlog(m)nlog(m)), where n, MN, MN and M are the number of transactions contained by two different utility-lists on the table respectively

Search procedure

A recursive method that starts with a low-order itemset and each round expands a single item to form a new high-order itemset

Main procedure

reference

  1. Hui-miner: Mining high utility Itemsets without candidate generation
  2. FHM: Faster High-utility Itemset mining using estimated Utility co-occurrence pruning
  3. Ulb-miner: Efficient High Utility Itemset Mining using Buffered Utility – Lists