Starting from Encoder-Decoder model, explore the solution of context offset

Abstract: In this paper, we demonstrate that CLAS, an end-to-end contextual ASR model consisting of a full neural network, fuses contextual information by mapping all contextual phrases. In experimental evaluation, we found that the proposed CLAS model exceeded the standard Shallow Fusion bias method.

