Based on the speech recognition model of RNN and CTC, the solution of context offset is explored
In this paper we introduce a combined CTC and WFST (Weighted Finite-State Sensers) approach: EESEN: End-to-end SPEECH RECOGNITION USING DEEP RNN MODELS AND WFST-based DECODING.
Starting from Encoder-Decoder model, explore the solution of context offset
Abstract: In this paper, we demonstrate that CLAS, an end-to-end contextual ASR model consisting of a full neural network, fuses contextual information by mapping all contextual phrases. In experimental evaluation, we found that the proposed CLAS model exceeded the standard Shallow Fusion bias method.
mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.