Fast speaker adaption via maximum penalized
likelihood kernel regression
Ivor W. Tsang, James T. James, Brian Mak, Kai Zhang and
Jeffrey J. Pan
Abstract:
Maximum likelihood linear regression (MLLR) has been a popular speaker
adaptation method for many years.
In this paper, we investigate a generalization of MLLR using nonlinear
regression. Specifically, kernel regression is applied with appropriate
regularization to determine the transformation matrix in MLLR for fast
speaker adaptation.
The proposed method, called maximum penalized likelihood kernel
regression
adaptation (MPLKR), is computationally simple and the mean vectors of
the speaker adapted acoustic model can be obtained analytically by
simply solving a linear system. Since no nonlinear optimization is
involved, the obtained solution is always guaranteed to be globally
optimal. The new adaptation method was evaluated on the Resource
Management task with 5s and 10s of adaptation speech. Results show
that MPLKR outperforms the standard MLLR method.
Proceedings of the International Conference on Acoustics, Speech, and Signal
Processing
(ICASSP 2006), vol 1, pp.997-1000, Toulouse, France, May 2006.
PDF:
http://www.cs.ust.hk/~jamesk/papers/icassp06.pdf
Back to James Kwok's home page.