In this talk I will give an overview on text dependent speaker recognition including Bayesian and deep-learning based approaches. In the Bayesian part I will describe methods tuned to handling large amounts of training data (joint-factor analysis, i- vectors) and methods tuned to modeling robustly very small amounts of training data. I will also talk about spoofing and countermeasures. The talk is based on: H. Aronowitz, "Inter Dataset Variability Modeling for Speaker Recognition", in Proc. ICASSP, 2017.
H. Aronowitz, "Speaker Recognition using Common Passphrases in RedDots", in
Proc. ICASSP, 2017.
A. Aides, H. Aronowitz, "Text-Dependent Audiovisual Synchrony Detection for Spoofing Detection in Mobile Person Recognition", in Proc. Interspeech, 2016.
H. Aronowitz, "Speaker Recognition using Matched Filters", in Proc. ICASSP, 2016. O.Plchot, L. Burget, H. Aronowitz, P. Majetka, "Audio Enhancing with DNN Autoencoders for Speaker Recognition", in Proc. ICASSP, 2016.
H. Aronowitz, "Exploiting Supervector Structure for Speaker Recognition Trained on a Small Development Set", in Interspeech, 2015.
H. Aronowitz, "Score Stabilization for Speaker Recognition Trained on a Small Development Set", in Interspeech, 2015.
Dr. Hagai Aronowitz received the B.Sc. degree in Computer Science, Mathematics and Physics from the Hebrew University, Jerusalem, Israel in 1994, and the M.Sc. degree, Summa Cum Laude and Ph.D. degree, both in Computer Science from Bar-Ilan University, Ramat-Gan, Israel, in 2000 and 2006 respectively. In 2006-2007 he has been a postdoctoral fellow in the advanced LVCSR group in IBM T. J. Watson Research Center, Yorktown Heights, NY. He currently is working at IBM Haifa Research Lab, leading the multi-modal biometrics research team. His research interests include speaker identification, speaker diarization, face identification, audiovisual processing, and spoken language identification. Dr. Aronowitz is an author of 50 peer reviewed publications in major conferences and journals.