With the success of voice assistants such as Alexa/Siri/Google, deep learning has brought sound and speech processing into the mainstream.
Topics include speech recognition, denoising, classification, audio tagging and audio separation (speech & music)…

Technical resources

Course material projected during training and sent to all trainees at the end of the course; case studies and practical examples chosen according to trainees' areas of interest

Performance monitoring

All trainees are asked to sign in every half-day Evaluation: Questionnaire to assess skills acquired at the end of the course

Assessment of results

Post-training satisfaction questionnaire

Pedagogical objectives

Theoretical courses mixed with examples and case studies. This training course aims to present the main problems encountered

Technologies covered

LSTM, U-Net, CNN, Fourier, Wiener filter, ngram, language model, acoustic model, state-space model, Kaldi, PyTorch, deep clustering, TASnet, tacotron, wavenet

Target skills

  • Basics of audio signal processing
  • Speech recognition: classic concepts, state of the art
  • Denoising, separation, filtering
  • Classification, tagging
  • Voice and music synthesis