Published on

Raw Audio to Piano Transcription

Presentation
Authors

Sam Witteveen and I started the TensorFlow and Deep Learning Singapore group on MeetUp in February 2017, and the nineteenth MeetUp, aka TensorFlow and Deep Learning: TensorFlow 2.0 - the New Stuff, was again hosted by Google Singapore.

We were honoured to have our favourite Singaporean Google Brain team member that works with TPUs speak at the event. Frank Chen's talk was "TensorFlow 2.0 is coming" - and comprised 4 parts :

  • TensorFlow 2.0
  • TensorFlow.js
  • TensorFlow Lite
  • TensorFlow Distribution Strategies

And, at the last minute, we also discovered that Wolf Dobson from the TensorFlow team could give a talk before he passed out from Jet Lag : "Eager Mode and @autograph".

Following that, Sam Witteveen (now returned from New York) gave a talk about the (very) new feature that appeared in Google Research's Colab : Free access to TPUs! In a talk that apparently took the Google team a little off-guard (because they thought that noone would be able to figure out how to do it, given the lack of documentation), Sam showed how to "Use Keras and TPUs in the Cloud for Free".

For my part, I gave a talk titled "Piano Transcriptions", which discussed the Google Magenta team's model to convert raw audio files to midi piano rolls. Even though the talk was super-brief (because the additional talk had caused everything to run late), I described how the Deep Learning transcription network is built, the special 'losses' required to make it perform so well, and demonstrated it in action on music sourced 'in the wild' (the code for the enhanced Colab file is also available - see the slides for details).

The slides for my talk are here :

Presentation Screenshot

If there are any questions about the presentation please ask below, or contact me using the details given on the slides themselves.

Presentation Content Example