Real-time transcription and sentiment analysis of audio streams; on the phone and in the browser


Aaron Bassett

Discover how you can use Artificial Intelligence to perform sentiment analysis of an audio stream, in real-time! In this workshop, we'll show you how to use AI and NLP to figure out what a person is calling about and how they are feeling, all from a telephone call audio feed via WebSockets.

This is an intermediate workshop which makes heavy use of Python, WebSockets, Tornado, and a bit of JavaScript. We'll be looking at how we can capture audio from a telephone call, convert the speech to text, and perform sentiment analysis of the transcript. We'll also look at how we can download an MP3 of our conversation which we can then use NLP to lift out key information.

We'll build up our application piece-by-piece, so attendees can code along as we build it. Alternatively, you can download the example code and follow along as I talk through what the code is doing.


If you do wish to follow along I suggest reading the following in preparation:

We're going to be working a lot with `async` and WebSockets; so please ensure you have Python 3.6.x installed. The installation section of the [DjangoGirls tutorial]( may help. Alternatively, if you find yourself often needing to run different Python versions [pyenv]( is incredibly useful.

We're going to be using [pipenv]( to install our dependencies and manage our virtual environment. So please ensure you have pipenv installed. You may also want to `pipenv install` the following packages, so you have them cached on your local machine.

`pipenv install nexmo hug tornado cryptography watson-developer-cloud logzero`

There will be a Git repository available with the example code. If you'd like to clone it, you must have Git installed as well.

We'll be making use of the Nexmo and IBM Watson APIs. You might want to save some time by registering for both these services;

I'll provide participants with a voucher code on the day for some additional Nexmo credit.