Using Node.js on a Raspberry Pi to listen to MIDI messages from an Avid S6L console to trigger HTTP requests or run scripts

Back in the summer, I posted about a project I had recently finished, which involved sending HTTP requests to a server that would then relay a MIDI output message based on the request that was sent.

We’ve been using that software (dubbed midi-relay) since then to be able to control our Chroma-Q Vista lighting desks remotely across vlans by using stream decks running Companion. It works pretty well, especially since the midi-relay software is configured to run directly on the lighting consoles upon startup. We have even set up a few crontab entries to send CURL commands to the light desks to turn them on at certain times when we don’t want to be on-site just to press a button.

In anticipation of completing my most recent project, “LiveCaption“, which takes audio and transcribes it to text in real-time, I started working on midi-relay 2.0: listening to MIDI input and using that to trigger a response or action.

logo
I figured it was time this thing had a logo.

In both auditoriums at my church, we have Avid S6L audio consoles. These consoles can do a lot, and like most consoles, they have GPIO pinouts to allow you to trigger things remotely, whether as an action originating from the sound console, or externally that then triggers something on the console like recalling a snapshot, muting an input, etc.

Screen Shot 2019-11-19 at 4.23.54 PM
Stock photo of the console I found on the Internet.
photo-nov-19-2-39-41-pm.jpg
These are (some of) the I/O pins on the S6L console. It has GPIO and MIDI ports. We use the footswitch input for setting tap tempo.

I started looking at the possibility of using the GPO pins on the console to trigger an external action like sending an HTTP request to Ross Dashboard, Companion, etc. However, there are only 8 GPO pins on this audio board, so I knew that could be a limiting factor down the road in terms of the number of possible triggers I could have.

The S6L also has MIDI In and Out, and through the Events section of the console, it can be used as either a trigger (MIDI In) or an action (MIDI Out) on just about anything.

Photo Nov 19, 1 28 22 PM
The Events page on an Avid S6L console. All kinds of things can be used as triggers and actions here! In this particular event, I’ve created a trigger that when the Snapshot “Band” is loaded, it sends MIDI Out on Channel 1 with Note 22 (A#0) at Velocity 100. MIDI-Relay then listens for that MIDI message and sends an HTTP POST request to the LiveCaption server to stop listening for caption audio.

We already have a snapshot that we load when we go to the sermon/message that mutes things, sets up aux sends, etc. and I wanted to be able to use that snapshot event to automatically start the captioning service via the REST API I had already built into LiveCaption.

In the previous version, midi-relay could only send Note On/Off messages and the custom MSC (MIDI Show Control) message type I had written just for controlling our Vista lighting consoles. With version 2.0, midi-relay can now send MIDI out of all of the channel voice MIDI message types:

  • Note On / Note Off
  • Polyphonic Aftertouch
  • Control Change
  • Program Change
  • Pitch Bend
  • Channel Pressure / Aftertouch

It can also send out:

  • MSC (MIDI Show Control), which is actually a type of SysEx message
  • Raw SysEx messages, formatted in either decimal or hexadecimal

And, midi-relay can now listen for all of those channel voice and SysEx messages and use it to trigger one of the following:

  • HTTP GET/POST (with JSON data if needed)
  • AppleScript (if running midi-relay on MacOS)
  • Shell Script (for all OS’s)

There are a few software and hardware products out there that can do similar things, like the BomeBox, but I wanted to build something less-expensive and something that could run on a Raspberry Pi, which is exactly how we’ve deployed midi-relay in this case.

Photo Nov 19, 1 27 32 PM
Here is the Raspberry Pi running midi-relay, connected to the MIDI ports on the S6L via a USB to MIDI interface. It tucks away nicely at the back of the desk.

Now we can easily and automatically trigger the caption service to start and stop listening just by running the snapshots on the audio console that we were already doing during that transition in the service. This makes it easier for our volunteers and they don’t really have to learn a new thing.

Here’s a video of it in action:

 

 

[wpvideo W77anq42]

If you’d like to check out version 2.0 of midi-relay, you can download both the source code and binaries from GitHub: https://github.com/josephdadams/midi-relay

The documentation is pretty thorough if you want to use the API to send relay messages or set up new triggers, but you can also use the new Settings page running on the server to do all that and more.

Screen Shot 2019-11-19 at 4.21.28 PM
From the Settings page, you can view available MIDI ports, add/delete Triggers, view detected midi-relay hosts running on the network, and send Relay messages to other hosts.

And if you’re a Companion user for your stream deck, I updated the module for Companion to support the new channel voice MIDI relay messages as well! You’ll need to download an early alpha release of Companion 2.0 to be able try that out. Search for “Tech Ministry MIDI Relay” in Companion.

Here’s a list of the Raspberry Pi parts I used, off Amazon:

Photo Nov 19, 3 52 56 PM

I hope this is helpful to you and your projects! If you need any help implementing along the way, or have ideas for improvement, don’t hesitate to reach out!

Free Real-Time Captioning Service using Google Chrome’s Web Speech API, Node.js, and Amazon’s Elastic Cloud Computing

For awhile now, I’ve wanted to be able to offer live captions for people attending services at my church who may be deaf or hard of hearing, to allow them to follow along with the sermon as it is spoken aloud. I didn’t want them to have to install a particular app, since people have a wide variety of phone models and OS’s, and that just sounded like a pain to support long-term. I also wanted to develop something low-cost, so that more churches and ministries could benefit from it.

I decided to take concepts learned my PresentationBridge project from last year’s downtown worship night and use it for this project. The idea was essentially the same, I wanted to be able to relay, in real-time, text data from a local computer to all connected clients using the Node.js socket.io library. Instead of the text data coming from something like ProPresenter, the text data would be the results of the Web Speech API’s processing of my audio source.

If you’re a Google Chrome user, Chrome has implemented W3C’s Web Speech API, which allows you to access the microphone, capture the incoming audio, and receive a speech-to-text result, all within the browser using JavaScript. It’s fast and, important to me, it’s free!

Here is how it works: The computer that is doing the actual transcribing of the audio source to text must use Google Chrome and connect to a Bridge room, similar to how my PresentationBridge project works. Multiple bridge rooms (think “venues” or “locations”) can be configured on the server, and if multiple rooms are available, when end users connect, they will be given an option to choose the room they want to be in and receive text. The only requirement for browser choice is the computer doing the transcribing; all others can use any browser on any computer or device they choose.

Screen Shot 2019-11-04 at 1.36.34 PM
This is the primary Bridge interface that does the transcribing work.

From the Bridge interface, you can choose which “Bridge” (venue) you want to control. If the Bridge is configured with a control password, you will have to enter it. Once connected, you can choose whether to send text data to the connected clients, go to logo, etc. You can redirect all users to a new webpage at any time, send a text/announcement, or reload their page entirely. To start transcribing, just click “Start Listening”. You’ll have to allow Chrome to have access to the microphone/audio source (only the first time). When you are connected to the Bridge, you can also choose to send the users to Logo Mode (helpful when you’re not broadcasting), or you can choose to send data or turn it off (helpful when you want to test transcribe but not send it out to everyone). There is also a simple word dictionary that can be used to replace commonly misidentified words with their proper transcription.

A note about secure-origin and accessing the microphone: If you’re running this server and try to access the page via localhost, Google Chrome will allow you to access the microphone without a security warning. However, if you are trying to access the page from another computer/location, the microphone will be blocked due to Chrome’s secure-origin policy.

If you’re not using a secure connection, you can also modify the Chrome security flag to bypass this (not recommended for long-term use because you’ll have to do this every time Chrome restarts, but it’s helpful in testing):

  • Navigate to chrome://flags/#unsafely-treat-insecure-origin-as-secure in the address bar.
  • Find and enable the Insecure origins treated as secure section.
  • Add any addresses you want to ignore the secure origin policy for. Remember to include the port number (the default port for this project is 3000).
  • Save and restart Chrome.

Here is a walkthrough video of the captioning service in action:

[wpvideo r6P0iWGj ]

I chose to host this project on an Amazon EC2 instance, because my usage fits within the free tier. We set up a subdomain DNS entry to point to the Elastic IP so it’s easy for people in the church to find and use the service. The EC2 instance uses Ubuntu Linux to run the Node.js code. I also used ngninx as a proxy server. This allowed me to run the service on my custom port, but forward the necessary HTTPS (port 443) traffic to it, which helps with load balancing and keeps my server from having to handle all of that secure traffic. I configured it to use our domain’s SSL certificate.

I also created a simple API for the service so that certain commands like “start listening”, “send data”, “go to logo” etc. can be done remotely without user interaction. This will make it easier to automate down the road, which I plan to do soon, so that the captioning service is only listening to the live audio source when we are at certain points in the service like the sermon. Because it’s just a simple REST API, you can use just about anything to control it, including a Stream Deck!

IMG_2076.JPG
We deployed them in our two auditoriums using ChromeBooks. An inexpensive solution that runs the Chrome Browser!

In order to give the devices a direct feed from our audio consoles, I needed an audio interface. I bought this inexpensive one off Amazon that’s just a simple XLR to USB cable. It works great on Mac, PC, and even ChromeBooks.

Screen Shot 2019-11-14 at 2.15.19 PM.png
XLR to USB audio interface so we can send a direct feed from the audio console instead of using an internal microphone on the computer running the Bridge.

If you’d like to download LiveCaption and set it up for yourself, you can get it from my Github here: https://github.com/josephdadams/LiveCaption

I designed it to support global and individual logos/branding, so it can be customized for your church or organization to use.