By Alex Walker, March 10, 2026
Australian Recording Industry Association
Introduction to Song Identification
Many of us have experienced moments when a catchy tune lingers in our minds but its title remains frustratingly elusive. In such instances, song identification applications like Shazam come to the rescue. By simply playing an audio snippet, these apps can quickly identify songs and provide users with relevant information, such as the song title and artist. Initially, Shazam operated as a phone service in the UK, where users could dial a short code to connect with the service, hold their phones to the audio, and receive an SMS with the song details.
Inspired by this innovative service, I embarked on a project to replicate and enhance the functionality of Shazam using Twilio’s robust API for Programmable Voice and SMS. This guide will walk you through creating a phone service that identifies songs using Node.js and the Shazam API, making the process both engaging and educational.
Prerequisites for the Service
Before diving into the implementation, ensure you have the following components in place:
- A free Twilio account; you can sign up here.
- A RapidAPI account for accessing the Shazam API.
- A Twilio phone number capable of handling calls.
- Node.js installed on your machine.
- Ngrok, for exposing your local server.
Service Overview
This phone service will leverage Twilio’s Programmable Voice capabilities. When a call comes to the Twilio number, it’ll route the call through an HTTP request to a Node.js application. This application will utilize Twilio Markup Language (TwiML) to control call handling and responses.
During an incoming call, the service will record audio for a brief interval, typically five seconds. The recording is then processed to identify the song using a song identifier mechanism, specifically the Shazam API. If the song is identified, users will receive an SMS with the corresponding details.
Setting Up Your Application
Project Structure
Start by creating a new directory for your project:
mkdir song-identifiercd song-identifier
Installing Dependencies
Next, initialize a new Node.js project and install the required dependencies:
npm init -ynpm install twilio dotenv express wavefile axios
These dependencies include:
twilio: For sending and receiving SMS and managing call services.dotenv: To handle environment variables such as API keys and Twilio credentials.express: A web framework for managing server routes and HTTP requests.wavefile: To manipulate audio files into the required format for the Shazam API.axios: For making HTTP requests to the Shazam API.
Environment Variables
Create a new file named .env in your project root directory, and include the following lines:
TWILIO_NUMBER=XXXXXXXXXXTWILIO_ACCOUNT_SID=XXXXXXXXXXTWILIO_AUTH_TOKEN=XXXXXXXXXXRAPID_API_KEY=XXXXXXXXXX
Ensure to replace XXXXXXXXXX with actual values from your Twilio and RapidAPI accounts. Remember to use the E.164 format for Twilio phone numbers.
Developing the Phone Service
In this section, you’ll implement the core functionality of the phone service in your index.js file, creating two main routes: /record and /identify.
Recording Incoming Calls
The /record route handles incoming calls. When a call is answered, it records audio for five seconds:
app.post('/record', async (req, res) => { const twiml = new VoiceResponse(); twiml.record({ action: '/identify', maxLength: '5', }); res.type('text/xml'); res.send(twiml.toString());});
Identifying Songs
After recording, Twilio directs the call to the /identify route to process the audio:
app.post('/identify', async (req, res) => { const twiml = new VoiceResponse(); let response; const delay = ms => new Promise(res => setTimeout(res, ms)); while (true) { await delay(1000); response = await axios.get(req.body.RecordingUrl, { responseType: 'arraybuffer' }).catch(err => {}); if (response) break; } const wav = new WaveFile(); wav.fromBuffer(response.data); wav.toSampleRate(44100); const wavBuffer = wav.toBuffer(); const base64String = Buffer.from(wavBuffer).toString('base64'); const track = await fetchTrack(base64String); if (track) { sendSMS(track, req.body.Caller); await twiml.hangup(); } else { twiml.redirect('/record'); } res.type('text/xml'); res.send(twiml.toString());});
In this code fragment:
- The recording is fetched from Twilio, and a delay is implemented to account for potential processing time.
- The audio is reformatted to meet Shazam’s requirements before sending it for identification as a Base64 string.
- Upon successfully identifying a song, an SMS with the song details is sent to the caller’s registered number. Otherwise, the service redirects to record the audio again.
Deploying the Application
To run your application, use the following command to start your server:
node index.js
Next, expose your local server using Ngrok:
ngrok http 3000
Copy the generated URL and link it with your Twilio number’s Webhook settings, specifically to the /record route. Once configured, test your service by calling your Twilio number and holding your device near the music source.
Future Enhancements
While this phone service effectively mimics the functionality of Shazam’s initial offering, there remains significant room for improvement:
- Transitioning to WhatsApp Business API could enable higher-quality voice messages.
- Implementation of caching to avoid repetitive polling requests can enhance efficiency and reduce unnecessary load.
Conclusion
Congratulations! You’ve successfully developed a song identification service akin to Shazam using Twilio. While it may be simpler to resort to existing applications, crafting your own service offers tremendous learning opportunities. I hope this guide provided not only practical coding insights but also inspiration for further exploration and innovation.
For additional Twilio projects, consider exploring these tutorials:
- How to Send Voice-to-SMS Transcripts Using Twilio
- How to Escape Pesky Situations using Twilio
- How to Call an AI Friend using GPT-3 with Twilio
Happy coding!