"Humans will be doing all the serious music transcription for the foreseeable future": Songscription review

This AI-powered transcription tool promises to convert any audio file into sheet music, tab or MIDI. But can it really match the skills of a professional?

songscription
(Image: © Songscription)

MusicRadar Verdict

Pros

  • +

    Very accurate pitch detection, including grace notes and complex chords

  • +

    Freemium model gives you basic transcription features for nothing

Cons

  • -

    Uneven rhythm detection and frequently incorrect or nonsensical time signatures

  • -

    Charts require a lot of formatting and editing to be usable

  • -

    Struggles to transcribe tracks with multiple instruments accurately

MusicRadar's got your back Our team of expert musicians and producers spends hours testing products to help you choose the best music-making gear for you. Find out more about how we test.

What is it?

As a music teacher, I am often asked: is there some software that can take an audio recording of a song and transcribe it into sheet music or guitar tab? It seems like that should be possible, right? If AI can write songs and essays and computer code, then surely it should be able to write music notation too.

This is a hot topic for me right now, because I’m teaching a transcription class at NYU. It’s a useful skill for pop musicians, because sheet music is often unavailable, and when it is available, it’s missing important information and full of inaccuracies. If the computer could do our transcription for us, it would save a lot of time. A new browser-based tool called Songscription is the latest attempt to make the dream of automated music transcription a reality. Does it succeed? Let’s find out.

Performance

Previous attempts at automated transcription have not been successful. This is not because computers have a hard time identifying notes. Guitar tuners and pitch correction plugins detect pitches accurately in real time. Detecting the timing of note onsets is also very easy; you just look for the transients; the moments where the amplitude increases in some short window of time.

So, computers can detect pitches and timing. Why can’t they transform that information into sheet music or tab? The difficulty lies in interpreting the pitches and timing in a meaningful and human-readable way. This is where software often comes up short. Computers are notoriously bad at being able to deduce meter from a series of note onsets, to distinguish ordinary human timing variation from tempo changes or complex microrhythms, and to determine whether a slight pitch fluctuation is part of the “real” melody or not.

But now we have AI! If you train the software on a corpus of accurately notated audio files, can it learn to successfully pattern-match from there? This is the premise of Songscription.

The rhythm seems accurate, until you realize that everything is shifted an eighth note late

When you upload a file or provide a YouTube link to Songscription, it asks you what instruments it contains. The first choice is piano, for good reason; the company used piano recordings as the bulk of its training data.

There are several more options labeled “beta”, meaning, not fully finished: flute, acoustic guitar, violin, trumpet, and bass guitar. There's also a Piano Arrangement option that promises to transform an entire track into an arrangement for solo piano.

In the Advanced options tab, you can specify a key signature and time signature, though you can also ask Songscription to figure these out itself. There are many time signatures to choose from, but oddly, you can only choose from among the major keys.

Once an audio file has been transcribed, Songscription displays the notation in a score viewer where the music can be edited and played back at varying speeds. You also have the option to display the notation on a piano roll visualizer – a handy feature for those that can't read conventional notation. Once you're happy with the results, you can export the notation in PDF, MIDI, Music XML or GuitarPro formats.

Given that Songscription clearly favors piano recordings, I decided to test some of those first. Click the links to compare Songscription’s output to the source audio.

I started with What'd I Say by Ray Charles, which begins with fifteen seconds of unaccompanied Wurlitzer electric piano playing a single-note blues riff. Then there are thirty more seconds of piano and drums playing a fairly simple R&B groove. At first, it seems like Songscription has transcribed all of this impressively well. It identifies the notes accurately, including the grace notes. The rhythm seems accurate too, until you realize that everything is shifted an eighth note late. That is something you could fix in a notation editor, but it would take some work.

songscription

(Image credit: Songscription)

Next, I tried Crazy by Patsy Cline, which starts with a prominent piano part accompanied by bass, guitar and drums. The results are much less impressive. Songscription gets the pitches, more or less, but it has no idea what to make of the rhythm. It writes a measure of 3/4, a measure of 4/4 and a measure of 11/8 before finally settling into 6/4. The song is in regular 4/4 with a triplet feel, which you could also write as 12/8. Technically, you could make a case that 12/8 and 6/4 are equivalent, but it’s not what you would want to see on a chart.

Maybe the background instruments in the Patsy Cline song were throwing things off too much? I thought Songscription might do better with some unaccompanied classical piano, so I gave it the Prelude in C Major from Bach’s Well Tempered Clavier, performed by Glenn Gould. Songscription does well at first, aside from a few skipped notes here and there. But then it begins skipping entire phrases, then entire measures, for no reason I can see.

Songscription did better with another well-known piece of beginner piano student repertoire, Beethoven’s Für Elise, performed by Van Cliburn. The transcription is very accurate, except that, as with Ray Charles, everything is shifted an eighth note late.

songscription

(Image credit: Songscription)

How about some non-beginner piano? I chose Danseuses de Delphes by Claude Debussy, performed by Debussy himself. The pitches are exactly right, but the interpretation of the rhythms is weird.

Testing classical piano is interesting, but also pointless; no one needs these pieces transcribed when the sheet music is so easily available on IMSLP. I decided to try some jazz piano instead, starting with Functional by Thelonious Monk.

Songscription gets the notes right for the most part, but it is completely lost with the rhythms

Songscription gets the notes right for the most part, but it is completely lost with the rhythms, and the chord symbols are all over the place. In fairness, this tune is more harmonically and rhythmically complex than entry-level Bach or Beethoven. On the other hand, it’s also the kind of thing that I would actually want help with transcribing.

So, that’s piano. How about other instruments? Violin is an option, and I have some recordings of Bach’s solo violin works, so I tried one of those: the Presto from the Violin Sonata No. 1 In G Minor, performed by Viktoria Mullova. Songscription gets the time signature wrong, but otherwise does well, until Mullova does a tiny bit of expressive bowing, at which point the timing goes completely off track.

Flute is on the list too, so I did another Bach piece, the Partita for Solo Flute in A Minor, BWV 1013, performed by Jean-Pierre Rampal. Like the Presto from the violin sonata, this piece is a continual and uninterrupted stream of notes. Unlike violinists, however, flutists need to breathe. Songscription produces a faithful transcription of Rampal’s performed timing, but it writes the pauses for breath into the score as longer notes. In fairness, a person who isn’t familiar with the piece might well get confused about that too.

Acoustic guitar is next up on the instrument list. I gave Songscription a solo blues guitar track, Honey Babe Your Papa Cares For You by Elizabeth Cotten. The rhythm is shifted over two beats, but otherwise this is the best transcription I got out of any of my test pieces.

PRICING

Songscription offers its users a maximum of 10 3-minute transcriptions per month for free. If you want more, though, you'll need to sign up for a Plus or Pro subscription. Plus ($9.99/month) allows for 5 6-minute transcriptions per month and gives you additional export options, while Pro ($29.99/month) boosts that to a maximum of 100 15-minute transcriptions per month.

The guitar tab is fairly playable, though it needs some mild correction. This example is unfair, though, because I happen to know that Cotten is playing with the guitar tuned down a whole step, so you would finger the song as if it’s in G rather than F. But you couldn’t expect the AI to know that.

The real problem is, how many recordings of solo acoustic guitar even exist in the world? I’m imagining that most of Songscription’s target audience would be after transcriptions of full songs. The Piano Arrangement feature is currently in the alpha stage, but it’s the feature I and most people would be most interested in, so it’s only fair to test it.

I started with a song that I thought would be easy, Everybody Laughs by David Byrne, a nice simple tune with clear, transparent timbres. Songscription gets the general idea of the melody and rhythm, but as with Elizabeth Cotten, everything is two beats off.

I tried Good Vibrations by the Beach Boys next, expecting that the AI would have a harder time, because it’s a lower-fidelity recording with more complex timbres. As expected, the melody and harmony are substantially in place, but the time is complete chaos.

songscription

(Image credit: Songscription)

Songscription struggles even harder with Jumpin’ Jack Flash by the Rolling Stones. It identifies some of the chords, but it can’t locate the melody at all. It similarly can’t find a foothold in Chain of Fools by Aretha Franklin, aside from putting C under all the chords.

For the final challenge, I used Isobel by Björk. Predictably, Songscription can’t do much with the orchestral intro, but once the main song starts, it gets the approximate shape of the vocal melody and the broad strokes of the chord changes. However, it has no idea at all how to fit any of this to the metrical grid, throwing in some mysterious bars of 7/8 time. I can see why the Piano Arrangement is an alpha feature.

In addition to audio and YouTube links, you can also use Songscription to convert MIDI to sheet music. (It confusingly presents MIDI as an “audio” format, which it isn’t.) Also, I’m not sure why Songscription includes this feature, since most notation editors can already turn MIDI into sheet music. I guess what Songscription is really offering here is the ability to generate piano reductions of complex arrangements.

I tested the MIDI feature with a detailed transcription I made of Harder Better Faster Stronger by Daft Punk. I exported my MuseScore file as MIDI and uploaded it. Here’s the result. Songscription does fine with the melody and chords, but what is going on with bass clef? Why is it pedaling C in the bass when the song is in F# minor? I finally realized that it interprets my drum part as a bassline, and dutifully copies the kick drum as a repeated low C.

Alongside its transcription features, Songscription offers

Verdict

Songscription works best on beginner-level classical repertoire and very simple pop songs recorded on solo piano. If you give it other instruments, more than one instrument at a time, or music with any kind of expressive timekeeping, it struggles.

In other words, Songscription does very well on the kinds of music that are easy for humans to transcribe, and not well on the kinds of songs that are hard for humans to transcribe. The tool is helpful in situations where you don't need help, and unhelpful in situations where you do need help.

It's helpful in situations where you don't need help, and unhelpful in situations where you do need help

Even when it can accurately identify the notes and rhythms, Songscription’s charts need significant editing to be useful. Like I said at the beginning of this review, there’s a big difference between identifying pitches and timing and representing that information in notation.

Sheet music and tab are not precise documentation of the notes as performed; they are abstractions. If a piece of music is like a city, then a notated score is like a street map. You have to leave out a huge amount of information if you want the map to be readable. Deciding what to include and what to omit is an art, not a science.

Songscription is a remarkable tech demo. It’s astonishing that it can detect pitches and durations so well. However, for any kind of real-world use, cleaning up its output would be harder work than writing charts the old-fashioned way.

I notice that the site offers professional transcription services at the bottom of each score – maybe that’s their actual business? It would make sense; humans will be doing all the serious music transcription for the foreseeable future.

Ethan Hein

Ethan Hein has a PhD in music education from New York University. He teaches music education, technology, theory and songwriting at NYU, The New School, Montclair State University, and Western Illinois University. As a founding member of the NYU Music Experience Design Lab, Ethan has taken a leadership role in the development of online tools for music learning and expression, most notably the Groove Pizza. Together with Will Kuhn, he is the co-author of Electronic Music School: a Contemporary Approach to Teaching Musical Creativity, published in 2021 by Oxford University Press. Read his full CV here.

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.