The developers behind AI-based sample editing software Samplab have released a free plugin that lets you generate free samples using AI technology.
TextToSample utilizes Meta's open-source, AI-powered text-to-music sound generator MusicGen to produce short samples in response to a text prompt. Simply type in what you'd like to hear - anything from 'anthemic lead synth line' to 'atonal harmonica solo' will do - and TextToSample will generate a sample based on your instructions.
You can also drag and drop any sample into TextToSample, add a prompt and the software will create a new sample based on the original. This can be used to layer up samples produced by TextToSample, progressively augmenting them until you reach the desired result, or it can be used to put a spin on existing samples from your own library. In the video above, the user drops in a synth pad sample and enters the prompt 'rock guitar'; TextToSample then adds a little flourish of guitar at the tail end of the sample.
The plugin might look simple - aside from the text box, there's a grand total of two other controls - but don't be deceived; with TextToSample, Samplab have packaged up one of the most cutting-edge generative AI models for making music that's currently available and hooked it up with your DAW, giving you the power to generate unlimited amounts of free music samples.
"Our goal with TextToSample is to enable the producer community to experiment with state-of-the-art generative AI models," reads the FAQ on Samplab's website. "We want to show what's currently possible with these models, especially when running them locally on your computer."
We spoke with Gian-Marco from Samplab to find out more about how the plugin works. "We run an AI model in the background that takes your text and/or audio prompt as input and then generates audio based on that," Gian-Marco tells us. "It works similar to things like ChatGPT, but it spits out sound instead of text."
"In particular, the model step by step generates small chunks of audio (20 ms at a time) until the result has the desired length. Whenever it generates a chunk, it considers the audio that came before. The "context" value that you can set, allows you to adjust how many seconds of audio model looks at when generating the next bit. If you give TextToSample audio as the input, the model acts the same as if it generated that input itself and just tries to continue it, also taking the text prompt into account if one exists."