Generative AI systems need to be fed huge amounts of data, with copyrighted materials often on the menu. Musicians may now have a way to fight back with HarmonyCloak, a system that embeds data into songs that can’t be picked up by human ears but will scramble AI trying to reproduce it.
In order to generate Facebook images that trick your Nan into praising a fake kid who made a giant Jesus out of eggs, AI systems first need to be trained on eye-watering amounts of data. The more content that’s poured into these models, the more detailed, accurate and diverse the end results can be. So companies like OpenAI and Anthropic just scrape the entirety of human creative output – in other words, the internet.
But a little thing called copyright keeps cropping up as a thorn in their sides. Nintendo wasn’t too pleased that Meta AI allowed users to generate emoji stickers depicting Mario characters brandishing rifles, for instance. But while the big content owners have the resources to either sue AI companies or strike up lucrative licensing deals, small creators often get shafted.
Researchers at the University of Tennessee, Knoxville and Lehigh University have now developed a new tool that could help musicians protect their work from being fed into the machine. It’s called HarmonyCloak, and it works by effectively embedding a new layer of noise into music that human ears can’t detect but AI 'ears' can’t tune out.
This extra noise is dynamically created to blend into the specific characteristics of any given piece of music, remaining below the human hearing threshold. But any errant AI models that scrape the music can’t figure out which bits to ignore, so it kind of poisons the well and ruins their attempts at recreation.
The idea is that creators could use the tool to add a layer of protection to their music before uploading it to websites or streaming services, where it might get caught up in AI dragnets. Similar tools are already in use for images.
Here’s a good example of it in action. These two audio clips are generated by a model called MusicLM, based on the same prompt – “generate indie rock track” – and trained on the same music. The difference is that one source is clean, while the other uses HarmonyCloak.
When trained on the clean music, the AI farts out a serviceable but soulless track. It wouldn’t sound out of place in the background of a car insurance commercial produced by an agency that doesn’t want to pay artists.
But then comes the AI-generated track trained on music protected by HarmonyCloak. The difference is stark – this mess of random noise is genuinely uncomfortable to listen to, and sounds like it was recorded by a three-legged cat hopping across a keyboard.
HarmonyCloak can be used in two different settings. It can be tuned to apply noise that’s targeted at a specific AI model, with better results that still apply even after the track’s been processed, such as compression to MP3. Or, it can generate noise that affects a range of models, so the original creation is protected against whichever AI attempts to replicate it, including ones that haven't been developed yet.
Normally these kinds of protections would create an arms race, where AI models just adapt to get around barriers like this. But in this case, the team says HarmonyCloak operates differently with every song, so AI would need to know the specific parameters used for each individual track before it could crack the code. It could still be done, but at least it wouldn’t be easy for it en masse.
HarmonyCloak and other tools might end up helping artists survive until AI either completely destroys the value of human expression, or it chokes to death on its own regurgitations of regurgitations – whichever comes first.
The researchers will present their work at the IEEE Symposium on Security and Privacy in May 2025.
Source: MOSIS Lab
Or perhaps hardware processing to replicate the human hearing system.
Nice idea, be great if it really works, but have my doubts!