The Sound of AI Music

hackerfactor.com · speckx · 1 day ago · view on HN · research
quality 5/10 · average
0 net
AI Summary

Dr. Krawetz discusses AI detection techniques in music through beat/tempo detection, frequency analysis, and linguistic analysis to distinguish AI-generated content from human-created or auto-tuned music. The article examines labeling approaches by streaming platforms and argues the real problem is lack of content curation rather than AI generation itself.

Entities
Dr. Neal Krawetz The Hacker Factor Blog FotoForensics YouTube Meta Instagram Facebook TikTok Deezer Spotify Apple Music Suno Udio SongGPT ChatGPT Claude Gemini Grok Copilot Manfred Mann Bruce Springsteen
The Sound of AI Music - The Hacker Factor Blog The Hacker Factor Blog Takes one to know one Home Blog Swag About Dr. Neal Krawetz writes The Hacker Factor Blog. Follow him on Mastodon . Tools • FotoForensics : Test your own photos. • Hintfo : View metadata. • Gender Guesser : Use your words. Links Security Internet Storm Center Krebs on Security Bruce Schneier Images Photo Stealers Awkward Family Photos Unsplash Debunking News iMediaEthics Poynter Debunking Politics FactCheck PolitiFact Debunking Other Snopes Math with Bad Drawings Calendar « March '26 » S M T W T F S 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 Archives March 2026 February 2026 January 2026 Recent... Older... Feeds RSS 1.0 feed RSS 2.0 feed Categories Conferences Copyright Financial Forensics Authentication FotoForensics Image Analysis IoT Mass Media Music Network Honeypot Tor Phones Politics Privacy Programming AI Security Terrorists Travel Unfiction [Other] All categories The Sound of AI Music Thursday, 12 March 2026 --> I recently gave a closed-door presentation on AI detection. The talk was titled "Detecting Deep Fakes: The Musical! " Why focus on music? Because it's one of the messiest industries. AI is embedded throughout modern music. And the problems that we have with the flood of bad AI music mirrors the AI slop found in other industries: books, artwork, online "reviews", etc. If you know how to detect AI in music, they you know the basics needed for detecting AI in photos, videos, and written text. This isn't a new topic for me. Last year, I blogged about Detecting AI Music , where I focused on beat/tempo detection. This time, I added in frequency analysis (great for distinguishing AI from autotune from real human voices) and linguistic analysis -- because the words often leave traces that can distinguish human writings from the lyrics generated by ChatGPT, Suno, Claude, Gemini, Grok, Copilot, etc. Finding AI When it comes to music, AI is everywhere. In the old days, you might have manual layering and synthetic (non-AI) music. But today, AI is embedded in everything: Voices used to be tuned and adjusted manually. The original auto-tune (1997) used autocorrelation and digital signal processing (DSP); not AI. But today, many pitch correction tools use AI. Synthesizers are common in modern music. The original synthesizers were basic tone generators. They evolved to include more pitch control (like when you press a piano key softly or strongly, it changes the intensity of the note). Today, they often use AI for sound design, generative sequencing, and aligning tracks. Modern electric guitars and amplifiers include a Neural Amp Modeling (NAM), which uses AI to "capture" the tone of vintage tube amps. They also use intelligent impedance tracking to monitor the interaction between the amp and the speaker cabinet. Even the guitar pedals are intelligent, using AI to predict the next settings while actively "listening" to the player. Some amplifier software allows description-based or AI-assisted preset selection. These can automatically adjust the equalizers to fit a specific sound profile. Mixing boards use AI to separate tracks in real-time, automate audio leveling, adaptive equalization, and more. The mastering software (which combines tracks and produces the final recording) uses AI to seamlessly blend the various inputs. It's hard to listen to any music today without listening to some kind of "AI". Labels There's been a huge push to label AI media. A few examples: In 2023, YouTube began labeling AI-generated videos. In 2024, Meta began labeling images on Instagram and Facebook. Also in 2024, TikTok began automatically labeling AI-generated content. However, doing it automatically means using AI to detect AI, which is a very fragile solution. Independent surveys found that TikTok was only labeling about 30% of AI content (a 70% false-negative rate). TikTok can also incorrectly label non-AI media if there is metadata contamination (such as using AI to touch-up one small portion) or if the use of makeup, lighting, and filters caused TikTok's detector to think the visual content was AI. By 2025, the streaming music service Deezer began addding AI labels to music. They are also using the labels to demonetize AI content . This year (2026), there is a community outcry for Spotify to label AI music. This month (March 2026), Apple Music launched " Transparency Tags ". This puts the burden of labeling on the media creators (which is a pretty good approach right now). But here's my question: What are they labeling? Just consider music. Are they labeling the score, instrumentation, lyrics, vocals, the tool used, or something else? For example, Suno generates AI music. The music can be completely AI-generated (e.g., click on the random demo prompt), or it can be human controlled. This begs the question: what if you have a song with human-written lyrics that combine with Suno for vocals, instruments, and score -- all with human creative control? Depending on the service it could be: Labeled as AI: Because it used Suno. (By itself, Suno is detectable based on the sound structure.) Labeled as Human: Because a human maintained the creative control. Labeled "Human-written with AI assistance": This is closer to the ground truth. Or they might use some other label. Keep in mind, there is no consistency between these different streaming platforms. The same media uploaded to three different platforms may end up with three different labels. The Real Problem But let's back up a moment. Why are we labeling it in the first place? It comes down to Sturgeon's Law : "90% of everything is crap". I've spent way too much time listening to music at Suno, Udio, SongGPT, etc. When it comes to music, I think it's closer to 99%. But there's always that 1% gem. The real problem isn't that "AI music is crap". The real problem is that there's no moderator or curator to filter out the crap. In the pre-streaming days of cassette tapes and record players, the music industry had layers of people who could filter out the bad songs before they reached the masses. As a result, most albums had 1-2 bad songs, but most of them were "good". It was rare to find an album like Manfred Mann's The Roaring Silence , where every song is crap -- except one. And that one was a flop written by Bruce Springsteen ("Blinded by the Light") that required a synthesizer and vocal layering to make it sound good. (Editor's Note: Calling it a 'flop' is kind of harsh. How about 'underperformed commercially'?) AI doesn't change the percentage of crap music; AI just allows it to be made faster. With no content moderators, the listening public is faced with the onslaught of unfiltered crap music. Since most of today's audio flood comes from AI, users are pushing for AI labels. But really what we need is a label to say whether the music was curated by a trusted source. If you're listening to unmoderated music, then you're going to hear a lot of crap music, regardless of whether it comes from AI or not. One of the alternative approaches right now is on "artist verification". The label becomes less about what the music is, and more about who stands behind it. (Gee, doesn't that sounds like what SEAL does for notarizing content?) Labels vs Content There are really four problems we need to address. The first problem is detection. Detecting AI in music is easy if you know what artifacts to look for. The same approach applies to photos, videos, text, and other content. Finding the artifacts is the hard part, but it's definitely there and detectable. Adding to this problem, we have the complexity of detection. Your typical consumer isn't technical enough to detect AI in lyrics or distinguish a synthesizer from AI-generated sounds. The second problem is labeling. What are we labeling and why? And more importantly, what are the false-positive and false-negative error rates for the labeling system? If a platform (like TikTok) labels 30% of AI content, users don't think, "Oh, 70% is missing." They think, "The 70% that isn't labeled must be real." The label ends up validating the fakes. (If you've taken any psych classes, then this is the "Silence as Signal" phenomenon.) The third problem is disclosure. If you tell people what you are detecting, then it means that the next generation of AI (i.e., in a few months) will be adjusted to avoid detection. (In the AI world, this leads to adversarial hardening, where the AI system becomes harder to thwart.) Moreover, people who want to force mislabeling or avoid detection now know exactly what they need to change. (As an aside: I'm focused on music because this industry doesn't seem to care about avoiding detection. I'm not going to disclose all of the subtle artifacts for detecting AI in videos or photos because those are readily used for fraud.) The fourth problems is the lack of a curator to filter out the crap. And like TikTok, there's always someone who thinks they can use AI to detect and filter AI. However, an automated labeler will introduce two additional problems: Problem 4(A): Reliability. Between hallucinated results, non-authoritative results, unspecified biases, and unspecified accuracy, you're going to end up creating more problems than you solve. Moreover, because the results from most deep learning systems are non-deterministic, an AI system fails any legal requirements (FRE Rule 901, Daubert and Frye standards). Basically: do you want to stake your business's reputation on an unknown filtering system? ( FRE Rule 901(c) is a proposed addition to the Federal Rules of Evidence that is currently under consideration by advisory committees. It addresses the authentication of AI-generated or "deepfake" evidence.) Problem 4(B): Maintenance. AI detection is a moving target. This means that you need someone to constantly monitor the results and detect when the accuracy declines. This also requires regularly re-training the system to incorporate any new changes. All of this effort becomes a full-time job. However, nobody wants to pay someone to sit around watching AI so they can detect the problem before the customer base begins complaining. Instead, companies will deploy it, wait for the complaints, and then panic-patch their solution -- often making it even worse. None of these problems are new, and they are not unique to music. For example, cameras and video systems have built-in AI to adjust the lens, exposure, focus, and more. Take the Google Pixel 10 as an extreme example. It's marketed as an ' AI Camera ' because it uses AI to merge multiple frames for HDR. To an AI detector, these photos often flag as tampered or altered because they lack the natural noise and physics of a single-frame capture. Whether it's music, photos, video, books, or art, these are the problems that we need to start tackling today because it's only going to get worse in the future. The current approach, of labeling AI content, isn't going to be a viable long-term solution. Read more about AI , Forensics , Music | Comments (0) | Direct Link Comments No comments Add Comment [ Top level ] Enclosing asterisks marks text as bold (*word*), underscore are made via _word_. Standard emoticons like :-) and ;-) are converted to images. E-Mail addresses will not be displayed and will only be used for E-Mail notifications. Copyright 2002-2026 Hacker Factor. All rights reserved.