Secure coding and AI – Not always best friends

Secure coding and AI - Not always best friends

In a previous LinkedIn post about vibe coding gone horribly wrong, I talked about the dangers of just going with the “vibes” and why we should still care about the resulting code. This blog post will focus on a different topic, secure coding when using an AI assistant tool or an application development platform that uses AI generated code behind the scenes.

Secure coding en AI zijn niet altijd de beste vrienden. Laten we beginnen met het uitleggen van een misschien minder bekende cyberaanval die een “supply chain attack” wordt genoemd en hoe deze in verband staat met AI-generated code. Het is de natte droom van een hacker om te infiltreren in de originele broncode of bibliotheken die worden gebruikt door softwaretoepassingen over de hele wereld. De missie van de hacker is om kwaadaardige code te planten waarmee gegevens en wachtwoorden kunnen worden gestolen. De softwaretoepassing wordt gebouwd met de kwaadaardige code erin en vervolgens verzonden door een vertrouwde software distributeur. Hoogstwaarschijnlijk blijft deze kwaadaardige code lange tijd onopgemerkt, waardoor de hacker genoeg tijd heeft om je vertrouwde gegevens te stelen.

This “supply chain attack” is not new and it requires a lot of skills and time to prepare, but when it succeeds, it’s probably the most powerful cyber attack. Imagine what could happen if a web server or a popular online meeting application used around the world is compromised at its source. In May 2001, the open-source Apache web server, which at the time hosted over 60% of the world’s web sites, nearly fell victim to a supply chain attack. A public server hosting the source code repositories and binary releases of the Apache web server was compromised. Fortunately, this vulnerability was discovered by the open source community in time to prevent the hackers from injecting malicious code.

A more recent example of a supply chain attack happened just a week ago on September 8. About 20 npm-pakketten were compromised that in total are downloaded 2 billion times a week, talking about a supply chain attack at scale. The compromised npm packages were distributed on the trusted npm registry but contained malicious code that would intercept your username and password to steal your crypto currency wallet. The malicious code was obfuscated as well to stay hidden as long as possible. The hacker was able to infiltrate by getting control of a Github account of a trusted open source contributor. It all started with a phishing email to enable MFA (Multi Factor Authentication) that appeared to be legit but was sent from a malicious domain name nmpjs.help that was registered just a couple days before on September 5.

With more and more code generated by an AI agent or a vibe coding platform, is AI generated code still vulnerable to these supply chain attacks?

The answer is simply yes. While AI agents are getting better at generating syntactically correct code samples, they are still very bad at generating secure code. A recent GenAI security report shows AI getting almost exponentially better over the years at generating syntactically correct code that actually compiles. However, the graph also shows a flat line for AI code violations against the OWASP Top 10 Beveiligingsbedreigingen (see red line). The OWASP Top 10, such as code injection, are easily detected by static code analysis tools such as the SonarQube OWASP plugin.

2025 GenAI Code Security Report
2025 GenAI Code Security Report – Veracode

You may be wondering why an AI agent does not generate secure code from the start?

You need to remember that an AI agent is an inferrer, it’s not something deterministic like a code compiler. This means that an AI agent will choose the most likely next part (or token, if you like) to generate for the programming code. One of the reasons why an agent does not always write secure code is that an AI agent is trained on many public code samples that are freely available in technical articles on the Internet.

These code samples are often used to demonstrate a particular coding technique or programming concept, and are often for demonstration purposes only. They provide some tutorials on how to get started quickly, using less secure things like shared access keys and not implementing fine-grained authorization control. These articles don’t pay attention to secure coding techniques because they want to explain a particular concept. Things like security by design en zero trust architecture (“never trust, always verify”)
are left out to focus on the concept at hand.

We blijven nog steeds op hetzelfde onderwerp van een supply chain attack. Zoals we allemaal weten, kan een AI-agent last hebben van hallucinaties, waarbij hij vaak zeer overtuigende antwoorden geeft die simpelweg niet kloppen. Er is een nieuwe term die “package hallucinations” heet voor AI-generated code. AI-generated code verwijst naar softwarepakketten die gebruikt zouden moeten worden, maar gewoon niet bestaan in de echte wereld.

The large-scale research published in the academic paper “We Have a Package for You! A Comprehensive Analysis of Package Hallucinations by Code Generating LLMs ” testte 576.000 door AI gegenereerde codevoorbeelden. Bijna 20% daarvan bevatten koppelingen naar bibliotheken met broncode die gewoon niet bestaan. De resultaten varieerden een beetje afhankelijk van de programmeertaal die werd gebruikt om de code te genereren. Bij verschillende LLM’s kwamen dezelfde niet-bestaande code libraries terug in de gegenereerde code. Als hacker zou je dit patroon kunnen detecteren en de softwarepakketten publiceren die de AI-agent zei te gebruiken.

OWASP Top 10 Web Application Security Risks

Het verschil is dat de gehallucineerde softwarepakketten nu bestaan in de echte wereld waarnaar de AI-agent verwijst. Dit type van supply chain attack staat ook bekend als slopsquatting. Hackers registreren de aanbevolen “hallucinated packages” op vertrouwde distributieplatforms. Deze softwarepakketten bevatten kwaadaardige code om je softwaretoepassing bij de bron te vergiftigen. Het probleem is dat je de gegenereerde code te veel vertrouwt, maar niet controleert of de voorgestelde code libraries die je gaat downloaden ook betrouwbaar zijn.

Speaking of vibe coding, the vibe coding platform Lovable also had its share of security issues with AI-generated code. About 10% of the deployed web applications scanned for security issues were exposed to hackers accessing data such as personal and financial information.API keys of cloud providers are publicly available. Cloud platform API keys are particularly attractive to hackers because they allow them to run their phishing sites and other malicious applications on a cloud platform for free. Again, it’s a good reminder to always set a budget limit on your cloud provider account.

Veel gebruikers van vibe coding hebben geen technische achtergrond en hebben vaak weinig of geen kennis van secure coding. Ze vertrouwen erop dat het vibe coding platform de beveiligingsaspecten van de geïmplementeerde webapplicatie afhandelt, maar deze platforms hebben nog een lange weg te gaan. Zoals Lovable zei in een post op X van 29 mei, “We’re not where we want to be in terms of security, and we’re committed to continuing to improve the security posture for all Lovable users”.

I probably keep repeating myself, an AI agent can be a great tool for a software engineer but make sure you’re still in the driver’s seat. You should care about the code (not just the AI generated or not) and certainly don’t blindly follow the AI agent.