Why AI developers are grumpy about containers

John Speed Meyers, Head of Chainguard Labs, and Dan Fernandez, Staff Product Manager

October 31, 2024

Most code in most software applications was not written by the software engineers building or deploying the application. And software applications built on artificial intelligence (AI) are no exception. AI applications, beneath the topmost layer of code, are often open source code all the way down. And that open source code needs to come from somewhere. Increasingly, that somewhere is “containers,” the building block of cloud-native applications.

Unfortunately, that means AI-related code is inheriting the worst parts of containers, at least according to interviews we recently conducted. To better understand the experience of AI developers with containers, we interviewed machine learning enthusiasts, AI developers, and data scientists about their usage of AI-related containers. The results suggest that developers, to put it politely, have “issues” with AI-related containers.

The full report can be found in our newly released AI + Containers white paper. The section below summarizes the main findings.

AI + Containers: Better Together?

In the interviews we did for this study, there were five key themes:

AI container users hate “bloat.” Many AI containers, according to interviewees, have too many unnecessary components and are consequently too large. It’s a pain.
Security for AI container users rarely means identifying and fixing CVEs in container images. Despite there being hundreds of known vulnerabilities (CVEs), on average, in popular AI-related containers, the interviewees were concerned more about topics like network security, data leakage, and other issues.
Dependency hell in the world of AI and containers is real. Frustrated interviewees described dependencies in AI applications as “finicky” and “fiddly.”
Machine learning specialists often know little about containers. These professionals appear happy to hand off container-related decisions to platform engineering or machine learning operations teams.
Defaults and templates for AI container usage are powerful. Many container users simply inherit a Dockerfile from past developers and use it.

In short, AI in containers has the same problems that containers have in general, and sometimes to an extreme degree: many containers are “bloated,” unfixed CVEs are prevalent; and the dependencies are a mess to configure and maintain. Most users simply inherit whatever is in place and attempt to make it work, only dimly aware of future implications. It’s a mess!

Chainguard AI Images: Avoid Dependency Hell

These issues are why Chainguard offers Chainguard AI Images, a suite of CPU and GPU-enabled container images, featuring heavily-used frameworks like PyTorch, Conda, and Kafka.

These images are designed to be minimal, without the bloat that so many AI developers interviewed in this research mentioned. They are also CVE-free, allowing developers to be developers and not security experts. Finally, Chainguard AI Images provides a template for developers to build on, a default where the security issues are already squared away.

Interested readers can also check out Chainguard’s course on Securing the AI/ML Supply Chain, a resource for data scientists, machine learning engineers, or AI developers who want to become more familiar with safeguarding their production AI environments. And on November 6, Chainguard will be hosting a webinar that explores the topic of AI and containers.

If you are interested in Chainguard AI Images and the benefits they can provide to your machine learning, data science, or AI efforts, please get in touch!

Ready to Lock Down Your Supply Chain?

Talk to our customer obsessed, community-driven team.

Talk to an expert

Resource Hub

Unchained Blog

Events

Trust Center

Education

Documentation

Courses

Featured Event

Why AI developers are grumpy about containers

AI + Containers: Better Together?

Chainguard AI Images: Avoid Dependency Hell

Ready to Lock Down Your Supply Chain?