For most AI powered product teams, the final wording around prompts usually falls to a software engineer. There might be some R&D that goes into designing the prompts before they are handed to the engineers, but once the prompts are hardcoded in, the responsibility of maintaining them becomes the engineering department’s problem…
This is a problem because prompts are not like other snippets of code. Designing good prompts combines subject matter expertise, with current best practices around prompt engineering, and a technical understanding of how and where to implement these prompts into a codebase. A software engineer cannot be expected to play all three roles.
What we need is a new tool in the software development process that allows subject matter experts, prompt engineers and software engineers to collaborate on the prompt development process.
The Current Landscape: Engineers as Prompt Custodians
Here’s how most AI product teams develop their prompts:
- Someone (maybe the founders or a consultant) comes up with the initial prompts.
- They give these prompts to the engineers.
- Engineers put the prompts into the code.
- From then on, engineers are responsible for maintaining the prompts.
This process looks a lot like how teams handle other software features. On the face of it, it make sense – engineers own the code, so they should own everything that in it, right?
But here’s the thing: a prompt is not just another piece of code.
The Problem: Prompts Are Not Just Another Code Snippet
AI, particularly since ChatGPT’s release in 2022, is transforming software development. Consider an app that helps writers improve screenplays:
Pre-2022: Developers would code complex rules about good writing and intricate pattern detection algorithms.
Today: Developers create prompts that instruct AI to evaluate various aspects of a screenplay. These prompts now contain the app’s core logic and value.
The rest of the code merely handles basic tasks like file management and user accounts. It wouldn’t be too much of a stretch to say that all code does in an increasing number of projects is limited to managing the inputs and outputs from prompts.
In essence, AI has shifted much of a product’s value from traditional code into carefully crafted prompts. This represents a fundamental change in how software delivers its core functionality.
A small change to one of these prompts at the core of an app like this can have a dramatic effect on how well the app helps someone writes a better screenplay. By comparison you could spends weeks refining the signup experience and it would have a negligible impact on the core product experience. Prompts are incredibly sensitive to tiny linguistic changes and subtle wording choices in counterintuitive ways that software engineers cannot be expected to know.
It’s not just about the wording, the way prompts are structured around tasks also impacts their efficacy. Ask a subject matter expert how they think about reviewing screenplays and they might have a whole set of criteria for thinking about plot and then an entirely separate process for evaluating pacing and rhythm. By comparison a software engineer might not know the difference between plot and pacing and would resort to a much more general set of prompts that cannot capture the kind of analysis that prompts informed by deep industry knowledge and expertise would.
Our understanding of how best to articulate prompts is a new and rapidly evolving discipline. One aspect of this discipline is the specific phrasing of words in a prompt, another aspect is the subject matter expertise that goes into structuring the thought behind the prompts (regardless of how they are phrased) and yet another aspect is the technical understanding of how and where these prompts are effectively integrated into a product.
The Solution: A Collaborative Space for Prompt Management
We need a new way to manage prompts in AI products that takes into account the needs of each of the stakeholders involved: the subject matter experts, the prompt engineers and the software engineers. Expecting one person to play the role of all three might work for basic prompts or a prototype but developing production-grade AI products will necessitate a tool that fosters collaboration around the prompt development process.
To illustrate this need, let’s draw a parallel with content management in online publications. If you were developing an online publication you would never expect writers and editors to have to interact with the source code. Maybe submitting articles as Google docs and then expecting a software engineer to upload and maintain everything works for a tiny online publication but anything beyond that and you inevitably arrive at a content management system. A CMS allows writers to write and editors to edit and coders to code. This separation of concerns works because writers, editors and developers each have very different needs in the publication process. The same is true for publishing and maintaining prompts, and given how important they are to core value a product provides, we are going to need specialised tools for managing prompts that addresses the needs of each stakeholder in the process.
Challenges to Overcome
There are several prompt management tools on the market today. After having reviewed several of them (Langfuse, Agenta, PortKey, PromptLayer) it is clear that these tools are not designed as collaborative workspace. These are overly technical developer tools and, from a prompt management perspective, they don’t offer much more value than storing you prompts in a shared Excel sheet.
We need prompt management tools that are easy for non technical contributors to use since prompt engineers and subject matter experts won’t always come from a technical background.
Another challenge that I found with the existing tools is that they don’t account for the complexity involved in getting such a diverse group of people to work together. For example, a subject matter expert might change a prompt in a way that fundamentally breaks how the code works. This is not something a subject matter expert would understand, nor should they, but it will ultimately affect whether the prompts can be used or not. Pre-empting these kinds of avoidable errors and communicating them clearly to each stakeholder as they are making changes is one way that a prompt management system could foster collaboration between people who have very different concerns and types of expertise.
Changing How We Do Things
As the prompt development process matures as a discipline, we need to recognize that engineers should never be the sole custodians of your prompts. The current practice of relegating prompt maintenance to the engineering department overlooks the multifaceted nature of effective prompt design.
To address this, we need specialized tools to manage and develop prompts in production-grade AI products. These tools must treat prompts as unique entities, distinct from regular code, and perhaps most importantly, engineers should never be the only people who handle these prompts. The goal is to foster collaboration between subject matter experts, prompt engineers, and software developers. This collaborative approach ensures that prompts benefit from domain expertise, best practices in prompt engineering, and technical implementation knowledge.