A tool for debugging templates for requests to the OpenAI Chat Completion API

The chat completion API of OpenAI is simply incredibly powerful if you want to use large language models to do some semi-intelligent job. While that interface is modeling a simple chat with the AI, just like ChatGPT, it also allows many more things, like passing text and images to the AI, forcing it to respond with JSON in a specifiable format or having it decide on actual tool calls that should be performed for executing a task. So you could program a real AI agent using that interface if you want. I've been using that happily as the major workhorse for a wide variety of tasks within my AI projects. However, having it do what you want requires sometimes quite an amount of experimentation. In this blog, I'd like to share a way how I usually do that - using a little application I wrote for that purpose.

Why chat completion requests can be nontrivial

In many simple cases, you just can give the AI a prompt to do something, and that's it. In other cases, it requires some more care. For instance, in the Composum AI has a feature that is using the OpenAI chat completion to automatically translate whole websites into various languages. An important point in our approach is that it doesn't translate individual text in the components the one web page is composed of, but translates the whole page in one step so that the AI can see the relationships of the text on the page too. Now, if the CMS editor is changing only one paragraph in a whole page, then it becomes really interesting. Just giving the AI to change paragraph might degrade the translation quality because the AI doesn't have the context of the whole page. So I'd rather give it the text of the whole page and extract from the translated result only the translation of that changed paragraph. But to give the AI the other text of the already translated page is also important because it can then try to be true to the existing translation that also might have been adapted by the CMS editor. So, I'm using a kind of conversation template with the AI that gets filled in with the actual texts. Somewhat simplified, it looks like this:

System prompt: You are a professional translator translating faithfully.
User: Print the existing translated text of the page. You can later consider these as examples.
Assistant: {here comes the already translated text in the target language}
User: Print the existing edited text you will need to translate.
Assistant: {here comes the original text we need to translate}
User: Print the text to translate

That is not a real chat I'm asking the AI to continue using the chat completion API, but a kind of "fake chat" that is designed to entice it to exactly the response I need. If you are curious - here is a full request that show how that actually could look like in the Composum AI, though it's best viewed by importing it into the app.

Debugging of conversation templates

Of course, you can always modify the code and test within your application. And there is also the OpenAI Playground where you can try out how to work with the Chat Completion API. However, when I'm testing an application, I often want to investigate why some particular step went wrong. For such cases, I usually log the JSON requests which are sent to OpenAI at debug log level. Or in my command line tools, there's always a verbose mode (-v) where those JSON requests are output as well. Now, how would it be if there was a little application where you can quickly paste in such a JSON request, display it neatly to inspect it, change it and rerun it on the OpenAI API until it works as it should?

Just joking. Of course, I wrote such an application. It originally started out as an attempt to rebuild a chat application outside ChatGPT, but I extended it so that it is fit for exactly this purpose. You can load the request, run it through the AI, modify it, and retry it as often as you want. If you like you can also set the parameter n to get 10 responses from the AI and inspect all of them at once. That often helped me to find out how to best phrase the prompts in applications so that the AI really does what I like it to do. Anyway, this goes beyond what's usually called "prompt engineering". I like to call that "conversation engineering".

By the way - this application is part of my ChatGPT Toolbox that contains many more command line tools around the various OpenAI APIs and more, many centered around the swiss knive type tool chatgpt for using the chat completion API in many ways.