On November 6th at OpenAI’s DevDay, a new product called GPTs was announced. These GPTs offer a quick and easy way to build a ChatGPT extension through a no-code platform that greatly simplifies the development of complex multi-modal chat bots for developers.
We’re rolling out custom versions of ChatGPT that you can create for a specific purpose — called GPTs. GPTs are a new way for anyone to create a tailored version of ChatGPT to be more helpful in their daily life, at specific tasks, at work, or at home — and then share that creation with others. For example, GPTs can help you learn the rules to any board game, help teach your kids math, or design stickers.
Anyone can easily build their own GPT — no coding is required. You can make them for yourself, just for your company’s internal use, or for everyone. Creating one is as easy as starting a conversation, giving it instructions and extra knowledge, and picking what it can do, like searching the web, making images or analyzing data. Try it out at chat.openai.com/create.
Let’s explore what this means by going over the existing functionality and concepts. Then by building our own GPT and how to add both your own application programming interface (API) and data!
GPTs Overview
To start, let’s review the existing features of what we can do with GPTs and create a simple GPT before moving onto a more advanced one using an API.
*Note: As of publishing this, GPTs are still beta and existing features and behavior might change very quickly. They’re also limited to a small amount of users for the time being.
UI
The UI for GPTs is simple and can be made completely from your browser.

GPT Builder Home Screen
It’s designed to be easy to use and requires no-code but it does provide more complex functionality by giving developers the ability to upload their own datasets and provide API’s.

GPT Builder Configure Screen
GPTs are a multi-model copy of ChatGPT. They have support for vision, DALL-E, and tools like web browsing, a code interpreter using Python and custom actions that use public API’s.
This is very similar concept to what has stemmed from open-source projects like Agents which LangChain, a popular framework for building LLM applications describes as the following:
The core idea of agents is to use a language model to choose a sequence of actions to take. In chains, a sequence of actions is hardcoded (in code). In agents, a language model is used as a reasoning engine to determine which actions to take and in which order.
OpenAI has abstracted the building of Agents with GPTs without any programming. They also provide a similar developer API known as Assistants that give more flexibility into building complex applications like GPTs.
Simple No-Code GPT Example
Let’s build a simple GPT with no added knowledge or actions. Luckily, OpenAI has built a lot of functionality that handles the “magic” on how GPTs work and how they can extend ChatGPT.
Before, building something similar to GPTs required programming a conversational bot with lots of complexities. Even using the OpenAI API, it still required a lot of understanding on how to use the Chat Completion API with tools like LangChain for building bots that could use tools or multiple models. This has been simplified and abstracted away allowing the quick development of advance conversational bots.
However, it’s important to note that this simplification comes with a trade-off in terms of reduced flexibility to more custom approaches.
Creating the GPT
Let’s demonstrate how to take advantage of the underlying multi-model architecture of ChatGPT.
We’ll call our GPT prototype “Reverse Fashion Search”, a GPT that allows users to upload images of an outfit where the vision model identifies the different clothing pieces then attempting to find those same clothing pieces online.
This can be done in a few minutes, something which would’ve previously taken a significant amount of effort.
You can either use the “Create” tab that uses a GPT Builder, which might also be a GPT.. 🤖? The other option is the “Configure” tab which allows you to manually define your GPT.
Instructions Prompt
Lets start with our prompt which is the most important part to our GPT.
If you’re unfamiliar with prompts, checkout promptingguide.ai. A very useful resource for diving into prompting techniques used with language models like ChatGPT.
Prompt engineering is a relatively new discipline for developing and optimizing prompts to efficiently use language models (LMs) for a wide variety of applications and research topics. Prompt engineering skills help to better understand the capabilities and limitations of large language models (LLMs).
Put simply, a prompt is just the text that we instruct our LLM to follow.
Our prompt should be well structured but also lay out the stage for how the LLM should respond to messages.
ChatGPT works by prompting the LLM with a conversational format. The LLM will generate text on the input, it infers the meaning of the context based on its training data and the prompt. A very simple example of a conversation prompt might look like:
system: You are a helpful AI assistant.
user: Hi!
assistant: Hello, how can I help you today?
Each message in the conversation will continue the prompt until some stop word is reached or the token limit is reached. Lucky for us, we don’t need to worry about this because ChatGPT will handle this for us when building our GPT.
When thinking about our instructions prompt, we’ll think of it as our system prompt which sets the stage for how the language model will respond.
Here’s a detailed example of what we can use with our Reverse Fashion Search GPT:
Objective:
You're an AI assistant designed to help the user find similar clothing online by analyzing and identify clothing from example images. These images can be sourced from social media posts, user uploads like screenshots, etc. Your task involves detailed analysis and subsequent search for similar clothing items available for purchase.
Process:
1. Image Acquisition:
- Request the user to provide an image. This can be a direct upload or a screenshot from social media platforms.
- Note: Inform the user that screenshots may be necessary for certain social media platforms that require login, as you cannot access these platforms directly.
2. Identifying the Subject:
- If the image contains multiple people, ask the user to specify whose clothing they are interested in.
- Proceed once the user identifies the subject of interest.
3. Detailed Clothing Analysis:
- Thoroughly describe each piece of clothing worn by the chosen subject in the image.
- Include details such as color, pattern, fabric type, style (e.g., v-neck, button-down), and any distinctive features (e.g., logos, embellishments).
4. Verification:
- Present the clothing description to the user for confirmation.
- If there are inaccuracies or missing details, ask the user to clarify or provide additional information.
5. Search and Present Options:
- Once the description is confirmed, begin web browsing for similar clothing items.
- Ask the user if they prefer to search for all items simultaneously or one at a time.
- Searched results can be direct links to a specific item or a search query to another site.
- For each item found, provide a direct purchase link for each line item, the link should be the entire summery of the item. e.g. "[- Amazon: A white t-shirt](link)"
- Try to provide a price if possible for each item
6. User Confirmation and Iteration:
- After presenting each find, ask the user to confirm if it matches their expectations.
- If the user is not satisfied, either adjust the search based on new input (repeat from step 5) or ask if they wish to start the process over with a new image.
Constraints:
- When asking the user questions, prompt in clear and simple to understand format, give the user a selection of options in a structured manner. e.g. "... Let me know if this correct, here are the next steps: - Search for all items - Search each item one at a time"
- Format your responses in HTML or MD to be easier to read
- Be concise when possible, remember the user is trying to find answers quickly
- Speak with emojis and be helpful, for example this would be an intro:
"""
# 🌟 Welcome to Your Fashion Search Assistant Powered by ChatGPT! 🌟
Hello! 👋 If you're looking to **find clothing items similar to those in a photo**, I'm here to help. 🛍️👗👔
### Getting Started is Easy:
1. **Upload an Image** 🖼️ or
2. **Provide a Screenshot** from a social media platform. 📱💻 🔍
**Remember:** If it's from a social media platform that requires login, a **screenshot** will be necessary. Let's embark on this fashion-finding journey together! 🚀
"""
This is a detailed prompt to instruct the LLM. It provides an overview on what is should do, a step-by-step process and constraints. It takes advantage of a few prompting techniques like few-shot prompt by providing an example on how to speak. It also provides some detailed reasoning steps for the model to follow.
Longer prompts can be a problem, this is because the GPT-4 model can only process so many tokens from both input and output tokens. Once again, this isn’t something we need to worry about because GPTs understands how to paraphrase, summarize and continue long running conversations but it’s important to know as the quality of the conversation will eventually degrade as more tokens are added. This prompt has 522 words which is roughly 653 tokens. OpenAI provides a good example for the estimates on this where they describe a token as being about 3/4 of a word.
Knowledge & Capabilities
For this example it doesn’t need any extra knowledge, instead we will just give it access to “Web Browsing”.
Result
Once you finish adding your prompt and any conversation starters, you can save and publish your GPT! The finished result is a simple GPT where no programming was needed. It has vision capabilities, web browsing and GPT-4 for helping with reverse fashion searching, nice!
Demo

Demo 3x Speed

A cool DALL-E 3 interpretation of the GPT
Problems
GPTs face a lot of the same underlying issues ChatGPT and other LLM backed assistants face today, these are primarily:
general problems with trying to get the LLM be more deterministic (following instructions)
Other issues specific to GPTs are:
prone to errors
token quotas & throttling occur quickly with large system prompts
how they reason their usage with tools and actions
how they perform RAG with large and complex datasets
Some of these will be understood more in time, especially as the platform and technology mature. Hopefully things will continue to get better and new features will be released as fast as they have been so far.
Thanks for reading!
