What I’ve Learned Building Two LLM-Powered Products this Year

This essay was written with the help of Kindo’s Claude model.

Building with LLMs

Over the last year, we’ve seen the emergence of large language models (LLMs) like ChatGPT go from an obscure AI research demo to a mainstream phenomenon. This rapid shift has caught many off guard, leaving us to wonder: what just happened?

As someone who has built products (1, 2) using LLMs at both OpenAI and Kindo, I’ve experienced firsthand how quickly these models have evolved. Just a year ago, ChatGPT was a rough “research preview.” Today, it is a leading consumer product with over 100 million users and drives over $1 billion in enterprise sales.

Independent AGI research labs have transformed into full-fledged product-led companies, capitalizing on the insight that full stack co-design with model training and product drives better model performance and cost structure.

The cascading effects of this Cambrian explosion in generative AI are accelerating productivity and sparking urgent policy debates. Amidst the frenzy, one thing is clear: LLM-powered products will continue rapidly transforming as we translate emerging research, regulation, and user needs into thoughtful design.

So what have I learned from shipping two of GenAI products in the last year?

First: Active Design is Built In

LLM-powered products condense the traditional user research cycles. With non-LLM products, the product builder conducts user research before creating the tool to better understand the user’s needs. They observe how others would use their tool, reviewing when the user picks up their tool or which queries they turn to for help.

With LLM-powered products, we gain insights with each user interaction into how people collaborate with LLMs - what their expectations are, how they imagine they can rely on a product, and their own delegation of tasks. Through their collaboration, we can decipher users’ working mental models on how to use the product, what its limitations are, and what they - the end user - accept as success. The products themselves become ongoing exercises in active design.

Chatbots like ChatGPT, Bard, Poe, and CharacterAI showcase this well. They famously streamline knowledge retrieval (e.g. “Plan a trip to Paris”) and writing tasks (e.g. “Summarise this pdf”). Their primary interface is conversation. Users engage in discussion to steer the model towards successful outcomes. This requires grasping the model's capabilities and limitations, where success is predicated on the user's understanding.

Second: Start with Quick Wins

LLMs excel at summarization, categorization, and retrieval. Lean into these strengths early on to demonstrate value. At OpenAI and Kindo, we utilized simple product features like reading digests, tailored nudges, and semantic search to create delightful “a-ha!” moments that educated users on the power of LLMs.

Most jobs involve finding answers and summarizing information. These are horizontal challenges, and can often be dizzying to help the user begin their product experience. By focusing LLM collaboration on one data source, the range of benefits become apparent. For example, a meeting video becomes a golden goose offering many options: transcribe, translate, summarize, write updates, and provide feedback. This single input highlights the LLM's plurality and helps your users imagine what else is possible.

Optionality makes capabilities more discoverable. Design patterns introduce new modes of interaction. Nudges encourage productive patterns. Together, these tweaks streamline collaboration. They distill capabilities into lightweight interactions. The goal is to help users rapidly unlock an LLM's potential.

Third: Design for Human-AI collaboration

One common misconception is that LLMs are plug-and-play. Building a flexible infrastructure around a model is required to transform the model from a research environment to every day use.

Until recently, only engineers and scientists used LLMs. To make collaboration accessible, product builders must abstract complexity.

Researchers use prompt engineering - which refers to providing the model instructions before completing a task - and few shot learning - which refers to showing the model a few examples before asking the model to generate its own - to improve a model’s performance.

These methods of model steering are just the tip of the iceberg on how product and design are required to help non-engineers gain the same benefits.

As a product builder, we can borrow from research and utilise features like system prompts (3), style references, personas, and prompt rewriting (4) to abstract complexity so that anyone can benefit. Resources like DAIR, Anthropic, and Deeplearning.ai help bridge the gap between research and product offering guidance on prompt engineering.

At Kindo, we leverage few shot prompting with Style References to improve our user’s results. Our Style References let users share writing samples and style guides. This customizes the model’s output to match the user’s individual or brand voice versus relying on a generic style.

We also created a Toolbox of Actions supported by backend prompts that include personas and task descriptions. We support Auto-Categorization to drive convenience, with custom models clustering our user’s libraries into categories.

With each interaction, our users better understand how to collaborate with LLMs. As product builders, our goal is to continue to refine a product experience that reflects what LLMs do well, not just supporting the raw model capabilities.

Conclusion

In one short year, LLMs have rapidly evolved from niche research to a mainstream force. Their cascading impacts spark productivity gains, policy debates, and urgent questions.

My key insight? LLM-powered products condense the product design process. With each interaction, we gain insights to inform the next iteration. Moving fast doesn’t mean compromising on thoughtfulness. If we build with users’ mental models in mind, we can shape how these models impact society.

In conclusion, LLMs enable ongoing active learning, with each user interaction providing insight. Quick wins demonstrate strengths like summarization and retrieval, educating through delight. And thoughtful design abstracts complexity, empowering broad collaboration.

As LLMs continue proliferating, these principles will guide the translation of research into products. With emerging regulations and societal impacts, it is an open question where this Cambrian explosion leads next. But one thing is clear: the thoughtful co-design of models and design interfaces will determine how humanity benefits from this new Industrial Revolution.

LLM-powered products are at the very beginning, with the extent of their impact yet unknown. But their development will be guided by the emerging lessons and priorities of today. With open and inclusive design, broad comprehension, and proactive policy, we can steer this technology towards empowering humanity’s limitless potential. The path ahead remains open—with care, we can navigate it together.

——

1 The ChatGPT Enterprise Assistant for Morgan Stanley (”Morgan Stanley is testing an OpenAI-powered chatbot for its 16,000 financial advisors”, March 2023).

2 Kindo, an open source-powered, secure platform for knowledge workers to collaborate with LLMs (VentureBeats, YouTube Trailer).

3 Here is an example of a system prompt from ChatGPT.

4 CopyAI’s Prompt Improver Feature which enhances the user experience with an LLM designed to engaged to analyze and refine a user’s original prompt into something better.