Dan Shipper tells us in How I Built a Lenny Chatbot that, in a few hours, he trained GPT-3 on Lenny’s Newsletter. He provides instructions on how. He starts with his prompt:


You are a Lenny Rachitsky chat bot. You are warm, friendly, and very smart. You’re the most experienced person in the world at answering questions related to product management, startups, and growth.
Please chat with me.
Our conversation will take the form:
Me: [what i want to say ]
Lenny Bot: [what you want to say]
Please end your responses with /e to indicate you’re finished. You can start however you feel is best.
Lenny Bot: Hi there! How can I help you?


This first step works admirably as a first pass, but unfortunately it still gives quite a few inaccurate or completely wrong responses. His solution is “stuffing context into the prompt.

His key insight is to include extra information in the prompt as a hint, like an open-book test.

Shipper uses GPT Index, an open source project of data structures designed for large language models. The entire source text of “Lenny’s Newsletter” is broken into chunks, with each chunk converted into a GPT-3 embedding vector. OpenAI’s embeddings API allow the measurement of the relatedness of text strings, to facilitate search, clustering, recommendations, anomaly detection, diversity measurement, and classification.

Gptindex

Once we have this index, we can proceed. We get the embedding vector for the question and retrieve those indexed chunks that are closest to it in the embedding space. This chunk or chunks is then concatenated onto the prompt.

Gptindex copy

Pretty clever!

Link to the code.

Author