How to select examples by similarity
This guide assumes familiarity with the following concepts:
This object selects examples based on similarity to the inputs. It does this by finding the examples with the embeddings that have the greatest cosine similarity with the inputs.
The fields of the examples object will be used as parameters to format the examplePrompt passed to the FewShotPromptTemplate.
Each example should therefore contain all required fields for the example prompt you are using.
- npm
- Yarn
- pnpm
npm install @langchain/openai @langchain/community
yarn add @langchain/openai @langchain/community
pnpm add @langchain/openai @langchain/community
import { OpenAIEmbeddings } from "@langchain/openai";
import { HNSWLib } from "@langchain/community/vectorstores/hnswlib";
import { PromptTemplate, FewShotPromptTemplate } from "@langchain/core/prompts";
import { SemanticSimilarityExampleSelector } from "@langchain/core/example_selectors";
// Create a prompt template that will be used to format the examples.
const examplePrompt = PromptTemplate.fromTemplate(
  "Input: {input}\nOutput: {output}"
);
// Create a SemanticSimilarityExampleSelector that will be used to select the examples.
const exampleSelector = await SemanticSimilarityExampleSelector.fromExamples(
  [
    { input: "happy", output: "sad" },
    { input: "tall", output: "short" },
    { input: "energetic", output: "lethargic" },
    { input: "sunny", output: "gloomy" },
    { input: "windy", output: "calm" },
  ],
  new OpenAIEmbeddings(),
  HNSWLib,
  { k: 1 }
);
// Create a FewShotPromptTemplate that will use the example selector.
const dynamicPrompt = new FewShotPromptTemplate({
  // We provide an ExampleSelector instead of examples.
  exampleSelector,
  examplePrompt,
  prefix: "Give the antonym of every input",
  suffix: "Input: {adjective}\nOutput:",
  inputVariables: ["adjective"],
});
// Input is about the weather, so should select eg. the sunny/gloomy example
console.log(await dynamicPrompt.format({ adjective: "rainy" }));
/*
  Give the antonym of every input
  Input: sunny
  Output: gloomy
  Input: rainy
  Output:
*/
// Input is a measurement, so should select the tall/short example
console.log(await dynamicPrompt.format({ adjective: "large" }));
/*
  Give the antonym of every input
  Input: tall
  Output: short
  Input: large
  Output:
*/
API Reference:
- OpenAIEmbeddings from @langchain/openai
- HNSWLib from @langchain/community/vectorstores/hnswlib
- PromptTemplate from @langchain/core/prompts
- FewShotPromptTemplate from @langchain/core/prompts
- SemanticSimilarityExampleSelector from @langchain/core/example_selectors
By default, each field in the examples object is concatenated together, embedded, and stored in the vectorstore for later similarity search against user queries.
If you only want to embed specific keys
(e.g., you only want to search for examples that have a similar query to the one the user provides), you can pass an inputKeys
array in the final options parameter.
Loading from an existing vectorstoreβ
You can also use a pre-initialized vector store by passing an instance to the SemanticSimilarityExampleSelector constructor
directly, as shown below. You can also add more examples via the addExample method:
// Ephemeral, in-memory vector store for demo purposes
import { MemoryVectorStore } from "langchain/vectorstores/memory";
import { OpenAIEmbeddings, ChatOpenAI } from "@langchain/openai";
import { PromptTemplate, FewShotPromptTemplate } from "@langchain/core/prompts";
import { SemanticSimilarityExampleSelector } from "@langchain/core/example_selectors";
const embeddings = new OpenAIEmbeddings();
const memoryVectorStore = new MemoryVectorStore(embeddings);
const examples = [
  {
    query: "healthy food",
    output: `galbi`,
  },
  {
    query: "healthy food",
    output: `schnitzel`,
  },
  {
    query: "foo",
    output: `bar`,
  },
];
const exampleSelector = new SemanticSimilarityExampleSelector({
  vectorStore: memoryVectorStore,
  k: 2,
  // Only embed the "query" key of each example
  inputKeys: ["query"],
});
for (const example of examples) {
  // Format and add an example to the underlying vector store
  await exampleSelector.addExample(example);
}
// Create a prompt template that will be used to format the examples.
const examplePrompt = PromptTemplate.fromTemplate(`<example>
  <user_input>
    {query}
  </user_input>
  <output>
    {output}
  </output>
</example>`);
// Create a FewShotPromptTemplate that will use the example selector.
const dynamicPrompt = new FewShotPromptTemplate({
  // We provide an ExampleSelector instead of examples.
  exampleSelector,
  examplePrompt,
  prefix: `Answer the user's question, using the below examples as reference:`,
  suffix: "User question: {query}",
  inputVariables: ["query"],
});
const formattedValue = await dynamicPrompt.format({
  query: "What is a healthy food?",
});
console.log(formattedValue);
/*
Answer the user's question, using the below examples as reference:
<example>
  <user_input>
    healthy
  </user_input>
  <output>
    galbi
  </output>
</example>
<example>
  <user_input>
    healthy
  </user_input>
  <output>
    schnitzel
  </output>
</example>
User question: What is a healthy food?
*/
const model = new ChatOpenAI({});
const chain = dynamicPrompt.pipe(model);
const result = await chain.invoke({ query: "What is a healthy food?" });
console.log(result);
/*
  AIMessage {
    content: 'A healthy food can be galbi or schnitzel.',
    additional_kwargs: { function_call: undefined }
  }
*/
API Reference:
- MemoryVectorStore from langchain/vectorstores/memory
- OpenAIEmbeddings from @langchain/openai
- ChatOpenAI from @langchain/openai
- PromptTemplate from @langchain/core/prompts
- FewShotPromptTemplate from @langchain/core/prompts
- SemanticSimilarityExampleSelector from @langchain/core/example_selectors
Metadata filteringβ
When adding examples, each field is available as metadata in the produced document. If you would like further control over your
search space, you can add extra fields to your examples and pass a filter parameter when initializing your selector:
// Ephemeral, in-memory vector store for demo purposes
import { MemoryVectorStore } from "langchain/vectorstores/memory";
import { OpenAIEmbeddings, ChatOpenAI } from "@langchain/openai";
import { PromptTemplate, FewShotPromptTemplate } from "@langchain/core/prompts";
import { Document } from "@langchain/core/documents";
import { SemanticSimilarityExampleSelector } from "@langchain/core/example_selectors";
const embeddings = new OpenAIEmbeddings();
const memoryVectorStore = new MemoryVectorStore(embeddings);
const examples = [
  {
    query: "healthy food",
    output: `lettuce`,
    food_type: "vegetable",
  },
  {
    query: "healthy food",
    output: `schnitzel`,
    food_type: "veal",
  },
  {
    query: "foo",
    output: `bar`,
    food_type: "baz",
  },
];
const exampleSelector = new SemanticSimilarityExampleSelector({
  vectorStore: memoryVectorStore,
  k: 2,
  // Only embed the "query" key of each example
  inputKeys: ["query"],
  // Filter type will depend on your specific vector store.
  // See the section of the docs for the specific vector store you are using.
  filter: (doc: Document) => doc.metadata.food_type === "vegetable",
});
for (const example of examples) {
  // Format and add an example to the underlying vector store
  await exampleSelector.addExample(example);
}
// Create a prompt template that will be used to format the examples.
const examplePrompt = PromptTemplate.fromTemplate(`<example>
  <user_input>
    {query}
  </user_input>
  <output>
    {output}
  </output>
</example>`);
// Create a FewShotPromptTemplate that will use the example selector.
const dynamicPrompt = new FewShotPromptTemplate({
  // We provide an ExampleSelector instead of examples.
  exampleSelector,
  examplePrompt,
  prefix: `Answer the user's question, using the below examples as reference:`,
  suffix: "User question:\n{query}",
  inputVariables: ["query"],
});
const model = new ChatOpenAI({});
const chain = dynamicPrompt.pipe(model);
const result = await chain.invoke({
  query: "What is exactly one type of healthy food?",
});
console.log(result);
/*
  AIMessage {
    content: 'One type of healthy food is lettuce.',
    additional_kwargs: { function_call: undefined }
  }
*/
API Reference:
- MemoryVectorStore from langchain/vectorstores/memory
- OpenAIEmbeddings from @langchain/openai
- ChatOpenAI from @langchain/openai
- PromptTemplate from @langchain/core/prompts
- FewShotPromptTemplate from @langchain/core/prompts
- Document from @langchain/core/documents
- SemanticSimilarityExampleSelector from @langchain/core/example_selectors
Custom vectorstore retrieversβ
You can also pass a vectorstore retriever instead of a vectorstore. One way this could be useful is if you want to use retrieval besides similarity search such as maximal marginal relevance:
/* eslint-disable @typescript-eslint/no-non-null-assertion */
// Requires a vectorstore that supports maximal marginal relevance search
import { Pinecone } from "@pinecone-database/pinecone";
import { OpenAIEmbeddings, ChatOpenAI } from "@langchain/openai";
import { PineconeStore } from "@langchain/pinecone";
import { PromptTemplate, FewShotPromptTemplate } from "@langchain/core/prompts";
import { SemanticSimilarityExampleSelector } from "@langchain/core/example_selectors";
const pinecone = new Pinecone();
const pineconeIndex = pinecone.Index(process.env.PINECONE_INDEX!);
/**
 * Pinecone allows you to partition the records in an index into namespaces.
 * Queries and other operations are then limited to one namespace,
 * so different requests can search different subsets of your index.
 * Read more about namespaces here: https://docs.pinecone.io/guides/indexes/use-namespaces
 *
 * NOTE: If you have namespace enabled in your Pinecone index, you must provide the namespace when creating the PineconeStore.
 */
const namespace = "pinecone";
const pineconeVectorstore = await PineconeStore.fromExistingIndex(
  new OpenAIEmbeddings(),
  { pineconeIndex, namespace }
);
const pineconeMmrRetriever = pineconeVectorstore.asRetriever({
  searchType: "mmr",
  k: 2,
});
const examples = [
  {
    query: "healthy food",
    output: `lettuce`,
    food_type: "vegetable",
  },
  {
    query: "healthy food",
    output: `schnitzel`,
    food_type: "veal",
  },
  {
    query: "foo",
    output: `bar`,
    food_type: "baz",
  },
];
const exampleSelector = new SemanticSimilarityExampleSelector({
  vectorStoreRetriever: pineconeMmrRetriever,
  // Only embed the "query" key of each example
  inputKeys: ["query"],
});
for (const example of examples) {
  // Format and add an example to the underlying vector store
  await exampleSelector.addExample(example);
}
// Create a prompt template that will be used to format the examples.
const examplePrompt = PromptTemplate.fromTemplate(`<example>
  <user_input>
    {query}
  </user_input>
  <output>
    {output}
  </output>
</example>`);
// Create a FewShotPromptTemplate that will use the example selector.
const dynamicPrompt = new FewShotPromptTemplate({
  // We provide an ExampleSelector instead of examples.
  exampleSelector,
  examplePrompt,
  prefix: `Answer the user's question, using the below examples as reference:`,
  suffix: "User question:\n{query}",
  inputVariables: ["query"],
});
const model = new ChatOpenAI({});
const chain = dynamicPrompt.pipe(model);
const result = await chain.invoke({
  query: "What is exactly one type of healthy food?",
});
console.log(result);
/*
  AIMessage {
    content: 'lettuce.',
    additional_kwargs: { function_call: undefined }
  }
*/
API Reference:
- OpenAIEmbeddings from @langchain/openai
- ChatOpenAI from @langchain/openai
- PineconeStore from @langchain/pinecone
- PromptTemplate from @langchain/core/prompts
- FewShotPromptTemplate from @langchain/core/prompts
- SemanticSimilarityExampleSelector from @langchain/core/example_selectors
Next stepsβ
You've now learned a bit about using similarity in an example selector.
Next, check out this guide on how to use a length-based example selector.