Create your own Notion Chatbot with LangChain, OpenAI and Streamlit

12 min readAug 31, 2023

Introduction

Hey! My name is Logan Vendrix and I’m a Data Scientist at Arinti, an AI company based in Belgium. Lately, I have been focusing on LLMs projects, and it’s been fascinating. I have learned a lot thanks to people sharing their knowledge online, and now, I want to give back by teaching you how to build your own Notion chatbot using LangChain, OpenAI, FAISS, and Streamlit!

All project files are available on my GitHub!

GitHub - lvendrix/notion-chatbot-public

Contribute to lvendrix/notion-chatbot-public development by creating an account on GitHub.

github.com

Why a Notion Chatbot?

Problem

At my company, we have all our company content in Notion. As we have a lot of documentation, it can be hard to find the right information quickly. So I built a Notion chatbot to make it easier to find information, using the latest AI technologies.

Solution

First, we use LangChain to load and split the Notion content, which we then convert to vectors using OpenAI embeddings, and store them in a vector database (FAISS in this case). Using LangChain, we build a Conversational Retrieval Chain to link our vector database and OpenAI GPT to answer the user’s question. The answer is based on the most relevant information found in our Notion content. We improve the chatbot by creating a custom system prompt and adding memory to it. Finally, we create a chat interface with Streamlit and embed the app directly in our Notion!

Roadmap

Step 1: Project structure and initialization

We look at the project structure and install the required dependencies. We also retrieve our OpenAI API key and duplicate a public Notion page that will serve as the base for this project.

Step 2: Document Ingestion

We convert all the content from our Notion pages into numerical representations (vectors). Because LLMs like GPT can’t handle very long text, we first need to split the text content into smaller text chunks. We’ll use LangChain for that. Once split, we use OpenAI’s embedding model to convert them into vectors. Finally, we store the vectors in a vector database.

Step 3: Query

The user enters a question, which we convert into a vector using the same embedding model. We use this vector to look for similar vectors in the vector database created earlier. The relevant text content is finally passed along with the user’s question to OpenAI GPT to create an answer.

For a better chat experience, we add memory to our chatbot by storing previous messages in a chat history. During the conversation, the chatbot has access to it.

Step 4. Chatbot application

We create the chatbot interface using Streamlit. With it, we can easily create a beautiful chat application, that we can then deploy online. Finally, we embed the app directly into our Notion page.

This tutorial is inspired by Harrison Chase’s one (LangChain’s founder) on how to chat with Notion content using LangChain. We improve his implementation by

Using specific markdown characters for better content splitting
Adding memory to the bot
Using Streamlit as our front-end chat application using the new chat features
Embedding the Streamlit chat application into a Notion page

Tutorial

1. Project structure and initialization

1.1 Project structure

The structure of the project notion-chatbot consists of

.streamlit/secrets.toml ⇒ to store your OpenAI API key
faiss_index ⇒ FAISS index (vector database) where all the vectors are stored
notion_content ⇒ folder containing our Notion content, in markdown files
.gitignore ⇒ to ignore your OpenAI API key and Notion content
app.py ⇒ script of the Streamlit chat application
ingest.py ⇒ script to convert Notion content to vectors and store those in a vector index
utils.py ⇒ script to create a Conversation Retrieval Chain
requirements.txt ⇒ necessary packages to deploy to Streamlit Community Cloud

We’ll create each file step-by-step during this tutorial, so no need to create them all at once.

Now, let’s initialize the project!

1.2 Project initialization

Start by creating a project folder notion-chatbot
Create a new environment and install the required dependencies

pip install streamlit langchain openai tiktoken faiss-cpu

Create the .gitignore file to specify which files not to track

# .gitignore

notion_content/
.streamlit/

Head towards OpenAI’s website to get your API key

Create the folder .streamlit
In .streamlit, create the file secrets.toml to store your OpenAI API key as follows

# secrets.toml

OPENAI_API_KEY = 'sk-A1B2C3D4E5F6G7H8I9J'

For this tutorial, we’ll use Blendle Employee Handbook as our knowledge base
No Notion account? Head to their website and create an account. It’s free!
Select Duplicate on the top-right corner to duplicate it into your own Notion

2. Document ingestion

2.1 Export your Notion content

Head towards the main Notion page of the Blendle Employee Handbook
On the right top corner, click on the 3 dots
Choose Export
Select Markdown and CSV for Export Format
Select Include subpages
Save your file as notion_content.zip
Unzip your folder
Place the content folder notion_content into your project folder notion-chatbot

It’s also possible to use Notion’s API to get your Notion content. For the simplicity of this tutorial, we manually export the content.

Great! Now you should have all the notion content as .md files in the folder notion_content, within your project folder notion-chatbot.

2.1 Convert Notion content to vectors

To use the content of our Notion page as the knowledge base of our chatbot, we need to convert all the content into vectors and store them.

For this, we’ll use LangChain, OpenAI embedding model, and FAISS.

Open your project folder in your favorite IDE and create the file ingest.py

# ingest.py

import streamlit as st
import openai
from langchain.document_loaders import NotionDirectoryLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import FAISS

# Load OpenAI API key
openai.api_key = st.secrets["OPENAI_API_KEY"]

# Load the Notion content located in the folder 'notion_content'
loader = NotionDirectoryLoader("notion_content")
documents = loader.load()

# Split the Notion content into smaller chunks
markdown_splitter = RecursiveCharacterTextSplitter(
    separators=["#","##", "###", "\n\n","\n","."],
    chunk_size=1500,
    chunk_overlap=100)
docs = markdown_splitter.split_documents(documents)

# Initialize OpenAI embedding model
embeddings = OpenAIEmbeddings()

# Convert all chunks into vectors embeddings using OpenAI embedding model
# Store all vectors in FAISS index and save to local folder 'faiss_index'
db = FAISS.from_documents(docs, embeddings)
db.save_local("faiss_index")

print('Local FAISS index has been successfully saved.')

Let’s go over the code:

We load the OpenAI API key stored in .streamlit/secrets.toml
We load the Notion content located in the notion_content folder using NotionDirectoryLoader
We split the content into smaller text chunks using RecursiveCharacterTextSplitter. There are different ways to split text. As our Notion content consists of markdown files with headings (# for H1, ## for H2, ### for H3), we choose to split on those specific characters. This ensures that we split the content at the best place between paragraphs, and not between sentences of the same paragraph. If the split can't be done on headings, it tries to split on the characters '\n\n', '\n', '.' that separate sentences. RecursiveCharacterTextSplitter follows the order of the list of characters you provide, meaning it will use the next character in the list until the chunks are small enough. We choose to have a chunk size of 1500 and an overlap of 100. Experiment with different values to see what works best for your project.
We convert each text chunk into a vector using OpenAI embedding model
We store all the vectors in a FAISS index

Nice, that was easy! Now that we have converted our Notion content to vectors and that they are stored in a vector database, let’s see how we can interact with them!

3. Query

3.1 Flow

A chat history is initially created. Whenever the user asks a question or the chatbot gives an answer, we store these messages in the chat history. As the conversation grows, the chatbot keeps track of the previous messages. This is the memory of our chatbot!
The user writes a question. The question is immediately stored in the chat history.
We combine the question and chat history into a stand-alone question.
We convert the stand-alone question into a vector. We use the same embedding model we used in the Document Ingestion phase.
We pass the vector to the vector database and perform a similarity search (aka vector search). To explain it simply: given a set of vectors (in red) and a query vector (user’s question in blue), we need to find the most similar items to the question. In the example below, we show the nearest neighbor search, which looks for the 3 closest vectors to the query vector.
With the most similar vectors found, we give the Notion content linked to these vectors with the stand-alone question to GPT.
GPT formulates an answer following the guidelines of a system prompt.
The chatbot replies to the user with the answer from GPT.
We pass the chatbot’s answer to the chat history.
Repeat!

Similarity search with K-Nearest Neighbor Search (k=3)

Makes sense? Nice! Now, let’s code this!

3.2 Query

First, we’ll create a LangChain Conversational Retrieval Chain, which will be the brain of our application. We’ll create the file utils.py where we create the function load_chain() that returns a Conversational Retrieval Chain.

# utils.py

import streamlit as st
import openai
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferWindowMemory
from langchain.chat_models import ChatOpenAI
from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings
from langchain.prompts import PromptTemplate
from langchain.prompts.chat import SystemMessagePromptTemplate

openai.api_key = st.secrets["OPENAI_API_KEY"]

@st.cache_resource
def load_chain():
  """
    The `load_chain()` function initializes and configures a conversational retrieval chain for
    answering user questions.
    :return: The `load_chain()` function returns a ConversationalRetrievalChain object.
    """

  # Load OpenAI embedding model
  embeddings = OpenAIEmbeddings()
  
  # Load OpenAI chat model
  llm = ChatOpenAI(temperature=0)
  
  # Load our local FAISS index as a retriever
  vector_store = FAISS.load_local("faiss_index", embeddings)
  retriever = vector_store.as_retriever(search_kwargs={"k": 3})
  
  # Create memory 'chat_history' 
  memory = ConversationBufferWindowMemory(k=3,memory_key="chat_history")
  
  # Create system prompt
  template = """
    You are an AI assistant for answering questions about the Blendle Employee Handbook.
    You are given the following extracted parts of a long document and a question. Provide a conversational answer.
    If you don't know the answer, just say 'Sorry, I don't know ... 😔. 
    Don't try to make up an answer.
    If the question is not about the Blendle Employee Handbook, politely inform them that you are tuned to only answer questions about the Blendle Employee Handbook.
    
    {context}
    Question: {question}
    Helpful Answer:"""
  
  # Create the Conversational Chain
  chain = ConversationalRetrievalChain.from_llm(llm=llm, 
                                              retriever=retriever, 
                                              memory=memory, 
                                              get_chat_history=lambda h : h,
                                              verbose=True)
  
  # Add systemp prompt to chain
  # Can only add it at the end for ConversationalRetrievalChain
  QA_CHAIN_PROMPT = PromptTemplate(input_variables=["context", "question"],template=template)
  chain.combine_docs_chain.llm_chain.prompt.messages[0] = SystemMessagePromptTemplate(prompt=QA_CHAIN_PROMPT)
  
  return chain

Let’s go over the code:

We load the OpenAI API key stored in .streamlit/secrets.toml
We create the function load_chain() that returns a Conversational Retrieval Chain object. We use st.cache_resource to make our application more efficient. Using this, we only run the load_chain() function at the very beginning and store the result in a local cache. Later, it will skip its execution as no changes have been made. Very convenient!
We load the OpenAI embedding model that will convert the user’s queries into vectors.
We load the OpenAI chat model that will generate the answers. To do so, it will use the stand-alone question (combining the user’s question and chat history) and relevant documents. We specify a temperature of 0, meaning that the model will always select the highest probability word. A higher temperature means that the model might select a word with a slightly lower probability, leading to more variation, randomness, and creativity. Play with it to see what works best for you.
We load our local FAISS index as a retriever, meaning that the chain will use it to search for relevant information. We define k=3, meaning that we’ll look for the 3 most relevant documents in the vector database.
We create the memory of our chatbot using ConversationBufferWindowMemory. We define k=3, meaning the chatbot will look at the last 3 interactions when creating the stand-alone question. This is useful for keeping a sliding window of the most recent interactions, so the buffer does not get too large.
We create the system prompt, which acts as the guidelines for our chatbot. We specify how the chatbot should behave, what it should do when it cannot find an answer, or when the user’s question is not within its scope.
We finally create the chain using ConversationalRetrievalChain, linking all the previous elements together. We set verbose=Trueto see what’s happening under the hood when running the chain. This makes it easier to see what information the chatbot uses to answer user’s questions.
We add the system prompt to the chain. Currently, it appears that we can only add it after defining the chain when using ConversationalRetrievalChain.from_llm.

4. Chatbot application

4.1 Streamlit application

Now that the brain of our chatbot is built, let’s put it all in a Streamlit application!

# app.py

import time
import streamlit as st
from utils import load_chain

# Custom image for the app icon and the assistant's avatar
company_logo = 'https://www.app.nl/wp-content/uploads/2019/01/Blendle.png'

# Configure Streamlit page
st.set_page_config(
    page_title="Your Notion Chatbot",
    page_icon=company_logo
)

# Initialize LLM chain in session_state
if 'chain' not in st.session_state:
    st.session_state['chain']= load_chain()

# Initialize chat history
if 'messages' not in st.session_state:
    # Start with first message from assistant
    st.session_state['messages'] = [{"role": "assistant", 
                                  "content": "Hi human! I am Blendle's smart AI. How can I help you today?"}]

# Display chat messages from history on app rerun
# Custom avatar for the assistant, default avatar for user
for message in st.session_state.messages:
    if message["role"] == 'assistant':
        with st.chat_message(message["role"], avatar=company_logo):
            st.markdown(message["content"])
    else:
        with st.chat_message(message["role"]):
            st.markdown(message["content"])

# Chat logic
if query := st.chat_input("Ask me anything"):
    # Add user message to chat history
    st.session_state.messages.append({"role": "user", "content": query})
    # Display user message in chat message container
    with st.chat_message("user"):
        st.markdown(query)

    with st.chat_message("assistant", avatar=company_logo):
        message_placeholder = st.empty()
        # Send user's question to our chain
        result = st.session_state['chain']({"question": query})
        response = result['answer']
        full_response = ""

        # Simulate stream of response with milliseconds delay
        for chunk in response.split():
            full_response += chunk + " "
            time.sleep(0.05)
            # Add a blinking cursor to simulate typing
            message_placeholder.markdown(full_response + "▌")
        message_placeholder.markdown(full_response)

    # Add assistant message to chat history
    st.session_state.messages.append({"role": "assistant", "content": response})

Let’s go over the code:

From utils.py, we import load_chain() that loads the Conversational Retrieval Chain we created earlier.
We load an image from a URL, that we use as our app’s page icon as well as our assistant’s avatar in the chat application.
We initialize our chain in the session state.
We initialize the chat history in the session state. We also include a first message from the assistant, welcoming the user.
We display all the messages of the chat history. We specify a custom avatar for the assistant, while the user’s avatar is the default one
We create the chat logic. It

receives the query from the user and stores it in the chat history
displays the user’s query in the chat
passes the user’s query to our chain st.session_state['chain']({"question": query})
gets a response back
displays the response in the chat, and simulates a human-speed response by slowing down the typing speed
stores the response in the chat history

4.2 Deployment on Streamlit Cloud

When all has been tested and you are happy with your chatbot, it’s time to go live! To deploy it on Streamlit Cloud:

Create a requirements.txt file to store the required dependencies

# requirements.txt

openai
langchain
faiss-cpu
tiktoken

Deploy the app and click on Advanced settings. There, specify your Python version and your OpenAI API key (same as in your local secrets.toml file)

4.3 Embed your Streamlit app in Notion

Once your app has been successfully deployed, copy the URL of your app
Head to your Notion page
Embed your app by selecting Embed in the block options
Paste your app’s URL and click on Embed link
Voilà! Have fun interacting with your content using your new Notion chatbot!

Congratulations!

In this tutorial, you have learned to

Convert your Notion content to vectors using OpenAI embedding model and store them in a FAISS index
Build a Conversation Retrieval Chain using LangChain, with custom prompt and memory
Build and deploy a Streamlit chat application using its latest chat features
Embed a Streamlit chat application into your Notion page

If you have any questions, please post them in the comments below. And if you want to learn more about AI and LLMs, let’s connect on LinkedIn!

Create your own Notion Chatbot with LangChain, OpenAI and Streamlit

Introduction

GitHub - lvendrix/notion-chatbot-public

Contribute to lvendrix/notion-chatbot-public development by creating an account on GitHub.

Why a Notion Chatbot?

Problem

Solution

Roadmap

Step 1: Project structure and initialization

Step 2: Document Ingestion

Step 3: Query

Step 4. Chatbot application

Tutorial

1. Project structure and initialization

2. Document ingestion

3. Query

4. Chatbot application

Congratulations!

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Logan Vendrix

Responses (2)

Recommended from Medium

Multi Agent Solution for Customer Churn Prediction using Gen AI

This blog has below 3 sections :

Building Assistant API application with Streamlit

Building an interactive AI assistant app with Streamlit and OpenAI’s Assistant API: handling real-time streaming, images, and function…

Introduction to Multi-Agent LLM Orchestration

The basics of LLM agents and developing a multi-agent system with LangGraph

Streamlit Meets Gemini: Effortlessly Deploying Your Chatbot on Google App Engine

In a world where chatbots are becoming integral to user engagement, deploying your own intelligent assistant has never been easier. This…

Build LLM Chatbot Apps with Streamlit & LangChain

Create Interactive LLM-Powered Generative AI Applications with Streamlit and LangChain Framework Using Langchain-Groq Client Open Source…

LangGraph + MCP + Ollama: The Key To Powerful Agentic AI

In this story, I have a super quick tutorial showing you how to create a multi-agent chatbot using LangGraph, MCP, and Ollama to build a…