Live at : http://217.154.38.177:8501/
Smart ChatBot π€
This is a simple Streamlit ChatBot app powered by OpenAI, LangChain, RAG (Retrieval-Augmented Generation), and FAISS for vector storage.
The app allows users to upload a text-based PDF document and ask natural language questions related to the content of the uploaded file.
π Features
- π₯ Ask questions from any uploaded PDF document
- π‘ Uses OpenAI (
gpt-3.5-turbo) to generate smart answers - π Powered by LangChain’s RAG framework and FAISS vector store
- π Dynamically parses PDFs and generates embeddings
- π§ Retrieval-based contextual answering with document chunking
π οΈ Tech Stack
- Streamlit – UI frontend
- LangChain – Chain and retrieval logic
- OpenAI – LLM for answer generation
- FAISS – Local in-memory vector storage
- PDF document parsing via
PyPDFLoader
π¦ Folder Structure
your-repo/
βββ app.py # Main Streamlit app file
βββ README.md # Project documentation
π‘ How It Works
- Upload a PDF document (text-based only).
- It gets split into chunks using LangChainβs
RecursiveCharacterTextSplitter. - Each chunk is converted into a vector using OpenAI Embeddings (
text-embedding-3-smallor similar). - Vectors are stored in a temporary FAISS index.
- At query time, most relevant chunks are retrieved and passed as context to GPT.
- GPT returns an answer grounded in the uploaded document.
βΆοΈ Getting Started
1. Clone the repository
git clone https://github.com/siddharthsingh5010/pdf_rag_chatbot
cd pdf_rag_chatbot
2. Install dependencies
Create a virtual environment and install required packages:
pip install -r requirements.txt
3. Set your OpenAI API Key
You can export it in your terminal session:
docker build -t pdf_rag_app .
Or add this in your shell config (~/.bashrc, ~/.zshrc, etc.).
4. Run the app
docker run -p 8501:8501 pdf_rag_app
π Notes
- This app only supports text-based PDFs (not scanned images).
- For best performance, make sure your OpenAI API key has access to
gpt-3.5-turbo.
π License
MIT License
Author
Siddharth Singh
