Cache Management Overview
Cache Management is one of the most powerful features of the HFIM Admin Panel. This page explains what caching is, why it matters, and how to use it effectively.
What is Cache Management?โ
Cache Management lets you create and maintain a library of pre-written question-and-answer pairs. When users ask questions similar to your cached entries, the chatbot returns the cached answer immediately without searching through documents.
Think of It Like...โ
Imagine the chatbot is a librarian:
- Without cache: Every time someone asks "Where is the circulation desk?", the librarian searches through maps, directories, and signs before answering.
- With cache: The librarian memorizes "The circulation desk is on the first floor near the entrance" and immediately answers without searching.
Cache entries are like the librarian's memorized answers - faster, consistent, and always ready.
Why Use Cache Management?โ
Cache management provides three major benefits:
1. โก Faster Responsesโ
Without cache: 2-8 seconds (AI search + generation) With cache: 50-500 milliseconds (instant retrieval)
Users get answers 10-15x faster when questions match cache entries.
2. โ Consistent Answersโ
Cache ensures the chatbot gives the same accurate answer every time:
- Official program requirements stay consistent
- Important deadlines are always correct
- Contact information never varies
Example: Without cache, "How many credits is HFIM?" might generate slightly different responses (120 credits vs 120 total credits vs 120 semester hours). With cache, every user gets the exact same verified answer.
3. ๐ฐ Lower Costsโ
Each cached response avoids:
- OpenAI API calls for embedding and generation
- Pinecone vector searches
- Database lookups
Impact: A cache hit rate of 70% can reduce AI costs by 60-80%.
If 100 users ask "What is HFIM?" in a week:
- Without cache: 100 AI generation calls = ~$0.50-$1.00
- With cache: 1 AI call (first time) + 99 cache hits = ~$0.005
Savings: 99%+ cost reduction for frequently asked questions!
How Cache Worksโ
The Cache Matching Processโ
When a user asks a question, the chatbot follows this process:
1. User asks: "What is the HFIM program?"
โ
2. Chatbot checks cache for similar questions
โ
3a. MATCH FOUND โ Return cached response (fast!)
โ
Done in 50-500ms
3b. NO MATCH โ Search documents + Generate response
โ
Takes 2-8 seconds
What Makes a Good Match?โ
The chatbot considers:
- Similarity score - How closely the question matches
- Confidence level - How confident you are in the answer
- Status - Whether the cache entry is active
- Variations - Alternative ways to ask the same question
Matching threshold: Questions must be at least 70% similar to return the cached response.
Cache Entry Componentsโ
Each cache entry has several fields:
Core Fieldsโ
| Field | Purpose | Example |
|---|---|---|
| Question | Main question this entry answers | "What is the HFIM program?" |
| Response | The answer to return | "HFIM stands for Hospitality and Food Industry Management..." |
| Confidence | Your confidence in accuracy (0-1) | 0.95 (very confident) |
| Status | Active (used) or Inactive (disabled) | Active |
| Sources | JSON array of source documents | [{"filename": "HFIM_Handbook.pdf", "page": 5}] |
Optional Fieldsโ
| Field | Purpose | Example |
|---|---|---|
| Question Variations | Alternative ways to ask | "Tell me about HFIM\nExplain the Hospitality program" |
| TTL | How long answer stays valid (days) | 90 days |
| Admin Notes | Internal notes for other admins | "Updated for Fall 2026 requirements" |
The "sources" field uses JSON format to list documents used to create the response. This helps with transparency and allows you to update entries when source documents change.
Format: [{"filename": "Document.pdf", "page": 12, "section": "Overview"}]
Cache Statisticsโ
The admin panel tracks performance metrics for each cache entry:
- Times Served: How many times this entry was returned to users
- Success Rate: Percentage of times it successfully matched questions
- Last Updated: When you last modified this entry
- Last Served: When it was most recently used
These metrics help you identify:
- โ High-value entries (frequently used)
- โ ๏ธ Unused entries (never or rarely matched)
- ๐ Entries needing updates (old "Last Updated" dates)
Learn more: Performance Metrics
Getting Started with Cacheโ
Recommended Workflowโ
For new admin users, follow this process:
-
Week 1: Observe
- Review existing cache entries
- Check Dashboard metrics (hit rate, popular questions)
- Read Best Practices
-
Week 2: Maintain
- Edit outdated entries
- Fix entries with low confidence
- Add variations to underperforming entries
-
Week 3+: Expand
- Create new entries for frequently asked questions
- Use the variation generator
- Analyze conversation feedback
What to Cache Firstโ
Highest Priority (cache these immediately):
- โ Program overview and mission
- โ Admission requirements
- โ Core faculty contact information
- โ Course prerequisites
- โ Internship requirements
Medium Priority (cache within 1-2 months):
- โ ๏ธ Popular questions (check Dashboard)
- โ ๏ธ Questions with negative feedback
- โ ๏ธ Degree requirements and pathways
Lower Priority (cache as time allows):
- โฌ Rarely asked questions
- โฌ Questions with consistent positive feedback (AI already answers well)
Don't try to cache everything at once! Focus on the 10-20 most common questions first, then expand gradually based on usage patterns.
Common Cache Management Tasksโ
Quick links to specific guides:
| Task | Guide | Frequency |
|---|---|---|
| Find a cache entry | Searching Cache | Daily |
| Edit an entry | Editing Entries | Weekly |
| Create variations | Generating Variations | Bi-weekly |
| Bulk updates | Bulk Operations | Monthly |
| Check performance | Performance Metrics | Weekly |
| Follow best practices | Best Practices | Always! |
| Fix problems | Troubleshooting | As needed |
Cache vs. RAG: When to Use Eachโ
Understanding when to use cache vs. letting the AI search (RAG) helps you make better decisions.
Use Cache For:โ
โ Frequently asked questions - Asked 10+ times โ Official information - Program requirements, policies, contact info โ Consistent answers - Same answer every time โ Time-sensitive info - Deadlines, current semester dates (with appropriate TTL) โ Approved messaging - Specific phrasing matters
Let AI Search (RAG) For:โ
โ Rarely asked questions - Asked < 5 times โ Context-dependent questions - "What about MY situation?" โ Exploratory questions - "Tell me about career paths" โ Complex queries - Require synthesizing multiple sources โ Recent updates - Brand new information not yet cached
Caching too many entries can:
- Make management difficult (hundreds of entries to maintain)
- Increase false matches (similar questions get wrong cached answers)
- Reduce flexibility (users want slightly different answers)
Rule of thumb: If a question is asked < 5 times per month, consider letting RAG handle it.
Understanding Cache Lifecycleโ
Cache entries go through several stages:
1. Creationโ
- Manual creation: You write question + response
- Conversion: From good conversations
- Import: Bulk upload (coming soon)
2. Active Useโ
- Serves responses to users
- Tracks "Times Served" metric
- Generates performance data
3. Maintenanceโ
- Review based on TTL
- Update for accuracy
- Add variations if underperforming
4. Retirementโ
- Set to "Inactive" when outdated
- Delete if permanently irrelevant
- Archive notes explain why
Learn more: Best Practices - Status Management
Next Stepsโ
Ready to start working with cache? Here's what to do next:
- Learn to search cache entries - Find existing entries
- Practice editing - Update an entry safely
- Generate variations - Improve matching
- Read best practices - Avoid common mistakes
Start by editing a low-impact entry (something with few "Times Served") to get comfortable with the interface before modifying high-traffic entries.
Questions?โ
Remember: Cache management is a powerful tool for improving the chatbot. Start small, follow best practices, and expand gradually based on usage data!