CW

Perplexity AI Complete Guide — Architecture, RAG, Models & AI-Powered Search

Best for ResearchAI Search30 min read

By ChatWhole Team | 2025-02-05

Advertisement

Perplexity AI Complete Guide — Architecture, RAG, Models & AI-Powered Search

Perplexity AI is fundamentally different from ChatGPT or Claude. Instead of relying solely on training data, it performs real-time web search and synthesizes information from multiple sources using Retrieval-Augmented Generation (RAG).


Perplexity vs Traditional LLMs

Architecture Diagram
Traditional LLM (ChatGPT, Claude):
-------------------------------------
Training Data -> Model -> Response
(Static knowledge)
(Training cutoff: months ago)
(No citations)
(May hallucinate facts)

Perplexity (RAG-based):
-------------------------------------
User Question
    |
    v
Web Search (Real-time)
    |
    v
Source Retrieval (Multiple sources)
    |
    v
LLM Synthesis (Combine information)
    |
    v
Response + Citations
(Always current)
(Always cited)
(Less hallucination)

The RAG Architecture

What is Retrieval-Augmented Generation?

RAG combines the strengths of retrieval (finding relevant information) with generation (creating coherent answers):

Architecture Diagram
RAG Pipeline:

1. Query Understanding
   User: "What are the latest developments in quantum computing?"
   |
   v
2. Query Expansion
   Sub-queries:
   - "quantum computing breakthroughs 2025"
   - "quantum computing research papers"
   - "quantum computing industry news"
   |
   v
3. Document Retrieval
   Search engines + Custom index
   -> 50+ candidate documents
   |
   v
4. Relevance Ranking
   Rank by: relevance, freshness, authority
   -> Top 15-30 documents
   |
   v
5. Chunk Extraction
   Extract relevant paragraphs/chunks
   -> 100+ text chunks
   |
   v
6. LLM Synthesis
   Combine chunks into coherent answer
   Maintain source attribution
   |
   v
7. Citation Generation
   Link claims to sources
   [1] [2] [3] inline citations

Perplexity's Custom RAG Stack

Architecture Diagram
Perplexity Architecture:

+-----------------------------------------------------+
|  User Query: "Compare GPT-4o and Claude 3.5"       |
|                                                      |
|  +---------------------------------------------+    |
|  |  Query Processing                           |    |
|  |                                              |    |
|  |  1. Intent classification                   |    |
|  |     -> "Comparison" query                    |    |
|  |                                              |    |
|  |  2. Sub-query decomposition                 |    |
|  |     -> "GPT-4o capabilities"                 |    |
|  |     -> "Claude 3.5 capabilities"             |    |
|  |     -> "GPT-4o vs Claude 3.5 comparison"     |    |
|  |                                              |    |
|  |  3. Search strategy                         |    |
|  |     -> Web search + academic sources         |    |
|  +---------------+-----------------------------+    |
|                  |                                   |
|                  v                                   |
|  +---------------------------------------------+    |
|  |  Multi-Source Retrieval                     |    |
|  |                                              |    |
|  |  Sources:                                   |    |
|  |  +- Web search (Google, Bing)              |    |
|  |  +- Academic papers (Semantic Scholar)     |    |
|  |  +- News articles                          |    |
|  |  +- Documentation                          |    |
|  |  +- Perplexity's custom index              |    |
|  |                                              |    |
|  |  Result: 50+ candidate documents           |    |
|  +---------------+-----------------------------+    |
|                  |                                   |
|                  v                                   |
|  +---------------------------------------------+    |
|  |  Source Ranking & Filtering                 |    |
|  |                                              |    |
|  |  Criteria:                                  |    |
|  |  +- Relevance to query                     |    |
|  |  +- Source authority (DA score)             |    |
|  |  +- Freshness (recency weight)             |    |
|  |  +- Content quality                        |    |
|  |  +- Diversity (avoid duplicates)           |    |
|  |                                              |    |
|  |  Result: Top 15-30 sources                 |    |
|  +---------------+-----------------------------+    |
|                  |                                   |
|                  v                                   |
|  +---------------------------------------------+    |
|  |  LLM Synthesis                             |    |
|  |                                              |    |
|  |  Model: GPT-4 / Claude / Perplexity Small  |    |
|  |                                              |    |
|  |  Task:                                      |    |
|  |  1. Read all retrieved chunks              |    |
|  |  2. Identify key information               |    |
|  |  3. Synthesize coherent answer             |    |
|  |  4. Maintain source attribution            |    |
|  |  5. Generate citations                     |    |
|  +---------------+-----------------------------+    |
|                  |                                   |
|                  v                                   |
|  +---------------------------------------------+    |
|  |  Output: Answer + Citations                |    |
|  |                                              |    |
|  |  "GPT-4o and Claude 3.5 both excel at...   |    |
|  |   GPT-4o has better multimodal support [1], |    |
|  |   while Claude 3.5 excels at coding [2].   |    |
|  |   According to benchmarks [3]..."           |    |
|  |                                              |    |
|  |  [1] openai.com/gpt-4o                     |    |
|  |  [2] anthropic.com/claude                   |    |
|  |  [3] arxiv.org/paper/2024.xxxxx            |    |
|  +---------------------------------------------+    |
+-----------------------------------------------------+

Search Modes

Quick Search

FeatureDetails
Speed2-3 seconds
Sources5-10 sources
DepthSurface-level
Use caseSimple factual questions
CostFree (5/day limit)

Pro Search

FeatureDetails
Speed5-10 seconds
Sources15-30 sources
DepthDeep analysis
Use caseComplex research questions
Cost$5/month (unlimited)

Pro Search Process

Architecture Diagram
Pro Search Deep Dive:

1. Question Decomposition
   "Compare GPT-4o and Claude for coding"
   |
   +-- "GPT-4o coding benchmarks"
   +-- "Claude 3.5 coding benchmarks"
   +-- "GPT-4o vs Claude coding comparison"
   +-- "User experiences GPT-4o coding"
   +-- "User experiences Claude coding"

2. Parallel Search
   All 5 sub-queries executed simultaneously
   -> 200+ candidate documents

3. Cross-Reference Analysis
   Find information that appears in multiple sources
   -> Higher confidence in claims

4. Contradiction Detection
   Identify conflicting information
   -> Present both sides with sources

5. Synthesis
   Combine all information into comprehensive answer
   -> 500-1000 word response with 15-20 citations

Model Stack

Perplexity uses different models for different tasks:

Architecture Diagram
Model Selection:

Task                    Model Used
-----------------------------------------
Quick Search            Perplexity Small (proprietary)
Pro Search              GPT-4 / Claude
Reasoning tasks         o1 / Claude
Code questions          Specialized code model
Academic research       Perplexity + Scholar
General knowledge       Perplexity Small

Why multiple models?
- Different models excel at different tasks
- Cost optimization (use cheap model when possible)
- Latency optimization (use fast model when possible)
- Quality optimization (use best model when needed)

Use Cases

Use CaseWhy PerplexityComparison
ResearchAlways current, citedBetter than ChatGPT
Fact-checkingReal-time verificationBetter than Google
News monitoringLatest informationBetter than RSS
Academic researchSource attributionBetter than Scholar
Market researchCurrent dataBetter than reports
Technical docsUp-to-dateBetter than docs sites
Competitive analysisReal-time intelligenceBetter than manual

Perplexity vs ChatGPT

Architecture Diagram
Feature Comparison:

                    Perplexity         ChatGPT
-------------------------------------------------
Knowledge source    Web (real-time)    Training data
Citations           Yes (always)       No (usually)
Current events      Excellent          Poor (old data)
Factual accuracy    Higher             Lower (hallucination)
Creative writing    Poor               Excellent
Code generation     Good               Excellent
Conversation        Limited            Excellent
File upload         Limited            Yes
Image generation    No                 Yes (DALL-E)
Voice               No                 Yes
Price               $5/month           $20/month

Pricing

PlanPriceFeatures
Free$05 Pro searches/day, basic features
Pro$5/monthUnlimited Pro, $5 API credit
EnterpriseCustomTeam features, SSO, admin

API Usage

import requests

# Perplexity API (OpenAI-compatible)
response = requests.post(
    "https://api.perplexity.ai/chat/completions",
    headers={"Authorization": "Bearer your-api-key"},
    json={
        "model": "llama-3.1-sonar-small-128k-online",
        "messages": [
            {"role": "user", "content": "Latest AI news?"}
        ]
    }
)

print(response.json()["choices"][0]["message"]["content"])

Key Takeaways

  1. Perplexity searches the web in real-time — always current
  2. RAG architecture combines retrieval with generation
  3. Every answer includes citations to source material
  4. Pro Search decomposes complex questions into sub-queries
  5. Perplexity uses multiple LLMs optimized for different tasks
  6. Best for research and fact-checking — not creative writing
  7. Free tier gives 5 Pro searches per day
  8. API access available for developers (OpenAI-compatible)
  9. Perplexity is not a replacement for ChatGPT — it's a search engine
  10. Citation accuracy is much higher than ChatGPT's knowledge

Further Reading

Advertisement

Need Expert AI Help?

Get personalized AI tool selection, integration, and consulting.

Advertisement