100% Local Perplexity AI Clone Powered By LM Studio
Guide

100% Local Perplexity AI Clone Powered By LM Studio

Nov 2025 · 9 min read · Perplexica · LM Studio · Docker · Privacy · Local Search

Perplexica is a fully open-source, privacy-first alternative to Perplexity AI. It scours the internet using a private search engine, passes results to a local LLM, and returns comprehensive answers with cited sources — all running on your own machine. No API keys, no tracking, no cloud.

The Stack

🔍
SearxNG
100% open-source meta search engine — no tracking, no ads, private by default
Search Engine
🧠
LM Studio
Runs local LLMs with an OpenAI-compatible API endpoint
LLM Backend
🌐
Perplexica
The front-end and orchestration layer — brings SearxNG + your LLM together
UI + Orchestrator
🐳
Docker
Packages and runs the entire Perplexica stack in isolated containers
Runtime
🔒
Fully Private
All searches stay on your machine — no data sent to external servers
📄
Cited Sources
Answers include links to the web pages that informed the response
💬
Conversational
Follow-up questions keep full context — like chatting with a search engine
🎯
Focus Modes
Limit searches to Reddit, YouTube, or your own local documents

Prerequisites

Step 1 — Clone Perplexica and Configure

  1. Create a new folder for the project, open it, click the address bar, type cmd, and press Enter to open a terminal in that folder.
  2. Clone the repository:
git clone https://github.com/ItzCrazyKns/Perplexica cd Perplexica
  1. Inside the Perplexica folder, find sample.config.toml. Copy it and rename the copy to config.toml — this is your configuration file. You can leave it as-is for now and configure providers through the UI later.
Windows permissions note: If saving config.toml directly into a protected directory fails, save it to your Documents folder first, then move it into the Perplexica folder.

Step 2 — Start Perplexica with Docker

Make sure Docker Desktop is running, then in your terminal (inside the Perplexica folder) run:

docker compose up

This builds the Docker image and starts all required containers. The first run downloads everything needed — it takes a few minutes. Once it's done, open your browser and go to:

http://localhost:3000

Step 3 — Set Up LM Studio

  1. Install and launch LM Studio.
  2. Download the recommended model. In LM Studio, click the Discover icon and search for Qwen 3B. Download the Qwen2.5 3B GGUF Q8 version — it's lightweight, capable, and has a large context window.
  3. Configure context length. Go to the Server tab, select the Qwen model, click Load → Context Length and set it to at least 12,000 tokens (higher is better for web search responses). Reload the model after changing this.
  4. Start the API server. The LM Studio API server usually starts automatically when you load a model. Confirm it's running in the Server tab.

Step 4 — Connect Perplexica to LM Studio

  1. In the Perplexica UI at localhost:3000, click the gear icon (bottom left) to open Settings.
  2. Under Chat Model Provider, select Custom OpenAI.
  3. Work around the UI bug: The Custom OpenAI settings fields sometimes collapse before you can fill them in. To fix this, temporarily enter any API key for another provider to populate the dropdown, then re-select Custom OpenAI — the fields will reappear.
  4. Set the Base URL to:
http://host.docker.internal:1234/v1

This is the LM Studio API URL as seen from inside Docker. Enter any value for the model name and a dummy string for the API key (LM Studio doesn't require authentication by default).

(Optional) Add an Embedding Model for Better Search

An embedding model helps Perplexica process and rank search results more efficiently before passing data to the LLM, resulting in more accurate and relevant answers.

A great lightweight option is mxbai-embed-large via Ollama:

  1. Install Ollama and pull the mxbai-embed-large model.
  2. In Perplexica Settings, set Embedding Model Provider to Ollama.
  3. Set the Ollama API URL to: http://host.docker.internal:11434
  4. Select mxbai-embed-large as your embedding model.
Other embedding models: Explore options on the MTEB embedding leaderboard (link in video description) which compares hundreds of models across 1,000+ languages.

Using Perplexica

Once everything is configured, start a new chat session, type your search query, and hit Enter. Perplexica will:

  1. Use SearxNG to privately fetch web results.
  2. Pass those results to your local Qwen model in LM Studio.
  3. Return a structured answer with cited sources.

You can ask follow-up questions to explore topics in depth — the full conversation context is maintained throughout the session.

Focus Modes

Perplexica supports focused search modes that restrict results to specific platforms:

Restarting Perplexica Later

You don't need to run terminal commands every time. Once the containers are set up:

  1. Open Docker Desktop.
  2. Find the Perplexica container and click Run.
  3. Navigate to localhost:3000 — you're back up immediately.
Model recommendation: Qwen2.5 3B GGUF Q8 is a great starting model for Perplexica — fast, lightweight, and strong at reasoning and tool use. If you have more VRAM available, larger Qwen models will produce even better search summaries.

📦 Want to skip the setup?

The Local Lab offers pre-configured AI installer packages so you can get running in minutes, not hours.

Browse the Store →