Skip to main content
Running Strix with local models allows for completely offline, privacy-first security assessments. Data never leaves your machine, making this ideal for sensitive internal networks or air-gapped environments.

Privacy vs Performance

FeatureLocal ModelsCloud Models (GPT-5/Claude 4.5)
Privacy🔒 Data stays localData sent to provider
CostFree (hardware only)Pay-per-token
ReasoningLower (struggles with agents)State-of-the-art
SetupComplex (GPU required)Instant
Compatibility Note: Strix relies on advanced agentic capabilities (tool use, multi-step planning, self-correction). Most local models, especially those under 70B parameters, struggle with these complex tasks.For critical assessments, we strongly recommend using state-of-the-art cloud models like Claude 4.5 Sonnet or GPT-5. Use local models only when privacy is the absolute priority.

Ollama

Ollama is the easiest way to run local models on macOS, Linux, and Windows.

Setup

  1. Install Ollama from ollama.ai
  2. Pull a high-performance model:
    ollama pull qwen3-vl
    
  3. Configure Strix:
    export STRIX_LLM="ollama/qwen3-vl"
    export LLM_API_BASE="http://localhost:11434"
    
We recommend these models for the best balance of reasoning and tool use: Recommended models:
  • Qwen3 VL (ollama pull qwen3-vl)
  • DeepSeek V3.1 (ollama pull deepseek-v3.1)
  • Devstral 2 (ollama pull devstral-2)

LM Studio / OpenAI Compatible

If you use LM Studio, vLLM, or other runners:
export STRIX_LLM="openai/local-model"
export LLM_API_BASE="http://localhost:1234/v1"  # Adjust port as needed