Hardware architecture
Run n8n Using Docker
Start the SGLang Server (for LLM Inference
Run Ollama for Embeddings
Run Qdrant Vector Database Locally
Set Up WireGuard VPN
Create a Telegram Bot via BotFather
The Complete n8n Workflow
References

Published August 12, 2025 © GPL3+

Build Local AI RAG Agent with n8n on NVIDIA Jetson AGX Orin

Deploy Qwen3-4B with Sglang, Snowflake-Arctic-Embed2 with Ollama, Telegram and Qdrant in an n8n pipeline.

IntermediateFull instructions provided6 hours681

Build Local AI RAG Agent with n8n on NVIDIA Jetson AGX Orin

Things used in this project

Hardware components

NVIDIA Jetson AGX Orin Developer Kit

Story

In this tutorial, you’ll learn how to build AI Retrieval-Augmented Generation (RAG) agent on the NVIDIA Jetson AGX Orin Dev Kit using n8n, which is a powerful workflow automation platform. AI Agents use large language models (LLMs) like GPT to decide which tool to use dynamically, based on input and context.

Unlike traditional chatbots that rely solely on pre-trained knowledge, RAG agents dynamically retrieve and use relevant external information. This project implements a fully local RAG assistant on the NVIDIA Jetson AGX Orin 64GB development kit. The system leverages:

Qwen/Qwen3-4B large language model served via SGLang for low-latency, high-throughput inference.
Snowflake-Arctic-Embed2 (via Ollama) for efficient and accurate text embeddings.
Qdrant as the local vector database to store and retrieve semantic embeddings.
n8n as the workflow automation engine to orchestrate the RAG pipeline, including document processing, query handling, and response generation.
Telegram as the real-time messenger, enabling users to interact with the RAG agent through a familiar chat environment.
WireGuard VPN as the secure remote access without exposing services to the public internet.

Hardware architecture

Here's a brief overview of the scenario I would like to create:

Since my ISP does not provide a public IP address for my LAN, there are various third-party tools like Cloudflare Tunnel and ngrok for exposing local services, but they are not entirely safe for private data.

With WireGuard integrated, you can securely connect to your Nvidia Jetson device from anywhere. Queries sent via Telegram are routed through the Telegram API to the local RAG pipeline, processed by the Qwen3 model, and returned in real time, making this ideal for edge AI, privacy-sensitive workloads, and low-connectivity environments. The n8n web interface is also accessible from the internet, allowing users to troubleshoot any issues from anywhere.

Run n8n Using Docker

n8n is a powerful, open-source workflow automation tool that will act as the brain of your AI agent, connecting Telegram, LLMs, and vector databases.

docker run -it --rm \
  -p 5678:5678 \
  -v ~/.n8n:/home/node/.n8n \
  -e N8N_BASIC_AUTH_ACTIVE=true \
  -e N8N_BASIC_AUTH_USER=admin \
  -e N8N_BASIC_AUTH_PASSWORD=your_password \
  n8nio/n8n

Open your browser and go to:

http://<jetson-ip>:5678

💡 Tip: Change your_password to a secure one for production use.

Start the SGLang Server (for LLM Inference)

We'll use SGLang to serve a local LLM (e.g., Qwen3-4B) capable of tool calling.

python3 -m sglang.launch_server \
  --model-path Qwen/Qwen3-4B \
  --host 0.0.0.0 \
  --port 30000 \
  --mem-fraction-static 0.5 \
  --tool-call-parser qwen25

✅ This server will handle dynamic tool selection and reasoning.

Function tool calling is a capability that enables LLMs to go beyond simple text generation by interacting with external tools and real-world applications.

Run Ollama for Embeddings

Ollama powers the embedding model used by the RAG system to convert text into vectors.

Pull the embedding model:

ollama pull snowflake-arctic-embed2

Set Ollama host and start server:

export OLLAMA_HOST=0.0.0.0
ollama serve

This model (snowflake-arctic-embed2) will encode user queries and documents for similarity search in Qdrant.

Run Qdrant Vector Database Locally

Qdrant stores and retrieves vector embeddings efficiently.

Pull the Qdrant Docker image:

docker pull qdrant/qdrant

Then, run the service:

docker run -p 6333:6333 -p 6334:6334 \
    -v "$(pwd)/qdrant_storage:/qdrant/storage:z" \
    qdrant/qdrant

The API is now available at:

http://<jetson-ip>:6333/dashboard

You can now store and query embeddings from n8n workflows.

Set Up WireGuard VPN

To securely access your Nvidia Jetson from outside your local network, bring up the WireGuard tunnel:

sudo wg-quick up wg0

This setup assumes you’ve already configured wg0.conf in /etc/wireguard/. In this example, I’m using the wg-easy project.

This approach provides secure remote access to SGLang, Qdrant, and other services running on the NVIDIA Jetson dev kit, without exposing them to the public internet or relying on third-party tools like cloudflare or ngrok.

Create a Telegram Bot via BotFather

Use Telegram’s BotFather to create a chatbot interface for your AI agent.

Open Telegram and search for @BotFather

Start a chat and send:

/newbot

Follow prompts to:Choose a name (e.g., MyJetsonAIBot)Choose a username (ends with bot, e.g., MyJetsonAIBot)

After creation, BotFather will send HTTP API token. Save this token securely!

The Complete n8n Workflow

The workflow connects all components:

Data Ingestion: From websites/PDFs > HTML extraction > text splitting
Vector Storage: Documents processed into embeddings > stored in Qdrant
Query Handling: Telegram messages trigger the RAG pipeline
AI Response: LLM generates answers using retrieved context > replies via Telegram

This workflow uses the NVIDIA Jetson AI Lab website as the RAG input, specifically from webpage: https://www.jetson-ai-lab.com/research.html

Visual representation of the n8n automation flow

The JSON provided details the structure of the workflow:

{
  "name": "RAG example",
  "nodes": [
    {
      "parameters": {},
      "id": "aef807ca-2036-4316-b776-a86bd5ffdf35",
      "name": "When clicking ‘Test workflow’",
      "type": "n8n-nodes-base.manualTrigger",
      "position": [
        -2640,
        200
      ],
      "typeVersion": 1
    },
    {
      "parameters": {
        "jsonMode": "expressionData",
        "jsonData": "={{ $('Extract from HTML').item.json.text }}",
        "options": {}
      },
      "id": "e33dad49-6bcc-457f-ba9b-e3ad42b56a56",
      "name": "Default Data Loader",
      "type": "@n8n/n8n-nodes-langchain.documentDefaultDataLoader",
      "position": [
        -1700,
        400
      ],
      "typeVersion": 1
    },
    {
      "parameters": {
        "chunkSize": 2000,
        "chunkOverlap": {},
        "options": {}
      },
      "id": "026a9342-83c0-4b0c-8e80-b8a835ed100a",
      "name": "Recursive Character Text Splitter",
      "type": "@n8n/n8n-nodes-langchain.textSplitterRecursiveCharacterTextSplitter",
      "position": [
        -1720,
        620
      ],
      "typeVersion": 1
    },
    {
      "parameters": {
        "content": "## 1.1 Extract from HTML\n\n",
        "height": 335,
        "width": 640,
        "color": 4
      },
      "id": "c2ffc4de-8e78-4de7-8a21-f260a4005b20",
      "name": "Sticky Note",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -2680,
        80
      ],
      "typeVersion": 1
    },
    {
      "parameters": {
        "content": "## 2. Create Vector Store using selfhosted Qdrant\n",
        "height": 691,
        "width": 615,
        "color": 3
      },
      "id": "239ae893-4cc2-46f1-b897-7cfabb232000",
      "name": "Sticky Note1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -2020,
        80
      ],
      "typeVersion": 1
    },
    {
      "parameters": {
        "content": "## 3.1 Use the chat trigger\n\n\n",
        "height": 304,
        "width": 547
      },
      "id": "ec796a8c-4ce1-4c91-8a8e-2575b26bbdc6",
      "name": "Sticky Note2",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -1360,
        80
      ],
      "typeVersion": 1
    },
    {
      "parameters": {
        "content": "## 4.  Local AI Agent\n\n\n",
        "height": 669,
        "width": 627,
        "color": 5
      },
      "id": "91405c7c-b6ba-4ecd-a15f-3d176a6adcb1",
      "name": "Sticky Note5",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -760,
        80
      ],
      "typeVersion": 1
    },
    {
      "parameters": {
        "operation": "extractHtmlContent",
        "extractionValues": {
          "values": [
            {
              "key": "text",
              "cssSelector": "body"
            }
          ]
        },
        "options": {}
      },
      "type": "n8n-nodes-base.html",
      "typeVersion": 1.2,
      "position": [
        -2240,
        200
      ],
      "id": "6e23e029-9188-4bfc-9eb6-fcb32d085d7a",
      "name": "Extract from HTML"
    },
    {
      "parameters": {
        "mode": "insert",
        "qdrantCollection": {
          "__rl": true,
          "value": "test",
          "mode": "id"
        },
        "options": {}
      },
      "id": "d4e54663-8453-4766-8b67-429c04a2878f",
      "name": "Qdrant Vector Store",
      "type": "@n8n/n8n-nodes-langchain.vectorStoreQdrant",
      "position": [
        -1860,
        200
      ],
      "typeVersion": 1,
      "credentials": {
        "qdrantApi": {
          "id": "IOKo0hbR1y8P3Qmq",
          "name": "QdrantApi account"
        }
      }
    },
    {
      "parameters": {
        "mode": "retrieve-as-tool",
        "toolDescription": "Retrieve information about NVIDIA Jetson AI Lab and people.\n",
        "qdrantCollection": {
          "__rl": true,
          "value": "test",
          "mode": "list",
          "cachedResultName": "test"
        },
        "topK": 20,
        "includeDocumentMetadata": false,
        "options": {}
      },
      "type": "@n8n/n8n-nodes-langchain.vectorStoreQdrant",
      "typeVersion": 1.3,
      "position": [
        -420,
        400
      ],
      "id": "dc6a9b66-2afb-4abf-a737-777f1926b77c",
      "name": "Qdrant Vector Store1",
      "credentials": {
        "qdrantApi": {
          "id": "IOKo0hbR1y8P3Qmq",
          "name": "QdrantApi account"
        }
      }
    },
    {
      "parameters": {
        "content": "# Local AI RAG Assistant with n8n on NVIDIA Jetson AGX Orin 64GB Dev Kit\n \n## Deploy Qwen/Qwen3-4B Using Sglang and Snowflake-Arctic-Embed2 with Ollama for Embeddings and Qdrant as the Vector Store",
        "height": 915,
        "width": 3200,
        "color": 7
      },
      "id": "c36d57f3-9cec-4684-88c5-565e56f1b02d",
      "name": "Sticky Note3",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -2720,
        -100
      ],
      "typeVersion": 1
    },
    {
      "parameters": {
        "sessionIdType": "customKey",
        "sessionKey": "=chat_with_{{ $('Listen for incoming events').first().json.message.chat.id }}"
      },
      "type": "@n8n/n8n-nodes-langchain.memoryBufferWindow",
      "typeVersion": 1.3,
      "position": [
        -580,
        420
      ],
      "id": "31b28a96-3ad2-4257-ab57-bceb85607e17",
      "name": "Simple Memory"
    },
    {
      "parameters": {
        "options": {}
      },
      "type": "@n8n/n8n-nodes-langchain.chatTrigger",
      "typeVersion": 1.1,
      "position": [
        -1180,
        200
      ],
      "id": "5d3257c6-94ea-4c09-a887-1fc4d2a2f4aa",
      "name": "When chat message received",
      "webhookId": "153df2fe-1039-4c5f-8c97-be5e16edcfe4",
      "disabled": true
    },
    {
      "parameters": {
        "promptType": "define",
        "text": "={{ $json.message.text }}",
        "options": {
          "systemMessage": "=/no_think You are an assistant working for a Nvidia Jetson AI Lab Research Group. Provide information about people using tool. \n\nYour primary goal is to provide precise, contextually relevant, and concise answers based on the tools and resources available.\n\n### TOOL\nUse the \"Retrieve information about NVIDIA Jetson AI Lab and people\" tool to:\n- perform semantic similarity searches and retrieve information about Nvidia Jetson AI Lab Research Group\n relevant to the user's query.\n- access detailed information about Nvidia Jetson AI Lab Research Group  when additional context or specifics are required.\n\n### Key Instructions\n1. **Response Guidelines**:\n   - Clearly explain how the retrieved information addresses the user's query, if applicable.\n   - If no relevant information is found, respond with: \"I cannot find the answer in the available resources.\"\n\n2. **Focus and Relevance**:\n   - Ensure all responses are directly aligned with the user's question.\n   - Avoid including extraneous details or relying solely on internal knowledge.\n"
        }
      },
      "id": "c6d22128-8508-4a35-bbec-ab271c7fc707",
      "name": "AI Agent",
      "type": "@n8n/n8n-nodes-langchain.agent",
      "position": [
        -640,
        200
      ],
      "typeVersion": 1.6
    },
    {
      "parameters": {
        "model": "snowflake-arctic-embed2:latest"
      },
      "type": "@n8n/n8n-nodes-langchain.embeddingsOllama",
      "typeVersion": 1,
      "position": [
        -1920,
        420
      ],
      "id": "95974449-6510-4219-a12b-3d5351ca1cfd",
      "name": "Embeddings snowflake-arctic-embed2",
      "credentials": {
        "ollamaApi": {
          "id": "WuO7GZoCJoeRCgMl",
          "name": "Ollama account"
        }
      }
    },
    {
      "parameters": {
        "model": "snowflake-arctic-embed2:latest"
      },
      "type": "@n8n/n8n-nodes-langchain.embeddingsOllama",
      "typeVersion": 1,
      "position": [
        -480,
        600
      ],
      "id": "fd92a103-15b0-4537-b01d-a8f9318c207e",
      "name": "Embeddings snowflake-arctic-embed",
      "credentials": {
        "ollamaApi": {
          "id": "WuO7GZoCJoeRCgMl",
          "name": "Ollama account"
        }
      }
    },
    {
      "parameters": {
        "model": {
          "__rl": true,
          "value": "Qwen/Qwen3-4B",
          "mode": "list",
          "cachedResultName": "Qwen/Qwen3-4B"
        },
        "options": {}
      },
      "type": "@n8n/n8n-nodes-langchain.lmChatOpenAi",
      "typeVersion": 1.2,
      "position": [
        -720,
        420
      ],
      "id": "dac4409d-c43d-4893-b717-449466bd5e33",
      "name": "SGLang",
      "credentials": {
        "openAiApi": {
          "id": "wIBNk1p41L1TWzWu",
          "name": "Local LLM"
        }
      }
    },
    {
      "parameters": {
        "url": "https://www.jetson-ai-lab.com/research.html",
        "options": {}
      },
      "id": "5c01e26e-24f4-4551-bdd8-06e5cd19b088",
      "name": "Get data from website",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        -2440,
        200
      ],
      "typeVersion": 4.2
    },
    {
      "parameters": {
        "updates": [
          "message"
        ],
        "additionalFields": {}
      },
      "id": "081d1e46-8136-4a15-b3a9-979f6b18acb1",
      "name": "Listen for incoming events",
      "type": "n8n-nodes-base.telegramTrigger",
      "position": [
        -1180,
        540
      ],
      "webhookId": "322dce18-f93e-4f86-b9b1-3305519b7834",
      "typeVersion": 1,
      "credentials": {
        "telegramApi": {
          "id": "iuLi4kjMG1YACFTy",
          "name": "Telegram account"
        }
      }
    },
    {
      "parameters": {
        "content": "## 3.2 Use the telegram bot\n\n\n\n",
        "height": 344,
        "width": 547
      },
      "id": "6d2ad98d-a651-47db-9c50-44236139936c",
      "name": "Sticky Note4",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -1360,
        420
      ],
      "typeVersion": 1
    },
    {
      "parameters": {
        "chatId": "={{ $('Listen for incoming events').first().json.message.from.id }}",
        "text": "={{ $('AI Agent').item.json.output.replace(/&/g, \"&amp;\").replace(/>/g, \"&gt;\").replace(/</g, \"&lt;\").replace(/\"/g, \"&quot;\") }}",
        "additionalFields": {
          "appendAttribution": false,
          "parse_mode": "HTML"
        }
      },
      "id": "c89c55ba-5d7a-4090-a0d4-b1a7db95bef5",
      "name": "Correct errors",
      "type": "n8n-nodes-base.telegram",
      "position": [
        180,
        360
      ],
      "typeVersion": 1.1,
      "webhookId": "cd30e054-0370-4aef-b7bf-483c1e73c449",
      "credentials": {
        "telegramApi": {
          "id": "iuLi4kjMG1YACFTy",
          "name": "Telegram account"
        }
      }
    },
    {
      "parameters": {
        "chatId": "={{ $('Listen for incoming events').first().json.message.from.id }}",
        "text": "={{ $json.output }}",
        "additionalFields": {
          "appendAttribution": false,
          "parse_mode": "HTML"
        }
      },
      "id": "06b704a1-ce7b-437d-950a-0e5ba4253ac1",
      "name": "Telegram1",
      "type": "n8n-nodes-base.telegram",
      "position": [
        -20,
        200
      ],
      "typeVersion": 1.1,
      "webhookId": "df3e62fe-25fb-481f-a4ad-255a167818bb",
      "credentials": {
        "telegramApi": {
          "id": "iuLi4kjMG1YACFTy",
          "name": "Telegram account"
        }
      },
      "onError": "continueErrorOutput"
    },
    {
      "parameters": {
        "content": "## 5.  Telegram\n\n\n",
        "height": 669,
        "width": 487,
        "color": 6
      },
      "id": "2228d980-da5a-4c8f-b44b-30d049153e7e",
      "name": "Sticky Note6",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -80,
        80
      ],
      "typeVersion": 1
    },
    {
      "parameters": {
        "content": "## 1.2 Download PDF",
        "height": 335,
        "width": 640,
        "color": 4
      },
      "id": "46e1bdca-0874-481d-ae51-1fe9272fb268",
      "name": "Sticky Note7",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -2680,
        440
      ],
      "typeVersion": 1
    },
    {
      "parameters": {
        "formTitle": "Add your file here",
        "formFields": {
          "values": [
            {
              "fieldLabel": "File",
              "fieldType": "file",
              "acceptFileTypes": ".pdf",
              "requiredField": true
            }
          ]
        },
        "options": {}
      },
      "id": "53581c98-d112-469d-91aa-ef358761167f",
      "name": "On form submission",
      "type": "n8n-nodes-base.formTrigger",
      "position": [
        -2540,
        580
      ],
      "webhookId": "4e1e20d4-f759-42c8-8439-87b93f43aa7c",
      "typeVersion": 2.2
    }
  ],
  "pinData": {},
  "connections": {
    "When clicking ‘Test workflow’": {
      "main": [
        [
          {
            "node": "Get data from website",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Default Data Loader": {
      "ai_document": [
        [
          {
            "node": "Qdrant Vector Store",
            "type": "ai_document",
            "index": 0
          }
        ]
      ]
    },
    "Recursive Character Text Splitter": {
      "ai_textSplitter": [
        [
          {
            "node": "Default Data Loader",
            "type": "ai_textSplitter",
            "index": 0
          }
        ]
      ]
    },
    "Extract from HTML": {
      "main": [
        [
          {
            "node": "Qdrant Vector Store",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Qdrant Vector Store1": {
      "ai_tool": [
        [
          {
            "node": "AI Agent",
            "type": "ai_tool",
            "index": 0
          }
        ]
      ]
    },
    "Simple Memory": {
      "ai_memory": [
        [
          {
            "node": "AI Agent",
            "type": "ai_memory",
            "index": 0
          }
        ]
      ]
    },
    "When chat message received": {
      "main": [
        []
      ]
    },
    "Embeddings snowflake-arctic-embed2": {
      "ai_embedding": [
        [
          {
            "node": "Qdrant Vector Store",
            "type": "ai_embedding",
            "index": 0
          }
        ]
      ]
    },
    "Embeddings snowflake-arctic-embed": {
      "ai_embedding": [
        [
          {
            "node": "Qdrant Vector Store1",
            "type": "ai_embedding",
            "index": 0
          }
        ]
      ]
    },
    "SGLang": {
      "ai_languageModel": [
        [
          {
            "node": "AI Agent",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "Qdrant Vector Store": {
      "main": [
        []
      ]
    },
    "Get data from website": {
      "main": [
        [
          {
            "node": "Extract from HTML",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Telegram1": {
      "main": [
        [],
        [
          {
            "node": "Correct errors",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Listen for incoming events": {
      "main": [
        [
          {
            "node": "AI Agent",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "AI Agent": {
      "main": [
        [
          {
            "node": "Telegram1",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "On form submission": {
      "main": [
        [
          {
            "node": "Qdrant Vector Store",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  },
  "active": false,
  "settings": {
    "executionOrder": "v1"
  },
  "versionId": "a7cd7783-ffb3-421e-b0cd-a24b082136fb",
  "meta": {
    "templateCredsSetupCompleted": true,
    "instanceId": "adc1857ea07e92ce7e9be3a75c7bb0fe7997ac27b02fe8d5b142a598aee663eb"
  },
  "id": "UPXF0jR4WYDNUYNS",
  "tags": []
}

Telegram as the real-time messenger enabling users to interact with the RAG agent through a familiar chat environment. Here is an example output:

This setup transforms your NVIDIA Jetson AGX Orin into a fully offline, secure AI assistant capable of understanding context, retrieving relevant data, and interacting naturally through Telegram messenger.

References

Build a No-Code AI Telegram Bot with n8n (On-Premise & Free)

Nurgaliyev Shakhizat

78 projects • 202 followers

I am a hardcore robotics and IoT enthusiast. Email: shahizat005@gmail.com

Build Local AI RAG Agent with n8n on NVIDIA Jetson AGX Orin

Things used in this project

Hardware components

Story

Hardware architecture

Run n8n Using Docker

Start the SGLang Server (for LLM Inference)

Run Ollama for Embeddings

Run Qdrant Vector Database Locally

Set Up WireGuard VPN

Create a Telegram Bot via BotFather

The Complete n8n Workflow

References

Credits

Nurgaliyev Shakhizat

Comments

Embed the widget on your own site

Build Local AI RAG Agent with n8n on NVIDIA Jetson AGX Orin

Build Local AI RAG Agent with n8n on NVIDIA Jetson AGX Orin

Things used in this project

Hardware components

Story

Hardware architecture

Run n8n Using Docker

Start the SGLang Server (for LLM Inference)

Run Ollama for Embeddings

Run Qdrant Vector Database Locally

Set Up WireGuard VPN

Create a Telegram Bot via BotFather

The Complete n8n Workflow

References

Credits

Nurgaliyev Shakhizat

Comments

Related channels and tags