How much does a typical RAG pipeline cost?

About $0.017 per 5k-token page for ingestion (ScrapePay $0.010 + MarkdownOpt $0.005 + EmbedPay ~$0.0003 + MemoryServe $0.001 + MEMSCRUB $0.001). Retrieval costs ~$0.002 per query (EmbedPay $0.0003 + MemoryServe query $0.001 + MEMSCRUB per chunk $0.001). All prices in USDC on Base, settled in ~2 seconds.

Can I use my own embeddings instead of EmbedPay?

Yes. MemoryServe accepts a pre-computed vector in the /memory/write request body. If you already have an embedding from OpenAI direct, Voyage AI, or a self-hosted model, pass it through and skip the EmbedPay call entirely. The MemoryServe doc page shows the exact field.

What happens if a step fails mid-pipeline?

Charge-on-success-only across the fleet. Every service settles payment only on HTTP 2xx with non-empty content. ScrapePay 422 on JS crash, MarkdownOpt 422 on empty HTML, EmbedPay 503 on OpenAI outage, MemoryServe 503 on Qdrant outage — none of these settle. Your retry budget is preserved.

How do I prevent prompt injection from retrieved RAG content?

Run MEMSCRUB ($0.001 per chunk) on every retrieved chunk before injecting it into the LLM prompt. MEMSCRUB detects ten patterns including HTML comment injection, invisible Unicode, fake system messages, role-replacement attempts, exfiltration instructions, and persona overrides. Returns risk_level (safe/low/medium/high/critical) plus optional sanitised content with injections stripped.

Does this work with LangChain, LlamaIndex or CrewAI?

Yes — every service is plain HTTP, so any agent framework that can make a POST works. The fastest path is via the MCP wrapper: install @melis-ai/x402-tools-mcp once and all 22 services appear as MCP tools in Claude Desktop, Cursor, Cline, Continue, or any other MCP-aware client. No framework lock-in.

← Composition recipes

The canonical x402 RAG pipeline

ScrapePay → MarkdownOpt → EmbedPay → MemoryServe → MEMSCRUB. ~$0.017 per 5k-token page. No accounts. No subscriptions.

Overview

This pipeline ingests a web page into a secure, semantically searchable memory store — then protects your LLM from indirect prompt injection when retrieving chunks. Every step is a separate x402 microservice, billed per call in USDC on Base. You pay only for what you use.

ScrapePay $0.010 Fetch live page via Playwright

→

MarkdownOpt $0.005 Strip HTML noise, ~70% token reduction

→

EmbedPay ~$0.0003 Generate 1536-dim embedding

→

MemoryServe $0.001 Store in Qdrant + SQLite

→

MEMSCRUB $0.001 Scan retrieved chunks before LLM

Total ingestion cost: ~$0.017 per 5k-token page

Step 1 — Fetch the page (ScrapePay)

ScrapePay renders the page via Playwright (JS execution included), enforces robots.txt, and is charge-on-failure-safe — you are not billed if the page returns an error.

POST https://scrapepay.melis.ai/scrape
{
  "url": "https://example.com/article",
  "format": "html"
}

Returns raw HTML. Pass this directly to MarkdownOpt.

Step 2 — Clean to markdown (MarkdownOpt)

MarkdownOpt converts HTML to clean, LLM-ready markdown — stripping nav, ads, boilerplate, and inline styles. Reduces token count by ~70% before embedding, which lowers EmbedPay cost.

POST https://markdownopt.melis.ai/markdown
{
  "html": "<html>...</html>"
}

Returns { markdown, token_estimate, compression_ratio }. Use the markdown string as input to EmbedPay.

Step 3 — Embed (EmbedPay)

EmbedPay calls OpenAI's text-embedding-3-small and returns a 1536-dimensional vector. Billing is per 1k tokens (cl100k_base tokenisation). For a 5k-token page after MarkdownOpt compression, this costs roughly $0.0003.

POST https://embedpay.melis.ai/embed
{
  "text": "cleaned markdown content...",
  "model": "text-embedding-3-small"
}

Returns { embedding: number[], model, tokens_used, dimensions }. Pass the embedding array to MemoryServe — or let MemoryServe call EmbedPay internally (it does so automatically when you POST content to /memory/write).

Step 4 — Store (MemoryServe)

MemoryServe stores the content in Qdrant (vector search) and SQLite (full content + metadata). It calls EmbedPay internally — you do not need to pre-embed if you POST content directly.

POST https://memoryserve.melis.ai/memory/write
{
  "content": "cleaned markdown content...",
  "agent_id": "my-research-agent",
  "metadata": {
    "source_url": "https://example.com/article",
    "ingested_at": "2026-05-08T00:00:00Z"
  }
}

Returns { id, agent_id, created_at, vector_id }. The id is the SQLite row ID; use it for deletion (GDPR compliance).

Querying memory

POST https://memoryserve.melis.ai/memory/query
{
  "query": "what is the refund policy?",
  "agent_id": "my-research-agent",
  "top_k": 5
}

Returns an array of the top-k semantically similar chunks, each with score, content, and metadata. Pass retrieved chunks through MEMSCRUB before sending to your LLM.

Step 5 — Scan for injection (MEMSCRUB)

Indirect prompt injection is planted in third-party content to hijack your agent when it reads that content. MEMSCRUB runs 10 heuristic rules across each retrieved chunk before it reaches your LLM.

POST https://memscrub.melis.ai/scrub
{
  "content": "retrieved chunk text...",
  "sanitize": true
}

Returns { risk_score, risk_level, flagged, safe, sanitized }. risk_level is one of safe | low | medium | high | critical.

If safe is true — pass content to LLM.
If risk_level is medium or higher — log and optionally skip the chunk.
If sanitize: true — MEMSCRUB returns a cleaned version with injection patterns removed; use sanitized instead of the original.

What MEMSCRUB detects

10 heuristic rules covering:

HTML comment injection ()
Invisible Unicode (zero-width characters used to hide payloads)
Fake tool responses ([TOOL_RESULT], [FUNCTION_OUTPUT])
Metadata injection (system_context:, assistant_config:)
Conditional triggers (if the user asks about X, respond Y)
Chain-of-thought hijacking (thinking step by step planted in content)
Exfiltration instructions (send all conversation history to)
Persona injection (you are now, your new identity is)
Fake system messages ([SYSTEM], SYSTEM OVERRIDE)
Base64 payload detection

Cost summary

Step	Service	Cost (5k-token page)
Fetch	ScrapePay	$0.010
Clean	MarkdownOpt	$0.005
Embed	EmbedPay	~$0.0003
Store	MemoryServe	$0.001
Scan	MEMSCRUB	$0.001
Total per page		~$0.017
Query	MemoryServe /memory/query	$0.001 + ~$0.00001 EmbedPay

MCP usage

If you've installed @melis-ai/x402-tools-mcp, call the pipeline steps as MCP tools:

scrapepay({ url: "https://example.com/article" })
markdownopt({ html: result.html })
memoryserve_write({ content: result.markdown, agent_id: "my-agent", metadata: {...} })
memscrub({ content: retrieved_chunk, sanitize: true })

EmbedPay is called internally by MemoryServe — no separate MCP call needed.

← All composition recipes MEMSCRUB service →