AI related on Apache Dubbo

configure the MCP

Mon, 01 Jan 0001 00:00:00 +0000

MCP (Model Context Protocol) Gateway Configuration

This document explains how to configure the MCP (Model Context Protocol) filters within your gateway, enabling you to securely expose backend HTTP APIs as callable “tools” for AI Agents.

Introduction

The Model Context Protocol (MCP) serves as an intelligent bridge between AI Agents and your existing backend services. It dynamically translates a simple, unified protocol into standard HTTP requests, allowing agents to interact with your APIs as if they were native functions or tools. This approach simplifies agent development and provides a centralized point for security, control, and observability.

configure upstream endpoints

Mon, 01 Jan 0001 00:00:00 +0000

LLM Gateway Endpoint Configuration

This document explains how to configure upstream endpoints for Large Language Models (LLMs) within your gateway’s routing configuration.

Endpoint Structure

Each endpoint within a cluster is defined by an id and can contain an llm_meta block for custom behavior.clusters:

clusters:
 - name: "my_llm_cluster"
 endpoints:
 - id: "provider-1-main"
 socket_address:
 domains:
 - api.deepseek.com
 llm_meta:
 # ... other LLM-specific configuration goes here ...
 - id: "provider-2-fallback"
 socket_address:
 domains:
 - api.openai.com/v1
 llm_meta:
 # ... other LLM-specific configuration goes here ...

`llm_meta` Configuration Fields

The llm_meta block holds all the configuration specific to how the gateway should treat this LLM endpoint.

KVCache offload

Mon, 01 Jan 0001 00:00:00 +0000

AI KVCache Filter Configuration

This document explains how to configure and use the dgp.filter.ai.kvcache filter in Dubbo-go-Pixiu.

The filter integrates with vLLM (/tokenize) and LMCache controller APIs (/lookup, /pin, /compress, /evict) to:

provide cache-aware routing hints
trigger cache-management actions asynchronously
keep the main request path non-blocking

Architecture and Request Flow

dgp.filter.ai.kvcache is an HTTP decode filter. A typical request flow is:

Parse request body and extract model and prompt (or fallback from messages).
Record local hotness statistics (model + prompt) in the token manager.
Try cache-aware routing:
- read token cache for prompt
- call LMCache /lookup
- set a preferred endpoint hint in context (llm_preferred_endpoint_id)
Start an async cache-management goroutine (best-effort):
- call vLLM /tokenize
- call LMCache /lookup if needed
- execute strategy decisions (compress / pin / evict)
Continue the filter chain immediately (main request is not blocked by cache management).

Routing Contract (Important)

Current cache-aware routing uses instance id matching:

register service

Mon, 01 Jan 0001 00:00:00 +0000

LLM Service Discovery and Registration

This document aims to guide LLM service providers on how to dynamically register their service instances with the LLM Gateway via a Nacos registry. By following these guidelines, the gateway will be able to automatically discover your service and apply appropriate routing, retry, and fallback strategies based on the metadata you provide.

Registration Mechanism Overview

The core mechanism of service discovery is that your LLM service registers as a Nacos instance and provides a specific set of metadata upon registration. The LLM Gateway listens for service changes in Nacos, reads this metadata, and dynamically converts it into a fully functional gateway endpoint configuration.

AI related on Apache Dubbo

configure the MCP

MCP (Model Context Protocol) Gateway Configuration

Introduction

configure upstream endpoints

LLM Gateway Endpoint Configuration

Endpoint Structure

llm_meta Configuration Fields

KVCache offload

AI KVCache Filter Configuration

Architecture and Request Flow

Routing Contract (Important)

register service

LLM Service Discovery and Registration

Registration Mechanism Overview

`llm_meta` Configuration Fields