# Lingyi Wanwu MCP

> Lingyi Wanwu connects your AI agent directly to the Yi LLM ecosystem. This MCP handles chat completions, generates semantic embeddings for RAG pipelines, and provides real-time account usage monitoring. You get a single point of control over high-performance bilingual models like Yi-Large.

## Overview
- **Category:** industry-titans
- **Price:** Free
- **Tags:** yi-models, 01-ai, llm-api, embeddings, chinese-ai, kai-fu-lee

## Description

Lingyi Wanwu connects your AI agent right into the whole Yi LLM ecosystem. You're getting a single point of control over high-performance, bilingual models like Yi-Large. This MCP handles everything you need—from running chats to generating vectors and keeping tabs on what you spend.

You can use the `chat_completions` tool to send any prompt message to one of the available Yi models; it'll return a generated response while maintaining context across multiple turns in the conversation. Before sending or receiving text, you can pass content through `check_moderation`. This tool runs your prompts and responses against policy filters, flagging anything that violates usage guidelines so you know your output is clean.

When you need to power up an advanced search index or build out a Retrieval Augmented Generation (RAG) system, use `get_embeddings`. It takes any piece of text you throw at it and generates a high-dimensional numerical vector representing the semantic meaning. For model selection, the `list_models` tool lets you fetch a complete list of all accessible Yi models, giving you their specific technical specs so you know exactly what you're working with.

Lastly, tracking costs is simple. The `get_usage` tool retrieves your current account metrics. It shows you how many tokens you've consumed and what your remaining balance is. You can keep an eye on your operational spending without having to check a dashboard manually.

## Tools

### list_models
Fetches a list of all accessible Yi model names and their technical specifications.

### chat_completions
Sends a prompt message to one of the Yi models and returns the generated response.

### check_moderation
Runs content through policy filters, flagging any text that violates usage guidelines.

### get_embeddings
Takes input text and generates a numerical vector representing its semantic meaning.

## Prompt Examples

**Prompt:** 
```
Chat with the Yi-Large model and ask 'Explain the impact of AI on the future of work'.
```

**Response:** 
```
Connecting to Yi engine... Yi-Large responded: 'The impact of AI on work is multi-faceted, ranging from task automation to the creation of new specialized roles...' Would you like me to summarize the key points?
```

**Prompt:** 
```
Generate embeddings for my company's mission statement.
```

**Response:** 
```
Retrieving embeddings... I've generated a semantic vector for your mission statement using the Yi-embedding model. The data is now available for your RAG search workflows.
```

**Prompt:** 
```
Check my current account balance in Lingyi Wanwu.
```

**Response:** 
```
Checking billing status... Your current account balance is 50.00 CNY. This is sufficient for approximately 2 million tokens on the Yi-Large model.
```

## Capabilities

### Generate conversation responses
Send prompts to Yi models (like chat-34B or Yi-Large) and receive structured text outputs, maintaining context across turns.

### Create semantic vectors
Take any piece of text and generate a high-dimensional embedding vector for use in search indexes and RAG systems.

### Check content compliance
Pass outgoing prompts or generated responses through the moderation tool to check for policy violations before they are sent.

### View available models
List all Yi model versions and retrieve their specific technical details, helping you choose the right model for the job.

### Track token usage
Retrieve current account statistics, including consumed tokens and remaining balance, keeping your operational costs clear.

## Use Cases

### Building a secure internal knowledge bot
A user needs an agent that answers questions based on private documents. First, they run `get_embeddings` on their 100 PDFs to create vectors. Then, when a question comes in, the agent uses those embeddings for retrieval and finally executes `chat_completions` to synthesize the answer. The whole flow is contained and verifiable.

### Developing an automated content moderation pipeline
A platform needs to filter all user-submitted comments before saving them. The agent first runs a pre-check using `check_moderation`. If clean, it proceeds with the main task via `chat_completions`; otherwise, it flags the failure and stops.

### Optimizing cost for an enterprise app
A developer suspects their service is running out of budget. They immediately call `get_usage` to see the current token count. This insight guides them to use a cheaper model found via `list_models` instead of defaulting to Yi-Large.

### Integrating new LLM features into a client app
A team needs to test an entirely new feature that requires complex chat logic. Before writing any code, they call `list_models` to confirm the model ID exists and then use `chat_completions` in a sandboxed environment.

## Benefits

- Stop guessing which model to use. Use `list_models` first to see every available version of the Yi LLM, then select the specific one you need for the task.
- Control your spend right from your agent. The `get_usage` tool lets you pull real-time token counts and balances before running expensive chat completions.
- Build better search systems. Instead of basic keyword matching, use `get_embeddings` to convert company documents into semantic vectors, making RAG searches far more accurate.
- Keep your outputs clean. Run any generated text through `check_moderation` immediately after the call. This stops policy violations from ever reaching the user.
- Manage complex conversations easily. The `chat_completions` tool handles persistent context, so you don't have to resend the entire chat history with every follow-up message.

## How It Works

The bottom line is: you get a single, authenticated connection point to run complex LLM logic without worrying about underlying credentials or setup steps.

1. Subscribe to the server. Then log into the Lingyi Wanwu Developer Platform.
2. Generate a new API Key within the platform's 'API Keys' section.
3. Insert your unique API Key into the field provided here. Your agent can now manage Yi model workflows.

## Frequently Asked Questions

**How do I check my token usage using the `get_usage` tool?**
Call `get_usage()` in your agent workflow. It will return a JSON object detailing your current consumption and remaining balance for the Yi models.

**What is the difference between chat completions and embeddings?**
Chat completions generate text based on prompts (like having a conversation). Embeddings (`get_embeddings`) convert text into numerical vectors, which are used by search engines to find semantic matches.

**`list_models` tool: does it list all LLMs?**
No, `list_models` only lists the available Yi models. For a complete picture of every model on the market, you'll need to consult external documentation.

**Can I use `check_moderation` before running `chat_completions`?**
Yes. It’s best practice to run a user prompt through `check_moderation` first. If the output is flagged, you stop the workflow and prevent the chat call from ever happening.

**How do I handle rate limits when running `chat_completions`?**
The service manages standard API rate limits. If your agent exceeds the quota, it will receive a specific HTTP error code that tells you exactly how long to wait before retrying. You must implement exponential backoff in your workflow logic.

**Does `get_embeddings` handle bilingual text, specifically Chinese characters?**
Yes, the embedding model is optimized for both English and Mandarin (EN/CN). You can pass combined English and Chinese texts together; it generates a single semantic vector that properly accounts for both language inputs.

**If I use an outdated model name in `chat_completions`, how does `list_models` help?**
The `list_models` tool provides the definitive, currently active names and versions of all supported Yi models. Run this first to guarantee you are using the correct identifier before submitting any chat request.

**What is the expected input format when running `check_moderation`?**
The tool expects either a single string or an array of strings in the payload. It checks all provided text elements against policy rules and returns a status flag for every item you send.

**Which Yi model is best for complex reasoning?**
For complex reasoning and high-quality outputs, `yi-large` is recommended. For faster response times and cost efficiency, `yi-medium` or `yi-spark` are excellent alternatives.

**Can I automatically retrieve my remaining account balance?**
Yes! Use the `get_balance` tool. Your agent will connect to the Lingyi Wanwu billing service and return your current remaining credits.

**How do I list all the technical specs for the Yi models?**
Use the `list_models` tool. Your agent will retrieve a list of all models currently available on the platform, along with their IDs and capabilities.