# RisingWave MCP

> RisingWave provides direct access to your real-time streaming database via an MCP Server. You can execute full SQL statements (DDL/DML), stream raw JSON events into tables, and list all data components—including sources, sinks, and materialized views—all through natural conversation with your AI client.

## Overview
- **Category:** databases
- **Price:** Free
- **Tags:** streaming-database, real-time-analytics, sql, event-ingestion, materialized-views

## Description

**RisingWave** gives your AI client direct hands-on control of your real-time streaming database. You don't need to jump between a SQL console, an API playground, or documentation just to run a query. This MCP Server puts the entire data lifecycle—from source connection to final sink output—right into natural conversation with your agent. It lets you manage everything in one place.

When you use this server, your agent treats the database like a fully executable toolset. You'll find that managing complex streaming pipelines is straightforward because you can inspect every piece of data going in and out, and you can modify it as it flows through the system.

To start, you need to know what data you're dealing with. You can use `list_sources` to check all the external systems connected to RisingWave; this shows you exactly where your raw data is coming from, whether it's an S3 bucket or another database connection. Next, if you want a full picture of your setup, `list_tables` gives you a straightforward list of every standard, user-defined table in the schema. You can also check out what processed data ends up as materialized views; running `list_materialized_views` shows you the names and current status of all those complex, pre-calculated datasets.

Data doesn't just appear magically. The server lets you track where it goes once it’s done being processed. You can run `list_sinks` to view every configured external destination—that’s where your final output gets written—and `list_sources` helps confirm the input endpoints. If you need a complete inventory of the pipeline's components, you've got all the details right here.

Once you know what data exists and where it flows, you can start manipulating it. You have full power over running SQL statements. The `execute_sql` tool runs any full DDL or DML statement against your database. This means you can execute things like `CREATE`, `SELECT`, or `INSERT` commands directly through chat. You're not limited to just reading data; you can write and change it live.

For raw, real-time input, the `ingest_events` tool lets you feed streaming JSON records straight into a specific table for immediate processing. You don't need separate scripts or endpoints; you just tell your agent what data to send, and it handles the stream ingestion process instantly.

Basically, this server gives you total visibility and control. You run `execute_sql` to change the schema or write new records. You use `ingest_events` to push raw JSON objects into a table as they arrive. If you need to know what external systems feed the system, you check `list_sources`. To see where your cleaned data goes next, you check `list_sinks`. And if you wanna audit all the complex datasets that have been pre-calculated for you, just run `list_materialized_views`.

## Tools

### execute_sql
Runs any full SQL statement against the database, allowing reading and writing of data.

### ingest_events
Feeds streaming JSON records directly into a target table for immediate processing.

### list_materialized_views
Retrieves the names and status of all pre-calculated materialized views in your database.

### list_sinks
Shows a list of external destinations configured for data output from your pipelines.

### list_sources
Lists all connected external systems or initial data sources feeding into RisingWave.

### list_tables
Provides a straightforward list of every user-created table within the database schema.

## Prompt Examples

**Prompt:** 
```
List all the materialized views currently active in RisingWave.
```

**Response:** 
```
I've retrieved the catalog. You have 3 active materialized views: 'daily_sales_summary', 'user_login_stats', and 'realtime_inventory_levels'. Which one would you like to inspect?
```

**Prompt:** 
```
Ingest this JSON event into the 'user_activity' table: {"user_id": 123, "action": "login"}.
```

**Response:** 
```
I've successfully sent the event to the 'user_activity' table via the RisingWave Events API. The data is now available for streaming processing.
```

**Prompt:** 
```
Show me all external data sinks configured in the system.
```

**Response:** 
```
Querying the catalog... I found 2 configured sinks: 's3_archived_logs' (S3) and 'postgres_reporting_db' (JDBC).
```

## Capabilities

### Run full SQL queries
Execute DDL and DML statements (like CREATE, SELECT, or INSERT) against your database.

### Stream raw data events
Send single JSON objects or arrays of objects directly into a specified table for real-time processing.

### List user tables
Retrieve the names and basic metadata for all standard, user-defined database tables.

### Identify external data sources
List configured connections to outside systems, like S3 buckets or other databases.

### View data sinks
Show all defined endpoints where processed data is written (the output side of your pipelines).

### Inspect materialized views
List and check the status of pre-calculated, complex derived datasets.

## Use Cases

### Debugging broken event pipelines
A backend developer notices the 'user_activity' table is empty. They ask their agent to run `list_sources` first, confirming the S3 connection is active. Then, they use `ingest_events` to send three test records directly into the target table using known good JSON structure. The system confirms ingestion success, allowing them to pinpoint the failure source outside of RisingWave.

### Schema exploration for a new dashboard
An analytics engineer needs data from a newly integrated dataset but isn't sure what tables exist. They prompt their agent with 'Show me everything.' The agent runs `list_tables` and then uses `list_materialized_views`, providing the full list of available datasets, accelerating the dashboard build time.

### Verifying data output paths
A data team member finishes building a new real-time stream. Before deploying, they use the agent to run `list_sinks`. This confirms that both the 'S3 archive' and the 'reporting database' outputs are correctly configured targets for the processed data.

### Running ad-hoc validation queries
A developer needs to check if a specific user ID (123) was logged in today. Instead of writing a Python script, they ask their agent to run `SELECT * FROM user_activity WHERE user_id = 123 AND date > 'today'`. The agent executes the `execute_sql` tool and returns the result set instantly.

## Benefits

- **Immediate Data Verification:** Don't guess if your pipeline works. You use `list_materialized_views` to see the exact status of derived datasets before running a full report.
- **Zero-Friction Testing:** Need to test an endpoint? Use `ingest_events` to stream JSON data right into the table from your chat, bypassing manual API calls and giving instant feedback.
- **Full Data Map Visibility:** Stop guessing where data comes from or goes. Running `list_sources`, `list_sinks`, and `list_tables` gives you a complete, auditable topology of the entire system.
- **Direct Query Execution:** You don't need to export schemas just to check column types. Use `execute_sql` to run DDL statements or simple `SELECT` queries directly against the live stream data.
- **Efficiency Over Manual Steps:** Instead of opening a console, running a query, copying results, and then logging into a separate monitoring dashboard, your agent handles all steps in one flow using multiple tools.

## How It Works

The bottom line is: you use natural language instructions to perform complex, low-level database operations without writing code.

1. Subscribe to this MCP Server and provide your RisingWave connection URL.
2. Your AI agent uses the provided tools (e.g., `list_tables` or `execute_sql`) to query the database's state.
3. The server executes the command, returns the results (like a table list or data rows), and passes that output back to your chat interface.

## Frequently Asked Questions

**Can I run complex JOIN queries on my materialized views?**
Yes! Use the `execute_sql` tool to run any valid RisingWave SQL query. You can perform complex analytical queries or inspect the results of your real-time aggregations.

**How do I check if my Kafka or Pulsar sources are correctly connected?**
Use the `list_sources` tool. It queries the internal catalog to provide a list of all external data sources currently configured in your RisingWave instance.

**Is there a way to push data directly into a table without using an external source?**
Absolutely. Use the `ingest_events` tool to send JSON objects or arrays directly to a specific table via the Events API, perfect for testing or low-latency ingestion.

**When I use the execute_sql tool, how do I handle schema changes or create new tables?**
You manage structural changes using standard DDL commands within `execute_sql`. Simply run an ALTER TABLE statement to modify existing columns, or a CREATE TABLE command if you need a brand-new data structure. The system accepts all valid SQL dialects for these operations.

**What are the rate limits or maximum throughput when using the ingest_events tool?**
The `ingest_events` tool handles high volumes of JSON objects, typically supporting batch processing up to thousands of records per call. For extreme sustained loads, check your RisingWave configuration for specific API quotas, as those govern the actual ingestion rate.

**If I run list_materialized_views, does the result show me the current operational status (e.g., running or failed)?**
Yes, the output from `list_materialized_views` includes metadata about each view's health. You'll see indicators that tell you if the stream is active, paused, or if it has encountered recent errors requiring attention.

**After running list_sources, how do I verify which external systems are connected and authorized?**
The `list_sources` tool provides a catalog of configured endpoints. To confirm connectivity, you must check the accompanying connection details provided in the output for credentials or status flags. If it's not listed, the source hasn't been set up yet.

**When I run execute_sql, what is the recommended way to ensure my queries don't impact production data?**
The best practice is always to test complex SELECT statements or DML operations against a staging or development instance of your database. If you must use the live environment, wrap all read operations in transactions and verify results before committing changes.