Midjourney MCP. Generate and refine professional AI art in your terminal.

Q: How do I get the highest quality version of my generated image using upscaleimage?

You must use upscaleimage on a specific tile from your 2x2 grid. This tool isolates that single tile and runs it through an advanced render pipeline, giving you the final high-resolution output.

Q: Can I save my favorite aesthetic style for later? Does generatevariation help?

Yes. After generating a good image, use generatevariation on that specific grid result. This tool creates new structural branches based only on the visual data of your preferred image, letting you refine the look without changing the prompt.

Q: What is the difference between generateimage and rerolljob?

generateimage starts a brand new concept from scratch. rerolljob, however, uses the exact same prompt arguments as a previous job, giving you a fresh set of outputs while maintaining the original parameters.

Q: Does describeimage work on images I took myself?

No, describeimage requires an absolute valid URL for the image. You must upload or host your photo online first; the server reads from the public link you provide.

Q: When I use generateimage repeatedly, how do I know if my job is still running or if it failed?

You must poll the API using getjob. Don't rely on immediate responses; check the status periodically. The tool returns the current state—pending, completed, or error—so your agent knows exactly when to proceed.

Q: If I need to debug a complex creative workflow, how can listjobs help me review past outputs?

The listjobs tool provides a history of all previously executed prompts and jobs. It lets you quickly check the original arguments used for specific generations, helping you trace why an output looked the way it did.

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Vercel

See Vinkius in Action

Works with every AI agent you already use

…and any MCP-compatible client

Just plug in your AI agents and start using Vinkius.

Midjourney AI gives you full control over generative art. Use it with your AI client to create high-fidelity images from text prompts using 'imagine,' upscale specific tiles, generate variations on existing concepts, and even blend multiple source photos into one unique composition.

What your AI agents can do

Blend images

Merges 2-5 input images into a single, unique visual composition using specified URLs.

Describe image

Analyzes an image URL and returns up to four candidate text prompts that describe the image's contents and style.

Generate image

Starts a new generative image job based on a provided text prompt, returning a Job ID for tracking.

+ 7 more capabilities included

Generate high-fidelity images from text

The agent takes a descriptive prompt and initiates an image generation job, providing a unique Job ID for tracking.

Isolate and upscale specific art tiles

Extracts one specific square tile from a 2x2 grid and renders it into a final high-resolution file.

Create aesthetic variations of an image

Takes an existing generated image and creates several new structural versions, allowing you to iterate on the initial concept without writing a new prompt.

Combine multiple source images

Merges two to five different input images into one cohesive composition that bridges various artistic styles.

Analyze image content for prompts

Reads the visual data of any public image URL and returns four candidate text descriptions (prompts) that likely created it.

Control camera movement in generated scenes

Simulates cinematic movements, allowing you to pan across or zoom out from a scene's perspective.

Ask AI about this MCP

Ask ChatGPT

Ask Claude

Ask Perplexity

Supported MCP Clients

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Vercel

+ other MCP clients

Free for Subscribers

Waiting for input…

AI Agent

Midjourney AI (Generative Image Arts) MCP Server: 10 Tools for Visual Art

Access all core generative tools, including image generation, upscaling, blending, and camera control. Build complex visual assets with precise commands.

blend019d75d4

blend images

Merges 2-5 input images into a single, unique visual composition using specified URLs.

describe019d75d4

describe image

Analyzes an image URL and returns up to four candidate text prompts that describe the image's contents and style.

generate019d75d4

generate image

Starts a new generative image job based on a provided text prompt, returning a Job ID for tracking.

generate019d75d4

generate variation

Creates several alternative versions (variants) from an existing Midjourney grid output using a specific image reference.

get019d75d4

get job

Checks the current processing status of any ongoing Midjourney job using its assigned Job ID.

list019d75d4

list jobs

Retrieves a history of previously run prompts and their corresponding job IDs.

pan019d75d4

pan image

Expands the view by simulating a lateral camera pan across the image borders in a specified direction.

reroll019d75d4

reroll job

Re-runs an identical generation prompt, providing a fresh set of outputs without changing the original parameters.

upscale019d75d4

upscale image

Extracts and renders a single specific tile from a 2x2 grid into its highest possible resolution.

zoom019d75d4

zoom out image

Widens the overall perspective of an image, simulating a camera zoom-out effect.

Choose How to Get Started

Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.

Build Your Own

Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.

Import from OpenAPI, Swagger, or YAML specs
Create Agent Skills with progressive disclosure
Deploy to edge with MCPFusion framework
Built in DLP, auth, and compliance on every call
Real time usage dashboard and cost metering
Publish to catalog or keep private

Start building

Make Your AI Do More

Start with Midjourney AI (Generative Image Arts), then connect any of our 4,700+ other servers whenever your AI needs more. One click, no limits.

Use this MCP plus 4,700+ others, all in one place
Add new capabilities to your AI anytime you want
Every connection is secured and compliant automatically
Track usage and costs across all your servers
Works with Claude, ChatGPT, Cursor, and more
New servers added to the catalog every week

What you can do with this MCP connector

Your AI client can use this server to manage every step of your generative art workflow. You tell it what you want, and the tools handle the heavy lifting.

To kick off a new piece, you pass in a descriptive text prompt using generate_image. The agent then starts an image generation job and gives you a unique Job ID for tracking. Don't sweat the waiting; you can check that job’s status at any time with get_job or review your full history of previous prompts and their IDs by calling list_jobs.

If the first set of results ain't right, don't worry—you can re-run the exact same prompt using reroll_job, giving you a fresh batch of outputs without changing a single parameter.

When you get that initial 2x2 grid output, you've got options. If one specific tile is fire, use upscale_image to pull out just that square and render it at the highest resolution possible. You can also take an image you already generated and run generate_variation on it; this creates several structural versions of your concept, letting you iterate without having to rewrite the original prompt.

For a whole new vibe, you can merge two through five different source images into one cohesive composition using blend_images. This tool bridges multiple artistic styles from disparate sources.

Want to control the camera? You don't just get flat pictures; you simulate cinematic depth. Use pan_image to make it look like a lateral camera sweep across the scene, or use zoom_out_image to widen the perspective and pull back from the action. When you can't figure out what inspired an image online, pass any public image URL to describe_image.

The server analyzes the visuals and returns four candidate text prompts—the likely description that made that art.

Basically, this gives your AI client full control over the whole pipeline: from getting a concept started with a prompt, tracking its status, refining specific tiles, creating variations, combining source materials, simulating camera moves, right down to figuring out what kind of prompt created a picture you found elsewhere.

How Midjourney MCP Works

1 Subscribe to the server and provide your Midjourney API Key.
2 Ask your AI agent for an image (e.g., 'Generate a cyberpunk city'). The system calls generate_image and returns a Job ID.
3 Once the job is active, you can ask the agent to track it with get_job, or refine the result using tools like upscale_image.

The bottom line is: You tell your AI client what art you want; the server manages the complex, multi-step generation process from start to finish.

Who Is Midjourney MCP For?

This tool is for creative professionals who run on visual concepts. If your job involves mood boards, storyboarding, or generating marketing assets that need a 'wow' factor—you need this. It’s perfect for the Art Director tired of waiting days for photography and the Marketing Specialist needing dozens of photorealistic options by end-of-day.

Digital Artist

Uses generate_image to rapidly prototype visual concepts, then runs upscale_image to get final print-quality assets without leaving the chat interface.

Creative Director

Commands the agent to run multi-stage workflows: 'Blend these three moodboard images' using blend_images, then ask for variations using generate_variation to nail down a final aesthetic direction.

Marketing Content Lead

Generates photorealistic backgrounds or product mockups by running prompts and immediately passing the result to an agent that can run describe_image against it for copy inspiration.

What Changes When You Connect

Control the final quality. Instead of settling for a low-res preview, you can use upscale_image to pull out one specific tile from your generated grid and render it into a high-fidelity asset. This saves manual post-processing time.
Build complex concepts instantly. Don't just generate one image. Use blend_images to merge 2–5 existing images—like mood board photos or product shots—into a single, unique composition that bridges distinct styles flawlessly.
Iterate without losing context. If the first batch isn't quite right, you don't start over. Run generate_variation on an existing grid to immediately spin up new structural ideas based on what already worked.
Direct your camera work. Need a cinematic shot? Use pan_image or zoom_out_image. These tools let you dictate specific directional movements, turning static images into dynamic, film-like sequences.
Understand visual prompts. If you find an image online and love the look but don't know how to recreate it, run describe_image against its URL. It reverse-engineers four candidate prompts for you.
Manage large jobs easily. With get_job and list_jobs, your agent keeps track of every generation task—pending, completed, or failed—so you never lose sight of the workflow.

Real-World Use Cases

Creating a cinematic story panel.

A concept artist needs to visualize a scene that involves wide shots and focus shifts. They ask their agent to generate an initial image, then use zoom_out_image to pull back the perspective for a grander view, and finally use pan_image on the result to simulate a slow, dramatic sweep across the environment.

Building a product marketing campaign.

A marketing team has several source photos (a model, a background texture, a specific prop). Instead of hiring a composite artist, they use blend_images to merge all three elements into one photorealistic asset in minutes. They then run describe_image on the result for ad copy inspiration.

Visualizing an architectural concept.

An architect generates a base image using a prompt. The client dislikes the overall lighting, so they use reroll_job to re-run the exact same prompt arguments, hoping for a different aesthetic take without altering any text.

Refining character design for a game.

The designer generates three initial characters. They like Character A but need it in two more poses. Instead of writing new prompts, they use generate_variation on Character A's grid to get structural alternatives, maximizing their time.

The Tradeoffs

Only running a simple prompt.

The user just types: 'A cat wearing sunglasses.' and waits. The result is often low-resolution or aesthetically flat, requiring manual cropping and upscaling outside the agent.

→

Forgetting job tracking.

Submitting five prompts back-to-back without checking status. The user wastes time trying to interact with jobs that are still processing in the background queue.

→

Mixing up refinement tools.

Trying to use blend_images when they actually need a wider view of a single scene. This results in a messy collage instead of a cohesive, expanded shot.

→

→ Always track your jobs using get_job or list_jobs. For composition, use blend_images; for expanding the frame, use pan_image or zoom_out_image.

When It Fits, When It Doesn't

Use this server if your core need is high-fidelity visual creation and iterative concept refinement. If you are generating images based on text input and need to manipulate those results (upscaling, blending, panning), this toolset handles the full pipeline.

Don't use it if:
1. You just need a quick stock photo—there are simpler image libraries for that.
2. Your goal is merely generating a single, perfect shot on one try—you still need to manage job flow and variations using generate_variation or reroll_job to guarantee quality.
3. You're trying to automate simple data visualization (charts/graphs)—use dedicated graphing tools instead.

Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by Midjourney. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.

VINKIUS INFRASTRUCTURE

Cloud Hosted

Managed infra

V8 Isolated

Sandboxed per request

Zero-Trust Proxy

No stored credentials

DLP Enforced

Policy on every call

GDPR Compliant

EU data residency

Token Compression

~60% cost reduction

How we secure it →

Works with Claude, ChatGPT, Cursor, and more

The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.

This server provides 10 capabilities that interface natively with Claude, ChatGPT, Cursor, and any MCP client. No middleware. No custom integration required.

Available Capabilities

blend_images describe_image generate_image generate_variation get_job list_jobs pan_image reroll_job upscale_image zoom_out_image

Visual assets shouldn't require a graphic designer’s degree to assemble.

Right now, making professional-grade mood boards is a painful sequence of manual tasks. You find three images online—a model shot, a texture sample, and a location photo. Then you have to download all three into Photoshop, manually align them, mask the edges, adjust the color grading so they look like they belong together, and finally export them at 300 DPI.

With this MCP server, your agent handles that entire pipeline. You give it the source images, tell it what you're trying to achieve (e.g., 'Make these three elements look like a vintage magazine ad'), and the `blend_images` tool executes the composition. You get the final, integrated asset—no manual alignment required.

Midjourney AI (Generative Image Arts) MCP Server: Control every frame.

Before this server, if you generated a beautiful image but felt it needed more scope or depth, you were stuck with the original crop. You might have to re-prompt entirely just to widen the view, losing your initial composition's quality and focus.

Now, after running `generate_image`, you simply tell your agent to execute `pan_image` or `zoom_out_image`. The server intelligently extrapolates the surrounding visual data, giving you a wider, more cinematic frame without compromising the core subject matter. It’s an instant depth pass.

Common Questions About Midjourney MCP

How do I get the highest quality version of my generated image using upscale_image? +

You must use upscale_image on a specific tile from your 2x2 grid. This tool isolates that single tile and runs it through an advanced render pipeline, giving you the final high-resolution output.

Can I save my favorite aesthetic style for later? Does generate_variation help? +

Yes. After generating a good image, use generate_variation on that specific grid result. This tool creates new structural branches based only on the visual data of your preferred image, letting you refine the look without changing the prompt.

What is the difference between generate_image and reroll_job? +

generate_image starts a brand new concept from scratch. reroll_job, however, uses the exact same prompt arguments as a previous job, giving you a fresh set of outputs while maintaining the original parameters.

Does describe_image work on images I took myself? +

No, describe_image requires an absolute valid URL for the image. You must upload or host your photo online first; the server reads from the public link you provide.

When I use `generate_image` repeatedly, how do I know if my job is still running or if it failed? +

You must poll the API using get_job. Don't rely on immediate responses; check the status periodically. The tool returns the current state—pending, completed, or error—so your agent knows exactly when to proceed.

What is the required format for passing multiple images into `blend_images`? +

You must pass all source URLs as a single string, separated by commas. The tool requires valid, absolute URLs delimited strictly by commas (e.g., url1,url2,url3). This ensures your agent passes the data correctly.

If I need to debug a complex creative workflow, how can `list_jobs` help me review past outputs? +

The list_jobs tool provides a history of all previously executed prompts and jobs. It lets you quickly check the original arguments used for specific generations, helping you trace why an output looked the way it did.

What happens if I try to run too many image generation tasks with `generate_image` in quick succession? +

Rate limits are governed by your Midjourney account plan. While the agent manages job queuing, rapid calls can hit external API caps. Always implement a small delay between calling generate_image jobs.

How do I check if my Midjourney image is ready? +

Use the get_job tool with the Job ID provided. Your agent will poll the API and report the current state (pending or completed). Once finished, it will return the final image URL or the 2x2 grid depending on your task.

Can I edit an image by zooming out or panning the camera through the agent? +

Absolutely. Use the pan_image and zoom_out_image tools with a completed Job ID. Your agent will command Midjourney to expand the canvas or shift the focus in your specified direction, creating professional cinematographic edits.

What does the 'describe' tool do? +

The describe_image tool reverse engineers prompts. Provide an image URL, and your agent will retrieve 4 candidate text descriptions from Midjourney, showing you the exact words and styles used to create that visual concept.

Use it with your favorite AI tools

Connect this server to Cursor, Claude, VS Code, and more.

OpenAI Agents SDK sdk-python

Google ADK sdk-python

Pydantic AI sdk-python

Vercel AI SDK sdk-typescript