#Llm Inference MCP Servers

Discover 8 MCP servers tagged with Llm Inference on the Vinkius App Catalog.

Groq MCP

Empower LLM applications via Groq. Perform ultra-fast LPU-accelerated chat completions, handle audio transcription and translation, and use JSON mode directly from any AI agent.

Groq MCP

10 tools

Run large language models at unprecedented speed with custom LPU hardware that delivers real-time AI inference at massive scale.

Cerebras Inference MCP

15 tools

Access lightning-fast AI inference via Cerebras Wafer-Scale Engine. Generate chat completions, manage models, and run batch jobs at record speeds.

Anyscale MCP

7 tools

Orchestrate your Anyscale infrastructure. Manage LLM queries, vectors, services, and cluster batch jobs directly from your AI agent.

Fireworks AI MCP

6 tools

Empower LLM applications via Fireworks AI. Perform ultra-fast chat completions, generate embeddings and images, and transcribe audio directly from any AI agent.