Supercharge your AI with N-Gram Frequency Engine. Count phrase occurrences with mathematical precision.

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Vercel

See Vinkius in Action

Works with every AI agent you already use

…and any MCP-compatible client

Connect to your AI in seconds.

The N-Gram Frequency Engine precisely counts word phrases. It extracts unigrams, bigrams (two words), and trigrams (three words) from huge documents using native V8 JavaScript.

Stop relying on LLMs to approximate phrase counts; this server gives you mathematically perfect frequency numbers every time.

What your AI can do

Extract ngram frequencies

This tool pulls the top most frequent word groups (N-Grams) from text using deterministic counting.

Count Word Phrases

It calculates how many times specific sequences of words (bigrams, trigrams) appear in your text.

Handle Large Texts

The engine processes large documents without hitting the token limits that trip up standard language models.

Extract Specific N-Grams

You specify the size of the word group (N) and the tool pulls out only those specific patterns.

Ask an AI about this

Compatible AI Apps

OAuth 2.0 Compatible

Claude

ChatGPT

Cursor

Gemini

VS Code

JetBrains

Vercel

Zendesk

+ any other MCP app

Included with Plan

Waiting for input…

AI Agent

N-Gram Frequency Engine MCP Server: 1 Tool for Text Analysis

Use the available tools to calculate deterministic frequency counts of word sequences in large bodies of text.

Make your AI actually useful.

Add this MCP to Claude, Cursor, or Windsurf and your AI stops guessing. It gets real tools to look things up, take action, and handle the stuff you keep doing by hand.

Start using N-Gram Frequency Engine on Vinkius

Extract Ngram Frequencies

This tool pulls the top most frequent word groups (N-Grams) from text using deterministic counting.

Connect to your AI in seconds. Security and governance baked right in.

Pick your AI client below to get set up. Just create a Vinkius account, subscribe, and you're instantly up and running. We handle the entire backend infrastructure, delivering out-of-the-box support for HTTPS Streamable, SSE, and OAuth2—zero messy routing required.

Claude AI

Open Claude Settings

Go to claude.ai, click your profile icon, then navigate to Customize → Connectors.

Add Custom Connector

Click the "+" button and select Add custom connector. Paste your Vinkius endpoint URL:

https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp

Replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com. For OAuth-protected servers, expand Advanced settings to add credentials.

Start a conversation

Open a new chat. The N-Gram Frequency Engine integration is available immediately — no restart needed.

Antigravity

Configure Agent Environment

Open your Antigravity agent's workspace configuration or mcp-servers.json file.

Bind the Endpoint

Add the Vinkius endpoint URL to your agent's MCP connections list:

"mcp_servers": {
  "n-gram-frequency-engine": {
    "serverUrl": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
  }
}

Provide your secure token in place of [YOUR_TOKEN_HERE] to ensure your agent requests are authenticated.

Execute

Start your Antigravity session. The agent will autonomously discover and utilize the N-Gram Frequency Engine tools with full Vinkius guardrails applied.

VS Code Copilot

⚡

One-Click Install (Recommended)

In your Vinkius Dashboard, simply click the Add to VS Code button for this server. We'll automatically configure your local workspace.

Or configure manually

Open MCP Settings

Open VS Code, press Ctrl/Cmd + Shift + P, and search for GitHub Copilot: MCP Servers.

Add Server Config

Add the Vinkius endpoint configuration to your mcp-servers.json file:

"n-gram-frequency-engine": {
  "url": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
}

Ensure you replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com.

LangChain

Install Dependencies

Install the LangChain MCP adapters for your environment:

pip install langchain-mcp-adapters

Connect the Server

Use the SSEClient in LangChain to connect to the Vinkius managed endpoint:

from langchain_mcp_adapters.client import SSEClient

# Connect to Vinkius
client = SSEClient(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")
tools = client.get_tools()

CrewAI

Define the Tool

Load the Vinkius MCP tools into your CrewAI agents:

from crewai import Agent
from mcp_crewai import MCPTool

# Connect securely to Vinkius
vinkius_tools = MCPTool(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")

# Assign to Agent
researcher = Agent(
    role='Data Researcher',
    tools=vinkius_tools.get_all()
)

Execute Task

Run your CrewAI process. The agent will autonomously route tasks to the Vinkius managed server.

Choose How to Get Started

Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.

Build Your Own

Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.

Import from OpenAPI, Swagger, or YAML specs
Create Agent Skills with progressive disclosure
Deploy to edge with MCPFusion framework
Built in DLP, auth, and compliance on every call
Real time usage dashboard and cost metering
Publish to catalog or keep private

Start building

Make Your AI Do More

Start with N-Gram Frequency Engine, then connect any of our 5,000+ other servers whenever your AI needs more. One click, no limits.

Use this MCP plus 5,000+ others, all in one place
Add new capabilities to your AI anytime you want
Every connection is secured and compliant automatically
Track usage and costs across all your servers
Works with Claude, ChatGPT, Cursor, and more
New servers added to the catalog every week

N-Gram Frequency Engine MCP server cover

Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by natural. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.

VINKIUS INFRASTRUCTURE

Cloud Hosted

Managed infra

V8 Isolated

Sandboxed per request

Zero-Trust Proxy

No stored credentials

DLP Enforced

Policy on every call

GDPR Compliant

EU data residency

Token Compression

~60% cost reduction

Your data is protected. See how we built it.

Works with Claude, ChatGPT, Cursor, and more

The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.

This connection provides 1 powerful capabilities that interface natively with Claude, ChatGPT, Cursor, and other compatible AI platforms. No middleware. No custom integration required.

Counting recurring phrases in large documents isn't simple.

Today, when you have a massive text—say, 50 pages of user reviews—and you want to know the top five common two-word phrases, you usually throw it all into an AI prompt. The LLM tries its best, but because of context limitations and how large language models process data, it approximates the count. You end up with a 'pretty good guess' that might be off by twenty percent.

With the N-Gram Frequency Engine, you pass that same 50-page document to `extract_ngram_frequencies`. It runs the math in V8 JavaScript and spits out the mathematically exact top phrases and their count. No guessing required. Just hard numbers.

N-Gram Frequency Engine MCP Server: Count phrase occurrences with precision.

Manual analysis requires you to copy sections, use spreadsheet formulas for bigram counts, and then manually cross-reference data across different sources. It's slow, prone to formula errors, and doesn't scale past a few hundred words.

Now, route the entire corpus through this server. You get one clean API call that returns every phrase count you need, structured for immediate use in any database or script. The process is instant.

Support 24/7 support@vinkius.com ↗

Security Vinkius Trust Center ↗

SLA Service Level Agreement ↗

Report Listing Send Report ↗

What your AI can actually do with this

N-Gram Frequency Engine - Count Word Phrases

You need to know exactly how often specific word combinations—like "core business strategy" or "Q3 revenue forecast"—show up in massive reports. Standard language models can't handle that; they approximate the count, or they just run into token limits and miss entire phrases. This isn't guesswork.

The N-Gram Frequency Engine fixes that problem completely. It pulls data directly using native V8 JavaScript, giving you mathematically perfect counts for bigrams (two words), trigrams (three words), and any custom word group size (N) every time. Forget estimations; this is a deterministic count of word patterns across huge bodies of text.

The `extract_ngram_frequencies` Tool

The primary tool, extract_ngram_frequencies, calculates the top most frequent N-Grams from any source text deterministically. You feed it your documents, and it doesn't just skim the surface; it processes them fully.

When you run this engine, you get immediate access to three core capabilities. First, you can count word phrases by specifying if you want bigrams or trigrams, knowing that each sequence is counted precisely. Second, because it runs on V8 JavaScript, the tool handles huge documents without tripping over token limits—you don't lose data just 'cause it's too long for a typical AI client.

Third, you can specify exactly how large of a word group (the N value) you want to count, letting you pull out only those specific patterns and ignoring everything else.

This isn't about general text analysis; it's surgical counting. You're not asking your agent for a summary—you're demanding precise data points showing exactly how many times 'supply chain management' or 'regulatory compliance risk' appears across thousands of pages of transcripts. The engine delivers that structured list detailing the top N-Grams and their exact counts.

Think of it this way: you hand over a massive corpus—say, all the meeting minutes from the last year—and your agent doesn't waste time trying to summarize the vibe. Instead, it uses extract_ngram_frequencies to generate a list that tells you, definitively, which three-word phrases dominated the conversation and how many times each one appeared.

You get these numbers back immediately.

The ability to specify N means you control the scope of the count. Need only two-word pairs? Set N=2. Only looking for key concepts spread over three words? Set N=3. The tool handles all those parameters using native JS power, guaranteeing that every instance of your target phrase gets tallied correctly, no exceptions.

Built · Hosted · Managed by Vinkius N-Gram Frequency Engine - Count Word Phrases

Server ID 019e38c4-6e3f-72cf-9100-be8c3f0f58e9

Vinkius Inspector

Compliance Grade A+

Score 100/100

Report View Report ↗

Here's how it actually works

The bottom line is you get reliable, mathematically perfect phrase counts without relying on an LLM's memory or approximation.

Feed the engine a large body of text. This can be everything from transcripts to full articles.

The server runs extract_ngram_frequencies using V8 JavaScript, which calculates exact word counts by identifying common N-Grams.

You get back a list that shows the top phrases and their precise frequency count.

See it in action

01 01

Analyzing Competitor Content

An SEO analyst needs to map a competitor's keyword strategy from 10 linked articles. Running the text through extract_ngram_frequencies finds the exact top 10 most frequent trigrams, showing where they are focusing their content efforts. This is impossible to do reliably using only an LLM prompt.

02 02

Mining Customer Feedback

A product manager collects thousands of user reviews. They use the engine to extract bigram frequencies, identifying phrases like 'slow loading' or 'login error,' which pinpoints exactly where users are struggling across the entire dataset.

03 03

Academic Corpus Review

A linguist is studying a niche field. They feed the engine an entire corpus of historical documents and use extract_ngram_frequencies to get deterministic counts on specific academic terminology, verifying patterns that standard summarization tools would miss.

04 04

Identifying Core Themes in Legal Docs

A compliance officer needs to check thousands of meeting transcripts for recurring legal phrases. They use the engine to calculate trigram frequencies, providing a verifiable count of key terms like 'non-disclosure agreement' or 'liability waiver'.

The honest tradeoffs

Asking an LLM directly

Anti-pattern

Prompting your agent: 'Find the top 5 bigrams from this 100-page PDF.' The model will try its best, but it’ll likely fail or give you a guess because of token limits.

The Fix

Instead, route the text to extract_ngram_frequencies. This dedicated tool bypasses LLM limitations and gives you an exact count. Use the specific N-Gram counting tool instead of relying on general prompting.

When It Fits, When It Doesn't

Use this server if your primary goal is a precise, verifiable count of word sequences (unigrams, bigrams, trigrams). If you need to know how many times 'deep learning' appears in 10,000 documents, use extract_ngram_frequencies.

Don't use this if your goal is general summarization ('What are the main topics?') or semantic understanding ('Why did they feel frustrated?'). For those tasks, you need a general-purpose LLM. This server is purely a counting mechanism—it tells you what words cluster together, not why.

It's all about metric accuracy versus conceptual depth.

Questions you might have

How does N-Gram Frequency Engine MCP Server count phrases? +

It uses native V8 JavaScript to perform deterministic counting on the source text, guaranteeing accurate counts for unigrams, bigrams, and trigrams. This process bypasses LLM token limits entirely.

Can I use extract_ngram_frequencies to count phrases in PDFs? +

Yes, as long as the PDF content is first extracted into a plain text string, the extract_ngram_frequencies tool can process it. The engine works on raw text data.

Is this better than just asking my agent to summarize the document? +

Yes, because summarizing describes concepts; counting is factual. This server gives you hard metrics (the frequency count), while a summary only provides qualitative takeaways. They solve different problems.

How do I change the N-Gram size using extract_ngram_frequencies? +

You set the desired 'N' value in your prompt or function call. For example, setting N=2 counts bigrams (two words), and N=3 counts trigrams (three words).

When I use `extract_ngram_frequencies`, what is the maximum size of text it can process? +

The engine handles extremely large texts, limited primarily by available memory. You don't need to worry about typical token limits or length restrictions. Since it uses native V8 JavaScript, processing speed remains high even with massive inputs.

Can `extract_ngram_frequencies` handle text that has complex formatting or mixed characters? +

It requires raw, clean plain text input for the most accurate results. If your source material includes HTML tags or unusual symbols, it’s best practice to strip those out first. This ensures the engine focuses only on meaningful word sequences.

What security measures govern the data used by `extract_ngram_frequencies`? +

Your text input is processed securely within the Vinkius infrastructure for computation. We do not retain your source documents or use them to train our models; you only receive the calculated frequency output.

If I run `extract_ngram_frequencies` with an empty string, what error response should I expect? +

It handles null or empty inputs gracefully. Instead of throwing an error, it returns a zero count for all N-Grams. This makes the tool reliable for conditional logic within your agent workflows.

What are Bigrams and Trigrams? +

A bigram is a sequence of two adjacent words (e.g., 'machine learning'). A trigram is three (e.g., 'natural language processing').

Does it lowercase the text automatically? +

Yes, all text is automatically lowercased and tokenized natively to ensure accurate aggregation of phrases.

Is this faster than asking Claude? +

Significantly faster and 100% accurate. LLMs cannot count occurrences across thousands of tokens reliably.

Connect to your AI in seconds.

Extract ngram frequencies

N-Gram Frequency Engine MCP Server: 1 Tool for Text Analysis

Make your AI actually useful.

Extract Ngram Frequencies

Connect to your AI in seconds. Security and governance baked right in.

Claude AI

Open Claude Settings

Add Custom Connector

Start a conversation

Claude Code

Open your terminal

Add the MCP Server

Start coding

Cursor

One-Click Install (Recommended)

Open Cursor Settings

Add New Server

Use in Composer

Antigravity

Configure Agent Environment

Bind the Endpoint

Execute

VS Code Copilot

One-Click Install (Recommended)

Open MCP Settings

Add Server Config

Windsurf

One-Click Install (Recommended)

Open Windsurf Settings

Add Server Endpoint

LangChain

Install Dependencies

Connect the Server

CrewAI

Define the Tool

Execute Task

Choose How to Get Started

Build Your Own

Make Your AI Do More

Works with Claude, ChatGPT, Cursor, and more

Counting recurring phrases in large documents isn't simple.

N-Gram Frequency Engine MCP Server: Count phrase occurrences with precision.

What your AI can actually do with this

The extract_ngram_frequencies Tool

Here's how it actually works

Who is this actually for?

What Changes When You Connect

See it in action

Analyzing Competitor Content

Mining Customer Feedback

Academic Corpus Review

Identifying Core Themes in Legal Docs

The honest tradeoffs

Asking an LLM directly

When It Fits, When It Doesn't

Questions you might have

The `extract_ngram_frequencies` Tool