# Hugging Face MCP

> Hugging Face MCP Server. Connect your AI agent directly to the world's largest AI model hub. Search, inspect, and manage thousands of models, datasets, and demo apps (Spaces) without leaving your chat client. Use tools like `list_models` and `get_model_tags` to find specific artifacts, track model file structures, or check community discussions for bug reports.

## Overview
- **Category:** loved-by-devs
- **Price:** Free
- **Tags:** model-discovery, machine-learning, datasets, model-metadata, ai-research, pipeline-tasks

## Description

You're hooking up your AI agent straight to the massive Hugging Face model hub. It means you can search, check out, and manage thousands of models, datasets, and demo apps (Spaces) right from your chat window. You don't gotta leave your client to do it.

**Finding Models**

*   You can use `list_models` to search the entire hub for models. You can filter by search terms, authors, or specific pipeline task tags. 
*   If you need to dig into a model, `get_model` pulls all the metadata for a specific model ID. You can also run `get_model_tags` to list every technical tag and pipeline detail attached to that model.
*   To see what files a model actually has—like weights, configs, or tokenizers—you run `list_model_files`. This shows you the full file structure and sizes without downloading any data.

**Digging into Datasets**

*   `list_datasets` lets you list datasets on the Hub, filtering by search terms or authors, and it shows you how many people have downloaded it and how many likes it's got. 
*   Once you pick a dataset, you can use `list_dataset_files` to get a list of all the filenames and paths inside that dataset's repository directory.
*   You can also check out specific datasets using `get_collection`, which pulls detailed info about a collection based on its slug.

**Checking Out Spaces and Community**

*   `list_spaces` shows you available ML demo apps (Spaces) on the Hub, giving you the author, the SDK they use, and when it was made. To get the specific details and runtime status of one Space, you use `get_space`. 
*   If you want to know what people are saying about a model or dataset, `list_model_discussions` shows you the active discussion threads for a given model, including the titles and how many replies there are. You can start a new conversation on a model, dataset, or Space using `create_discussion`.

**User Info and Organization**

*   You can verify the logged-in Hugging Face user and pull their name, plan, and token metadata using `get_user`. 
*   `list_collections` lists available Hugging Face collections, and you can filter that list by author or title.

## Tools

### create_discussion
Starts a new conversation thread on a specified Hugging Face repository (model, dataset, or space).

### get_collection
Retrieves detailed information about a specific Hugging Face collection using its slug.

### get_model
Fetches all metadata for a specific Hugging Face model using its full ID.

### get_model_tags
Lists all technical tags and pipeline information for a given model ID.

### get_space
Retrieves details and runtime status for a specific Hugging Face demo Space.

### get_user
Verifies the authenticated Hugging Face user by returning their name, plan, and token metadata.

### list_collections
Lists available Hugging Face collections, allowing filtering by author or title.

### list_dataset_files
Returns a list of filenames and paths within a specified dataset repository directory.

### list_datasets
Lists datasets on the Hub, filtered by search term or author, providing download and like counts.

### list_model_discussions
Lists the active discussion threads for a given model, including thread titles and reply counts.

### list_model_files
Returns a list of files and their sizes within a specific model repository, ideal for inspecting weights.

### list_models
Lists models on the Hub, allowing filtering by search term or author, and showing pipeline task tags.

### list_spaces
Lists available demo Spaces on the Hub, providing details on the author, SDK, and creation date.

## Prompt Examples

**Prompt:** 
```
Find popular text generation models with over 1000 likes.
```

**Response:** 
```
Here are top text-generation models by likes: meta-llama/Llama-3.1-70B (45k likes), mistralai/Mixtral-8x7B (12k likes), microsoft/Phi-3-mini (8k likes) and google/gemma-2-9b (6k likes). Would you like the file structure for any of these?
```

**Prompt:** 
```
Show me what files are in the bert-base-uncased model.
```

**Response:** 
```
The bert-base-uncased repo contains: config.json (443 bytes), pytorch_model.bin (440 MB), tf_model.h5 (440 MB), flax_model.msgpack (440 MB), tokenizer.json (466 KB), tokenizer_config.json (48 bytes), vocab.txt (232 KB) and README.md. It has weights in PyTorch, TensorFlow and Flax formats.
```

**Prompt:** 
```
What discussions are happening on the Llama-3 model page?
```

**Response:** 
```
There are 23 active discussions on meta-llama/Llama-3-8B. Top threads include: 'Fine-tuning with PEFT/LoRA — memory requirements' (18 replies), 'Quantization to 4-bit — GGUF format' (14 replies) and 'Comparison with Mistral-7B on reasoning tasks' (9 replies).
```

## Capabilities

### Discover and Filter Models
Search the entire model hub using `list_models` to find artifacts based on tags, authors, or free text.

### Inspect Model Metadata and Tags
Use `get_model` and `get_model_tags` to retrieve detailed information, including the model's primary task, framework, and license.

### List Model Files and Structures
Run `list_model_files` to get a file tree (weights, configs, tokenizers) for a model without downloading any data.

### Explore Datasets and File Trees
List available datasets using `list_datasets` and then inspect their internal structure with `list_dataset_files`.

### Manage Community Discussion
Review or start conversations on a specific model or dataset using `list_model_discussions` or `create_discussion`.

### Monitor Demo Applications
View details and runtime status of ML demo apps (Spaces) using `get_space`.

## Use Cases

### The Model Vet: Checking Artifact Readiness
An ML Engineer needs to know if a specific model, `google/gemma-2-9b`, is ready for production. They ask the agent to run `list_model_files` first. The agent returns the file tree, confirming the presence of `config.json` and the correct weight format. Next, they run `get_model_tags` to verify the required framework (e.g., PyTorch). Done. The model is verified without any manual downloads or browsing.

### The Data Explorer: Finding Related Data
A Researcher needs data for a new NLP project. They first use `list_datasets` to find potential candidates. When they narrow it down to 'MedicalReports', they use `list_dataset_files` to confirm the data is in parquet format, saving them hours of manual inspection.

### The Debugger: Assessing Model Stability
A Developer is considering a new model, but isn't sure if it's stable. They ask the agent to run `list_model_discussions` for the model. The agent shows 23 active discussions, allowing the developer to immediately see top threads about 'quantization' and 'memory requirements' before committing to the integration.

### The Project Manager: Scouting Demo Apps
A PM wants to show stakeholders the best ML demo apps. They use `list_spaces` to get a list of available demos, checking the SDK (Streamlit vs. Gradio) and the creation date. They then use `get_space` to check the live runtime status of the top candidate.

## Benefits

- Find models by criteria, not by memory. Instead of manually checking tags, use `list_models` to filter thousands of models by task type or framework. You get a structured list instantly.
- Stop guessing the model structure. When you run `list_model_files`, you see the exact contents of the repo—the `config.json`, the weights, the tokenizer—without having to download a single byte.
- Keep your research contained. Use `list_datasets` and `list_collections` to browse data sources and related models. You stay in your chat client and don't have to switch tabs to read dataset metadata.
- Understand community consensus. Review bug reports and usage tips by calling `list_model_discussions`. You get a count of active threads and the top topics, helping you decide if a model is stable.
- Quickly check live demos. Use `get_space` or `list_spaces` to see if an ML demo app is currently running. This saves time debugging deployment issues before you even start coding.
- Verify your assets before committing. Run `get_user` to confirm your token is active and linked to the correct account details.

## How It Works

The bottom line is, your AI client handles the API calls and presents the ML hub data right where you are working.

1. Subscribe to the Hugging Face server and provide your Hugging Face Access Token.
2. Select your AI client (e.g., Claude, Cursor, or your own agent) and initiate a query.
3. The agent calls the appropriate tool (e.g., `list_models` or `list_dataset_files`) and returns the structured data directly to your chat window.

## Frequently Asked Questions

**How do I use the `list_models` tool to find a specific type of model?**
You can filter the results by providing a search term or author name in the tool call. This narrows the thousands of available models down to relevant results, making discovery fast.

**Does `get_model_tags` give me the framework and task type?**
Yes. It returns a detailed breakdown, including the model's primary task tag (like 'text-generation') and the framework it uses (like 'pytorch' or 'tensorflow').

**What is the difference between `list_model_files` and `list_dataset_files`?**
`list_model_files` shows the artifacts inside a model repo (weights, configs). `list_dataset_files` shows the files inside a dataset repo (data splits, READMEs). Both map out the structure.

**Can I check discussions on a model using `list_model_discussions`?**
Yes. This tool lists active threads, giving you the title, author, and comment count. This helps you gauge community interest and spot common bugs.

**How do I check the status of a demo app using `get_space`?**
You provide the Space ID. The tool returns details and the current runtime status, letting you know if the demo app is live and working.

**How do I find datasets using the `list_datasets` tool?**
The `list_datasets` tool returns dataset IDs, authors, and descriptions. You can filter results by search term or author to narrow down your search.

**What information does `get_user` provide about my Hugging Face account?**
It returns metadata about your user account, including your plan, organization memberships, and access token type. This confirms your credentials are set up correctly.

**How can I view the file structure of a model using `list_model_files`?**
The tool lists filenames, file sizes, and paths for a given model. You can also optionally specify a subdirectory to inspect specific folders within the model repo.

**How do I get a Hugging Face Access Token?**
Log in to [**Hugging Face**](https://huggingface.co), go to **Settings > Access Tokens**, click **New token**, give it a name and select scopes (read is sufficient for browsing, write if you need to create repos). Copy the token immediately — it starts with `hf_`.

**Can I search models by task type (e.g. text-generation)?**
Yes! Use `list_models` with a search query. While the search endpoint doesn't directly filter by pipeline_tag, you can search by task name (e.g. search='text-generation') and then use `get_model` or `get_model_tags` to verify the pipeline_tag of specific models.

**Can I see what files are in a model repository?**
Yes! Use `list_model_files` with the model ID (e.g. 'google-bert/bert-base-uncased') to see the complete file tree including model weights (.safetensors, .bin), config files, tokenizer files and README. Optionally set a path to browse a specific subdirectory like 'onnx' or 'pytorch'.

**Can I create discussions on model pages?**
Yes! Use `create_discussion` with the repo type ('model', 'dataset' or 'space'), the repo ID and a title. This creates a new discussion thread on the repository. You can use `list_model_discussions` first to check existing threads before creating a new one.