# Hugging Face MCP

> Hugging Face MCP connects your AI agent directly to the world's largest hub for machine learning resources. Use it to find, inspect, and manage thousands of models, datasets, and live demo apps in one conversation. You can search by task type, review model file structures without downloading anything, track community discussions, or list available datasets—all from your preferred AI client.

## Overview
- **Category:** loved-by-devs
- **Price:** Free
- **Tags:** model-discovery, machine-learning, datasets, model-metadata, ai-research, pipeline-tasks

## Description

Connecting to the Hugging Face hub means you can treat your ML research like a natural conversation. Instead of switching tabs and manually searching across separate websites, your agent acts as an embedded data scientist. You can ask it to find models that perform specific tasks or locate datasets matching certain criteria. The tool lets you inspect model metadata—seeing tags, download counts, and file structures—all before deciding what's useful for your project. Need to check the status of a live demo app? Just ask. If you’re working with Vinkius, this MCP makes sure that entire ecosystem of ML resources is accessible from a single point of entry, letting your AI client handle the heavy lifting. You can even use it to create discussions or browse existing community reports, keeping all your research notes right where they belong.

## Tools

### list_dataset_files
Lists all filenames within a specific Hugging Face dataset repository, helping you map out its structure.

### create_discussion
Allows you to open a brand new conversation thread on any model, dataset, or space page.

### get_collection
Retrieves specific details and information for a named Hugging Face collection slug.

### get_model
Fetches core metadata about any specified model ID in the 'author/name' format.

### get_model_tags
Provides detailed tags and pipeline information for a model, showing its framework, license, and primary task type.

### get_space
Retrieves all operational details about a specific Hugging Face demo application (Space).

### list_collections
Lists multiple curated model, dataset, and space collections available on the Hub.

### list_datasets
Provides a list of datasets, along with their author, download counts, and creation dates.

### list_model_discussions
Lists active discussion threads on a model page so you can review community feedback or bug reports.

### list_model_files
Shows the full file list, sizes, and paths for any specified model repository directory.

### list_models
Searches and returns a list of models on the Hub based on search terms or authors.

### list_spaces
Lists available demo applications (Spaces), showing their SDK, title, and author.

### get_user
Checks your authenticated user account details to confirm the token is working correctly.

## Prompt Examples

**Prompt:** 
```
Find popular text generation models with over 1000 likes.
```

**Response:** 
```
Here are top text-generation models by likes: meta-llama/Llama-3.1-70B (45k likes), mistralai/Mixtral-8x7B (12k likes), microsoft/Phi-3-mini (8k likes) and google/gemma-2-9b (6k likes). Would you like the file structure for any of these?
```

**Prompt:** 
```
Show me what files are in the bert-base-uncased model.
```

**Response:** 
```
The bert-base-uncased repo contains: config.json (443 bytes), pytorch_model.bin (440 MB), tf_model.h5 (440 MB), flax_model.msgpack (440 MB), tokenizer.json (466 KB), tokenizer_config.json (48 bytes), vocab.txt (232 KB) and README.md. It has weights in PyTorch, TensorFlow and Flax formats.
```

**Prompt:** 
```
What discussions are happening on the Llama-3 model page?
```

**Response:** 
```
There are 23 active discussions on meta-llama/Llama-3-8B. Top threads include: 'Fine-tuning with PEFT/LoRA — memory requirements' (18 replies), 'Quantization to 4-bit — GGUF format' (14 replies) and 'Comparison with Mistral-7B on reasoning tasks' (9 replies).
```

## Capabilities

### Search and find models
Discover thousands of available ML models by filtering them using name, task type, framework, or author.

### Inspect model files
View the exact filenames, sizes, and paths within a model repository without having to download any weights or artifacts.

### Review datasets
List available datasets on the hub and view their descriptions, size details, and file trees for inspection.

### Check live demos (Spaces)
Retrieve information about ML demo applications, including whether they are currently running or down.

### Manage discussions
Read existing community threads for bug reports or feature requests, and also create new discussion topics on a specific model or dataset page.

## Use Cases

### Vetting a new NLP model for production
A developer needs to know if a candidate model supports the right framework. They ask their agent to get_model_tags for several candidates, immediately seeing which ones are PyTorch and which are TensorFlow before writing any integration code.

### Comparing dataset structures
A researcher needs two datasets but isn't sure how the files are stored. They use list_dataset_files on both, letting them compare file paths (e.g., 'train.parquet' vs 'data/raw') in a single view.

### Checking if a demo app is operational
A team lead wants to know if the internal ML dashboard is working before a meeting. They ask get_space, and the agent instantly reports the current runtime status of the application.

### Documenting a research finding
After testing several models, an engineer uses list_model_discussions to gather context on common bugs or optimal usage tips that others have already shared in the community threads.

## Benefits

- Stop leaving your AI client to check model tags or browse discussions. The MCP lets you get the full pipeline info (get_model_tags) and read community feedback (list_model_discussions) without opening a browser.
- When vetting models, don't guess what files are inside. Use list_model_files to map out every single artifact—like config.json or model weights—before you download anything.
- Need to find a good starting point? You can use list_models and filter by task type, quickly narrowing down thousands of options to only those relevant for your current project.
- The MCP centralizes discovery. Whether you're listing datasets (list_datasets) or checking out live demo apps (get_space), everything is accessible from one conversation stream.
- You can initiate conversations about models using create_discussion, making it easy to track bug reports and feature requests directly through your AI agent.

## How It Works

The bottom line is that you get an AI assistant capable of acting like a dedicated ML researcher, pulling data from Hugging Face without leaving your current interface.

1. First, subscribe to this MCP and provide your Hugging Face Access Token.
2. Next, tell your AI client what you're looking for—for instance, 'show me all image classification models using PyTorch.'
3. Finally, your agent returns the relevant model metadata, file lists, or discussion threads directly in the chat.

## Frequently Asked Questions

**How do I use Hugging Face MCP to find all available models?**
You can list general candidates using list_models, which lets you filter by search term or author. It returns the model ID, task tag, and download count for quick comparisons.

**Can I use Hugging Face MCP to check a dataset's file structure?**
Yes, run list_dataset_files on your desired dataset repository. This gives you a clear list of every filename, like 'train.parquet', helping you understand the data layout.

**What is the best way to check model tags using Hugging Face MCP?**
Use get_model_tags and provide the full author/name ID for the model. This tool gives detailed information on its framework, license, and primary task tag in one go.

**How do I find live demo apps with Hugging Face MCP?**
Run list_spaces to get a catalog of all available demo applications. You can then use get_space on a specific ID to confirm its current runtime status.

**Can I create discussions using the Hugging Face MCP?**
Yes, you can initiate conversations with create_discussion. Just provide the repo type (model, dataset or space), the ID, and your title to start a new thread.