Building MCP Servers with FastMCP: 7 Mistakes to Avoid

7 common mistakes developers make when building MCP servers with FastMCP - from missing tool annotations and poor error handling to token-wasteful responses and security gaps - with concrete fixes for each.

The Model Context Protocol has seen explosive adoption - over 10,000 server repositories created in its first year, with 518+ servers in the official registry. Many of them are built with FastMCP, a Python framework that makes it straightforward to expose tools to LLM agents. But "straightforward" doesn't mean "hard to get wrong." After reviewing dozens of MCP server implementations - including Google's own Google Ads MCP server - the same mistakes keep showing up. Here are seven of the most common ones and how to fix them.

1. Not Marking Mutating Operations

Every MCP tool call is a black box to the client until you tell it otherwise. Without annotations, clients like Claude Code and ChatGPT have two bad options: prompt the user for confirmation on every single call (friction), or auto-approve everything (dangerous). The 2025-03-26 MCP spec revision introduced ToolAnnotations (PR #185) to solve this:

from mcp.types import ToolAnnotations
  
  READ_ONLY = ToolAnnotations(readOnlyHint=True)
  MUTATING = ToolAnnotations(readOnlyHint=False, destructiveHint=True)
  
  @mcp.tool(annotations=READ_ONLY)
  def get_campaigns(...): ...
  
  @mcp.tool(annotations=MUTATING)
  def create_campaign(...): ...

These annotations are hints, not guarantees - the spec is explicit that clients should not trust them from untrusted servers. But for trusted servers, they let clients skip confirmation prompts for read-only tools and flag destructive ones for explicit approval. The spec also provides idempotentHint and openWorldHint for finer-grained signaling. Mark every tool. It takes one line and prevents accidental data loss.

2. Exposing Raw API Primitives Instead of Outcome-Oriented Tools

A common pattern: wrap the API's query language in a search() tool and call it a day. Now the LLM has to compose GAQL queries, GraphQL mutations, or construct protobuf dicts from scratch. It will get them wrong. Repeatedly.

The Google Ads MCP server is a cautionary example. It exposes exactly three tools: a raw search tool that accepts GAQL queries, a list_accessible_customers tool, and a get_resource_metadata tool. That's it. To create a campaign, the LLM has to compose GAQL mutations from scratch. To pull campaign performance, it needs to know which GAQL fields to select, how to format date ranges, and how to structure conditions. This is the API-primitive trap in practice - the server mirrors the Google Ads API surface instead of wrapping it in outcome-oriented tools like get_campaigns or create_campaign that handle the query construction internally.

Design tools around what the agent wants to achieve, not around raw API endpoints. A well-designed Google Ads MCP server would offer get_campaigns that pre-selects the right GAQL fields and handles date filtering, or create_campaign that builds the full mutation chain (budget, campaign, ad group) with sensible defaults. The raw search tool can still exist for power users who need full flexibility, but convenience tools should handle the common use cases without requiring the LLM to learn a query language.

As Docker's MCP best practices guide puts it, the end user of the tool is the agent or LLM, not the human - design accordingly.

3. Missing Safe Defaults

An MCP tool that creates a Google Ads campaign with ENABLED status by default will burn real money the moment an LLM calls it. Any campaign-creation tool should default to PAUSED status - the agent or user must explicitly enable it. This is a one-line fix that prevents potentially expensive accidents:

@mcp.tool(annotations=MUTATING)
  def create_campaign(name: str, budget_amount: float, ...):
      """Create a new campaign. Defaults to PAUSED status."""
      campaign.status = client.enums.CampaignStatusEnum.PAUSED
      ...

The same principle applies broadly. Creating draft resources instead of live ones, requiring explicit confirmation for irreversible deletes, defaulting to conservative rate limits - any operation where the cost of an accidental execution is high should default to the safe path.

4. Poor Tool Documentation

LLMs read your tool schemas. Every parameter name, description, and type constraint is part of the prompt the agent works with. A parameter called enable_slow_ramp with no description is a coin flip for the LLM. FastMCP's documentation recommends using Annotated with Pydantic Field to produce self-documenting schemas:

from typing import Annotated
  from pydantic import Field
  
  CustomerId = Annotated[str, Field(
      description="Google Ads customer ID, numeric string without dashes",
      pattern=r"^\d{10}$",
      examples=["1234567890"]
  )]
  
  DateRange = Annotated[str, Field(
      description="Date in YYYY-MM-DD format",
      pattern=r"^\d{4}-\d{2}-\d{2}$"
  )]

Beyond parameters, tool docstrings should document what the response looks like - not just what the tool does. Include a ## Return Format section with example output. Without it, the LLM calls the tool speculatively and may misinterpret the response structure.

For static reference material (API field lists, enum values, query language syntax), use MCP Resources instead of cramming everything into tool descriptions. Resources are application-driven - the host fetches them only when needed, keeping the context window clean.

5. Swallowing Error Messages

Many APIs return generic errors like "Bad Request" or "Internal Server Error" with no details. When an MCP server passes these through unchanged, the LLM has nothing to work with. It enters a self-correction loop - retrying with slight variations, burning tokens, and often exhausting the API request budget without ever fixing the actual problem.

The fix is two-fold. First, when you detect a generic error message (the usual suspects: "bad request", "internal server error", "not found"), append the full response body. The raw response typically contains field-level validation errors or diagnostic info the LLM can act on. Second, don't aggressively truncate error responses. A 50-character error limit made sense for human UIs; LLMs need the full context to self-correct.

6. Wasteful Token Usage in Responses

Every tool response eats into the context window. JSON responses on tabular data are particularly wasteful - key names repeat for every single row:

[{"campaign.name": "Brand", "campaign.id": 123, "metrics.clicks": 500}, ...]

The same data as CSV:

campaign.name,campaign.id,metrics.clicks
  Brand,123,500

CSV eliminates repeated keys, braces, brackets, and quotes - typically saving 40-60% of tokens on tabular data. Consider adding a response_type parameter to read-only tools, defaulting to "csv" with JSON available when structured data is needed for programmatic processing. Implementation is straightforward: a format_rows_as_csv() utility that converts list[dict] to CSV strings, with nested values (lists, dicts) JSON-serialized into cells.

7. Ignoring Security Fundamentals

An Endor Labs study of 2,614 MCP implementations found that 82% use file system operations vulnerable to path traversal, 67% use APIs susceptible to code injection, and 34% are prone to command injection. These aren't theoretical risks:

Incident	Impact
Asana MCP data exposure (June 2025)	Customer data leaked across tenants via flawed isolation; MCP disabled for two weeks
CVE-2025-6514 (mcp-remote)	Critical (CVSS 9.6) OAuth command injection enabling arbitrary OS command execution
mcp-server-git CVEs (Jan 2026)	Path traversal and argument injection in Anthropic's own reference server

The attack surface is broad: prompt injection embeds malicious instructions in content the LLM processes, tool poisoning manipulates tool metadata, and consent fatigue exploits users who auto-approve after seeing dozens of benign requests. Validate all inputs at the boundary. Never store credentials in plaintext or expose them in URLs. Treat every string from an LLM as untrusted user input - because that's exactly what it is.

Key Takeaways

Annotate every tool with ToolAnnotations from the MCP 2025-03-26 spec. One line per tool prevents accidental mutations.
Design outcome-oriented tools that wrap raw API primitives. LLMs work better with get_campaigns than with a generic query builder.
Default to safe states - PAUSED campaigns, draft resources, conservative limits.
Document parameters exhaustively using Annotated + Field. Include return format examples in docstrings.
Surface full error messages to the LLM. Generic errors cause expensive retry loops.
Return CSV by default for tabular data. It cuts token usage by 40-60% with no information loss.
Validate all inputs, store no secrets in plaintext, and treat LLM-generated strings as untrusted. The first year of MCP produced 30+ CVEs - don't add to the count.