Mastodon Mention Tracking via the Streaming API

For solo founders and small teams, keeping an ear to the ground for what people are saying about your brand is critical. Every mention, positive or negative, is a data point. While traditional social media platforms offer various APIs, the federated nature of Mastodon presents a unique challenge and opportunity. You don't just want to know what's happening on one server; you need a broader view. This article will walk you through leveraging Mastodon's streaming API to track mentions, offering a raw, real-time feed of public activity.

Why Mastodon? Why the Streaming API?

Mastodon, as part of the Fediverse, operates on a decentralized model. Unlike centralized platforms, there's no single "firehose" of all public activity. Instead, thousands of independent instances federate, sharing content among themselves. This distributed architecture means that mentions of your brand could be happening on any number of instances, making a comprehensive overview challenging with traditional search methods.

For a solo founder, the community-driven nature of Mastodon often means more direct, honest feedback than you might find elsewhere. People discuss products, services, and ideas openly. Missing these conversations means missing out on valuable insights, potential customers, and early warning signs of issues.

The Mastodon API offers both a REST API for polling and a Streaming API for real-time events. While the REST API is excellent for historical data retrieval or specific queries, it's inefficient for real-time monitoring due to rate limits and the overhead of constant polling. The Streaming API, on the other hand, provides a persistent WebSocket connection, pushing events to you as they happen. This is exactly what you need for live mention tracking – cheap, real-time insights without hammering API limits.

Getting Started: Authentication and Authorization

Before you can connect to any Mastodon API, you need an application registered with a Mastodon instance and an access token. This token grants your application permission to perform actions on behalf of a user (or just read public data, in our case).

Here's how to get one:

  1. Choose an Instance: You'll need an account on some Mastodon instance. Many developers use their primary account instance or create a dedicated bot account for this purpose. The API token you generate will be tied to this instance.
  2. Register an Application:
    • Log into your Mastodon account.
    • Go to Preferences (usually via the gear icon).
    • Navigate to Development (or "Applications").
    • Click "New Application".
    • Provide an application name (e.g., "Mentionly Tracker"), a website (can be a placeholder if you don't have one yet), and redirect URIs (for our purposes, urn:ietf:wg:oauth:2.0:oob or http://localhost is fine, as we'll use a direct token).
    • Crucially, select the necessary Scopes. For public mention tracking, read:statuses and read:accounts are generally sufficient. Avoid requesting unnecessary scopes for security reasons.
  3. Generate Your Access Token: After registering, you'll be presented with your Client Key, Client Secret, and Access Token. The Access Token is what you'll use to authenticate your streaming connection. Treat this token like a password; do not hardcode it directly into public repositories or share it. Use environment variables or a secrets management system.

For example, if you register an app on mastodon.social, your token will be valid for mastodon.social. You'll need to repeat this process for other instances if you want to track mentions on them directly.

Connecting to the Streaming API

The Mastodon Streaming API uses WebSockets. You'll establish a persistent connection to an instance, and it will push new events to you. There are several streams available:

  • public: The local public timeline of the instance. This includes statuses from users on that instance and statuses from other instances that have federated to it. This is your primary source for broad mention tracking on a specific instance.
  • public:local: Only statuses from users directly on that instance.
  • hashtag: Statuses containing a specific hashtag.
  • user: Events related to the authenticated user (notifications, home timeline, etc.). Not typically used for general brand monitoring.

For brand mention tracking, you'll primarily be interested in the public stream or potentially hashtag streams if your brand uses specific hashtags.

The WebSocket endpoint typically looks like this: wss://[your.instance]/api/v1/streaming?stream=[stream_name]&access_token=[YOUR_ACCESS_TOKEN]

Let's use Python with the websockets library as a concrete example.

import asyncio
import websockets
import json
import os

# Replace with your Mastodon instance and access token
# It's best practice to load these from environment variables
MASTODON_INSTANCE = os.getenv("MASTODON_INSTANCE", "mastodon.social")
ACCESS_TOKEN = os.getenv("MASTODON_ACCESS_TOKEN", "YOUR_SECRET_TOKEN_HERE") # !!! Use environment variable

async def connect_and_stream():
    uri = f"wss://{MASTODON_INSTANCE}/api/v1/streaming?stream=public&access_token={ACCESS_TOKEN}"
    print(f"Connecting to {uri}")

    async with websockets.connect(uri) as websocket:
        print("Connected to Mastodon streaming API.")
        try:
            async for message in websocket:
                data = json.loads(message)
                event_type = data.get("event")

                if event_type == "update":
                    # An 'update' event signifies a new status
                    status = json.loads(data.get("payload"))

                    # Basic information about the status
                    status_id = status.get("id")
                    account_display_name = status.get("account", {}).get("display_name")
                    account_username = status.get("account", {}).get("acct") # e.g., "user@instance.social"
                    status_url = status.get("url")

                    # The actual content, HTML-formatted
                    content = status.get("content")

                    print(f"--- New Status ({status_id}) ---")
                    print(f"From: {account_display_name} (@{account_username})")
                    print(f"URL: {status_url}")
                    print(f"Content (HTML): {content}")
                    print("------------------------")
                elif event_type == "delete":
                    # A 'delete' event means a status was removed
                    deleted_id = data.get("payload")
                    print(f"Status {deleted_id} was deleted.")
                # You might also see 'filters_changed', 'announcement', etc.
                else:
                    print(f"Received unknown event type: {event_type} with payload: {data.get('payload')}")

        except websockets.exceptions.ConnectionClosedOK:
            print("Connection closed gracefully.")
        except Exception as e:
            print(f"An error occurred: {e}")

if __name__ == "__main__":
    # For demonstration, ensure you set your environment variables
    # export MASTODON_INSTANCE="mastodon.social"
    # export MASTODON_ACCESS_TOKEN="your_actual_token_here"
    asyncio.run(connect_and_stream())

This script connects to the public stream of mastodon.social (or your specified instance) and prints out basic details for every new status that appears on that instance's public timeline.

Filtering and Processing Mentions

The public stream can be incredibly noisy, especially on larger instances. You'll receive every public status posted or federated to that instance. To find brand mentions, you need to implement robust filtering.

Here's how you might extend the previous example to filter for specific keywords, like your brand name "Mentionly" or a product name:

```python import asyncio import websockets import json import os import re # For regular expressions to clean HTML

MASTODON_INSTANCE = os.getenv("MASTODON_INSTANCE", "mastodon.social") ACCESS_TOKEN = os.getenv("MASTODON_ACCESS_TOKEN", "YOUR_SECRET_TOKEN_HERE")

Define your brand keywords (case-insensitive)

BRAND_KEYWORDS = ["mentionly", "myproduct", "yourbrandname"]

Regex to strip HTML tags from content for easier searching

HTML