← Back

Set Up Automatic Robots.txt for Your AWS Website

Overview

This guide shows you how to use a Lambda edge function that calls the REST API to append Automatic Robots.txt rules to your existing robots.txt file. Please contact us if you need help.

Step 1: Create a Lambda Edge Function

const KNOWN_AGENTS_ACCESS_TOKEN = "YOUR_ACCESS_TOKEN" // TODO: Swap in your access token
const ROBOTS_TXT_DISALLOW_PATH = "/"
const FETCH_TIMEOUT_IN_MILLISECONDS = 3000
const ROBOTS_TXT_AGENT_TYPES = [
    // TODO: Add blocked agent types
]

export const handler = async (event) => {
    const thisResponse = event.Records[0].cf.response
    const thatResponse = await fetchRobotsTXT().catch(() => {
        return new Response("")
    })
    const thisRobotsTXT = thisResponse.body || ""
    const thatRobotsTXT = thatResponse.ok ? await thatResponse.text() : ""

    const robotsTXT = [
        thisRobotsTXT.trim(),
        "# BEGIN Known Agents Managed Content",
        thatRobotsTXT.trim(),
        "# END Known Agents Managed Content"
    ].join("\n\n")

    return {
        ...thisResponse,
        status: "200",
        statusDescription: "OK",
        headers: {
            ...thisResponse.headers,
            "content-type": [{ value: "text/plain" }]
        },
        body: robotsTXT
    }
}

async function fetchRobotsTXT() {
    return fetch("https://api.knownagents.com/robots-txts", {
        method: "POST",
        signal: AbortSignal.timeout(FETCH_TIMEOUT_IN_MILLISECONDS),
        headers: {
            "Authorization": `Bearer ${KNOWN_AGENTS_ACCESS_TOKEN}`,
            "Content-Type": "application/json"
        },
        body: JSON.stringify({
            agent_types: ROBOTS_TXT_AGENT_TYPES,
            disallow: ROBOTS_TXT_DISALLOW_PATH
        })
    })
}

Step 2: Block Agents by Category

Step 3: Create a Distribution Behavior

Step 4: Test Your Integration

If your website is correctly connected, you should see the new rules in your website's robots.txt.