` (50–60 chars) and a unique\n `<meta name=\"description\">` (120–160 chars). Duplicate titles and descriptions\n cause Google to de-rank or rewrite them. Check both before marking a task done.\n- Every page MUST render a `<link rel=\"canonical\" href=\"...\">` pointing to its own\n canonical URL. This is required even on pages that are not duplicated — it is a\n signal, not just a deduplication tool.\n- Never generate two pages with the same slug or URL path. Before adding a new content\n file or route, confirm the path does not already exist in `src/pages/` or the\n content collection.\n- Every blog post or article MUST include JSON-LD structured data: at minimum\n `Article` with `headline`, `datePublished`, `dateModified`, and `author`.\n Product pages need `Product` schema. FAQ pages need `FAQPage` schema.\n- Open Graph tags (`og:title`, `og:description`, `og:image`, `og:url`) must be\n present on every page. The `og:image` must be an absolute URL (not a relative path).\n- Do NOT use `noindex` on pages that should rank. Do NOT remove `noindex` from pages\n in `src/pages/api/`, `src/pages/admin/`, or any route that should not be crawled.\n\n## URL and routing conventions\n- URLs are lowercase, hyphen-separated, no trailing slashes (or consistently with\n trailing slash if the framework default — pick one and enforce it with a redirect).\n- Never rename a published URL without adding a 301 redirect from the old path.\n Broken inbound links are ranking signals that are lost permanently.\n- Paginated series use `/page/2/` style paths, not query strings (`?page=2`).\n Query-string pagination is not indexed by Google.\n\n## Content and performance\n- All images must have descriptive `alt` text that includes the target keyword where\n natural. Empty `alt=\"\"` is only correct for decorative images.\n- Images must be served in WebP or AVIF format. No JPEG or PNG without a `<picture>`\n element providing a next-gen fallback.\n- Every page must load without render-blocking scripts. No `<script>` without `defer`\n or `async` in the `<head>` unless it is a critical inline script.\n- Internal links must use the full path and must not 404. Before adding a link, verify\n the target page exists.\n\n## Definition of done\n- `astro check` or `tsc --noEmit` passes.\n- `astro build` completes without warnings.\n- Running a spot-check: `curl -s <page-url> | grep -c 'canonical'` returns 1.\n- No duplicate `<title>` values across built HTML (run `grep -r '<title>'` on dist/).\n- JSON-LD is present and valid (use schema.org validator).\n```\n\n## Why these rules\n\n- **Unique title and description per page** is the most impactful SEO rule for content sites. Agents that generate content pages in bulk often reuse the same metadata template, producing dozens of pages that are technically distinct but appear identical to crawlers — triggering soft deduplication penalties.\n- **Canonical on every page, not just duplicates** is frequently misunderstood. Agents that read SEO documentation often only add canonicals where content is clearly duplicated (e.g. pagination). In practice, every page should self-reference its canonical to prevent parameter-injected crawl variants from splitting link equity.\n\n## Good fit\n\n- Blogs, documentation sites, SEO content hubs, and marketing sites where organic search is the primary acquisition channel.\n\n## Not a fit\n\n- Internal tools, dashboards, or applications where SEO is irrelevant — the canonical/structured data requirements add overhead with no benefit." }

{ "id": "ai-rules-for-seo-content-sites", "type": "rules", "category": "rules", "locale": "en", "url": "/rules/ai-rules-for-seo-content-sites", "title": "AI Coding Rules for SEO Content Sites", "description": "AGENTS.md rules for SEO-focused content sites that prevent duplicate metadata, enforce structured data, and keep agents from breaking crawlability.", "tools": [ "Cursor", "Claude Code", "Codex", "Windsurf" ], "stack": [ "Astro", "Next.js", "TypeScript" ], "tags": [ "agents-md", "seo", "astro", "nextjs", "typescript", "conventions" ], "difficulty": null, "updated": "2026-06-08", "markdown": "Drop this in your repo root as `AGENTS.md`. It targets any content-heavy site where organic search traffic is the primary growth lever — blogs, documentation sites, marketing sites, and resource libraries.\n\n## AGENTS.md\n\n```md title=\"AGENTS.md\"\n# Project Rules — SEO Content Site\n\n## Stack\n- Astro (static) or Next.js (App Router, static export or ISR).\n- TypeScript strict. Content schema enforced via Zod (content collections or manual).\n- Tailwind CSS for styling.\n\n## Hard rules — SEO correctness\n- Every page MUST have a unique `` (50–60 chars) and a unique\n `<meta name=\"description\">` (120–160 chars). Duplicate titles and descriptions\n cause Google to de-rank or rewrite them. Check both before marking a task done.\n- Every page MUST render a `<link rel=\"canonical\" href=\"...\">` pointing to its own\n canonical URL. This is required even on pages that are not duplicated — it is a\n signal, not just a deduplication tool.\n- Never generate two pages with the same slug or URL path. Before adding a new content\n file or route, confirm the path does not already exist in `src/pages/` or the\n content collection.\n- Every blog post or article MUST include JSON-LD structured data: at minimum\n `Article` with `headline`, `datePublished`, `dateModified`, and `author`.\n Product pages need `Product` schema. FAQ pages need `FAQPage` schema.\n- Open Graph tags (`og:title`, `og:description`, `og:image`, `og:url`) must be\n present on every page. The `og:image` must be an absolute URL (not a relative path).\n- Do NOT use `noindex` on pages that should rank. Do NOT remove `noindex` from pages\n in `src/pages/api/`, `src/pages/admin/`, or any route that should not be crawled.\n\n## URL and routing conventions\n- URLs are lowercase, hyphen-separated, no trailing slashes (or consistently with\n trailing slash if the framework default — pick one and enforce it with a redirect).\n- Never rename a published URL without adding a 301 redirect from the old path.\n Broken inbound links are ranking signals that are lost permanently.\n- Paginated series use `/page/2/` style paths, not query strings (`?page=2`).\n Query-string pagination is not indexed by Google.\n\n## Content and performance\n- All images must have descriptive `alt` text that includes the target keyword where\n natural. Empty `alt=\"\"` is only correct for decorative images.\n- Images must be served in WebP or AVIF format. No JPEG or PNG without a `<picture>`\n element providing a next-gen fallback.\n- Every page must load without render-blocking scripts. No `<script>` without `defer`\n or `async` in the `<head>` unless it is a critical inline script.\n- Internal links must use the full path and must not 404. Before adding a link, verify\n the target page exists.\n\n## Definition of done\n- `astro check` or `tsc --noEmit` passes.\n- `astro build` completes without warnings.\n- Running a spot-check: `curl -s <page-url> | grep -c 'canonical'` returns 1.\n- No duplicate `<title>` values across built HTML (run `grep -r '<title>'` on dist/).\n- JSON-LD is present and valid (use schema.org validator).\n```\n\n## Why these rules\n\n- **Unique title and description per page** is the most impactful SEO rule for content sites. Agents that generate content pages in bulk often reuse the same metadata template, producing dozens of pages that are technically distinct but appear identical to crawlers — triggering soft deduplication penalties.\n- **Canonical on every page, not just duplicates** is frequently misunderstood. Agents that read SEO documentation often only add canonicals where content is clearly duplicated (e.g. pagination). In practice, every page should self-reference its canonical to prevent parameter-injected crawl variants from splitting link equity.\n\n## Good fit\n\n- Blogs, documentation sites, SEO content hubs, and marketing sites where organic search is the primary acquisition channel.\n\n## Not a fit\n\n- Internal tools, dashboards, or applications where SEO is irrelevant — the canonical/structured data requirements add overhead with no benefit." }