P PasteCode
规则

AI编码规则针对SEO内容网站

AGENTS.md规则针对以SEO为重点的内容网站,防止重复元数据,强制执行结构化数据,并防止代理破坏可爬取性。

CursorClaude CodeCodexWindsurf AstroNext.jsTypeScript
.md .json 更新于 2026年6月8日

将此文件放在仓库根目录,命名为 AGENTS.md。它适用于任何内容密集型网站,这些网站以自然搜索流量为主要增长杠杆——博客、文档网站、营销网站和资源库。

AGENTS.md

AGENTS.md
# Project Rules — SEO Content Site
## Stack
- Astro (static) or Next.js (App Router, static export or ISR).
- TypeScript strict. Content schema enforced via Zod (content collections or manual).
- Tailwind CSS for styling.
## Hard rules — SEO correctness
- Every page MUST have a unique `<title>` (50–60 chars) and a unique
`<meta name="description">` (120–160 chars). Duplicate titles and descriptions
cause Google to de-rank or rewrite them. Check both before marking a task done.
- Every page MUST render a `<link rel="canonical" href="...">` pointing to its own
canonical URL. This is required even on pages that are not duplicated — it is a
signal, not just a deduplication tool.
- Never generate two pages with the same slug or URL path. Before adding a new content
file or route, confirm the path does not already exist in `src/pages/` or the
content collection.
- Every blog post or article MUST include JSON-LD structured data: at minimum
`Article` with `headline`, `datePublished`, `dateModified`, and `author`.
Product pages need `Product` schema. FAQ pages need `FAQPage` schema.
- Open Graph tags (`og:title`, `og:description`, `og:image`, `og:url`) must be
present on every page. The `og:image` must be an absolute URL (not a relative path).
- Do NOT use `noindex` on pages that should rank. Do NOT remove `noindex` from pages
in `src/pages/api/`, `src/pages/admin/`, or any route that should not be crawled.
## URL and routing conventions
- URLs are lowercase, hyphen-separated, no trailing slashes (or consistently with
trailing slash if the framework default — pick one and enforce it with a redirect).
- Never rename a published URL without adding a 301 redirect from the old path.
Broken inbound links are ranking signals that are lost permanently.
- Paginated series use `/page/2/` style paths, not query strings (`?page=2`).
Query-string pagination is not indexed by Google.
## Content and performance
- All images must have descriptive `alt` text that includes the target keyword where
natural. Empty `alt=""` is only correct for decorative images.
- Images must be served in WebP or AVIF format. No JPEG or PNG without a `<picture>`
element providing a next-gen fallback.
- Every page must load without render-blocking scripts. No `<script>` without `defer`
or `async` in the `<head>` unless it is a critical inline script.
- Internal links must use the full path and must not 404. Before adding a link, verify
the target page exists.
## Definition of done
- `astro check` or `tsc --noEmit` passes.
- `astro build` completes without warnings.
- Running a spot-check: `curl -s <page-url> | grep -c 'canonical'` returns 1.
- No duplicate `<title>` values across built HTML (run `grep -r '<title>'` on dist/).
- JSON-LD is present and valid (use schema.org validator).

为什么这些规则

  • 每页唯一的标题和描述是内容网站最具影响力的SEO规则。批量生成内容页面的代理通常重复使用相同的元数据模板,生成数十个技术上不同但对爬虫而言看似相同的页面——从而触发软去重惩罚。
  • 每个页面都设置规范URL,而不仅仅是重复页面经常被误解。阅读SEO文档的代理通常只在内容明显重复的地方添加规范URL(例如分页)。实际上,每个页面都应自引用其规范URL,以防止参数注入的爬取变体分散链接权益。

适用场景

  • 博客、文档网站、SEO内容中心和营销网站,这些网站以自然搜索为主要获取渠道。

不适用

  • 内部工具、仪表盘或应用程序,其中SEO无关紧要——规范URL/结构化数据要求会增加无益的开销。