Prompt-to-PR: Add a Sitemap and robots.txt
SOP for adding a dynamic XML sitemap and robots.txt to a Next.js or Astro project — correct lastmod, priority, and crawl rules for production SEO.
CursorClaude CodeCodexWindsurf Next.jsAstroTypeScript
Sitemaps and robots.txt are the first SEO primitives an agent touches, and they are frequently wrong — wrong lastmod format, missing Sitemap: directive in robots, or blocked pages inadvertently included. This playbook gets them right.
1. Requirement
Produce an XML sitemap covering all public routes (static + dynamic content from the database) and a robots.txt that blocks admin/API paths and references the sitemap. Works for both Next.js App Router and Astro; choose the correct approach for your framework.
2. First Prompt
Add a sitemap.xml and robots.txt to this project. Use the correct approachfor the framework detected below.
### If Next.js 14+:1. Create `src/app/sitemap.ts` using the Next.js `MetadataRoute.Sitemap` return type. Include: - All static routes: /, /pricing, /blog, /about (hardcoded is fine). - All dynamic blog posts: fetch slugs from the DB using the existing query helper, return lastModified from the post's updatedAt field. - Use `process.env.NEXT_PUBLIC_APP_URL` as the base URL. - Correct W3C datetime format for lastModified (ISO 8601).2. Create `src/app/robots.ts` using MetadataRoute.Robots. - Allow: all routes. - Disallow: /admin, /api, /dashboard. - Add `sitemap: process.env.NEXT_PUBLIC_APP_URL + "/sitemap.xml"`.
### If Astro:1. Add `@astrojs/sitemap` integration. In astro.config.ts, add `sitemap({ filter: (page) => !page.includes("/admin") })` and set `site: process.env.SITE_URL`.2. Create `public/robots.txt`: User-agent: * Disallow: /admin Disallow: /api Sitemap: <SITE_URL>/sitemap-index.xml
Do not create a custom sitemap endpoint if the integration handles it.Do not block / or any public content pages.3. Expected File Changes
### Next.jssrc/app/sitemap.ts (new — dynamic MetadataRoute.Sitemap)src/app/robots.ts (new — MetadataRoute.Robots)
### Astroastro.config.ts (add sitemap integration + filter)public/robots.txt (new).env.example (SITE_URL added if missing)4. Review Checklist
- Base URL comes from an env var — not hardcoded as
http://localhost:3000. lastModifiedis a JavaScriptDateobject (Next.js converts to ISO 8601) or already a valid ISO string — not"undefined"or missing./admin,/api, and/dashboardare in the Disallow list.- The
Sitemap:directive inrobots.txtuses an absolute URL. - Dynamic routes (blog posts) are included via a DB query, not just static routes.
- The sitemap does not include 404, redirect, or noindex pages.
bun run buildthencurl /sitemap.xmlreturns valid XML (check withxmllint).
5. Test Commands
bun run build && bun run start# or for Astro:bun run build && bun run preview
# Validate sitemap XMLcurl -s http://localhost:3000/sitemap.xml | xmllint --format - | head -40
# Confirm robots.txtcurl http://localhost:3000/robots.txt
# Confirm admin is disallowed and sitemap directive is presentgrep -E "Disallow|Sitemap" <(curl -s http://localhost:3000/robots.txt)
# Google Rich Results / URL Inspection simulationcurl -A "Googlebot" http://localhost:3000/sitemap.xml -I6. Common Failures
lastModifiedis"undefined"— post’supdatedAtfield is null. Guard:lastModified: post.updatedAt ?? post.createdAt ?? new Date().- Sitemap returns 404 —
src/app/sitemap.tsis missing or placed outside theappdirectory. - All routes disallowed — agent adds
Disallow: /by mistake. Confirm only admin/API paths are blocked. Sitemap:directive has relative URL — Google ignores it. Must be absolute:https://example.com/sitemap.xml.- Static sitemap only — agent hardcodes blog slugs instead of querying the DB. Confirm the sitemap function is
asyncand fetches real data.
7. Fix Prompt
The sitemap.xml at /sitemap.xml includes every blog post withlastModified "undefined" (rendered as the string).
Fix in src/app/sitemap.ts: const posts = await getBlogPosts(); return posts.map((post) => ({ url: `${BASE_URL}/blog/${post.slug}`, lastModified: post.updatedAt ?? post.createdAt ?? new Date(), changeFrequency: "weekly", priority: 0.8, }));
Ensure getBlogPosts() returns rows that include updatedAt and createdAt.8. PR Description
## SEO: Add dynamic sitemap.xml and robots.txt
**Next.js**: `src/app/sitemap.ts` + `src/app/robots.ts` using built-in`MetadataRoute` types. Sitemap includes static routes + all published blogposts with correct `lastModified` timestamps from the DB.
**robots.txt** disallows `/admin`, `/api`, `/dashboard`; includes absolute`Sitemap:` directive pointing to the generated `/sitemap.xml`.
Base URL read from `NEXT_PUBLIC_APP_URL` — no localhost URLs in production.