Dynamic Sitemaps in Next.js: Thousands of Pages, One File
How to generate dynamic sitemaps in Next.js for sites with thousands of pages. Multi-sitemap index, database queries, and revalidation strategies.
title: "Dynamic Sitemaps in Next.js — sitemap.ts for Large Sites" description: "How to generate dynamic sitemaps in Next.js for sites with thousands of pages. Multi-sitemap index, database queries, and revalidation strategies." tags: ["sitemap", "nextjs", "technical-seo"]
You launched a massive programmatic SEO campaign. 15,000 new pages generated from your database. Three weeks later, Google Search Console shows exactly 42 pages indexed. The culprit: your static public/sitemap.xml file hasn't been updated since 2022, and Googlebot gave up trying to crawl your site's infinite pagination. You need a dynamic sitemap that scales with your database. Writing manual XML strings is error-prone, and third-party packages add unnecessary dependencies. Next.js provides built-in API primitives to handle this directly in the App Router.
What is the Next.js sitemap.ts convention?
The sitemap.ts file in the Next.js App Router automatically generates an XML sitemap route at build time or request time by returning an array of URL objects.
Drop a sitemap.ts file in the root of your app directory. Next.js executes the default exported function and compiles the returned array into valid XML. You no longer need next-sitemap or custom API routes returning Content-Type: text/xml.
// app/sitemap.ts
import { MetadataRoute } from 'next'
export default function sitemap(): MetadataRoute.Sitemap {
return [
{
url: 'https://indxel.com',
lastModified: new Date(),
},
{
url: 'https://indxel.com/docs',
lastModified: new Date(),
},
{
url: 'https://indxel.com/blog',
lastModified: new Date(),
},
]
}When you navigate to /sitemap.xml, Next.js outputs:
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://indxel.com</loc>
<lastmod>2024-05-12T14:32:00.000Z</lastmod>
</url>
<!-- ... -->
</urlset>How do you generate dynamic sitemaps from a database?
Fetch your records inside the sitemap() function using your ORM, map the results to the Next.js MetadataRoute.Sitemap type, and return the combined array.
Since sitemap.ts is a Server Component environment, you can query your database directly. Fetch your static routes, fetch your dynamic routes, combine them, and return the array.
// app/sitemap.ts
import { MetadataRoute } from 'next'
import { db } from '@/lib/db'
export default async function sitemap(): Promise<MetadataRoute.Sitemap> {
const baseUrl = 'https://indxel.com'
// Fetch all published blog posts
const posts = await db.post.findMany({
where: { published: true },
select: { slug: true, updatedAt: true },
})
// Map database records to sitemap format
const postUrls = posts.map((post) => ({
url: `${baseUrl}/blog/${post.slug}`,
lastModified: post.updatedAt,
}))
// Define static routes
const staticUrls = [
{
url: baseUrl,
lastModified: new Date(),
},
{
url: `${baseUrl}/pricing`,
lastModified: new Date(),
},
]
return [...staticUrls, ...postUrls]
}Use absolute URLs. Google rejects sitemaps containing relative paths like /blog/my-post. Always prepend your production domain.
How do you handle sites with more than 50,000 URLs?
Use the generateSitemaps function alongside sitemap() to split your URLs into multiple files, creating a sitemap index automatically.
The sitemap protocol strictly limits a single XML file to 50,000 URLs and 50MB uncompressed. If you query 120,000 products and return them in a single sitemap.ts file, Google Search Console will reject the file entirely. Next.js handles this via the generateSitemaps export.
First, generateSitemaps returns an array of objects, typically representing pagination offsets or category IDs. Then, Next.js calls your sitemap() function multiple times, passing the id from generateSitemaps as a prop.
// app/sitemap.ts
import { MetadataRoute } from 'next'
import { db } from '@/lib/db'
const ITEMS_PER_SITEMAP = 50000
// 1. Tell Next.js how many sitemaps to generate
export async function generateSitemaps() {
const totalProducts = await db.product.count()
const totalSitemaps = Math.ceil(totalProducts / ITEMS_PER_SITEMAP)
// Returns [{ id: 0 }, { id: 1 }, { id: 2 }]
return Array.from({ length: totalSitemaps }).map((_, i) => ({
id: i,
}))
}
// 2. Generate the specific sitemap based on the ID
export default async function sitemap({
id,
}: {
id: number
}): Promise<MetadataRoute.Sitemap> {
const skip = id * ITEMS_PER_SITEMAP
const products = await db.product.findMany({
skip,
take: ITEMS_PER_SITEMAP,
select: { slug: true, updatedAt: true },
})
return products.map((product) => ({
url: `https://example.com/products/${product.slug}`,
lastModified: product.updatedAt,
}))
}When you request /sitemap.xml, Next.js automatically outputs a Sitemap Index file pointing to /sitemap/0.xml, /sitemap/1.xml, etc.
You do not need to build a custom sitemap-index.xml file. Next.js intercepts the root /sitemap.xml request and serves the index automatically when generateSitemaps is present.
Should you include changefreq and priority?
No. Google officially ignores the changefreq and priority fields, relying instead on the lastmod date to determine if a page needs recrawling.
Developers often waste compute power writing complex algorithms to assign a 0.9 priority to the homepage and 0.5 to older blog posts. Google's Gary Illyes confirmed years ago that Googlebot ignores these values. The only field that heavily influences crawl behavior is lastmod.
| Field | Required | Google Support | Developer Action |
|---|---|---|---|
loc (URL) | Yes | Full | Pass absolute URL. |
lastmod | No | High | Pass the updatedAt timestamp from your database. |
changefreq | No | Ignored | Omit. Saves bytes. |
priority | No | Ignored | Omit. Saves compute. |
If you must satisfy a legacy auditing tool that flags missing priorities, you can add them. But for actual search engine behavior, strictly map your database updatedAt fields to lastModified and drop the rest.
How do you include images in a Next.js sitemap?
Add an images array to each URL object in your sitemap.ts return array, specifying the absolute image URL and optional metadata like titles.
Image sitemaps dictate exactly which images Google associates with your page content. If you run an e-commerce store with React image galleries that load client-side, Googlebot might miss them. Injecting them into the sitemap forces discovery.
// app/sitemap.ts
import { MetadataRoute } from 'next'
export default function sitemap(): MetadataRoute.Sitemap {
return [
{
url: 'https://example.com/products/mechanical-keyboard',
lastModified: new Date(),
images: [
'https://example.com/images/keyboard-front.jpg',
'https://example.com/images/keyboard-side.jpg'
],
},
]
}Next.js will compile this into the image:image XML namespace automatically.
How do you revalidate sitemaps in Next.js?
Export a revalidate route segment config variable from your sitemap.ts file to cache the XML output and regenerate it in the background via ISR.
Database queries for 50,000 URLs are slow. If your database query takes 3 seconds, you block the request every time Googlebot (or an SEO crawler) hits /sitemap.xml. You fix this by treating your sitemap like any other Next.js page leveraging Incremental Static Regeneration (ISR).
// app/sitemap.ts
import { MetadataRoute } from 'next'
import { db } from '@/lib/db'
// Cache the sitemap for 1 hour (3600 seconds)
export const revalidate = 3600
export default async function sitemap(): Promise<MetadataRoute.Sitemap> {
// Query executes once per hour, regardless of traffic
const products = await db.product.findMany()
// ...
}The first request generates the XML and caches it. Subsequent requests serve the cached file instantly. After one hour, the next request serves the stale cache while Next.js triggers a background regeneration.
How does robots.ts interact with your dynamic sitemap?
A robots.ts file dynamically outputs your robots.txt directives and explicitly points search engine crawlers to the absolute URL of your sitemap index.
Sitemaps are useless if crawlers can't find them. While you can submit sitemaps manually in Google Search Console, declaring them in robots.txt handles discovery for Bing, Ahrefs, Semrush, and other bots. Next.js provides a robots.ts convention exactly like sitemap.ts.
// app/robots.ts
import { MetadataRoute } from 'next'
export default function robots(): MetadataRoute.Robots {
return {
rules: {
userAgent: '*',
allow: '/',
disallow: '/private/',
},
sitemap: 'https://indxel.com/sitemap.xml',
}
}How do you measure the impact of dynamic sitemaps?
Track the ratio of "Discovered - currently not indexed" to "Indexed" URLs in Google Search Console after deploying your dynamic sitemap.
A typical Next.js e-commerce app with 120,000 SKUs takes 4.5 seconds to generate its sitemap index dynamically on every request. By switching to generateSitemaps with a 24-hour ISR cache (revalidate = 86400), server response time drops to 50ms. Crawl budget efficiency increases, resulting in a 94% indexation rate within 14 days.
Before you ship to production, you must validate that your dynamically generated URLs actually return a 200 OK status. Feeding Google a sitemap full of 404s damages your crawl rate limit.
Use the Indxel CLI to validate your local or preview sitemap in CI. The CLI outputs warnings in the same format as ESLint — one line per issue, with the file path and rule ID.
npx indxel check --sitemap https://preview-url.vercel.app/sitemap.xmlIndxel Validation Report
Checking 3,402 URLs from sitemap index...
✖ 3 critical errors found
/products/discontinued-item [404 Not Found]
/blog/draft-post [403 Forbidden]
/categories/old-category [Redirects to /categories/new] - Sitemaps must contain terminal URLsFrequently Asked Questions
Can I use both a static sitemap.xml and a dynamic sitemap.ts?
No. Next.js will throw a build error if you include both a sitemap.xml and a sitemap.ts file in the same route segment. Delete your static XML file when migrating to the TypeScript convention.
How do I submit my dynamic sitemap to Google Search Console?
Add the sitemap URL to your robots.txt file and submit the index URL directly in the GSC dashboard under the "Sitemaps" tab. You only need to submit the root /sitemap.xml. Google will automatically parse the index and discover /sitemap/0.xml and subsequent files.
How do I exclude certain pages from the sitemap?
Filter them out of your database query or omit them from the returned array in sitemap.ts. If you have a noindex column in your database, add a where: { noindex: false } clause to your ORM query. Sitemaps should only contain canonical, indexable URLs that return a 200 status code.
Does sitemap.ts work with the Pages Router?
No. The sitemap.ts convention is exclusive to the Next.js App Router. If you are using the Pages Router, you must create a custom API route or use getServerSideProps in a pages/sitemap.xml.js file to manually set the text/xml header and return an XML string.
Validate your sitemap setup in CI before Googlebot crawls your dead links. Add this to your GitHub Actions workflow:
name: SEO Validation
on: [pull_request]
jobs:
validate-sitemap:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run Indxel Sitemap Check
run: npx indxel check --sitemap ${{ github.event.deployment_status.environment_url }}/sitemap.xml --ci