Setting Up a Custom Sitemap Generator for Sitecore XM Cloud

The Challenge: Generating Sitemaps for Multiple Websites and Languages Recently, while working on a Sitecore XM Cloud project with multiple websites and a complex language setup, as a Solution Archite

· Sitecore , Sitecore AI , Tutorials

The Challenge: Generating Sitemaps for Multiple Websites and Languages

Recently, while working on a Sitecore XM Cloud project with multiple websites and a complex language setup, as a Solution Architect, I encountered a challenge: how do we efficiently generate a sitemap for multiple websites, each supporting multiple languages, in a headless environment?

Sitecore XM Cloud doesn’t have built-in sitemap generation like its XP counterpart, and since we were using a headless architecture with Next.js, we needed a custom solution that would:

  1. Dynamically generate XML sitemaps for each website and language variation.
  2. Optimize performance by leveraging caching and API calls instead of querying Sitecore directly every time.
  3. Ensure search engines like Google can easily crawl and index the content.

Here’s how I tackled the problem, complete with configurations and code snippets.


Step 1: Understanding the Sitemap Structure

Each website needed its own sitemap, with URLs structured based on the site’s language variations. A typical sitemap for our setup would look like this:

https://www.example.com/sitemap.xml (Main sitemap index)
https://www.example.com/sitemap-en.xml (English sitemap)
https://www.example.com/sitemap-fr.xml (French sitemap)        

Since we had 38 domains in scope and multiple language variants for each, the sitemap needed to be generated dynamically.


Step 1: Understanding the Sitemap Structure

Each website needed its own sitemap, with URLs structured based on the site’s language variations. A typical sitemap for our setup would look like this:

https://www.example.com/sitemap.xml (Main sitemap index)
https://www.example.com/sitemap-en.xml (English sitemap)
https://www.example.com/sitemap-fr.xml (French sitemap)        

Since we had 38 domains in scope and multiple language variants for each, the sitemap needed to be generated dynamically.


Step 2: Creating the Sitemap Generator API in Next.js

Since our front-end was built with Next.js, we created a serverless API route to dynamically generate the sitemap.

Setting Up the Sitemap API Route

In your Next.js project, create a new API route at pages/api/sitemap.js:

import { SitemapStream, streamToPromise } from 'sitemap';
import { NextApiRequest, NextApiResponse } from 'next';
import fetch from 'node-fetch';
export default async function handler(req, res) {
    try {
        const sitemap = new SitemapStream({ hostname: 'https://www.example.com' });
        const urls = await fetchSitecoreRoutes();
        urls.forEach(url => {
            sitemap.write({ url: url.loc, changefreq: 'daily', priority: 0.8 });
        });
        sitemap.end();
        const sitemapXml = await streamToPromise(sitemap);
        res.setHeader('Content-Type', 'application/xml');
        res.send(sitemapXml.toString());
    } catch (error) {
        console.error('Error generating sitemap:', error);
        res.status(500).end();
    }
}        

Setting Up the Sitemap API Route

In your Next.js project, create a new API route at pages/api/sitemap.js:

import { SitemapStream, streamToPromise } from 'sitemap';
import { NextApiRequest, NextApiResponse } from 'next';
import fetch from 'node-fetch';
export default async function handler(req, res) {
    try {
        const sitemap = new SitemapStream({ hostname: 'https://www.example.com' });
        const urls = await fetchSitecoreRoutes();
        urls.forEach(url => {
            sitemap.write({ url: url.loc, changefreq: 'daily', priority: 0.8 });
        });
        sitemap.end();
        const sitemapXml = await streamToPromise(sitemap);
        res.setHeader('Content-Type', 'application/xml');
        res.send(sitemapXml.toString());
    } catch (error) {
        console.error('Error generating sitemap:', error);
        res.status(500).end();
    }
}        

Fetching Sitecore Routes

Since we were using headless Sitecore, we had an API that returned the list of valid URLs for each website and language. Here’s how we fetched those URLs:

async function fetchSitecoreRoutes() {
    const response = await fetch('https://your-sitecore-api-endpoint/api/routes');
    if (!response.ok) {
        throw new Error(`Failed to fetch routes: ${response.statusText}`);
    }
    return await response.json();
}        

This API fetches all published pages from Sitecore and constructs URLs for the sitemap.


Step 3: Handling Multi-Site and Multi-Language Configurations

Each site in our Sitecore instance had its own root node and language versions. To dynamically fetch pages for each site and language, we updated our fetchSitecoreRoutes function to include parameters for site and language.

async function fetchSitecoreRoutes(site, language) {
    const apiUrl = `https://your-sitecore-api-endpoint/api/routes?site=${site}&lang=${language}`;
    const response = await fetch(apiUrl);
    if (!response.ok) {
        throw new Error(`Failed to fetch routes for ${site} (${language}): ${response.statusText}`);
    }
    return await response.json();
}        

Now, when generating sitemaps, we iterate over each site and language:

const sites = [
    { name: 'example', languages: ['en', 'fr', 'es'] },
    { name: 'example2', languages: ['en', 'de'] }
];
sites.forEach(async site => {
    site.languages.forEach(async lang => {
        const urls = await fetchSitecoreRoutes(site.name, lang);
        generateSitemap(urls, site.name, lang);
    });
});        

This ensures that each website gets a properly structured sitemap for every language.


Step 4: Caching the Sitemap for Performance

Since generating the sitemap on every request would be inefficient, we leveraged Next.js Incremental Static Regeneration (ISR) to cache the sitemap for a fixed duration.

We modified the API to use ISR by updating getStaticProps:

export async function getStaticProps() {
    const urls = await fetchSitecoreRoutes();
    return {
        props: { urls },
        revalidate: 86400, // Regenerate once per day
    };
}        

This drastically reduced the load on our API and ensured the sitemap stayed updated daily.


Step 5: Submitting the Sitemap to Search Engines

Finally, we needed to submit the sitemap to Google Search Console and Bing Webmaster Tools.

  1. Google Search Console:
  2. Bing Webmaster Tools:

For automation, we used the following curl command:

curl -X POST "https://www.google.com/ping?sitemap=https://www.example.com/sitemap.xml"
curl -X POST "https://www.bing.com/ping?sitemap=https://www.example.com/sitemap.xml"        

This ensured that search engines were always aware of the latest sitemap updates.


Building a custom sitemap generator for Sitecore XM Cloud in a multi-site, multi-language headless environment was an exciting challenge. By leveraging Next.js API routes, dynamic route fetching, caching, and search engine submission, we ensured a scalable and efficient solution.

This approach is extendable – you can add priority logic, exclude certain pages, or even generate image sitemaps using the same principles.

🚀 Have you built a custom sitemap generator for Sitecore XM Cloud? Let me know your approach and any optimizations you've implemented!


Photograph of Ashish Kapoor

About the author

Ashish Kapoor

Global Director of Marketing Technology | Chief Technology Advisor | Architecting the Future with SaaS MACH & Agentic AI | 2x Sitecore Ambassador MVP

  • 21+ years in enterprise product architecture
  • Sitecore MVP Ambassador (2023, 2024)
  • Global digital delivery across 40+ countries
  • 100+ AI agents shipped in production
  • $2M+ MarTech rationalisation savings
Read the full bio