The BBC's RSS Feed — Stuart Breckenridge

Due to the incorrect way the BBC’s RSS 2.0 feed handles guids, RSS readers are repeatedly left displaying duplicate articles.

Let’s have a look at why this happens with a sample article from their feed:

<item>
    <title>
        <![CDATA[
            'We fell off the face of the earth': Dad-daughter duo who took on 7,500 miles for TV
        ]]>
    </title>
    <description>
        <![CDATA[
            Molly Clifford and her father are part of this year's line up for the BBC's Race Across the World.
        ]]>
    </description>
    <link>
        https://www.bbc.com/news/articles/c9951jrr18no?at_medium=RSS&at_campaign=rss
    </link>
    <guid isPermaLink="false">https://www.bbc.com/news/articles/c9951jrr18no#3</guid>
    <pubDate>Fri, 03 Apr 2026 05:19:07 GMT</pubDate>
    <media:thumbnail width="240" height="135" url="https://ichef.bbci.co.uk/ace/standard/240/cpsprodpb/bb22/live/0bdf4fa0-2db9-11f1-934f-036468834728.jpg"/>
</item>

Specifically, let’s focus on the guid:

<guid isPermaLink="false">https://www.bbc.com/news/articles/c9951jrr18no#3</guid>

What I’ve seen the BBC doing is incrementing the suffix after the # and, as per the RSS 2.0 specification below, RSS readers tend to treat each incremented guid as a new entry:

guid stands for globally unique identifier. It’s a string that uniquely identifies the item. When present, an aggregator may choose to use this string to determine if an item is new.

The above article has been fetched by Gobbler twice and the title had changed between fetches:

guid	title	content hash
https://www.bbc.com/news/articles/c9951jrr18no#2	’We fell off the face of the earth’: Dad and daughter raced across world but had to keep it secret	a8159e96
https://www.bbc.com/news/articles/c9951jrr18no#3	’We fell off the face of the earth’: Dad-daughter duo who took on 7,500 miles for TV	17cbc6b7

Strictly speaking, the RSS 2.0 specification doesn’t prohibit a guid from changing. Additionally, there are no update semantics available (e.g., an updatedDate element) in the 2.0 specification. So, in this scenario with a change of title, an incremented guid is almost justifiable.

However, this isn’t always the case. Let’s look at a different example in the Gobbler database:

guid	title	content hash
https://www.bbc.com/news/articles/cyv1q9gz39do#0	How English-only condolences undid one of Canada’s top CEOs	8845f9d6
https://www.bbc.com/news/articles/cyv1q9gz39do#1	How English-only condolences undid one of Canada’s top CEOs	8845f9d6
https://www.bbc.com/news/articles/cyv1q9gz39do#3	How English-only condolences undid one of Canada’s top CEOs	8845f9d6

Gobbler has fetched this article three times. The article hasn’t changed at all: same title, same content, and same published date¹, all validated by the content_hash. This is simply not justifiable. There is no reason to change the guid if the article hasn’t changed.

What could the BBC do differently?

First, don’t change the guid when the article content hasn’t changed. Just don’t.

Second, if the article has been updated, use <atom:updated> in the <item>. The feed declares the Atom namespace and already uses it:

<atom:link href="https://feeds.bbci.co.uk/news/uk/rss.xml" rel="self" type="application/rss+xml"/>

Lastly, and this is a bit of a stretch goal, put the full content of each article in the feed instead of a summary.

I couldn’t fit everything in the table. ↩

Footnotes