How Open RSS feeds work
Open RSS feeds are just RSS feeds for websites that don't already offer them. Once you use them, you'll find that they're not very different from using any other RSS feed on the web, although they do have a few slight differences in how they work.
When you add openrss.org
to the beginning of the URL of a web page you're on, it uses Open RSS to build an RSS feed of that web page content and shows the resulting RSS feed to you in the browser. Below is an example of the Star Wars News feed.
It may not look like a glob of XML gibberish like many RSS feeds do. But it is indeed an RSS feed—it's just presented in a more user-friendly view.
Populating feeds
The content in each Open RSS feed is populated from some third-party source (usually the web page that matches the URL entered in your browser's location bar). When subscribed to the feed, it will continue to go back to check its content source for new content throughout the day. And when it detects new content, it'll add it to RSS feed.
Requesting content
When requesting new content for a feed, the Open RSS server will make a request sending our IP address along with the following unique user agent string to let the source know it's us requesting the content.
Open RSS uses multiple IP addresses when making requests for content and they can change at any given time. Because of this, we don't publish any public list of IPs.Storing content
We don't request new content for a feed from its source website every time the feed is accessed in an RSS reader. That could overload the website with too many requests for content and may cause serious harm to its server, given the large amount of feed requests we get every day. The version of the feed stored in our database is shown instead.
This limits the amount of unnecessary requests when there is likely no new content available and using the stored version of a feed when content hasn't change makes the feed respond much faster. Only after surpassing the frequency thresholds we set, will the feed request new content from the website source. Then, add the new website content to the feed in our database and show the updated contents once the feed is accessed in an RSS reader again. This process repeats itself multiple times a day.
Storage limits
The more data we store, the more it costs. As a nonprofit organization, we must keep our costs low. So we don't store feed content indefinitely. Generally, we delete feed data after a few months or whenever we deem it necessary.
Cleanup
The content in Open RSS feeds are taken from many different sources across the web. The content format and structure varies greatly between many of them, which can cause issues if used directly in an RSS reader. Therefore our systems will perform a few sanitization steps before rendering the content into an RSS feed to avoid any issues when used in an RSS reader.
- Excessive, irrelevant content (e.g. related content, sponsored messages, etc.) are filtered out
- Privacy invasive content that can track you, including beacons, links, and embedded media are removed
- Ads are removed
- Inaccurate titles and descriptions are improved
- Missing or incomplete summaries and content with the full text are filled in
- Low-resolution images are replaced with high-res ones and misaligned images re-adjusted for optimal viewing
- All unnecessary code and styling is stripped
Feed content
Content sources
The content in Open RSS feeds come from many different third-party sources including various APIs, websites, and even other RSS feeds—mostly websites, though. The content obtained from these sources will always be content that is or, at some point has been, publicly-accessible. The organization doesn't support showing content in RSS feeds that is intentionally private or behind a paywall.
When determining what source to use for a feed's content, we always use a source that would give the fastest response time with the least amount of effort. For instance, if a user attempts to generate an Open RSS feed for a web page and the page already has an associated RSS feed, we'll use that RSS feed as the source, and just improve its content to be used in the final Open RSS feed. If no RSS feed for the web page exists, we'll use it's available API. If there's no API, we'll resort to scraping the web page, and so on.
XML formats
When subscribed to in an RSS reader, Open RSS feeds will be in either RSS 2.0. and Atom format, depending on the source of it's content. To ensure RSS feeds are compatible with the largest amount of RSS readers, Open RSS feeds are served using the RSS 2.0 format if the source of the feed is from a website, web page, or if derived from another RSS 2.0 feed. However, if the source of the feed is derived from a feed that is already using Atom format, the feed will be served as an Atom feed as well.
Content limitations
The amount of items shown in a feed depends on how frequently the feed has new content. But most feeds will only show the most recent 4-10 items at a time. We don't store unlimited feed content and all content is purged after a few months. You'll have to use an RSS reader with the ability to store content indefinitely to access RSS feed content beyond those shown in an Open RSS feed.
Update frequency
When an RSS reader is subscribed to an Open RSS feed, the feed needs to repeatedly request content from its original source to ensure it always has the most up-to-date content. However, requesting new content doesn't happen every second of the day. There's a delay in requesting new content from the website to avoid overloading it with too many requests or making unnecessary requests for content that likely hasn't changed. This means that even though the website that powers the feed may have new content available, there may be a delay in that new content showing up in the feed.
The frequency at which new content for a feed is requested from its source website is determined automatically based on the average publish time of items in the feed's history. For instance, if the feed only posts new content every hour, that's likely how often new content will be requested and shown in the feed. However, there are exceptions: requests for new content in a feed will be done at a minimum once every 6 hours and no earlier than every 20 minutes.
Here are examples of three different Open RSS feeds that show how the average publish time is calculated from the last four items they published to determine how frequently new content is requested from their source.
First item published | Second item published | Third item published | Fourth item published | Average publish time | Requests new content | |
---|---|---|---|---|---|---|
Feed 1 | 5/10/2024 5:00 PM | 5/10/2024 5:25 PM | 5/10/2024 5:50 PM | 5/10/2024 6:15 PM | 25 minutes | Every 12.5 minutes |
Feed 2 | 12/1/2023 5:00 PM | 12/1/2023 5:01 PM | 12/1/2023 5:02 PM | 12/1/2023 5:03 PM | 1 minute | 20 minutes |
Feed 3 | 2/15/2022 1:00 AM | 2/25/2022 1:00 AM | 2/25/2022 8:00 AM | 2/25/2022 4:00 PM | 11 days | Every 6 hours |
Although most Open RSS feeds use the approach above, there are exceptional cases where we'd ditch our average calculation and manually adjust a feed's update frequency. Like if they're aren't enough items in the feed to do an accurate average calculation, or if a feed's source informs us that we're requesting feed content too often.
Open RSS feeds that are unused will no longer be updated with new content. Each RSS feed needs to be actively used in an RSS reader that will make requests to feeds on a frequent and consistent basis, which will trigger a request for content from their source.
Messaging
If we need to communicate some information around an Open RSS feed, it'll be shown in a banner on the feed preview when accessing the feed from a web browser. Each banner has a different color depending on the type of communication.
The information in each banner is dynamic, which means a different banner can be added, removed, or changed depending on when the feed is viewed or the timing of a situation that warrants it to be shown.
Type | Description |
---|---|
Info | Helpful information to note |
Warning | Something to be aware of, but doesn't really impact your use of the feed |
Error | A high-priority issue that will likely impact direct usage of the feed |
Source code
While we love open-source, we don't intend to open-source the code while the organization is in existence. This is to avoid companies who wish to retaliate against us from having made their content available in RSS feeds. However, all Open RSS code will be made open-sourced and released to the public under an appropriate license if the organization were to cease to exist, per our Bylaws.
Last Updated: 9 months ago