Requesting feed content from the wrong location critical
It's common for feed applications to grab content from some unconventional location on a website to generate a feed because the website doesn't provide it. However, even though the site already tells this app where feeds are located, the app is still attempting to extract content from other areas on the website that clearly aren't feeds.
Why it's a problemWhen a feed app doesn't request content from the right location or tries getting content from areas on a site not designated for feed consumption, the activity looks suspicious. This is likely to cause website owners to block this app from accessing its content.
What it means for usersFeed content obtained from the wrong location can be unpredictable and may not display correctly in the app. If websites block this app because of the behavior, which is likely, feeds won't work at all.
How to fix itWhen the app visits a webpage to retrieve feed content, it should:
-
Get the website's feed location from the
autodiscovery
linkelement of the page and use that location for all subsequent feed requests - Update any attempts by its users to subscribe to the incorrect location in the application to the correct new location without making any further requests to the website for this info
- Ensure the request isn't made to a location on the website that has been explicitly disallowed in its robots.txt file
Too many requests to non-existent content critical
A large number of requests are being made by this application for feed content across websites where no feed content exists.
Why it's a problemIt causes unnecessary strain on websites and can negatively impact their performance. Because bad bots often behave in this same way, websites will likely block the app for being associated with this activity, regardless of the application's intent.
What it means for usersWhen a website blocks this application for this behavior, feeds will stall or stop working entirely.
How to fix itThe application should reduce the number of requests being made to pages on a website that don't exist, and consider using the website's site map or robots.txt file, if available.