Logs analysis and Crawl budget
What data does Google's robots collect?
Among the information collected by Google's bots, we find:
- the request date
- the url of the page that has been explored
- the code response of the page (200, 301, 404, etc.)
- the user agent
- the IP address
- the referer (the url source)
All these data are recorded on log files and it is partly thanks to the analysis of these logs that we can determine and better understand the behavior of Google's bots on a website. This analysis will allow us to know which pages Google is exploring the most, which pages are not being crawled by the robots but also how often it is exploring the site and how this affects organic traffic (SEO).
The question is whether Google perceives the site in the same way as us and makes sure that it puts forward the main pages of the site, distributing them popularity - or Google Juice.
How to exploit server logs?
In order to carry out a log analysis, and once the server logs are recovered, it is necessary to use an SEO platform which will allow decrypting log files and explain how Google browses the site in question. For this there are several types of platforms that can exploit server logs such as Botify, Oncrawl or Logs Data Platform (OVH).
Once the platform chosen for the analysis of your logs as well as your log files plugged to the tool, all you have to do is understand, translate and analyze what Google sees, or at least its robots explorations.
What should be understood in the log analysis?
You should know that Google allocates some time of passage on a site for the good exploration of pages by robots, and this time allocated is called the crawl budget.
The purpose of the log analysis is to optimize the crawl budget so that all pages of the site are crawled by the robots and therefore referenced on the search engines.
For that, it is necessary to study the various data which we can extract via the server log files. And thanks to the analysis of these logs, we can get:
- Crawl ratio: The number of crawled pages by Google's crawlers vs the number of unsearched pages
- The global crawling frequency: the number of times that pages are visited by bots per day.
- Inactive pages (not receiving traffic) vs. active pages
- The number of SEO visits from these orphan pages (pages not linked within the site structure) vs the number of SEO visits from the pages in the structure
- The number of new crawled pages
Dialekta can therefore assist you in the analysis of your site's logs to help you better understand how Google sees your site and which pages are most popular in the eyes of Google.