Check the page as the robot sees it. How to upgrade to the new version of Search Console

All tips

We've released a new book, Social Media Content Marketing: How to Get Inside Your Followers' Heads and Make Them Fall in Love with Your Brand.

Robot crawlers are a kind of stand-alone browser programs. They go to the site, scan the contents of the pages, make a text copy and send it to the search database. Its indexing in a search engine depends on what crawlers see on your site. There are also more specialized spider programs.

“Mirrorers” - recognize duplicate resources.
“Woodpeckers” determine the accessibility of the site.
" " - robots for reading frequently updated resources. As well as programs for scanning pictures, icons, determining the frequency of visits and other characteristics.

What does the robot see on the site?

Resource text.
Internal and external links.
HTML code of the page.
Server response.
Robots file txt is the main document for working with the spider. In it you can set some parameters to attract the robot’s attention, while others, on the contrary, cannot be viewed. Also, when you visit the site again, the crawler uses exactly this file.

In what form does the robot see the site page?

There are several ways to look at a resource through the eyes of a program. If you are a website owner, then Google came up with Search Console for you.

Add a resource to the service. Read how this can be done.
After that, select the “View as” tool Googlebot ».
Click “Receive and display”. After scanning, the result will be like this.

This method displays the most complete and accurate picture of how the robot sees the site. If you are not the owner of the resource, then there are other options for you.

The simplest is through a saved copy in a search engine.

Let's assume that the resource has not yet been indexed and you cannot find it in a search engine. In this case, to find out how the robot sees the site, you need to perform the following algorithm.

Install Mozila Firefox.
Add a plugin to this browser.
A bar will appear below the URL field in which we:
in “Cookies” select “Disable Cookies”;
in “Disable” click on “Disable JavaScript” and “Disable ALL JavaScript”.
Be sure to reload the page.
All in the same tool:
in “CSS” click on “Disable styles” and “Disable all styles”;
and in “Images” check the “Display ALT attributes” and “Disable ALL images” checkboxes. Ready!

Why do you need to check how the robot sees the site?

When a search engine sees one information on your site, and the user sees another, it means that the resource appears in the wrong search results. Accordingly, the user will hastily leave it without finding the information he is interested in. If a large number of visitors do this, then your site will drop to the very bottom of the search results.

You need to check at least 15-20 pages of the site and try to cover all types of pages.

It happens that some cunning people deliberately pull off such scams. Well, for example, instead of a website about soft toys, they are promoting some casino “Kukan”. Over time, the search engine will (in any case) detect this and send such a resource under filters.

18.01.2016 12:00
1169 Reads

The world has gone crazy with news about robotics, with almost every day there are reports about the beginning of the robot revolution. But how justified is all this advertising hype, excitement, and sometimes fears? Is the robot revolution really starting?

In response, we can note that in some areas of our lives we are likely to see new additions to robots in the near future. But in reality, we shouldn't expect dozens of robots to take to the streets or roam our offices in the very near future.

And one of the main reasons for this is that robots do not have the ability to truly see the world. But before we talk about how robots in the future will be able to see the world, we first need to understand what vision actually involves.

How do we see?

Most people have two eyes and we use them to collect light that reflects off objects around us. Our eyes convert this light into electrical signals, which are transmitted along the optic nerves and immediately processed by our brain.

Our brain somehow determines what is around us based on all these electrical impulses and our own experiences. All this creates an idea of the world and allows us to navigate, helps us pick up things, allows us to recognize each other's faces and do a million other things that we take for granted. All activities, from collecting light in our eyes to understanding the world around us, are what provide us with the ability to see.

Researchers estimate that up to 50% of our brain volume is used to service vision. Almost all animals have eyes and can partially see. At the same time, most animals and insects have a much simpler brain than humans. But it works well.

Thus, some forms of vision can be achieved without the massive, computer-level power of the mammalian brain. The ability to see is clearly dictated by its essential usefulness in the process of evolution.

Robot vision

So it's no surprise that many robotics researchers are predicting that if a robot can see, we'll likely actually see a boom in robotics development. And robots may finally become real human assistants, which is what many people so want.

How do we teach robots to see? The first part of the answer to this question is very simple. We use a video camera, just like the one in your smartphone, to capture a constant stream of images. Robot video camera technology itself is a serious subject of research. But for now let's just imagine a standard video camera. We feed these images into the computer and then there are different options.

Since the 1970s, developers have been improving computer vision systems for robots and studying the characteristic features of images. These can be lines or points of interest such as corners or certain textures. Programmers create algorithms to find these signatures and track them frame by frame in the video stream.

This significantly reduces the amount of data from millions of pixels in an image to several hundreds or thousands of characteristic fragments.

In the recent past, when computing power was more limited, this was very important. Next, engineers think about what the robot is likely to see and what it should do. They are creating software that will simply recognize patterns to help the robot understand what is around it.

Environment

The software can only create a basic picture of the environment in which the robot operates, or it can attempt to match the found characteristic features with a library of primitives from the built-in software.

Essentially, robots are programmed by humans to see things that humans think the robot needs to see. There are many successful examples of the implementation of such computer vision systems, but practically today there are no robots that are able to navigate their environment only through machine vision.

Such systems are not yet reliable enough to reliably prevent the robot from falling and colliding while moving. Self-driving cars, which have been the talk of the town lately, use lasers or radar in addition to a machine vision system.

In the last five to ten years, research and development of a new generation of machine vision systems has begun. These studies made it possible to create systems that are not programmed, as before, but that study what they see. Vision systems for robots have been developed by analogy with how scientists imagine the principles of vision in animals. That is, they use the concept of neural layers, like in animal brains. Developers create the structure of the system, but do not lay down the algorithm on the basis of which this system operates. In other words, they leave it up to the robot to improve it.

This method is known as machine learning. Such technologies are now beginning to be implemented due to the fact that serious computing power has become available at a reasonable cost. Investments in these technologies are occurring at an accelerated pace.

Collective mind

The importance of robot learning also lies in the fact that they can easily share their knowledge. Each robot will not have to learn everything from scratch, like a newborn animal. The new robot can act by taking into account the actions and relying on the experience of other robots.

Equally important, robots that share experiences can also learn together. For example, each of a thousand robots can observe different cats and share this data with each other via the Internet. This way they can learn to classify all the cats together. This is an example of distributed learning.

The fact that robots in the future will be able to learn collaboratively and in a distributed manner has profound implications and, while frightening to some, captures the imagination of others.

Real robot revolution

Today there are many applications for robots that can see. It is not difficult to find areas in our lives where such robots can help.

The first uses of robots that can see are likely to be in industries that are experiencing labor shortages, such as agriculture, or are inherently unattractive to humans and could be dangerous. For example, search work after natural disasters, evacuating people from dangerous areas, or working in confined and hard-to-reach spaces.

Sometimes people find it difficult to maintain attention over a long period of observation, which can also be achieved with the help of a robot that can see. Our future robotic companions at home will be much more useful if they can see us.

And in the operating room, apparently, we will soon see robots that will assist surgeons. The robot's perfect vision, super precise clamps and arms will allow surgeons to focus on the main task - choosing a solution.

Upgrade guide for legacy users

We are developing a new version of Search Console, which will eventually replace the old service. In this guide, we will cover the main differences between the old and new versions.

General changes

In the new version of Search Console we have implemented the following improvements:

Search traffic data can be viewed for 16 months instead of the previous three.
Search Console now provides detailed information about specific pages. This information includes canonical URLs, indexing status, degree of mobile optimization, etc.
The new version includes tools that allow you to monitor the crawling of your web pages, fix related errors and submit requests for re-indexing.
The updated service offers both completely new tools and reports, as well as improved old ones. All of them are described below.
The service can be used on mobile devices.

Comparison of tools and reports

We're constantly working to improve various Search Console tools and reports, and you can already use many of them in the updated version of this service. Below, the new versions of reports and tools are compared with the old ones. The list will be updated.

Old version of the report

Analogue in the new version of Search Console

Comparison

Search query analysis

The new report provides data for 16 months, and it has become more convenient to work with.

Helpful hints

Rich Results Status Reports

New reports provide detailed information to help troubleshoot errors and make it easy to request rescans.

Links to your site
Internal links

Links

We have merged two old reports into one new one and improved the accuracy of link counting.

Indexing status

Indexing report

The new report contains all the data from the old one, as well as detailed information about its status in the Google index.

Sitemap report

Sitemap report

The data in the report remains the same, but we have improved its design. The old report supports testing the Sitemap without submitting it, but the new one does not.

Accelerated Mobile Pages (AMP)

AMP Page Status Report

The new report adds new types of errors for which you can view information, and also allows you to send a rescan request.

Manual measures

Manual measures

The new version of the report provides a history of manual actions taken, including information about review requests submitted and review results.

Google crawler for websites

URL Checker Tool

In the URL Inspection Tool, you can view information about the version of the URL included in the index and the version available online, and submit a crawl request. Added information about canonical URLs, noindex and nocrawl blocks, and the presence of URLs in the Google index.

Ease of viewing on mobile devices

Ease of viewing on mobile devices

The data in the report remains the same, but working with it has become more convenient. We've also added the ability to request a page be rescanned after mobile viewing issues have been fixed.

Scan Error Report

Indexing report And URL checking tool

Site-level crawl errors are shown in the new indexing report. To find errors at the individual page level, use the new URL inspection tool. New reports help you prioritize issues and group pages with similar issues to identify common causes.

The old report showed all errors for the last three months, including irrelevant, temporary and insignificant ones. A new report highlights issues important to Google that have been uncovered over the past month. You will only see issues that could cause the page to be removed from the index or prevent it from being indexed.

Issues are shown based on priority. For example, 404 errors are only marked as errors if you requested the page to be indexed through a sitemap or other method.

With these changes, you'll be able to focus more on the issues that affect your site's position in Google's index, rather than having to deal with a list of every error Googlebot has ever found on your site.

In the new indexing report, the following errors have been converted or are no longer shown:

URL Errors - For Desktop Users

Old error type	Analogue in the new version
server error	In the indexing report, all server errors are indicated with the mark Server error (5xx).
False error 404	Error: The submitted URL returns a false 404 error. Eliminated: false 404 error.
Access is denied	The indexing report indicates one of the following categories, depending on whether you requested processing for this type of error: Error: The submitted URL returns a 401 (unauthorized request) error. Excepted: The page was not indexed due to a 401 error (unauthorized request).
Not found	The indexing report indicates in one of the following ways, depending on whether you requested processing for this type of error: Error: The submitted URL was not found (404). Excluded: not found (404).
Other	The indexing report is indicated as Scan Error.

URL Errors – For Smartphone Users

Currently, errors occurring on smartphones are not shown, but we hope to include them in the report in the future.

Site errors

In the new version of Search Console, site errors are not shown.

Security Issue Report

New security issue report

The new Security Issues Report retains much of the functionality of the old report and adds a history of site issues.

Structured Data

Rich Results Checker Tool And rich results status reports

To process individual URLs, use the Rich Results Inspector or URL Inspection Tool. You can find site-wide information in the rich results status reports for your site. Not all rich results data types are available yet, but the number of reports is growing.

HTML optimization

–

There is no similar report in the new version. To create informative page titles and descriptions, follow our guidelines.

Blocked resources

URL Checker Tool

There is no way to view blocked resources across the entire site, but using the URL Inspection tool you can see blocked resources for each individual page.

Android Applications

–

Starting March 2019, Search Console will no longer support Android apps.

Resource Kits

–

Starting March 2019, Search Console will no longer support resource sets.

Do not provide the same information twice. Data and queries contained in one version of Search Console are automatically duplicated in the other. For example, if you submitted a review request or sitemap in the old Search Console, you do not need to submit it again in the new one.

New ways to perform common tasks

The new version of Search Console performs some legacy operations differently. The main changes are listed below.

Features not currently supported

The following features are not yet implemented in the new version of Search Console. To use them, return to the previous interface.

Scanning statistics (the number of pages scanned per day, their loading time, the number of kilobytes downloaded per day).
Checking the robots.txt file.
Managing URL parameters in Google Search.
Marker tool.
Reading and managing messages.
"Change Address" tool.
Specifying the primary domain.
Linking a Search Console property to a Google Analytics property.
Disavowing links.
Removing obsolete data from the index.

Was this information useful?

How can this article be improved?

Promoting your website should include optimizing your pages to attract the attention of search engine spiders. Before you start creating a search engine friendly website, you need to know how bots view your site.

Search engines are not actually spiders, but small programs that are sent to analyze your site after they learn the URL of your page. Search engines can also reach your site through links to your website left on other Internet resources.

As soon as the robot reaches your website, it will immediately begin indexing pages by reading the contents of the BODY tag. It also fully reads all HTML tags and links to other sites.

Search engines then copy the site's content into a main database for indexing. This process in total can take up to three months.

Search Engine Optimization not such an easy matter. You must create a site that is spider friendly. Bots don't pay attention to flash web design, they just want information. If you looked at the website through the eyes of a search robot, it would look pretty stupid.

It’s even more interesting to look through the eyes of a spider at your competitors’ websites. Competitors not only in your field, but simply popular resources that may not need any search engine optimization. In general, it’s very interesting to see how different sites look through the eyes of robots.

Text only

Search robots see your site to a greater extent, as text browsers do. They love text and ignore information contained in pictures. Spiders can read about the picture if you remember to add an ALT tag with a description. Web designers who create complex websites with beautiful pictures and very little text are deeply disappointed.

In fact, search engines simply love any text. They can only read HTML code. If you have a lot of forms or javascript or anything else on your page that might block a search engine from reading the HTML code, the spider will simply ignore it.

What search robots want to see

When a search engine crawls your page, it looks for a number of important things. Having archived your site, the search robot will begin to rank it in accordance with its algorithm.

Search spiders protect and often change their algorithms so that spammers cannot adapt to them. It is very difficult to design a website that will rank high in all search engines, but you can get some advantage by including the following elements in all your web pages:

Keywords
META tags
Titles
Links
The selected text

Read like a search engine

After you have developed a website, all you have to do is develop it and promote it in search engines. But looking at a site only in a browser is not the best or successful technique. It is not very easy to evaluate your work impartially.

It is much better to look at your creation through the eyes of a search simulator. In this case, you will get much more information about the pages and how the spider sees them.

We have created a search engine simulator that is not bad, in our humble opinion. You will be able to see the web page as a search spider sees it. It will also show the number of keywords you entered, local and outbound links, and so on.