SEO Myths: All About Last-Modified Header. A blog in which there is a lot of interesting information ... Lastmodified take time from the server


The note: the adaptive version of the site is activated, which automatically adjusts to the small size of your browser and hides some of the site's details for readability. Happy viewing!

Hello dear blog readers. We continue the topic of one of the most important factors of SEO. This article will touch on what can be called the intricacies of internal optimization, since we will talk about the response code that search engines and visitors will receive in response to their visit to the page.

Correct server response

Despite the fact that this is a rather small detail when building and optimizing a website in general, it is, however, very important! Namely, it is important that a page on which there have been no changes since the last visit of a robot or a person gives 304 code, which means that the page has remained unchanged. When the server sends this code to the client, the execution of all PHP scripts on the page does not even start; instead, the page is loaded from the cache, which significantly reduces the load on the server and speeds up the page load for the user.

Thus, by configuring the correct answers from our server, we kill at least five birds with one stone:

  • We speed up page loading for visitors (people).
  • We reduce the load on the server.
  • The search results (for Yandex exactly) will show the date of the last page refresh, which can attract the user's attention, especially if the date is recent.
  • Site pages will participate in the sorting of search engines by date.
  • We significantly speed up the indexing of the site by search engines!

For some reason, the last point seems to me the sweetest (since it affects SEO and increases the credibility of your site with search engines), although without a doubt the rest of the points are also extremely important.

How to configure 304 and 200 server responses?

We have already said that in response to a request for unchanged pages, the server should send 304 Not Modified, and what code should the server send if the client accesses the page for the first time or accesses a changed page? In such cases, the server should give the status 200 OK... You don't need to send this code specially, if everything is in order with the page, then it always returns 200.

Therefore, we only need to take care of the 304 code, since the server will not send it without our intervention. For this we will be helped, as well as the title Last-Modified and request.

Headings Last-Modified

Last-Modified Is a header that we send using PHP, this header contains the exact time of the last page change (in seconds). For this, a common measure of time is used: the Unix Time Stamp.

Unix time stamp Is the number of seconds since the beginning of the Unix era: January 1, 1970. At the time of this writing, the Unix time stamp is 1370597447 seconds, which is 06/07/2013 09:30:47 GMT (+00: 00).

That is, all we need to do is just send the PHP header with the instruction Last-Modified and the desired date:

Header ("Last-Modified:" .gmdate ("D, d M Y H: i: s", $ last_modified_time). "GMT");

Where header is a construct for sending an HTTP header, Last-Modified- what we send and immediately after the colon is its value:

Gmdate ("D, d M Y H: i: s", $ last_modified_time). "GMT".

The Last Modified value is the function gmdate () which contains the variable I invented $ last_modified_time(you can name it whatever you like). In a variable $ last_modified_time and contains the time of the last change in the format Unix Time Stamp and the function gmdate () serves us in order to bring the date into the proper form (Greenwich Mean Time).

For clarity, here's an example for you: if we in a function gmdate () put the value 1365003142 , then at the output we get: Wed, 03 Apr 2013 15:32:22.

Now that we have learned how the whole process takes place, the question may arise: "Is this, for each page we manually need to indicate the time of the last change?". The answer is "Yes!" Personally, I do just that - manually, the most reliable option. However, specifically for this blog, I have provided everything, for example, if a new comment appears on the page, then in the variable $ last_modified_time the time of adding this comment is recorded, this is done so that search engines can index new comments and know that the site is "live". Each site is different and you will have to come up with your own algorithm for indicating the date of the last page change, or always specify it manually.

I emphasize again, my algorithm is as follows:

1) I indicate the date of creation of the material manually, if I change something in the article (typos or add it), then I again manually enter the new time of the last update.

2) If a visitor adds a comment, then to the variable $ last_modified_time automatically, without my knowledge, the time of adding a comment is entered, since in fact this will be the date of the last change of the page.

What I did not take into account: in the right column of the site I have fresh articles, recommended and top 10... They change constantly and at the same time for all pages. If every time I change the right column of the site, I changed (automatically or manually - it doesn't matter) the date of the last page change, then the whole meaning of this action would be lost. I decided that these changes should be monitored and taken into account when specifying $ last_modified_time not worth it, as they are of no SEO benefit.

As I already wrote, I cannot point you to exactly how to automate the date the page was last modified, but I will tell you how you DO NOT need to!

Errors when specifying the date of the last modification

The first thing that most people can think of is in the header to send the date of the last modification of the file with the content of the page. Personally, my article texts are in files, not in a database, so for me this method might seem like an excellent way out, so as not to enter every time Unix Time Stamp manually. But no! Most hosting sites, and maybe even all, take the date of its creation as the date of the last change of the file, they do not take into account its subsequent changes.

I think the consequences in such cases are clear to you. One popular Ukrainian hosting provider (and I think he is not alone) in his FAQ writes something like: “Instead of the date of the last file modification, use the function time () which returns the current time in Unix time stamp format ". How absurd! It’s easy to shoot himself on the spot! And this hosting provider is considered "one of the best", after reading this, I immediately decided not to become their client.

It's just anti-SEO, think for yourself, a search engine comes to your page and looks: “Wow! The last time the page was changed was just now, so I guessed when to come, class! ". He comes in a couple of days on the same page: “Look, it has just changed again, this is a coincidence ... Wait, why don’t I see any changes? Okay, I'll come back another time. " Comes again: "Well, no, men, this is no longer funny, you definitely cannot be trusted." Here is such a fairy tale :)

And then people wonder why the results in the search results are not the same as they would like, but because the banal is lost on your site confidence(trust). Just like in the parable "About the Shepherd and the Wolves".

So, we figured out the main mistakes: you cannot specify the current time and I do not advise you to specify the time the file was modified. Now let's continue to analyze how it all works.

Configure sending headers Last-Modified this is exactly 1/3 of the case, we still have to: make a response to the request and enable page caching... Both of these actions will not take much time and lines of code.

- this is a client request to your server, in which the client asks: "Has the page changed since my last visit?" If the page has not changed, then we must stop the execution of further page loading with the command:

In this case, the body of the page should not start rendering, this all happens BEFORE the first output of something to the page! At the same time, it is necessary to return the server response to the client. 304 Not Modified, thereby saying that the page needs to be taken from the cache. Let's get straight to the point:

If (isset ($ _ SERVER ["HTTP_IF_MODIFIED_SINCE"]) && strtotime ($ _ SERVER ["HTTP_IF_MODIFIED_SINCE"])> = $ last_modified_time) (header ("HTTP / 1.1 304 Not Modified"); die;) header ("Last-Modified : ".gmdate (" D, d MYH: i: s ", $ last_modified_time)." GMT ");

So, in the first line, we use to check whether an HTTP_IF_MODIFIED_SINCE request came to our server, and also immediately check the number of seconds in the received HTTP_IF_MODIFIED_SINCE more than in $ last_modified_time or not? If more, then the date of the last visit of the client is later than the date of the last change of the page, from here we draw a purely logical conclusion that the page has not changed, which means that we send the server response in the second line 304 Not Modified and with the 3rd line we kill (stop) the execution of all scripts on the page. In other words, stop downloading it.

If the client did not send us an HTTP_IF_MODIFIED_SINCE request or his last visit was earlier than the date of the last page change, then we (by default) send the code 200 OK and in the fifth line we send him the ACTUAL date of the page change, instead of the one he had.

About IF_MODIFIED_SINCE and how the code works, I told you everything you need, except what the strtotime () function does:

Strtotime ($ _ SERVER ["HTTP_IF_MODIFIED_SINCE"])

An attentive and savvy reader could already guess that this function converts an ordinary date into a Unix time stamp, since we set the $ last_modified_time variable in it, and therefore, for comparison, we need to bring everything to a common denominator in a common measurement system.

And the last thing, we just have to enable caching, this is done using the following lines:

Header ("Cache-Control: public"); header ("Expires:". date ("r", time () + 10800));

Where the number 10800 is the time (in seconds) for which we want to cache the page, that is, in this example, 3 hours.

And as always for those who did not understand anything, I spread everything completely, as it is arranged on my blog:

= $ last_modified_time) (header ("HTTP / 1.1 304 Not Modified"); die; / * killed everything below * /) header ("Last-Modified:" .gmdate ("D, d M Y H: i: s", $ last_modified_time). "GMT"); ?> And went went the rest of the page

I think you might have noticed that this whole Last Modified story is analogous to the tag in -. So lastmod is for informational and recommendation purposes, and no one will argue with the answers of your server. Naturally, it is not uncommon for the lastmod in the sitemap to differ from the Last Modified title, but from now on they should be the same for you! After all, now we have studied what science with you, not in order to become like unfortunate webmasters who have not advanced further than sitemap.xml.

Personally, at the moment I do not use the lastmod tag in my sitemaps at all, maybe later I will reconsider my actions, but so far I see no reason to be so meticulous, having the correct titles Last-Modified :)

And finally, check the correctness Last-Modified and you can with this service: click.

Thank you for your attention, special thanks to the ever-growing number of subscribers, for me this is the greatest incentive to blog more often. So whoever hasn't signed up for new articles yet, you're welcome!

The Last-Modified HTTP header tells the client the time the page (object) was last modified. If the client (browser, search robot) received the Last-Modified header, then the next time the address is accessed, provided that the page (object) is in the local cache, it will add the If-Modified-Since question (has the page changed after the date, obtained in Last-Modified). In turn, the server, having received the If-Modified-Since request, must check the received timestamp with the time the page was last modified and, if the page has not changed, respond with 304 Not Modified.

Saving Traffic

If the page has not changed, then the server will stop transmitting data after sending headers with the 304 Not Modified code, the page body, images and other objects will not be transmitted.

Reducing server load

Correct implementation of checking the time of the last page change can significantly (up to 30% or more) reduce the load on the server. Correct implementation means checking the time before the start of page generation on a dynamic site. In this case, all actions to generate the page (requesting content from the database, parsing templates, receiving comments, etc.) will not be performed. This is especially true for sites with high traffic and long duration of the user's visit. Example: A user is on a sports news site and constantly refreshes the home page pending the publication of the match result. In a few minutes, a page can be requested and received dozens of times. If the Last-Modified header is given and the If-Modified-Since request is processed correctly, then the page will actually be transmitted once, and a 304 Not Modified response will be returned to all subsequent requests.

Speed ​​up indexing by search engines

Search engines recommend sending in the Last-Modified header and handling If-Modified-Since correctly through the webmaster's guide.

Make sure your web server supports the If-Modified-Since HTTP header. This header will allow the web server to tell Google if the site's content has changed since the last crawl. Support for this feature will reduce bandwidth usage and overhead.

Google: Webmaster's Guide

Make sure the HTTP headers are correct. In particular, the content of the response that the server sends to the If-Modified-Since request is important. The Last-Modified header must give the correct date when the document was last modified. Even if the server does not return the date of the last modification of the document (Last-Modified), your site will be indexed. However, in this case, the following should be considered:

  • the date will not be shown in search results next to the pages of your site;
  • when sorted by date, the site will not be visible to most users;
  • the robot will not be able to get information about whether the site page has been updated since the last indexing. And since the number of pages received by the robot from the site in one visit is limited, the changed pages will be re-indexed less often.

The Last-Modified HTTP header tells the client the time the page (object) was last modified. If the client (browser, search robot) received the Last-Modified header, then the next time the address is accessed, provided that the page (object) is in the local cache, it will add the If-Modified-Since question (has the page changed after the date, obtained in Last-Modified). In turn, the server, having received the If-Modified-Since request, must check the received timestamp with the time the page was last modified and, if the page has not changed, respond with 304 Not Modified.

Saving Traffic

If the page has not changed, then the server will stop transmitting data after sending headers with the 304 Not Modified code, the page body, images and other objects will not be transmitted.

Reducing server load

Correct implementation of checking the time of the last page change can significantly (up to 30% or more) reduce the load on the server. Correct implementation means checking the time before the start of page generation on a dynamic site. In this case, all actions to generate the page (requesting content from the database, parsing templates, receiving comments, etc.) will not be performed. This is especially true for sites with high traffic and long duration of the user's visit. Example: A user is on a sports news site and constantly refreshes the home page pending the publication of the match result. In a few minutes, a page can be requested and received dozens of times. If the Last-Modified header is given and the If-Modified-Since request is processed correctly, then the page will actually be transmitted once, and a 304 Not Modified response will be returned to all subsequent requests.

Speed ​​up indexing by search engines

Search engines recommend sending in the Last-Modified header and handling If-Modified-Since correctly through the webmaster's guide.

Last-Modified and If-Modified-Since Headers for WordPress

Few people pay attention to HTTP headers Last-Modified and If-Modified-Since when optimizing your site, but in vain! It is important that a page whose content has not changed since the last visit of the search robot gives 304 code, which actually indicates that this particular page was not supplemented with anything - you did not edit or supplement the text, comments were not added to this entry, etc. P.

If this http header is absent, then in Yandex, when sorting results by date, the site will not be visible to most users.

That is why it is important that you not only set it up correctly, but also update the date to the current one every time you edit a record. This will need to be done manually.

With comments it is simpler: when a visitor adds a comment, then to a variable $ last_modified_time the time of adding a comment is entered automatically - this will be the date of the last page change.

Why are Last-Modified and If-Modified-Since headers needed?

1. When the server sends this code, the execution of all PHP scripts on the page is not even started. The page is loaded from the search cache, and this, as you understand, greatly reduces the load on the server to the great joy of your hoster and speeds up the loading of the page from the visitor, which is also good news.

How does this happen?

When crawling the Internet, Google and Yandex spiders save a copy of each site in their database. This copy serves as a kind of model for comparison: whether everything is the same or there have been changes. And if the Last-Modified and If-Modified-Since headers are not configured or configured incorrectly, new pages of the site are indexed, and the main page in the cache of search engines is not updated for a long time, just as the comment feed is not updated.

But for frequently updated pages (news feeds updated many times a day, actively commented blogs, etc.), it has one drawback: the information in the cache becomes outdated too quickly and a person, even reloading the page, does not see fresh news, does not sees new comments. But this is not so bad. The trouble is, the robot doesn't see this either, unless the correct Last-Modified header is included.

header ("Last-Modified:" .gmdate ("D, d M Y H: i: s"). "GMT");

If your site is updated frequently (for example, your posts are often commented on), you can disable caching with the following set of headers:

header ("Expires:" .gmdate ("D, d M Y H: i: s", time () + 7200). "GMT");

This means that the validity of the stored copy must be double-checked on every request.

How does browser caching work?

If it is not prohibited by calling the no_cache function, then in Firefox and IE the page is saved in the cache, and it is this page that is returned for all subsequent requests.

To refresh the page and get its latest version, you need to press the key combination Ctrl + F5, the usual "Refresh" (F5) button does not work. And I must say, documents in the IE cache can be stored for a very, very long time.

In Opera, the cache page is cleared by pressing the Refresh button or the F5 key. The combination of CRTL + F5 in the Opera - reboot all open tabs, As you understand, if you have opened a lot of them - in the process of waiting, you may grow a beard.

If you disable caching of the page with the no_cache function, then Opera and Firefox, when accessing such a page, use the mechanism with the If-Modified-Since header. Thus, caching occurs, but the browser asks the server whether the page has actually changed or not - this is the correct question.

Therefore, you need to connect the processing of this parameter as well. I will not describe what and what function means, I will just give a code that correctly renders headers and does not cause conflicts on most hosting services with which I have worked. This design works for sweb.ru, eomy.net, timeweb.ru, fastvps.ru, startlogic.com

header ("Expires:" .gmdate ("D, d M Y H: i: s", time () + 7200). "GMT");
header ("Cache-Control: no-cache, must-revalidate");
$ mt = filemtime ($ file_name);
$ mt_str = gmdate ("D, d M Y H: i: s"). "GMT";
if (isset ($ _ SERVER ["HTTP_IF_MODIFIED_SINCE"]) &&
strtotime ($ _ SERVER ["HTTP_IF_MODIFIED_SINCE"])> = $ mt)
(header ("HTTP / 1.1 304 Not Modified");
die;
}
header ("Last-Modified:". $ mt_str);
echo $ text;
header ("Vary: Accept-Encoding");
header ("Accept-Encoding: gzip, deflate, sdch");
?>

Thus, all you need to do is copy this code and add it to the file header.php Your theme ABOVE ... Those. this code is at the very top of the file BEFORE all the rest of the code


Attention! Before adding anything, save this file on your computer so that you can restore the original version if yours does not allow such a configuration of headers.

We check the result on the service for checking the Last-Modified and If-Modified-Since headers http://last-modified.com/ru/if-modified-since.html


  • If the result is positive, we wipe the sweat from our forehead and go to drink tea.
  • If the result is negative, the same construction can be added to the file index.php at the root of your WordPress (I ran into this on timeweb.ru hosting). Likewise, above everything else in it. Just do not forget about it when you update - the index file will be overwritten in its standard form.

Voila! By properly configuring the Last-Modified and If-Modified-Since headers, we got a bunch of bonuses:

  • We increased the speed of loading pages, which is important for the Google robot and pleasant for people.
  • We reduced the load on the server, which made the hoster happy.
  • Yandex search results will display the date of the last page refresh, which in some cases is very important for people, and therefore indirectly this will have a positive effect on behavioral factors.
  • Pages of our site will participate in sorting search engines by date - yes, advanced users use this.
  • And, as a consequence of all of the above, the indexing of our site by search engines will greatly accelerate.






2021 gtavrl.ru.