This page was last modified on 1576. A blog that has a lot of interesting information…. Speed ​​up indexing by search engines

HTTP header Last-Modified (last changes) transfers time to the client last change document (web page). The client (browser or search robot) sends the header " If-Modified-Since" and if the date the page was last modified matches, the server returns the header " 304 Not Modified" and doesn't load the page. If the last modified time is different (or the last modified header is not configured) - the server returns the header " 200 OK" and loads the page. That is, instead of reloading the page again and updating the cache, the client only receives header 304. The client saves traffic, and the server sends less data - mutual savings.

But why would there be this article if not to talk about the benefits that setting the Last-Modified header brings for, or more precisely, for speeding up site indexing. It is not difficult to guess that 10 pages of a site will be indexed faster than 1000. The same principle that allows you to optimize page loading also works for indexing. A search engine doesn't need to index 1,000 pages to find 10 new pages. Thanks to last modified, we leave only new pages (or updated ones) for the robot. The robot comes to the site and first takes what it needs, and then everything else.

Setting the Last-Modified header

Make sure your http headers are correct. In particular, what is important is the content of the response that the server gives to the “if-modified-since” request. The Last-Modified header must indicate the correct date the document was last modified. Even if the server does not display the last-modified date of the document, your site will be indexed. However, in this case, you should consider the following: - the search results will not show the date next to the pages of your site; - when sorting by date, the site will not be visible to most users; - the robot will not be able to obtain information about whether the site page has been updated since the last indexing. And since the number of pages a robot receives from a site in one visit is limited, changed pages will be reindexed less often. Make sure your web server supports the "If-Modified-Since" HTTP header. This header will allow the web server to tell Google whether the site's content has changed since the last time it was crawled. Supporting this feature reduces overhead and bandwidth usage.

Here are examples of how to configure the last-modified header to be sent and If-Modified-Since to be handled correctly.

How to set up meta Last-Modified for static html pages

How to set up Last-Modified in php

= $LastModified_unix) ( header($_SERVER["SERVER_PROTOCOL"] . " 304 Not Modified"); exit; ) header("Last-Modified: ". $LastModified); ?>

How to configure Last-Modified .htaccess

RewriteRule .* - RewriteRule .* -

How to configure Last-Modified nginx + php

location ~ .php$ ( ... if_modified_since off; fastcgi_pass fcgi; fastcgi_index index.php; fastcgi_param SCRIPT_FILENAME /<путь >/web$fastcgi_script_name; ... fastcgi_pass_header Last-Modified; include fastcgi_params; )

Check Last-Modified

When passing the header to the client is configured, it doesn't hurt check last modified for correctness. You can check Last-Modified on your own or a third-party website through online services.

Or make your own check for correct processing of the Last-Modified header:

Setting the Last-Modified header and processing the If-Modified-Since header will be extremely useful for any larger or smaller site. The processing speed of website pages can become significant. Relatively simple setup will not create problems, especially since for popular CMSs like joomla, wordpress, modx, etc. there are ready-made solutions.

There are a lot of different myths floating around in the field of website search engine optimization (SEO). Some of them have a basis, some of them came from nowhere. In this note we will look at one of them - using the last-Modified response header.

Some time ago we received a document entitled “Ingate Recommendations for Web Studios on Promoted Sites.” And one of the “recommendations” was the following:

After a redesign or on a new site being developed, the date of the last modification of the site pages (Last Modified) must be indicated.

To add to the site at PHP information the date of the last modification of the pages is necessary at the very beginning source code insert a script into each page


header("Last-Modified: " . date("D, d M Y H:i:s", time()) . " GMT");
?>

It was this wild nonsense, this utter nonsense and frankly crazy code that prompted me to write this note. Here I will try to explain what Last-Modified is, why it is needed and how browsers and search engines use it.

What is Last-Modified

When transmitting information to the client (browsers or search robot), the web server reports quite a lot of additional data. They can be viewed in the browser console, for example:

configure the server to issue correct response headers (for example, if the page does not exist, issue a 404 error, and if an If-Modified-Since request is received, then issue a 304 code if the page has not been changed since the date specified in the request).

You can also see that if the server does not respond in any way to a conditional GET request, then it is no different from a regular request. That is, the Last-Modified header with the current time, which is also incorrectly formed (hello Integgate!) is not needed at all!

So is Last-Modified necessary or not?

Generally necessary. But it is important to understand that it is not the header itself that plays any role, but the entire conditional request scenario, which must be fully implemented by the site. It is in this case that we will get a high speed of site indexing.

But it is often very difficult to implement this in a ready-made CMS. This may require quite significant changes to the code of the CMS itself.

Although for a number of CMS this can be achieved by enabling page caching. If the CMS caches pages, creating and serving essentially static files, then the web server itself will respond correctly to conditional requests. For example, in WordPress this can be achieved using the WP Super Cache plugin:

Let's check it in action. I enabled this plugin, opened the browser in anonymous mode and made two requests for the same page. It is clearly seen that the second answer is correct - 304 Not Modified:

Instead of a conclusion

Thus, we have dealt with the Last-Modified header. First, it must convey information about the date and time the document was actually modified. Secondly, the server’s response to a conditional request with the If-Modified-Since header is extremely important.

Well, listen less to SEOs who don’t know the basics of how the Internet works.

One of the stages of optimizing a website for its proper operation and successful promotion is server-side optimization. Other points include setting up the correct server response to the “Last-Modified” request. Correctly setting this parameter can increase the loading speed of the site and have a positive effect on its indexing by search robots.

What is Last-Modified and why is it needed?

As the name suggests, the Last-Modified header tells the client (site visitor) about the last time a particular page of the site was modified. If a search robot acts as a site visitor and the Last-Modified response to the requested document or page is not configured (or configured incorrectly) on the site (server), the search robot has no choice but to index all pages of the Internet resource again and again when each visit, thus creating a certain load on the server located on the hosting site. What if the number of pages is hundreds or even thousands? Depending on the characteristics and capabilities of the server, there is a possibility of running into some kind of error on the hosting side. In addition to this, it should be noted that the search robot has a limit on the number of pages indexed “at a time”, so if there is no correct settings Last-Modified header, we risk that unmodified pages will be indexed by the robot, but the new pages we need will not.

Based on the RFC 2616 specification describing the HTML Hypertext Transfer Protocol, a client can "ask" the server whether a page has changed since a certain date by sending the server an "If-Modified-Since" header. If the requested page has not changed, the server will return a "304 Not Modified" header and the browser will not load the page and the web server will not send much data. Otherwise (if the site page has changed since the previous request), the server will return a “200 OK” response and directly the code of the page itself.

In addition to the above, we mention Yandex’s recommendations: “The robot will not be able to obtain information about whether the site page has been updated since the last indexing. And since the number of pages the robot receives from the site in one visit is limited, changed pages will be re-indexed less often.”

TOTAL: The important purpose of the “Last-Modified” header is to inform the site visitor and search robot the date of the last modification of any document.

Why do you need to properly configure Last-Modified?

By correctly configuring the Last-Modified server response, we can achieve several positive results for our site:

  • The site page loading speed increases for people: if the page has already been visited by the user and at the time of the next visit the page has not changed, the visitor’s browser will not reload the page, but will display its cached copy;
  • The load on the hosting platform (server) is reduced: with this operating algorithm, the server will be loaded much less due to the need to transfer only the volume of changed pages to the site visitor;
  • The date of the last document in the search results is displayed: this fact can attract visitors to your site if a “fresh” date is displayed;
  • Sort by date: site pages will take part in sorting by date in search results;
  • The indexing of the site by search robots is significantly accelerated: Due to the quick response of your site about the date of crawled pages, old (already indexed) pages will be “thrown aside”, giving way to “fresh” documents. This point is the most significant when promoting a site, because... high indexing speed increases the level of trust in the site among search robots.

How to check if Last-Modified is configured correctly?

One of the services where you can check the correctness (and indeed the existence) of the configured Last-Modified server response is the eponymous last-modified.com

In the input field you need to write the address of your website or a specific page and click on the “Check” button. The result of the service will be a demonstration of your site’s response to a request for the “Last-Modified” and “304 Not Modified” headers. An example of such a check:

Setting up Last-Modified

Let's look at the implementation of the Last-Modified HTTP header response in PHP.

On the Internet you can often find the following recommendations for setting up Last-Modified:

I just want to exclaim: “We don’t need this kind of hockey!” And let's figure out why. In response to a user request, the function gmdate will return it to us current date Greenwich Mean Time (GMT). And this will happen every time with every request from a user or search robot - the server will return exactly its current date. It turns out that every time search engines visit your site, they will see that the requested page has just been updated. This may only be “useful” a few times... After a while, the search engine will realize that it is being “fooled” and will lose any trust in your site. Accordingly, such an implementation does not suit us.

Let's turn to the above resource for help. last-modified.com. It also presents an implementation of the Last-Modified HTTP header in PHP. It looks like this:

$LastModified_unix = 1294844676;

$IfModifiedSince = false;



if ($IfModifiedSince && $IfModifiedSince >= $LastModified_unix) (
exit;
}

Let's look at how this code works. Variable $LastModified_unix set manually in the format Unix Time Stamp(the number of seconds that have passed since the beginning of the Unix era - since January 1, 1970).

Function gmdate returns the current GMT time in the format Day, DD Mon YEAR HH:MM:SS GMT.

Next: get the time the page was last modified, check for availability If-Modified-Since, if there is one, we give it away 304 Not Modified and stop the script. Otherwise we generate a header Last-Modified and hand over the entire page.

In the above option it is proposed to set the time $LastModified_unix manually. But what if there are a lot of pages on the site? To do this, let’s “modernize” the script a little, replacing just the first line in it:

$LastModified_unix = strtotime(date("D, d M Y H:i:s", filectime($_SERVER["SCRIPT_FILENAME"])));
$LastModified = gmdate("D, d M Y H:i:s \G\M\T", $LastModified_unix);
$IfModifiedSince = false;

if (isset($_ENV["HTTP_IF_MODIFIED_SINCE"]))
$IfModifiedSince = strtotime(substr($_ENV["HTTP_IF_MODIFIED_SINCE"], 5));

if (isset($_SERVER["HTTP_IF_MODIFIED_SINCE"]))
$IfModifiedSince = strtotime(substr($_SERVER["HTTP_IF_MODIFIED_SINCE"], 5));

if ($IfModifiedSince && $IfModifiedSince >= $LastModified_unix) (
header($_SERVER["SERVER_PROTOCOL"] . " 304 Not Modified");
exit;
}

header("Last-Modified: ". $LastModified);

In the updated version, the first line of code converts the text representation of the date to English language to a label in the format Unix Time Stamp, thus using the automatically calculated change time of the current site page.

All is ready! Now all we have to do is paste the resulting code into the section ... each page of the site and enjoy high speed page loading and indexing by search robots.

Let us help you with setting up Last-Modified!

Start with a simple call at T-Design! Our phone number is 8 499 340-17-82 - works in mode non-stop, so that you can quickly receive the information you are interested in solving the problems of developing and supporting your site. Or write to email. We will advise on all questions and we will select the optimal tariff for your company.

Syntax

If-Modified-Since: , ::GMT

Directives

One of "Mon", "Tue", "Wed", "Thu", "Fri", "Sat", or "Sun" (case-sensitive). 2 digit day number, e.g. "04" or "23". One of "Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec" ( case sensitive). 4 digit year number, e.g. "1990" or "2016". 2 digit hour number, e.g. "09" or "23". 2 digit minute number, e.g. "04" or "59". 2 digit second number, e.g. "04" or "59". GMT

Greenwich Mean Time. HTTP dates are always expressed in GMT, never in local time.

Examples

If-Modified-Since: Wed, 21 Oct 2015 07:28:00 GMT

Specifications

Specification Title
RFC 7232, section 3.3: If-Modified-Since Hypertext Transfer Protocol (HTTP/1.1): Conditional Requests

Browser compatibility

The compatibility table in this page is generated from structured data. If you"d like to contribute to the data, please check out https://github.com/mdn/browser-compat-data and send us a pull request.

Update compatibility data on GitHub

DesktopMobile
ChromeEdgeFirefoxInternet ExplorerOperaSafariAndroid webviewChrome for AndroidFirefox for AndroidOpera for AndroidSafari on iOSSamsung Internet
If-Modified-SinceChrome Full support YesEdge Full support 12Firefox Full support YesIE Full support YesOpera Full support YesSafari Full support YesWebView Android Full support YesChrome Android Full support YesFirefox Android Full support YesOpera Android Full support YesSafari iOS Full support YesSamsung Internet Android Full support Yes

Last-Modified and If-Modified-Since Headers for WordPress

Few people pay attention to HTTP headers Last-Modified And If-Modified-Since when optimizing your site, but in vain! It is important that the page, the content of which has not changed since the last visit of the search robot, returns a 304 code, which actually indicates that this particular page has not been supplemented with anything - you have not edited or supplemented the text, no comments have been added to this post, etc. P.

If this http header is missing, then in Yandex, when sorting results by date, the site will not be visible to most users.

That is why it is important that you not only set it up correctly, but also update the date to the current one every time you edit a record. This will need to be done manually.

With comments it’s simpler: when a visitor adds a comment, then in the variable $last_modified_time the time the comment was added is entered automatically - this will be the date the page was last modified.

Why do we need the Last-Modified and If-Modified-Since headers?

1. When the server sends such code, the execution of all PHP scripts on the page does not even start. The page is loaded from the search cache, and this, as you understand, very significantly reduces the load on the server, much to the delight of your hoster, and speeds up page loading for the visitor, which is also good news.

How does this happen?

When crawling the Internet, Google and Yandex spiders save a copy of each site in their database. This copy serves as a kind of sample for comparison: is everything still the same or have changes occurred. And if the Last-Modified and If-Modified-Since headers are not configured or are configured incorrectly, new pages on the site are indexed, and the main page in the search engine cache is not updated for a long time, just as the comment feed is not updated.

But for frequently updated pages (news feeds updated many times a day, actively commented blogs, etc.) it has one drawback: the information in the cache becomes outdated too quickly and a person, even reloading the page, does not see the latest news, does not sees new comments. But that's not so bad. The trouble is that the robot doesn’t see this either, unless it’s turned on correct title Last-Modified.

header("Last-Modified: ".gmdate("D, d M Y H:i:s ")."GMT");

If your site is updated frequently (for example, your posts are often commented on), you can disable caching with the following set of headers:

header("Expires: ".gmdate("D, d M Y H:i:s", time() + 7200)." GMT");

This means that the validity of the stored copy must be double-checked with each request.

How does caching work in browsers?

If it is not disabled by calling the no_cache function, then in Firefox and IE the page is stored in the cache, and for all subsequent requests it is this page that is returned.

To refresh the page and get the latest version, you need to press the key combination Ctrl+F5, the usual “Update” button (F5) does not work. And I must say, documents in the IE cache can be stored for a very, very long time.

In Opera, the cache page is cleared by pressing the “Refresh” button or the F5 key. The combination CRTL+F5 in Opera - reloads all open tabs. As you understand, if you open them a lot, you may grow a beard while waiting.

If you disable page caching with the no_cache function, then Opera and Firefox, when accessing such a page, use the mechanism with the If-Modified-Since header. Thus, caching occurs, but the browser asks the server whether the page has actually changed or not - this is the correct way to pose the question.

Therefore, you need to enable processing of this parameter as well. I won’t describe what this function means, I’ll just give code that sends headers correctly and doesn’t cause conflicts on most hosting sites I’ve worked with. This design works on sweb.ru, eomy.net, timeweb.ru, fastvps.ru, startlogic.com

header("Expires: ".gmdate("D, d M Y H:i:s", time() + 7200)." GMT");
header("Cache-Control: no-cache, must-revalidate");
$mt = filemtime($file_name);
$mt_str = gmdate("D, d M Y H:i:s ")."GMT";
if (isset($_SERVER["HTTP_IF_MODIFIED_SINCE"]) &&
strtotime($_SERVER["HTTP_IF_MODIFIED_SINCE"]) >= $mt)
(header("HTTP/1.1 304 Not Modified");
die;
}
header("Last-Modified: ".$mt_str);
echo $text;
header("Vary: Accept-Encoding");
header("Accept-Encoding:gzip,deflate,sdch");
?>

So all you need to do is copy this code and add it to the file header.php Your theme ABOVE . Those. this code is at the very top of the file BEFORE all the rest of the code


Attention! Before adding anything, save this file on your computer so that you can restore the original version if yours does not allow such a header configuration.

We check the result using the Last-Modified and If-Modified-Since header checking service http://last-modified.com/ru/if-modified-since.html


  • If the result is positive, we wipe the sweat from our forehead and go drink tea.
  • If the result is negative, the same construction can be added to the file index.php in the root of your WordPress (I encountered this on the hosting timeweb.ru). Likewise, above everything else in it. Just don’t forget about this when you update - the index file will be overwritten in its standard form.

Voila! By correctly setting the Last-Modified and If-Modified-Since headers, we got a bunch of bonuses:

  • We increased the page loading speed, which is important for the Google robot and pleasant for people.
  • We reduced the load on the server, which pleased the hoster.
  • The date will be displayed in Yandex search results last update pages, which in some cases is very important for people, and therefore indirectly this will have a positive effect on behavioral factors.
  • The pages of our site will be involved in sorting search engines by date - yes, yes, this is what advanced users use.
  • And, as a consequence of all of the above, the indexing of our site by search engines will greatly accelerate.



Top