Sitemap xml looks like. XML Sitemap: Complete Guide to Use


01.03.2012, 14:41

Comrades!
The sitemap generator gave me a file where I entered both site.com/ and site.com/index.html.
Naturally, this is the same page.
What's the best way to do it for Google? Leave both lines or neuter one of them? If you castrate, which one?

01.03.2012, 14:55

site.com/ home
double site.com/index.html, you can redirect to the main page or rel="canonical"

01.03.2012, 15:28

Why is there a root page in the sitemap at all? Do you think he will index the sitemap and not look at the main page? :)
Sitemap is of no use. It serves only to inform the search engine about the presence of a page and is needed only in cases where a certain page cannot be reached with internal links.

God-bearer

01.03.2012, 17:44

In general, idiocy is also found on websites (http://www.google.com/search?q=site:romip.ru+inurl:index.html), where everyone is an expert - by definition. And even on megaprojects (http://www.google.com/search?q=site:yandex.ru/index.html).

01.03.2012, 18:38

Sitemap is no use

02.03.2012, 00:11

I would venture to express the opinion that it is possible to speed up the indexing of new pages by Google.
When updating the sitemap in Google Webmaster, the bot immediately takes it, I checked in the server logs.
I added a new page to the sitemap, updated it in WMT, and the next day the page was already in the index.
And after 2 months, half of the pages in the index are no longer there. So?))

God-bearer

02.03.2012, 01:22

Naturally, this is the same page.
These are different... mirrors of the type... and even if you don’t add it to the sitemap /index.html, they can index it for you... and you need to prevent this in every possible way.

02.03.2012, 08:42

and even if you don’t add it to the sitemap /index.html, it can be indexed for you
If you remove index.html from all pages, there is no external links, then search engines will remove it from the index.
I had this situation. I put it on the main page from all pages short link index.html. And the external ones were in the format http://хххххххх.ru/. Both were in the index. And http://хххххххх.ru/, and http://хххххххх.ru/index.html
I put it on everyone internal pages http://ххххххххх.ru/ and after several updates http://ххххххххх.ru/index.html was no longer in the search of both Yandex and Google.

Added 03/02/2012 at 09:49 ----------

Using robots.txt?
No. You just need to replace it with http://хххххххх.ru/ on all pages where index.html is indicated. It seems that such links are called absolute.
And short internal ones are relative.
But I’m afraid they will again reproach me for introducing new terms.....)))
And you don’t need to put http://хххххххх.ru/index.html anywhere

Businessman:)

02.03.2012, 09:20

If you remove index.html from all pages, and there are no external links to it, then search engines will remove it from the index....

I agree, I have the same experience. only in robots you can also close it just in case;)

02.03.2012, 11:16

I would venture to express the opinion that it is possible to speed up the indexing of new pages by Google.
When updating the sitemap in Google Webmaster, the bot immediately takes it, I checked in the server logs.
I added a new page to the sitemap, updated it in WMT, and the next day the page was already in the index.

For this purpose you can use RSS feed and ping

God-bearer

02.03.2012, 13:56

Using robots.txt?
As you prefer.

Http://www.bdbd.ru/index.php
http://www.bdbd.ru/index.html
must answer 301

Http://www.unmedia.ru/index.html
Request data
GET /index.html HTTP/1.1
User-Agent: Opera/9.80 (Windows NT 5.1; U; ru) Presto/2.10.229 Version/11.61
Host: www.unmedia.ru
Accept: text/html, application/xml;q=0.9, application/xhtml+xml, image/png, image/webp, image/jpeg, image/gif, image/x-xbitmap, */*;q=0.1
Accept-Language: ru-RU,ru;q=0.9,en;q=0.8
Accept-Encoding: gzip, deflate
Cookie: PHPSESSID=cc2a67ae9b5ae208cd2b96470619d10b; BITRIX_SM_GUEST_ID=100454; BITRIX_SM_LAST_VISIT=03/02/2012+14%3A53%3A27
Connection: Keep-Alive
Request body

Response data
HTTP/1.1 301 Moved Permanently
Server: nginx/0.6.32
Date: Fri, 02 Mar 2012 10:53:45 GMT
Content-Type: text/html; charset=iso-8859-1
Connection: keep-alive
Location: http://www.unmedia.ru/

If you remove index.html from all pages, and there are no external links to it
Then (http://www.google.com/search?q=site:yandex.ru/index.html) may still remain in the index (http://www.google.com/search?q=site:platon. ya.ru+%D0%B1%D0%BE%D1%82) ...

We have released a new book “Content Marketing in in social networks: How to get into your subscribers’ heads and make them fall in love with your brand.”

Subscribe

Sitemap XML Sitemap

Sitemap is special file in .xml format, stored in the root directory of the server. Website owners are often interested in why a Sitemap is needed and whether the presence/absence of this document affects search engine promotion? To answer these questions, consider the purpose and structure of a sitemap.

So, why do you need an xml Sitemap, what is it? A site map is a kind of directory consisting of a list of links leading to all sections and pages of the site. Sitemap helps search engines index the Internet project more quickly and efficiently. This is especially important if the volume of a web resource amounts to thousands or tens of thousands of pages.

[yt=R09Qywy5DXI]

More videos on our channel - learn internet marketing with SEMANTICA

It is important! The sitemap should only include pages that need to be included. search index. Documents with prohibited indexing or proprietary information should not be included in the Sitemap. The sitemap also does not include tagged pages and dynamic URLs.

Does Sitemap affect promotion?

Search engines will not pessimize an Internet project for the absence of this file. Theoretically, the robot should independently crawl all the pages of the site and include them in the search. However, you need to take into account that the system may crash and not find some web documents. Typically, the “problem areas” are sections that can only be reached through a long chain of links, and dynamically generated URLs.

From the point SEO perspective A sitemap has a certain impact because a sitemap speeds up indexing significantly. In addition, there is an increased likelihood that web pages will be indexed before unscrupulous competitors have time to copy and publish the content. Search engines give preference to the original source, while copy-paste is pessimized.

How to create a map website

The easiest way is to use one of the specialized services. For example, the online generator http://www.mysitemapgenerator.com/ allows you to create a sitemap of up to 500 pages for free. It is possible to generate paid Sitemaps without any restrictions. The webmaster will only need to indicate the address of his site, and then place the resulting file in the root folder of the server.

You can use the SiteMap Generator program. In the http:// field you should enter the address home page site, click on the “start” button and wait until the site map is generated. Then you need to go to the “Google Sitemap/XML” tab, copy the resulting code and paste it into a file in .xml format.

Sitemap xml file - available directives

  • The Lastmod parameter tells the robot when the document last time updated.
  • The priory tag indicates the priority of the document in relation to other pages on the site.
  • The loc parameter denotes URL address pages.
  • The changefreq tag is used to indicate that pages are dynamic (documents with the “0” parameter will be visited very rarely by the robot).

How to inform a search engine about a sitemap

To notify Yandex, you can add a sitemap directive to the robots.txt file. The code will look like this:

Sitemap: http://yoursite.ru/sitemap.xml

In addition, you can transfer a site map through the Yandex.Webmaster interface. To do this, you need to go to the “Indexing settings” >> “Sitemap files” tab, and then specify the sitemap address to the system.

You can notify Google in the same way. In the webmaster toolbar, go to the “Crawling” >> “Sitemaps” tab.

Share this article:

Get a professional outside perspective on your project

SEMANTICA studio specialists will conduct a comprehensive analysis of the site according to the following plan:

– Technical audit.
– Optimization.
– Commercial factors.
- External factors.

We don't just say what the problems are. We help solve them

09.06.2016 19983

Availability of xml sitemap ( sitemap xml) on the server - strategically important for optimization. The map stores the structure of the current pages of your resource and serves to ensure that search robots navigate the site faster and your pages are indexed faster. In other words: the site map shows search robots the shortest path to the pages. Here's a carousel. What exactly is in the sitemap? Sitemap file. xml contains not only page urls,...

  1. Select a site from the list.
  2. In the field, enter the URL where the file is available. For example, https://example.com/sitemap.xml.
  3. Click the Add button.

After adding the file, it is queued for processing. The robot will download it within two weeks. Each added file, including those attached to the Sitemap index file, is processed by the robot separately.

After downloading, next to each file you will see one of the statuses:

Status Description Note
"OK"
"Redirect" Remove the redirect and notify the robot about the update
"Error" The file is not formed correctly inform the robot about the update
"Not indexed"

Checking the server response

Disallow inform the robot about the update
Status Description Note
"OK" The file is formed correctly and loaded into the robot database

The date will appear next to the file. last download.

Indexed pages will appear in search results within two weeks

"Redirect" The specified URL redirects to another address Remove the redirect and notify the robot about the update
"Error" The file is not formed correctly Click the Error link for details. After making changes to the file, notify the robot about the update
"Not indexed" When accessing Sitemap, the server returns an HTTP code other than 200

Check if the file is accessible to the robot using the Check Server Response tool by specifying the full path to the file.

If the file is not available, contact the administrator of the site or server on which it is located.

Access to the file is denied in robots.txt using the Disallow directive Allow access to the Sitemap and notify the robot about the update

Sitemap update

If you have changed the Sitemap file added to Yandex.Webmaster, you do not need to delete it and upload it again - the robot regularly checks the file for updates and errors.

To speed up crawling a file, click the icon. If you are using a Sitemap index file, you can start processing each file listed in it. The robot will download the data within three days. You can use the function up to 10 times for one host.

Once you have used up all attempts, the next one will be available 30 days after the first. Exact date displayed in the Webmaster interface.

Removing Sitemap

In the Yandex.Webmaster interface, you can delete those files that were added on the Sitemap Files page: If a directive was added for Sitemap in the robots.txt file, delete it. After making changes, information about the Sitemap will disappear from the robot and Yandex.Webmaster database within a few weeks.

When getting acquainted with the project, the optimization specialist must introduce address bar after the site name the phrase “sitemap.xml”. Analysis of the site map allows you to find out why this or that content is not indexed. I will tell you how to create and implement a competent sitemap.xml in the next issue.

What is an XML map

XML sitemap is a file with information for search engines about the pages that need to be indexed. In other words, a site map is a list of all pages on XML format, available for crawling by a search robot. It is worth distinguishing an XML map from regular card site, which is located at http://site.com/sitemap/.

Using XML maps you can define:

  • location of site pages;
  • time last update each of the pages;
  • frequency (frequency) of updating and importance relative to other pages of the site;
  • importance (priority) of pages in the structure.

What elements does an XML map consist of?

The first line of the document indicates xml version and encoding is UTF-8.

Special XML tags are also used:

  • sitemapindex - parent tag at the beginning and end of the file;
  • sitemap is the parent tag for each sitemap specified in the file. Wherein this tag— child relative to sitemapindex;
  • url - a block that contains the value of the URL itself and other elements;
  • loc - directly the page URL;
  • changefreq - how often this page may change. Possible values: always, hourly, daily, weekly, monthly, yearly, never;
  • priority - priority structural elements, helps determine which pages have higher priority for crawling. It is assigned a value up to one, for example: 0.5.
  • lastmod — time of the last page content update, optional parameter. For sites with static content, it is enough to use changfreq.

An example of a file structure with an XML sitemap:

http://example.com/ 2017-02-05 monthly 0.8

For large sites, it is better to generate several XML maps. For example, this XML index includes two sitemap files:

http://www.example.com/sitemap1.xml http://www.example.com/sitemap2.xml.

XML sitemap for images

Separate XML maps are often created for image indexing. They are relevant only for Google; Yandex does not recognize image tags:

XML map data can help search engines find content that might not otherwise be discovered (for example, if it is loaded from using JavaScript), and specify the images to be scanned and indexed.

Tags used for image maps are:

Additionally, you can use optional tags:

Example XML map for images:

http://example.com/page.html http://example.com/pic1.jpg http://example.com/pic1.jpg

If your site contains unique video content, you can also create a separate XML map for it.

Interestingly, in this map, video URLs can be opened for search in the section Google Video. The results will display a video icon, which, by the way, can be customized, as well as other information specified in the card. For example, the title.

Result:

What video information can be sent to Google using a sitemap:

  • Name;
  • description,
  • duration;
  • miniature and so on.

Required tags:

  • https://gtavrl.ru/en/— page where the video is located;
  • — video title, up to 100 characters;
  • — location of the video player;
  • — location of a specific video;
  • — preview (thumbnail) of the video, no less than 120x90 px;
  • — container for video description;
  • — video description, up to 2000 characters.

In addition, you can use other tags that are optional and recommendatory in nature:

  • — video duration, up to 8 hours, written in seconds;
  • — video category, for example, technology;
  • — the name of the person (company) who added the video. You can specify one file name;
  • — indicates whether a subscription is required to watch the video. Both paid and free, with available values: yes, no;
  • — date of publication, in the format YYYY-MM-DD or YYYY-MM-DDThh:mm:ss+TZD;
  • — here it is indicated whether the video can be available for safe search or not;
  • — a list of countries in which the video may or may not be played. Valid values- country codes in ISO format 3166. Only one tag can be displayed for each video. . If tag none, the video is assumed to be playable in all territories;
  • — link to the gallery;
  • — date and time when the video becomes irrelevant;
  • — cost indicating currency in ISO 4217 format;
  • — video tags;
  • — number of video views;
  • — video rating (from 0 to 5);
  • — a list of platforms where the video can and cannot be played. Available values: web, mobile, tv. If there is no tag, the video is assumed to be playable on all platforms;
  • — indicates whether the video is a live broadcast. Available values: yes, no.

http://www.example.com/videos/video_1.html!} http://www.example.com/thumbs/video_1.jpg Обзор смартфона Xiaomi Redmi 3 Note Pro Подробный обзор внешнего вида и функций смартфона Xiaomi Redmi 3 Note Pro от интернет-магазина Example. http://www.example.com/video123.flv http://www.example.com/videoplayer.swf?video=123 600 4.3 1223 2017-01-05T19:20:30+03:00 yes no

Google "supports" the following formats:

  • .mpg, .mpeg, .mp4, .m4v;
  • .wmv;
  • .asf, .avi;
  • .ra, .ram, .rm;
  • .mov;
  • .flv.

XML map for Google News

For news sites, you can create a separate sitemap with dynamic generation and daily updates. These files will only work for resources included in Google listings News. If the site is not on the list, you can submit a request to add it.

The sitemap file should only contain URLs of articles published in the last two days. Articles published more than two days ago can be removed from the file, but they will remain in the Google News index for 30 days.

This sitemap can contain a maximum of 1000 URLs. If your site has more content in two days, you can create a sitemap index file for multiple maps.

Required tags:

  • — a general tag that indicates the publication. It has two required child tags:
    • — title of the publication;
    • — language in ISO 639 format;
    — date of publication in W3C format indicating the full date. Search engine Google robot understands dates down to fractions of seconds, for example:
YYYY-MM-DDTHh:mm:cc.s±hh:mm (2017-05-10T19:20:30.45+01:00)
  • — the title of the article is similar to the title on the website.

In addition, there are optional tags:

  • — properties of the article. Valid values:
    • PressRelease— official press release;
    • Satire- an article that presents the subject of discussion in a comic form.
    • Blog- any article that is published on a blog or in a blog format.
    • OpEd- any article expressing personal opinion and posted in the editor's column.
    • Opinion- any article expressing personal opinion and not included in the editor's column. This includes both columnist reviews and interviews.
    • UserGenerated- material created by the user and undergone official editorial editing.
  • keywords on the topic of the article;
  • — a list of stock/financial symbols (no more than five, separated by commas). Suitable for articles about business. Each symbol must be preceded by the name of the relevant exchange that matches the Google Finance entry, such as NASDAQ:AMAT or BOM:500325.

Example sitemap for Google News:

http://example.ua/news/wow55.html Новости ru Blog 2017-05-10 Рацион питания среднестатистического студента студенты, еда, мивина, пельмени, revo

How to build an XML map for multilingual sites

Sitemap files can be used to pass the attribute to Google rel="alternate" hreflang="x". Thanks to this, users are shown pages on required language and with URLs with the correct regionale.

The XHTML namespace should be specified like this:

Xmlns:xhtml="http://www.w3.org/1999/xhtml"

You also need to create separate element URL for each address. In turn, each element must include:

  1. The loc tag, which points to URLs;
  2. Subelement xhtml:link rel="alternate" hreflang="XX" for each alternative version pages, necessarily including the current version.

For example, the site has a section in Russian, intended for users from all over the world. In addition, there are two versions of this page: in Ukrainian and in English.

The full set of URLs looks like this:

  • example.com/ua/
  • example.com/ru/
  • example.com/en/

The sitemap file shown in the example below tells Google that the page example.com/ru/ has a corresponding version in Ukrainian and English languages:

http://example.com/ru/ http://www.example.com/deutsch/ http://www.example.com/en/

Yandex supports two sitemap file formats:

  • XML (recommended);
  • text file.

Requirements for Yandex cards:

  • the uncompressed size should not exceed 10 MB;
  • Yandex recognizes Punycode both in encoded form and in the original.

Fundamentally:

  • up to 50,000 links to sitemap files;
  • total size up to 50 MB (uncompressed).

Formats that Google supports as a sitemap:

  • XML - standard file;
  • RSS, media RSS and Atom 1.0 - suitable for blogs with an RSS or Atom feed;
  • Google Sites. If your site is created and verified using Google Sites, a sitemap file is created automatically. You can't change it, but you can send it to Google to get reporting information. If there are more than 1000 pages in one subdirectory, the sitemap may not display correctly.
  • Text file.txt.

Basic requirements for text files:

  • UTF-8 encoding;
  • the file should not contain anything other than a list of URLs;
  • the text file can be given any name, but only with the extension .txt (for example, sitemap.txt).

How to embed an XML map

  1. The XML sitemap file should be placed in the root directive of the site: http://<адрес сайта>/sitemap.xml.
  2. If there are several sitemaps, you need to create a map index, which should list links to all XML files. recommended generators.

    How to find errors in XML maps

    How to analyze a sitemap in Yandex.Webmaster

    In Yandex.Webmaster, to work with XML maps, follow the path “Indexing” - “Sitemap files”.

    Separately, in the Tools section, there is a “Sitemap File Analyzer”, where you can send text, URL or attach the file itself for verification. When checked, it shows the file type and size, number of links and errors.

    In the panel Google webmasters in the “Crawling” section there is an item “Sitemap files”.

    Here you can:

    • add or check sitemap files;
    • track the number of pages of various types sent and indexed;
    • see errors and problems in site maps;
    • send XML maps again or delete them.

    conclusions

    An XML sitemap is needed by search robots to find all your pages. It contains the URLs of pages on the site, as well as data related to them, such as when they were last updated, how often they were updated, and their importance relative to other pages on the site. Individual cards can be created for images, videos, XML can be marked up for Google News.

    There is no need to create a map manually - use free generators or specialized programs. You can check errors in maps in the Yandex and Google webmaster panels.

    Do you have any questions? I will be happy to answer in the comments.

Which are needed for search robots. Some will say that it is not needed, because all sections are already displayed. However, the need for such a page exists if the site contains fifty pages or more. For search engines and users, it will serve as a guide to help them understand where this or that information is contained.

XML and HTML files

Since it is used not only for search robots, but also for users visiting the site, two maps are usually compiled: in XML and HTML formats.

To create a Sitemap use an XML file. Thanks to him, robots are brought into their search database new If there is no map on a multi-page site a large number of pages may not be indexed for sometimes a very long time.

An HTML file is used to create a sitemap for users. The importance of this map lies in the fact that its convenience directly determines whether the user will find the information he is interested in or not. Therefore, such a map is created for those Internet projects in which all sections and their subsections do not fit in the main menu.

How to create a Sitemap XML

There are three ways to solve this problem:

    Buying a generator for a sitemap.

    Create a Sitemap using online services.

    Manually writing a file.

To significantly save time, it is proposed to purchase generators. Therefore, if twenty to thirty dollars to purchase a license is a small waste of money for a webmaster, then buying it, especially for a large Internet resource, still won’t hurt, since then you won’t need to create a site manually.

For a site containing several hundred pages, online services are recommended, where in order to create a Sitemap, you only need to indicate the address of the Internet resource and download the result.

The best option is to manually create a map. To do this, you need to know tags such as url, urlset, loc, lastmod, changefreg and priority. In this case, the first three tags are considered mandatory, but the last three can be dispensed with.

Creating a Sitemap in Joomla

To create a Sitemap on a website, Joomla and Wordpress have special add-ons, like most well-known administration systems, thanks to which a sitemap is created manually or automatically. For large Internet projects that constantly update materials, this addition is very convenient.

In Joomla it is called Xmap, in Wordpress - Google XML Sitemaps.

Automatic sitemap creation

Help to automatically create a Sitemap free online servers, if there are no more than five hundred pages on the site. Here's how easy it is to generate a sitemap:

    Having visited one of these Internet resources, you need to find the “Generate Sitemap” item, click on the “Create” button and create a Sitemap file automatically.

    Find “Site URL” and enter there the address of the site for which the map is being created.

    The system may require you to enter a verification code. You must also enter it and click “Start”.

    Upload the finished map to the website.

Manual way to create a map

This method is, on the one hand, the most difficult, taking up precious time, but on the other hand, it is the most reliable method, used in cases where other options are not suitable. So, for example, if there are many pages that are not particularly necessary to be included in the site map, but of course they automatically end up there, manual method will save the map from an “overdose” of such pages. Another reason for choosing this method is poor site navigation.

To implement manual map creation you must:

    Collect pages to include in a map.

    In the excel file, insert all addresses in the third column.

    Insert both url and loc in the 1st and 2nd columns.

    In the 4th and 5th columns, insert the closing url and loc.

    Use the “link” function to connect five columns.

    Create a sitemap.xml.

    Add both urlset and /urlset tags to this file.

    Insert a connected column between them.

The resulting file must be checked. This can be done, for example, in Yandex, in the webmaster panel.

How to create a Sitemap for Yandex and Google

After the site is created, it is added to the site. For this purpose, the file with the site map should be called Sitemap.xml and added to the root directory. To help search robots find it as quickly as possible, Google and Yandex have special tools. They are called “Webmaster Tools” (in Google) and “Yandex Webmaster” (in Yandex).

Adding a Sitemap to Google

Adding a Sitemap to Yandex

Likewise, you must first log in to Yandex Webmaster. Then go to Indexing/Sitemap files, specify the file path there and click the “Add” button.

    Search robots today will only take those files that contain no more than fifty thousand URLs.

    If the card exceeds ten megabytes, it is better to split it into several files. Thanks to this, the server will not be overloaded.

    To create Sitemap xml correctly, if there are several files, you need to register them all in the index file, using the sitemapindex, sitemap, loc and lastmod tags.

    All pages must be written either with or without the “www” prefix.

    The required file encoding is UTF8.

    You also need to add an indication of the language namespace in the file.

How to create a sitemap for users

Since such a map is created for users, it should be as simple and clear as possible. Despite this, it is necessary to accurately convey all the information about the structure of the site being used.

HTML maps generally have a familiar user structure consisting of sections and subsections highlighted in a specific way, e.g. CSS styles and graphic elements.

To create a Sitemap for a large Internet project, as in the case of an XML map, splitting is also recommended here. In this case, it is carried out in the form of separate tabs, eliminating the bulkiness of the map.

It will improve the functionality of the page JavaScript language, which is allowed to be used in this map, since it is created not for search engine robots, but for users.

Order for a sitemap file

It is advisable that the created file containing the Sitemap always be clean and tidy, especially if the site has a large number of pages. Since search engine robots scan sitemaps very quickly, there may simply not be enough time to view the entire file of a large Internet resource.

Therefore, if you get used to adding pages to the site map not at the bottom, but at the top, then, on the one hand, there is no doubt that search robot will have time to view the addresses of new pages, and on the other hand, this way it will be much easier to control all pages.







2024 gtavrl.ru.