If doc is a strange encoding, how to change it. How to solve encoding problems in Windows and MS Office


I have been asked several times to write, how to change the encoding on the site. This matter consists of several parts, so it cannot be explained in a nutshell. And I decided to write this article, in which I clearly describe what is required for change the encoding on the site.

Let's be with you convert the site to UTF-8 encoding. If you want to convert to any other encoding, then everything will be similar. So here is the procedure for this:

  1. All text files ( html, php, js, txt, in general, which contain text) recode into UTF-8. This is very easy to do via Notepad++ in point " Convert to UTF-8 without BOM"from the menu" Encodings". Moreover, all the files that don’t even display anything on the page.
  2. Place the file at the root of the site .htaccess with text AddDefaultCharset UTF-8.
  3. Change the encoding in the meta tag

If your site does not use a database, then at this stage you can finish changing the encoding. But if there is a database, then you also need to take the following steps:

  1. Immediately after connecting to the database, run the query: SET NAMES UTF-8
  2. IN phpMyAdmin change the database encoding to utf8_general_ci in its settings.
  3. Set the encoding for all tables utf8_general_ci.
  4. For all text field types, set the same encoding utf8_general_ci.

That's it, now your encoding on the site has been changed. I will only add that if somewhere in the code you were converting from one encoding to another (for example, through the function iconv()), then check this place carefully, there may be a problem there.

Files and documents created on a computer always have their own encoding. It often happens that when exchanging files or downloading them from the Internet, the encoding in which they were created is not readable by our computer. The reasons for this may be different - either the lack of the necessary encoding in the program with which we want to open the file, or simply the absence of some program components (an additional font package, for example).

Below we will look at how to change the encoding of an unreadable file or document in different programs.

Changing the encoding on a browser page

For Google Chrome

  1. Select the menu “Settings” → “Tools”.
  2. The line “Encoding” - we point the mouse, and a list of possible encodings appears in the browser.
  3. We select “Windows 1251” for Russian sites. If it doesn’t help, try “Automatic”.

For Opera

  1. Click “Opera” → “Settings”
  2. On the left menu “Websites” → field 2 “Display” → “Customize fonts”
  3. In the “Encoding” field, select “Cyrillic (Windows 1251)”.

For Firefox

  1. Firefox → Settings → Content.
  2. Opposite the “Default font” line, click the “Advanced” button.
  3. At the bottom of the window, select “Encoding” &rarr “Cyrillic (Windows 1251)”.

Changing encoding in Word

Let's look at the procedure for changing the encoding using Word 2010 as an example.

  1. Open the document.
  2. “File” tab → “Options”.
  3. Select the “Advanced” line. In the “General” section, opposite the line “Confirm file format conversion when opening”, check the box. Click OK.
  4. Next, the “File Conversion” window will open. Select “Encoded Text” and click OK.
  5. Next, in the window that opens, mark “Other” and select from the list the encoding that will display the required text. In the “Sample” window you can see how the text is displayed in a particular encoding that we have chosen.

If the above procedure did not help display the document, you can try changing the font. Sometimes a document may appear as “squares” or other symbols if the program does not have the appropriate font.

Changing encoding in Excel

Let's look at the procedure for changing the encoding for Excel 93-2004 and 2007:

  1. Open an unreadable document using Notepad++.
  2. Select the menu Encoding → Convert to UTF-8.
  3. The characters will not change, only the encoding at the bottom of the screen will change. Next, select a character set. If it is Russian: Encoding → Character sets → Cyrillic → Windows-1251.
  4. Click "Save". Open the file in Excel. If the text is not readable, try repeating steps 3-4.

Changing text encoding

  1. Open the file in the standard word processor Notepad.
  2. Click “Save As”.
  3. In the saving window that opens, select the location where we want to save the file, the document type - text, and also set a different encoding type.
  4. Save.
  5. Let's try to open the document.

Read more article

How is a computer able to perceive, separate and recognize all the many commands? All the symbols we use are a set of numbers. In other words, each letter and any other sign has its own designation in the form of a number. This makes it much easier and faster for the computer system to process information. But do not forget that there are many languages ​​in the world, and only 256 characters are used to indicate commands. That's why there are different encodings.

Encoding is a way of storing information and data for later use. If on the screen we see a set of letters that we do not understand, this means that the encoding was chosen incorrectly. And these same 256 numbers represent the symbols written under their meanings in a foreign language. If this problem occurs, the computer, when opening a file, offers to change the encoding to another one it has. Typically, the encoding is determined automatically by the selected language (keyboard layout) on the computer.

Changing the encoding in the browser if the Internet page is displayed crookedly

Sometimes a similar situation with the appearance of incomprehensible letters can arise when using search engines. Just like in the document, we can change the page encoding. To do this, in the menu of the installed Internet Explorer browser, select the “View” tab. A window appears on the right, in it click on the “Encoding” section, then select “Advanced” and, finally, a list of possible encodings appears. Click on the option for the encoding we need. Internet Explorer is configured to use six types of encodings: Windows-1251 and UTF-8 (commonly used encodings), ISO-8859-5, KOI-8U, Mac, KOI-8R. R

*change the encoding using the example of the Mazila browser

Developers of websites and other Internet resources rely on this information and use the same encodings. Here, the keyboard language will influence the language in which information is entered into the search line, but not the information that the search system will provide. By the way, Windows-1251 is used to encode pages in Russian. This is the main type of encoding for Russian-language sites. For sites in foreign languages, the encoding changes the number at the end of the name. For example, for English it will be Windows-1252, and for Central European languages ​​it will be Windows-1250.

Changing the encoding on the site

* how to change (or rather transform) the coding of an individual website page in the Notepad++ HTML editor

The problem becomes much more serious if the encoding of the entire site is incorrect. The most popular in Russia are two encodings. The first of them is the Unicode encoding, indicated in the form of utf-8 characters. The encoding has several forms of representation: UTF-8, UTF-32 and the most famous UTF-16. It contains a huge variety of languages. The second Russian encoding is Windows-1251. It also has a large number of languages, among which the most used among the Russian-speaking population and residents of the CIS countries.

Many experienced personal computer users believe that Windows-1251 encoding has practically become obsolete and will soon fade into the background. A massive transition from one encoding to another is already noticeable, but it is happening gradually. Evidence of this is the use of utf-8 abroad and among serious expensive Russian Internet resources.

Let's say that you decide to transcode a site from Windows-1251 to Unicode. To do this, you will need to perform several operations. First, specify the encoding in the settings. In the “Management” tab, you will need to select the “Web server settings” section. Instead of the previous encoding, you need to set the selected new encoding. In this case it is utf-8. This completes the simple process, and now you can use the site with the new encoding. However, in the browser menu (as described in the “Changing the encoding of an Internet page” paragraph of this article), in the “Encoding” section, click “Select automatically.” This is necessary to open all pages and sites, according to the settings, when using the new encoding.

Then you need to correct the entry in the meta tags. This is not difficult to do, you just need to change the entry in brackets. Before the right bracket, you need to remove the words Windows-1251 and enter utf-8 instead. When completing the site recoding process, the user will need to perform this action for each page. Because otherwise, some pages will still display meaningless information. The duration of recoding depends on the number of pages of the site, that is, on the amount of information on it. Still, it’s better not to waste time on this.

* Underlined in red is the meta tag that is responsible for the encoding of the site.

To avoid difficulties with changing the encoding of an existing Internet resource, you need to apply the correct encoding before creating the site. If, when opening website pages, a problem with the appearance of unknown text on the screen occurs, then the user will not want to waste his time on such a site, especially on deciphering the page manually.

And then the site will lose its visitors, which creates unfavorable conditions for the further existence of the site. Competitors will not wait until the site organizer fixes all the problems, but will take advantage of this situation. It will be difficult to regain the lost audience in the future. That is why you need to approach the creation of a website responsibly, since you will spend less time organizing it than correcting it.

In this article we will talk about how to change the encoding on a website, what encodings there are and which encoding to choose is more optimal.

The site encoding is set using a tag meta. We have already discussed what meta tags are and why they are needed in the article. The page encoding is set as follows:

This line is placed between the tags .

Note: In addition to specifying the required encoding, it is recommended to specify the language of the content on the page to help search engines correctly determine the language on the site:

Main types of encodings on the site

Since we are targeting a Russian-speaking audience, we will talk about the most popular encodings that support the Russian language. These include:

  • UTF-8- (Unicode) is currently the most popular encoding for websites (8 bits);
  • Windows-1251- one of the most common encodings (8 bits);
  • KOI8-R- standard for Cyrillic in Unix-like systems (8 bits).

Unicode is a coding standard that allows you to represent the characters of almost all written languages ​​(including mathematical, musical and others). Unicode has its manifestation in the formats UTF-8, UTF-16 and UTF-32, which differ in the way they store data. To ensure the best compatibility with older systems, 8-bit encoding is used.

Changing text encoding using notepad

To change the encoding of arbitrary text, you can use a regular notepad. Let's say you need to change the text encoding from KOI8 to Windows-1251. To do this you need:

  • Transfer the desired text to the standard Notepad editor;
  • In the menu "View" -> "Encoding" select "Cyrillic (Windows)".

When saving a file using Notepad, you can select the desired encoding.

In the Nubex website builder, all websites are created in UTF encoding, which allows different characters and languages ​​to be displayed correctly.

The set of characters that we see on the screen when opening a document is called an encoding. When it is set incorrectly, instead of clear and familiar letters and numbers, you will see incoherent symbols. This problem often arose at the dawn of technology, but now word processors can automatically select suitable sets themselves. The emergence and development of utf-8, the so-called Unicode, which includes many different characters, including Russian ones, played a role. Documents in this encoding do not need to be changed or configured, since they display the text correctly by default.

Modern text editors detect the encoding when opening a document

On the other hand, this situation still happens sometimes. And receiving an unreadable document is very annoying, especially if it is important and necessary. Just for such cases, Microsoft Word has the ability to specify the encoding for text. This will return it to readable form.

Forced change

If you have received a text file from some source, but cannot read its contents, then you need to manually change the encoding. To do this, go to the “Information” section in the “File” tab. Global recognition and display settings are collected here, and if you change them in an open document, they will become individual for it, but for others they will not change. Let's take advantage of this. In the “Advanced” section of the window that appears, find the “General” heading and check the “Confirm file conversion when opening” checkbox. Confirm your changes and close Word. Now open the document again, as if applying the settings, and the file conversion window will appear in front of you. It will contain a list of possible formats, among which we find “Encoded text”, and we will get the following dialogue.

This new window will have three radio buttons. The first, by default, is CP-1251, Windows encoding. The second is MS-DOS. We need a third item - manual selection, to the right of which are listed various sets of symbols. But, as a rule, the user does not know what characters the previous author used to type the text, so at the bottom of this window there is a field called “Sample”, in which a fragment of the text will be displayed in real time when selecting a particular set of characters. This is very convenient because you don’t have to close and open the document again every time to find the one you need.

Going through the options one by one and looking at the text in the samples field, select the encoding in which the characters will be Russian. But please note that this does not mean anything - watch carefully so that they form meaningful words. The fact is that there is more than one encoding for the Russian language, and text in one of them will not be displayed correctly in another. So be careful.

It must be said that such problems rarely arise with files made on modern word processors. However, there is also such a scourge of the modern information society as format incompatibility. The fact is that there are a number of text editors, and everyone uses them. Perhaps some people do not need the functionality of Word, some do not consider it necessary to pay for it, etc. There may be many reasons.

If, when saving the document, the author chose a format that is compatible with MS Word, then there should be no problems. But that doesn't happen often. For example, if the text is saved with the .rtf extension, then the encoding selection dialog will appear in front of you immediately when you open the text. But Word won’t even open the formats of another popular word processor, OpenOffice, so if you use it, don’t forget to select “Save As” when sending the file to an Office user.

Saving with encoding specified

The user may have a situation where he specifically specifies a certain encoding. For example, such a requirement is presented to him by the recipient of the document. In this case, you will need to save the document as plain text through the "File" menu. The point is that for given formats in Word there are encodings linked by global system settings, but for “Plain Text” no such connection has been established. Therefore, Word will offer to choose the encoding for it yourself, showing the document conversion window that is already familiar to us. Choose the encoding you need for it, save it, and you can send or transfer this document. As you understand, the final recipient will need to change the encoding in their text editor to the same one in order to read your text.

Conclusion

The issue of changing the encoding in Word documents does not arise very often for ordinary users. As a rule, a word processor can automatically determine the set of characters required for correct display and display the text in a readable form. But there are exceptions to any rule, so it is necessary and useful to be able to do it yourself, fortunately, the process is implemented in Word quite simply.

What we've covered is also valid for other programs in the Office suite. They may also experience problems due to, say, incompatibility of saved file formats. Here the user will have to perform all the same steps, so this article can help not only those who work in Word. Unification of setting rules for all programs in the Microsoft office suite helps you avoid getting confused when working with any type of document, be it texts, tables or presentations.

Finally, it must be said that you should not always blame the encoding. Perhaps everything is much simpler. The fact is that many users, in pursuit of “pretty things,” forget about standardization. If such an author selects the font installed on him, types a document using it and saves, his text will be displayed correctly. But when this document reaches a person who does not have such a font installed, an unreadable set of characters will appear on the screen. This is very similar to a “lost” encoding, so it’s easy to make a mistake. So before you try to decode text in Word, first try simply changing the font.







2024 gtavrl.ru.