Hieroglyphs are all for different purposes to inform. Selecting text encoding when opening and saving files


This was the first time I saw something like this - the files and folders from the flash drive disappeared, and instead of them there appeared files with incomprehensible names in the form of “kryakozyabriks”, let’s call them hieroglyphs.

The flash drive was opened using standard Windows tools and, unfortunately, this did not give positive results.

All files on the flash drive are gone, except one. Several files appeared with strange names: &, t, n-&, etc.

The files on the flash drive have disappeared, but Windows shows that the free space is occupied. This suggests that although the files we are interested in are not displayed, they are located on the flash drive.

Although the files have disappeared, the space is occupied. In this particular case, 817 MB are occupied

The first thought about the cause of what happened is the effect of the virus. Earlier, when there was a virus, the file manager FAR manager was used, which, as a rule, sees all files (hidden and system). However, this time, FAR manager saw only what the standard Windows Explorer did...

Even the FAR manager could not see the “lost” files

Since Windows does not see missing files, it does not try the trick of changing file attributes using the command line and the command attrib -S -H /S /D.

What will Linux see?

In this situation, as an experiment, I decided to use a Linux-based operating system. In this particular case, a disk with the Ubuntu 10.04.3 operating system was used (more details about Ubuntu and where to download it).

Important! There is no need to install Ubuntu on your computer - just boot from a CD, just like you do with .

After booting Ubuntu, the desktop will appear and you can work with folders and files in exactly the same way as in Windows.

As expected, Ubuntu saw more files compared to Windows.

Ubuntu also displays those files that were not visible from Windows (clickable)

Next, in order not to bother with file attributes, basic steps were taken: all displayed files were selected and copied to the local drive “D” (of course, you can also copy the files to the system drive “C”).

Now you can boot Windows again and check what happened.

Now Windows sees several Word files. Please note that file names are also displayed correctly

Unfortunately, the problem is not solved, since there were clearly more files on the flash drive (judging by the volume of 817 MB) than we were able to extract. For this reason, let's try to check the flash drive for errors.

Troubleshooting flash drive errors

To find and fix errors on disks, Windows has a standard utility.

Step 1. Right-click on the flash drive icon and select the “Properties” command.

Step 2. Go to the “Service” tab and click on the “Run check” button.

Step 3. Click on the “Launch” button.

After checking and correcting system errors, a corresponding message will appear.

Message: "Some errors have been found and fixed"

After eliminating the errors, the files with hieroglyphs disappeared, and a hidden folder named FOUND.000 appeared in the root directory of the flash drive.

Inside the FOUND.000 folder there were 264 files with the CHK extension. Files with the CHK extension can store fragments of files of various types extracted from hard drives and flash drives using the ScanDisk or CHKDISK utilities.

If all the files on the flash drive were of the same type, for example, Word documents with the docx extension, then in the Total Commander file manager, select all the files and press the key combination Ctrl + M (Files - Group renaming). We indicate which extension to look for and what to change it to.

In this particular case, I only knew that the flash drive contained Word documents and files with Power Point presentations. Changing extensions at random is very problematic, so it is better to use specialized programs - they themselves will determine what type of data is stored in the file. One such program is a free utility that does not require installation on your computer.

Specify the source folder (I dumped the CHK files onto my hard drive). Next, I chose the option in which files with different extensions would be placed in different folders.

All you have to do is click “Start”

As a result of the utility, three folders appeared:

  1. DOC - with Word documents;
  2. JPG - with pictures;
  3. ZIP - with archives.

The contents of eight files remained unrecognized. However, the main task was completed, Word documents and photographs were restored.

The downside is that it was not possible to restore similar file names, so you will obviously have to tinker with renaming Word documents. As for files with pictures, names such as FILE0001.jpg, FILE0002.jpg, etc. will also work.

Good day.

Probably, every PC user has encountered a similar problem: you open an Internet page or a Microsoft Word document - and instead of text you see hieroglyphs (various “kryakozabry”, unfamiliar letters, numbers, etc. (like in the picture on the left...)).

It’s good if this document (with hieroglyphs) is not particularly important to you, but what if you need to read it?! Quite often, similar questions and requests for help with opening such texts are asked to me. In this short article I want to look at the most popular reasons for the appearance of hieroglyphs (and, of course, eliminate them).

Hieroglyphs in text files (.txt)

The most popular problem. The fact is that a text file (usually in txt format, but they are also formats: php, css, info, etc.) can be saved in different encodings.

Encoding- this is a set of characters necessary to fully ensure the writing of text in a specific alphabet (including numbers and special characters). More details about this here: https://ru.wikipedia.org/wiki/Character_set

Most often, one thing happens: the document is simply opened in the wrong encoding, which causes confusion, and instead of the code of some characters, others will be called. Various strange symbols appear on the screen (see Fig. 1)…

Rice. 1. Notepad - encoding problem

How to deal with this?

In my opinion, the best option is to install an advanced notepad, such as Notepad++ or Bred 3. Let's take a closer look at each of them.

Notepad++

One of the best notepads for both beginners and professionals. Pros: free program, supports Russian language, works very quickly, code highlighting, opens all common file formats, a huge number of options allow you to customize it for yourself.

In terms of encodings, there is generally complete order here: there is a separate section “Encodings” (see Fig. 2). Just try changing ANSI to UTF-8 (for example).

After changing the encoding, my text document became normal and readable - the hieroglyphs disappeared (see Fig. 3)!

Rice. 3. The text has become readable... Notepad++

Bred 3

Another great program designed to completely replace the standard notepad in Windows. It also “easily” works with many encodings, easily changes them, supports a huge number of file formats, and supports new Windows operating systems (8, 10).

By the way, Bred 3 is very helpful when working with “old” files saved in MS DOS formats. When other programs show only hieroglyphs, Bred 3 easily opens them and allows you to calmly work with them (see Fig. 4).

If there are hieroglyphs instead of text in Microsoft Word

The very first thing you need to pay attention to is the file format. The fact is that starting with Word 2007, a new format appeared - “docx” (previously it was just “doc”). Usually, new file formats cannot be opened in the “old” Word, but sometimes it happens that these “new” files open in the old program.

Just open the file properties, and then look at the “Details” tab (as in Fig. 5). This way you will find out the file format (in Fig. 5 - the “txt” file format).

If the file format is docx - and you have an old Word (below version 2007) - then simply update Word to 2007 or higher (2010, 2013, 2016).

Next, when opening the file note(by default, this option is always enabled, unless, of course, you have a “don’t understand what assembly”) - Word will ask you again: what encoding to open the file in (this message appears at any “hint” of problems when opening the file, see Fig. . 5).

Rice. 6. Word - file conversion

Most often, Word automatically determines the required encoding, but the text is not always readable. You need to set the slider to the desired encoding when the text becomes readable. Sometimes you have to literally guess how a file was saved in order to read it.

Rice. 8. The browser detected the wrong encoding

To fix the display of the site: change the encoding. This is done in the browser settings:

  1. Google chrome: options (icon in the upper right corner)/advanced options/encoding/Windows-1251 (or UTF-8);
  2. Firefox: left ALT button (if you have the top panel turned off), then view/page encoding/select the desired one (most often Windows-1251 or UTF-8);
  3. Opera: Opera (red icon in the upper left corner)/page/encoding/select the desired one.

PS

Thus, in this article, the most common cases of the appearance of hieroglyphs associated with an incorrectly defined encoding were analyzed. Using the above methods, you can solve all the main problems with incorrect encoding.

I would be grateful for additions on the topic. Good Luck :)

When you open a text file in Microsoft Word or another program (for example, on a computer whose operating system language is different from the one in which the text in the file is written), the encoding helps the program determine in what form the text should be displayed on the screen so that it could be read.

In this article

Understanding text encoding

The text that appears as text on the screen is actually stored as numeric values ​​in a text file. The computer translates numeric values ​​into visible symbols. An encoding standard is used for this.

An encoding is a numbering scheme in which each text character in a set is assigned a specific numeric value. The encoding may contain letters, numbers and other symbols. Different languages ​​often use different character sets, so many of the existing encodings are designed to represent the character sets of their respective languages.

Different encodings for different alphabets

The encoding information saved with the text file is used by the computer to display text on the screen. For example, in the "Cyrillic (Windows)" encoding, the character "Y" corresponds to the numeric value 201. When you open a file containing this character on a computer that uses the "Cyrillic (Windows)" encoding, the computer reads the number 201 and displays "Y" sign.

However, if the same file is opened on a computer that uses a different encoding by default, the character corresponding to the number 201 in this encoding will be displayed on the screen. For example, if the computer uses the "Western European (Windows)" encoding, the character "Y" from the source text file based on the Cyrillic alphabet will be displayed as "É", since this is the character that corresponds to the number 201 in this encoding.

Unicode: a single encoding for different alphabets

To avoid problems with encoding and decoding text files, you can save them in Unicode. This encoding includes most characters from all languages ​​that are commonly used on modern computers.

Since Word is based on Unicode, all files in it are automatically saved in this encoding. Unicode files can be opened on any computer with an English operating system, regardless of the language of the text. In addition, on such a computer you can save files in Unicode that contain characters that are not in Western European alphabets (for example, Greek, Cyrillic, Arabic or Japanese).

Selecting encoding when opening a file

If the text in the open file is distorted or appears as question marks or squares, Word may have incorrectly detected the encoding. You can specify the encoding to be used for displaying (decoding) text.

    Open the tab File.

    Click the button Options.

    Click the button Additionally.

    Go to section Are common and check the box Confirm file format conversion when opening.

    Note: When this check box is selected, Word displays a dialog box File Conversion Whenever you open a file in a format other than Word (that is, a file that does not have a DOC, DOT, DOCX, DOCM, DOTX, or DOTM extension). If you work with these files frequently but don't usually need to select an encoding, be sure to disable this option to prevent this dialog box from appearing.

    Close and then reopen the file.

    In the dialog box File Conversion select item Coded text.

    In the dialog box File Conversion set the switch Other and select the desired encoding from the list.

    In area Sample

If almost all of the text looks the same (for example, squares or dots), your computer may not have the correct font installed. In this case, you can install additional fonts.

To install additional fonts, do the following:

    Click the button Start and select Control Panel.

    Do one of the following:

    On Windows 7

    1. In Control Panel, select the item Uninstalling programs.

      Change.

    On Windows Vista

      In the control panel, select the section Uninstalling a program.

      In the list of programs, click Microsoft Office, or Microsoft Word if it was installed separately from Microsoft Office, and then click Change.

    On Windows XP

      In Control Panel, click Installation and removal of programms.

      On the list Installed programs Click Microsoft Office, or Microsoft Word if it was installed separately from Microsoft Office, and click Change.

    In Group Changing your Microsoft Office installation click the button Add or remove components and then click the button Continue.

    In chapter Installation options expand the element Office Common Tools, and then - Multi-language support.

    Select the font you want, click the arrow next to it and select Run from my computer.

Advice: When opening a text file in one encoding or another, Word uses the fonts defined in the dialog box Web Document Options. (To bring up the dialog box Web Document Options, press Microsoft Office button, then click Word Options and select a category Additionally. In chapter Are common click the button Web Document Options.) Using the options on the tab Fonts dialog box Web Document Options You can customize the font for each encoding.

Selecting encoding when saving a file

If you do not select an encoding when saving the file, Unicode will be used. In general, Unicode is recommended because it supports most characters in most languages.

If you plan to open the document in a program that does not support Unicode, you can select the desired encoding. For example, on an English operating system, you can create a document in Traditional Chinese using Unicode. However, if such a document will be opened in a program that supports Chinese but does not support Unicode, the file can be saved in the "Chinese Traditional (Big5)" encoding. As a result, the text will display correctly when you open the document in a program that supports Traditional Chinese.

Note: Because Unicode is the most comprehensive standard, some characters may not appear when saving text in other encodings. For example, suppose that a Unicode document contains text in both Hebrew and Cyrillic. If you save the file in the "Cyrillic (Windows)" encoding, the Hebrew text will not be displayed, and if you save it in the "Hebrew (Windows)" encoding, the Cyrillic text will not be displayed.

If you select an encoding standard that doesn't support some characters in the file, Word will mark them in red. You can preview the text in the selected encoding before saving the file.

When you save a file as encoded text, the text for which the Symbol font is selected, as well as the field codes, are removed from the file.

Encoding selection

    Open the tab File.

    In field File name enter a name for the new file.

    In field File type select Plain text.

    If a dialog box appears Microsoft Office Word - Compatibility Check, press the button Continue.

    In the dialog box File Conversion select the appropriate encoding.

    • To use standard encoding, select the option Windows (default).

      To use MS-DOS encoding, select the option MS-DOS.

      To set a different encoding, select the radio button Other and select the desired item from the list. In area Sample you can preview the text and check if it displays correctly in the selected encoding.

      Note: To increase the document display area, you can resize the dialog box File Conversion.

    If the message "Text highlighted in red cannot be saved correctly in the selected encoding" appears, you can select a different encoding or check the box Allow character substitution.

    If character substitution is enabled, characters that cannot be displayed will be replaced with the nearest equivalent characters in the selected encoding. For example, an ellipsis is replaced by three dots, and corner quotes are replaced by straight ones.

    If the selected encoding does not have equivalent characters for the characters highlighted in red, they will be stored as out-of-context (for example, as question marks).

    If the document will be opened in a program that does not wrap text from one line to another, you can enable hard line breaks in it. To do this, check the box Insert line breaks and specify the break symbol you want (carriage return (CR), line feed (LF), or both) in the End lines.

Finding encodings available in Word

Word recognizes multiple encodings and supports encodings that are included with the system software.

Below is a list of scripts and their associated encodings (code pages).

Writing system

Encodings

Font used

Multilingual

Unicode (UCS-2 little endian, UTF-8, UTF-7)

Standard font for the "Normal" style of the localized version of Word

Arabic

Windows 1256, ASMO 708

Chinese (Simplified)

GB2312, GBK, EUC-CN, ISO-2022-CN, HZ

Chinese (traditional script)

BIG5, EUC-TW, ISO-2022-TW

Cyrillic

Windows 1251, KOI8-R, KOI8-RU, ISO8859-5, DOS 866

English, Western European and others based on the Latin alphabet

Windows 1250, 1252-1254, 1257, ISO8859-x

Greek

Japanese

Shift-JIS, ISO-2022-JP (JIS), EUC-JP

Korean

Wansung, Johab, ISO-2022-KR, EUC-KR

Vietnamese

Indian: Tamil

Indian: Nepali

ISCII 57002 (Devanagari)

Indian: Konkani

ISCII 57002 (Devanagari)

Indian: Hindi

ISCII 57002 (Devanagari)

Indian: Assamese

Indian: Bengali

Indian: Gujarati

Indian: Kannada

Indian: Malayalam

Indian: Oriya

Indian: Marathi

ISCII 57002 (Devanagari)

Indian: Punjabi

Indian: Sanskrit

ISCII 57002 (Devanagari)

Indian: Telugu

    To use Indian languages, you need to support them in the operating system and have the appropriate OpenType fonts.

    Only limited support is available for Nepali, Assamese, Bengali, Gujarati, Malayalam and Oriya.

Probably, every PC user has encountered a similar problem: you open an Internet page or a Microsoft Word document - and instead of text you see hieroglyphs (various “kryakozabry”, unfamiliar letters, numbers, etc. (like in the picture on the left...)).

It’s good if this document (with hieroglyphs) is not particularly important to you, but what if you need to read it?! Quite often, similar questions and requests for help with opening such texts are asked to me. In this short article I want to look at the most popular reasons for the appearance of hieroglyphs (and, of course, eliminate them).

Hieroglyphs in text files (.txt)

The most popular problem. The fact is that a text file (usually in txt format, but there are also formats: php, css, info, etc.) can be saved in various encodings.

Encoding- this is a set of characters necessary to fully ensure the writing of text in a specific alphabet (including numbers and special characters). More details about this here: https://ru.wikipedia.org/wiki/Character_set

Most often, one thing happens: the document is simply opened in the wrong encoding, which causes confusion, and instead of the code of some characters, others will be called. Various strange symbols appear on the screen (see Fig. 1)...

Rice. 1. Notepad - encoding problem

How to deal with this?

In my opinion, the best option is to install an advanced notepad, such as Notepad++ or Bred 3. Let's take a closer look at each of them.

Notepad++

Official website: https://notepad-plus-plus.org/

One of the best notepads for both beginners and professionals. Pros: free program, supports Russian language, works very quickly, code highlighting, opens all common file formats, a huge number of options allow you to customize it for yourself.

In terms of encodings, there is generally complete order here: there is a separate section “Encodings” (see Fig. 2). Just try changing ANSI to UTF-8 (for example).

After changing the encoding, my text document became normal and readable - the hieroglyphs disappeared (see Fig. 3)!

Official website: http://www.astonshell.ru/freeware/bred3/

Another great program designed to completely replace the standard notepad in Windows. It also “easily” works with many encodings, easily changes them, supports a huge number of file formats, and supports new Windows operating systems (8, 10).

By the way, Bred 3 is very helpful when working with “old” files saved in MS DOS formats. When other programs show only hieroglyphs, Bred 3 easily opens them and allows you to calmly work with them (see Fig. 4).

If there are hieroglyphs instead of text in Microsoft Word

The very first thing you need to pay attention to is the file format. The fact is that starting with Word 2007, a new format appeared - "docx" (previously it was just "doc"). Usually, new file formats cannot be opened in the “old” Word, but it sometimes happens that these “new” files open in the old program.

Just open the file properties and then look at the "Details" tab (as in Figure 5). This way you will find out the file format (in Fig. 5 - the “txt” file format).

If the file format is docx - and you have an old Word (below version 2007) - then simply update Word to 2007 or higher (2010, 2013, 2016).

Next, when opening a file, pay attention (by default, this option is always enabled, unless, of course, you have a “don’t understand which assembly”) - Word will ask you again: what encoding to open the file in (this message appears at any “hint” of problems when opening the file, see Fig. 5).

Rice. 6. Word - file conversion

Most often, Word automatically determines the required encoding, but the text is not always readable. You need to set the slider to the desired encoding when the text becomes readable. Sometimes you have to literally guess how a file was saved in order to read it.

Rice. 7. Word - the file is normal (the encoding is chosen correctly)!

Changing the encoding in the browser

When the browser mistakenly detects the encoding of an Internet page, you will see exactly the same hieroglyphs (see Figure 8).

To fix the display of the site: change the encoding. This is done in the browser settings:

  1. Google chrome: options (icon in the upper right corner)/advanced options/encoding/Windows-1251 (or UTF-8);
  2. Firefox: left ALT button (if you have the top panel turned off), then view/page encoding/select the desired one (most often Windows-1251 or UTF-8);
  3. Opera: Opera (red icon in the upper left corner)/page/encoding/select the desired one.

Thus, in this article, the most common cases of the appearance of hieroglyphs associated with an incorrectly defined encoding were analyzed. Using the above methods, you can solve all the main problems with incorrect encoding.







2024 gtavrl.ru.