Hieroglyphs appear. What to do if there are hieroglyphs instead of text (in Word, browser or text document). Correcting hieroglyphs to text

Hello, dear readers, admirers and other good people!

Have you ever received and read letters in “whatever language you understand” or went to some Internet resource and instead of the usual letters you saw continuous gibberish? If yes, then this article is for you, because in it we will talk about page encoding, its formats, why it occurs and how to avoid incomprehensible hieroglyphs in the future.

So, today we are waiting for not a light software article, but a harsh technical one, so get ready: we will hit harsh realities a little.
Go.

What is text encoding and what is it used for?

I would like to start with the fact that this article might not have existed, because... The computer-using life of the author of these lines proceeded quite calmly and with dignity. But then one fine day, wandering around the Internet not from my PC, I came across strange phenomena on some sites. When I went to Internet resources, I saw not the familiar Russian alphabet and beautiful, understandable text, but some kind of heresy in the form of an incomprehensible sequence of symbols. She looked something like this (see image).

At first I thought that my beloved Mozilka (Firefox browser) had overheated and it was time for her to call an ambulance, but then I began to understand that the problem was most likely on the network resource side and it lay in incorrectly configured encoding. This really turned out to be the case, and after fiddling around a bit with a tambourine, the problem was promptly resolved. The result of all my love affairs is today’s material. Actually, let's go look into the details.

All information presented in digital form and located on the global web must be considered from two sides: the first - from the user’s side (beautiful and well-groomed text on the monitor screen) and the second - from the search engine side (a certain program code consisting of various tags/ meta tags, symbol tables, etc.).

If you are at least a little familiar with hypertext markup language (HTML), then you should be aware that the site through the eyes of search engines (Google, Yandex) is seen not as ordinary text, but as a structured document consisting of sequences of various kinds of tags. To make it clearer what I'm talking about, let's take a look at our favorite site Notes from Sys.Admin” of the project, but not through the eyes of an ordinary user, but through the “eyes” of a search engine. To do this, press the key combination Ctrl+U (for Firefox and Chrome browsers) and see the following picture (see image):

What we have before us is a machine version of a website, in this unpresentable form it is presented to search engines and it is in this form that they eat it. If we simply took and “sandalized” versions of articles from a notepad or Word with plain text, the machines would not only choke on it, they wouldn’t even eat it. So, we have the main page of the project in HTML form. Pay attention to the line that says UTF-8, this is nothing more than the notorious encoding of the page text, it is what is responsible for the format for displaying information in a presentable form, as a result of which we see normal text through the browser.

Now let's figure out why it happens that sometimes we see cracks on the monitor screen. It's very simple, the problem lies in opening a file in the wrong encoding. If we translate it into everyday language, then let’s say you were sent to the store for milk, and you scooped up some bread, which seems to be also edible, but a completely different product format.

So, now let's understand the theory and for this we will introduce some definitions.

  1. Encoding (or “Charset”) – correspondence between a set of characters and a set of numeric values. Needed to “leak” information onto the Internet, i.e. text information is converted into data bits;
  2. Code page (“Codepage”) – 1 byte (8 bit) encoding;
  3. The number of values ​​accepted by 1 byte is 256 (two in an eighth).

The “symbol-image” correspondence is specified using special code tables, where each symbol is already assigned its own specific numeric code. There are quite a lot of such tables, and in different tables the same symbol can be identified differently (it can have different numeric codes).

All encodings differ in the number of bytes and the set of special characters into which each character of the source text is converted.

Note:
Decoding is an operation that results in the conversion of a symbol code into an image. As a result of this operation, information is displayed on the user's monitor screen.

In general.. We've sorted out the definitions, and now let's find out what kind of (encodings) there are.

Types of text encodings

And, in general, there are enough of them.

  • ASCII

One of the most “ancient” is the American coding table (ASCII, read as “ask”), adopted by the National Standards Institute. For encoding, she used 7 bits, the first 128 values ​​contained the English alphabet (in lower and upper case), as well as signs, numbers and symbols. It was more suitable for English-speaking users and was not universal.

  • Cyrillic

A domestic version of the encoding, for which they began to use the second part of the code table - characters from 129 to 256. Designed for a Russian-speaking audience.

  • MS Windows family encodings: Windows 1250-1258.

8-bit encodings appeared as a consequence of the development of the most popular operating system, Windows. Numbers from 1250 to 1258 indicate the language for which they are tailored, for example, 1250 - for the languages ​​of central Europe; 1251 – Cyrillic alphabet.

  • Information exchange code 8 bits – KOI8

KOI8-R, KOI8-U, KOI-7 – standard for Russian Cyrillic alphabet in Unix-like operating systems.

  • Unicode

A universal character encoding standard that allows you to describe the characters of almost all written languages. Designation “U+xxxx” (xxxx – hexadecimal digits). The most common encoding families UTF (Unicode Transformation Format): UTF-8, 16, 32.

Currently, as they say, UTF-8 “rules” - it is it that provides the best compatibility with older operating systems that used 8-bit characters. The majority of sites on the Internet are in UTF-8 encoding and it is this standard that is universal (support for Cyrillic and Latin).

Of course, I did not list all types of encodings, but only the most popular ones. If you want to know them all for general development, then the full list can be found in the browser itself. To do this, just go to the “tab” View-Encoding-Select list” and get acquainted with all their possible options (see image).

I think a reasonable question has arisen: “ Why the hell are there so many encodings?" Their abundance and reasons for their occurrence can be compared to the phenomenon of cross-browser/cross-platform. This is when the same website is displayed differently in different Internet browsers and on different gadget devices. By the way, the site " Notes from Sys.Admin"With this, as you noticed, everything is in order :).

All these encodings are working options created by developers “to suit themselves” and solve their problems. When their number exceeded all reasonable limits, and search engines began to produce queries like: “ How to remove crappy bugs in the browser?” - the developers began to rack their brains to bring all this mess to a single standard, so that, so to speak, everyone would feel good. And the Unicode encoding, in general, did this “well”. Now, if such problems arise, they are local in nature, and only completely unenlightened users do not know how to fix them (however, often problems with the encoding and display of sites appear due to the fact that the webmaster specified an incorrect format on the server side, and you have to switch the encoding in the browser).

Well, actually, for now, all the “basically necessary” theory that will allow you not to “float” in coding issues, now let’s move on to the practical part of the article.

Solving problems with encoding or how to remove crappy codes?

So, our article would be incomplete if we did not touch on consumer and everyday issues. Let's look at them and start with how (with what) you can view the encoding?

Any operating system has a symbol table, you don’t need to download it, install it - this is a given from above, which is located at the address: “Start-programs-standard-utilities-symbol table”. This is a table of vector shapes of all fonts installed on your operating system.

By selecting “additional parameters” (Unicode set) and the corresponding font type, you will see the full set of characters included in it. By clicking on any character, you will see its code in UTF-16 format, consisting of 4 hexadecimal digits (see image).

Now a few words about how to remove krakozabry. They can occur in two cases:

  1. From the user’s side - when reading information on the Internet (for example, when visiting a website);
  2. Or, as mentioned just above, on the part of the webmaster (for example, when creating/editing text files that support the syntax of programming languages ​​in the ++ program or due to specifying the wrong encoding in the site code).

Let's consider both options.

No. 1. Hieroglyphs from the user's side.
Let's say you launched the OS and in some of the applications you see the notorious scribbles. To fix this, go to: “ Start - Control Panel - Regional and Language Options - Change the language” and select from the list, “Russia”.

Also check in all tabs that the localization is “Russia/Russian” - this is the so-called system locale.

If you opened the site and suddenly realized that hieroglyphs do not allow you to read the information, then you should change the encoding using the browser (“View - Encoding”). On what? It all depends on the type of these krakozyabrs. Refer to the following cheat sheet (see image).

No. 2. Hieroglyphs from the webmaster's side.
Very often, novice website developers do not attach much importance to the encoding of the document being created, as a result of which they later encounter the above-mentioned problem. Here are some simple basic tips for webmasters to fix the problem.

To prevent this from happening, go to the Notepad++ editor and select “Encodings” from the menu. It is he who will help transform the existing document. The question is, which one? Most often (if the site is on WordPress or Joomla), then “ Convert to UTF-8 without BOM” (see image).

Having made such a conversion, you will see changes in the program status line.

Also, to avoid scams, it is necessary to force encoding information into the site header. Thus, you indicate to the browser that the site should be read in the prescribed encoding. A novice webmaster needs to understand that leapfrogs with encoding most often occur due to a mismatch between the server settings and the site settings, i.e. On the server, one encoding is registered in the database, and the site sends pages to the browser in a completely different one.

To do this, you need to write “blatantly” (in the site header, i.e., as often, in the header.php file) between the tags the following line:

By writing such a line, you will force the browser to correctly interpret the encoding, and the hieroglyphs will disappear.

You may also need to adjust the data output from the database (MySQL). This is done like this:

mysql_query("SET NAMES utf8");
myqsl_query("SET CHARACTER SET utf8");
mysql_query("SET COLLATION_CONNECTION="utf8_general_ci"" ");

Alternatively, you can also make a knight’s move and write the following lines in the .htaccess file:

# BEGIN UTF8
AddDefaultCharset utf-8
AddCharset utf-8 *

CharsetSourceEnc utf-8
CharsetDefault utf-8

#END UTF8

All of the above methods (or some of them) will most likely help you and your future visitors get rid of hated hieroglyphs and encoding problems. Unfortunately, we won’t go into more detail here about the instructions for webmaster stuff; I think that they will definitely understand the details if they want (after all, we have a slightly different topic for the site).

Well, now the practical part of the article is finished, all that remains is to sum up some small results.

Afterword

Today we got acquainted with the concept of text encoding. I am sure that now, when scribbles appear on your computer monitor, you will not give up, but remember all the methods given here and resolve the issue in your favor!

That's all, thank you for your attention and see you again.

Good day.

Probably, every PC user has encountered a similar problem: you open an Internet page or a Microsoft Word document - and instead of text you see hieroglyphs (various “kryakozabry”, unfamiliar letters, numbers, etc. (like in the picture on the left...)).

It’s good if this document (with hieroglyphs) is not particularly important to you, but what if you need to read it?! Quite often, similar questions and requests for help with opening such texts are asked to me. In this short article I want to look at the most popular reasons for the appearance of hieroglyphs (and, of course, eliminate them).

Hieroglyphs in text files (.txt)

The most popular problem. The fact is that a text file (usually in txt format, but they are also formats: php, css, info, etc.) can be saved in different encodings.

Encoding- this is a set of characters necessary to fully ensure the writing of text in a specific alphabet (including numbers and special characters). More details about this here: https://ru.wikipedia.org/wiki/Character_set

Most often, one thing happens: the document is simply opened in the wrong encoding, which causes confusion, and instead of the code of some characters, others will be called. Various strange symbols appear on the screen (see Fig. 1)…

Rice. 1. Notepad - encoding problem

How to deal with this?

In my opinion, the best option is to install an advanced notepad, such as Notepad++ or Bred 3. Let's take a closer look at each of them.

Notepad++

One of the best notepads for both beginners and professionals. Pros: free program, supports Russian language, works very quickly, code highlighting, opens all common file formats, a huge number of options allow you to customize it for yourself.

In terms of encodings, there is generally complete order here: there is a separate section “Encodings” (see Fig. 2). Just try changing ANSI to UTF-8 (for example).

After changing the encoding, my text document became normal and readable - the hieroglyphs disappeared (see Fig. 3)!

Rice. 3. The text has become readable... Notepad++

Bred 3

Another great program designed to completely replace the standard notepad in Windows. It also “easily” works with many encodings, easily changes them, supports a huge number of file formats, and supports new Windows operating systems (8, 10).

By the way, Bred 3 is very helpful when working with “old” files saved in MS DOS formats. When other programs show only hieroglyphs, Bred 3 easily opens them and allows you to calmly work with them (see Fig. 4).

If there are hieroglyphs instead of text in Microsoft Word

The very first thing you need to pay attention to is the file format. The fact is that starting with Word 2007, a new format appeared - “docx” (previously it was just “doc”). Usually, new file formats cannot be opened in the “old” Word, but sometimes it happens that these “new” files open in the old program.

Just open the file properties, and then look at the “Details” tab (as in Fig. 5). This way you will find out the file format (in Fig. 5 - the “txt” file format).

If the file format is docx - and you have an old Word (below version 2007) - then simply update Word to 2007 or higher (2010, 2013, 2016).

Next, when opening the file note(by default, this option is always enabled, unless, of course, you have a “don’t understand what assembly”) - Word will ask you again: what encoding to open the file in (this message appears at any “hint” of problems when opening the file, see Fig. . 5).

Rice. 6. Word - file conversion

Most often, Word automatically determines the required encoding, but the text is not always readable. You need to set the slider to the desired encoding when the text becomes readable. Sometimes you have to literally guess how a file was saved in order to read it.

Rice. 8. The browser detected the wrong encoding

To fix the display of the site: change the encoding. This is done in the browser settings:

  1. Google chrome: options (icon in the upper right corner)/advanced options/encoding/Windows-1251 (or UTF-8);
  2. Firefox: left ALT button (if you have the top panel turned off), then view/page encoding/select the desired one (most often Windows-1251 or UTF-8);
  3. Opera: Opera (red icon in the upper left corner)/page/encoding/select the desired one.

PS

Thus, in this article, the most common cases of the appearance of hieroglyphs associated with an incorrectly defined encoding were analyzed. Using the above methods, you can solve all the main problems with incorrect encoding.

I would be grateful for additions on the topic. Good Luck :)

What should you do if the text on your computer or in your browser is displayed as a combination of incomprehensible symbols, in simple terms - hieroglyphs? We solve the problem in this article.

First, let's clarify one detail... We are not talking about Japanese or other hieroglyphs, but about those incomprehensible combinations of letters, numbers and symbols that appear instead of text. The hieroglyphs (as we will conventionally call them) will look approximately as shown in the picture below.

If you try to open any library or file (for example, with extensions .dll, .exe) in Notepad, you will be shown exactly these hieroglyphs. But this does not mean that your computer is faulty. It’s just that some files need to be opened in the appropriate editors. Those. if you try to open a pdf book in notepad, you will get hieroglyphs instead of text. This leads to the first rule: Open and edit text only in programs suitable for this!

However, it also happens that absolutely all text on your computer is displayed incorrectly. This may be due to the fact that the virus has damaged some files, or everything is quite harmless (you installed some crooked program, or something was crookedly updated, or something was crookedly Russified). In this case, there is no need to panic and hit yourself in the chest with your heel. To solve the problem, simply configure your computer.

To configure, we need to find the “Regional and Language Options” utility in the “Control Panel”.
In Windows XP, this is done in this way: Start - Settings - Control Panel - Date, time, language and regional settings - Language and regional settings.
In Windows 7, everything is a little simpler: Start - Control Panel - Language and Regional Standards.

After running this utility, you will need to set “Russian language” as the main language in all tabs.

If you have entered all the values ​​(or you already have them), and the text continues to display incorrectly, then try restoring the system. Restore your system to the day (or earlier) before the day you had this problem. The system will repair damaged system files and the text will be displayed normally.

There are times when text is displayed incorrectly only in a specific application or (most often) in an Internet browser. Then you need to rummage through the settings of your browser or application, selecting Russian as the main language. Typically, browsers have a “default” option in their text encoding. If you have already selected this item, but problems still arise, then try changing the encoding to Cyrillic. If nothing helps, then simply reinstall your browser. In most modern browsers, when you reinstall them, all your settings and bookmarks are saved (and saved passwords too). Therefore, you can safely reinstall the browser without fear of losing any data.

If the above methods still don’t help, then try asking your friends or acquaintances - maybe they have already encountered a similar problem. Or look for information on various forums and websites. In general, whoever seeks will always find. Unlike most problems, font problems can have many root causes.

The methods described above are basic, and therefore I hope they will help you.


Latest articles in the “Computers & Internet” section:

Which computer mouse to choose
Computer monitoring and prevention
Disposable mail
What is an IP address and why is it needed?

Set the character set

Meta tag

You need to add a special meta tag to each page (or header template) that tells the browser what set of characters to use to display texts. This tag is standard and usually looks like this:

charset=UTF-8» />

charset=”utf-8″/> (option for HTML 5)

You need to paste it into the section - better at the very beginning, right after the opening one :

Meta encoding tag

Via .htaccess (if all else fails)

Usually the first two options are enough and browsers display the text how to. But some of them may have problems and therefore you can resort to help .htaccess file.

To do this, you need to write the following line in it:

AddDefaultCharset utf-8

That's all. If you apply sequentially these 3 methods of setting encoding on your project, then the likelihood is that that everything will be displayed as it should, close to 100%.

How to “see” what is hidden behind strange symbols on a website?

If you go to a web page, see “crazy words” and want to see normal text, then there are only two ways:

  • inform the site owner so that everything is configured properly
  • try to guess the encoding yourself. This is done using standard browser tools. In Chrome, for example, you need to click on the menu "Tools => Encoding" and from a huge list select the appropriate set of characters (i.e. guess).

Fortunately, almost all modern web projects are done in UTF-8 encoding, which is “universal” for different alphabets and therefore it is less and less likely to see these strange characters on the Internet.

This article discusses why, instead of Russian letters, squares, incomprehensible symbols, gibberish, question marks, dots, scribbles or cubes appear in Windows 7, vista, XP?

What to do to get rid of these phenomena? There is no universal recipe. A lot depends on the version of Windows, and the build itself.

The first reason why this happens is an encoding failure. The integrity of the registry is compromised and crashes occur. But this is not always the main source.

It often happens that even on a newly installed operating system, after launching some programs, instead of Russian letters, squares, incomprehensible symbols, quacks, question marks, dots, scribbles or cubes appear.

If the problem is with numbers, then it’s quick, and this one will help you get rid of question marks instead of normal letters.

This happens especially often after installing cracks. People's “craftsmen” do not take everything into account, and perhaps they make translations only for one operation.

Perhaps this is not the main thing, considering that everything is in the encoding. Maybe the program just doesn't support certain letters.

Although this is surprising, by default the Windows 7 operating system instead of Russian letters in some programs displays squares, incomprehensible symbols, gibberish, question marks, dots, scribbles or cubes.

I always make changes to the registry after reinstallation, even if everything works fine. There will be no problems with unclear symbols in the future.

Troubleshooting the problem through the registry

This manipulation is very easy to do. To do this, download and run the first file.

I emphasize, only the first, second - if after the first, incomprehensible symbols, hieroglyphs or gibberish do not disappear and normal Russian letters do not appear.

Just don’t forget to restart your computer after making changes to the system registry, otherwise you won’t expect any changes.

There are several other ways to change the encoding, but it is better not to do them, since this will be shifting the burden (cargo) from a sore spot to an unhealthy one.

The program that currently displays gibberish, hieroglyphs and generally incomprehensible things may start working, but Russian letters in others will be broken.

Just in case, you can try renaming the files " c_1252.nls….. c_1255.nls" add “bak” to them at the very end. It should look like this c_1252.nls.bak.” Do this for all four. They are located in this path: C:\Windows\System32.

I would like to say that I reinstalled at least 100 Windows 7. True, almost all were 32 (86) maximum. There were problems with displaying Russian letters.

This was especially true for programs. In some cases, questions appeared, squares, incomprehensible symbols, gibberish, question marks, dots, scribbles or cubes, but the very first method described always helped.

Also, squares, strange symbols, gibberish, question marks, dots, scribbles or cubes may appear in or skyrim.

This happens due to a mismatch of formats (encodings). They can be installed independently for each case separately (in manual mode) Look at the figure:

At the very top, click “file”, then move the cursor to the “encoding” place and click change. Good luck.




Top