How to estimate the amount of information on Hartley. Information, data, signals

All tips

| Lesson planning and lesson materials | 11th grade | Planning lessons for the academic year (according to the textbook by K.Yu. Polyakov, E.A. Eremina, full in-depth course, 4 hours per week) | Amount of information

Lessons 2 - 3
Information and probability. Hartley's formula. Shannon's formula
(§1. Amount of information)

It became possible to answer this question only after you studied logarithms in a mathematics course. From the formula

It immediately follows that I is the power to which 2 must be raised to get N, i.e. the logarithm:

This formula is called Hartley's formula in honor of the American engineer Ralph Hartley, who proposed it in 1928.

Let, for example, there be 10 planes on the airfield (with numbers from 1 to 10) and it is known that one of them is flying to St. Petersburg.

How much information is there in the message “Plane No. 2 is flying to St. Petersburg”? We have 10 options, from which one is selected, so according to Hartley's formula, the amount of information is equal to

I = log 2 10 ≈ 3.322 bits.

Note that for values of N that are not equal to an integer power of 2, the amount of information in bits is a fractional number.

Using Hartley's formula, you can calculate the theoretical amount of information in a message. Let's assume that the alphabet (the full set of valid characters) includes 50 characters (in this case we say that alphabet power equals 50). Then the information upon receiving each symbol is

I = log 2 50 ≈ 5.644 bits.

If a message contains 100 characters, its total information volume is approximately equal to

5.644 100 = 564.4 bits.

In general, the size of a message of length L characters using an alphabet of N characters is equal to I = L log 2 N.

Such an approach to determine the amount of information is called alphabetical. Of course, in practice it is impossible to use a non-integer number of bits to encode a character, so the first integer is used, which is greater than the theoretically calculated value. For example, if you use an alphabet of 50 characters, each character will be encoded with 6 bits (50 ≤ 2 6 = 64).

How many different messages can be sent if the alphabet and message length are known? Let's assume that 4 letters are used to encode a message, such as "A", "B", "C" and "D", and the message consists of two characters. Since each character can be selected in 4 different ways, for every choice of the first character there are 4 choices of the second. Therefore, the total number of different two-letter messages is calculated as 4 4 = 4 2 = 16. If one more character is added to the message, then for each of the 16 combinations of the first two characters, the third one can be chosen in four ways, so the number of different three-character messages is 4 4 4 = 4 3 = 64.

In general, if an alphabet of N characters is used, then the number of different possible messages of length L characters is Q = N L .

Next page

When studying various phenomena and objects of the surrounding world, people sought to associate numbers with these objects and introduce their quantitative measure. People learned to measure distances, weigh various objects, calculate the areas of figures and volumes of bodies. Having learned to measure time and its duration, we are still trying to understand its nature. The thermometer was invented many years before scientists understood what it measured: approximately three centuries passed from the invention of the first thermometer to the development of thermodynamics. The quantitative study of a certain phenomenon or object may be ahead of its qualitative study, and the process of forming the corresponding concept may follow the quantitative study.

A similar situation has developed with regard to information. R. Hartley in 1928, and then K. Shannon in 1948, proposed formulas for calculating the amount of information, but they never answered the question of what information is. In communication theory, information appears in the form of various messages: for example, letters or numbers, as in telegraphy, or as a continuous function of time, as in telephony or radio broadcasting. In any of these examples, ultimately, the task is to convey the semantic content of human speech. In turn, human speech can be represented in sound vibrations or in written form.

This is another property of this type of information: the ability to represent the same semantic content in different physical forms. W. Ashby was the first to draw special attention to this. Representing information in different physical forms is called encoding. In order to communicate with other people, a person has to constantly engage in encoding, recoding and decoding. It is obvious that information can be transmitted through communication channels in a variety of coding systems.

R. Hartley was the first to introduce the methodology of “measuring the amount of information” into the theory of information transfer. At the same time, R. Hartley believed that the information that he was going to measure was “... a group of physical symbols - words, dots, dashes, etc., which by general agreement have a known meaning for the corresponding parties.” Thus, Hartley set himself the task of introducing some kind of measure to measure encoded information.

Let a sequence of n characters a 1 a 2 a 3 a n be transmitted, each of which belongs to the alphabet A m containing m characters. What is the number K of different variants of such sequences? If n = 1 (one character is transmitted), then K = m; if n=2 (a sequence of 2 characters is transmitted), then K = m*m = m 2 ; in the general case, for a sequence of n characters we get

Hartley proposed calculating the amount of information contained in such a sequence as the logarithm of the number K to base 2:

I = Log 2 K, (2.1)

where K = mn.

That is, the amount of information contained in a sequence of n characters from the alphabet A m , in accordance with Hartley’s formula, is equal to

I = Log 2 (m n) = n Log 2 m. (2.2)

Remark 1. Hartley assumed that all symbols of the alphabet A m can occur with equal probability (frequency) anywhere in the message. This condition is violated for natural language alphabets: for example, not all letters of the Russian alphabet occur in the text with the same frequency.

Remark 2. Any message of length n in the alphabet A m will contain the same amount of information. For example, in the alphabet (0; 1), messages 00111, 11001 and 10101 contain the same amount of information. This means that when calculating the amount of information contained in a message, we are distracted from its semantic content. A “meaningful” message and a message derived from it by an arbitrary permutation of symbols will contain the same amount of information.

Example. A telegraph message uses two symbols - a dot (.) and a dash (-), i.e. the alphabet consists of m = 2 characters. Then, when transmitting one character (n = 1), the amount of information I = Log 2 2 = 1. This amount was taken as a unit of measurement of the amount of information and is called 1 bit (from English binary unit = bit). If a telegraph message in the alphabet (. ; -) contains n characters, then the amount of information I = n Log 2 2 = n (bits).

Using the symbols 0 and 1, information is encoded in a computer and when transmitted over computer networks, i.e. the alphabet consists of two characters (0; 1); one symbol in this case also contains I = Log 2 2 = 1 bits of information, therefore a message of length n characters in the alphabet (0; 1) in accordance with Hartley’s formula (2.2) will contain n bits of information.

If we consider the transmission of messages in the Russian alphabet, consisting of 33 letters, then the amount of information contained in a message of n characters, calculated using Hartley’s formula, is equal to I = n*Log 2 33 » n* 5.0444 bits. The English alphabet contains 26 letters, one character contains Log 2 26 » 4.7 bits, so a message of n characters, calculated using Hartley's formula, contains n* Log 2 26 » 4.7 *n bits of information. However, this result is not correct, since not all letters appear in the text with the same frequency. In addition, separating characters must be added to the letters of the alphabet: space, period, comma, etc.

Formula (2.1) superficially resembles the Boltzmann formula for calculating the entropy of a system with N equally probable microstates:

S= - k*Ln(W), (2.3)

where k is Boltzmann’s constant = 1.38*10 -23, and W is the probability of spontaneous adoption of one of the microstates of the system per unit time t = 10 -13 seconds, W = 1/N, i.e.

S= -k*Ln(1/N) = k*Ln(N), (2.4)

which is completely consistent with formula (2.1) with the exception of the factor k and the base of the logarithm. Because of this external similarity, the value of Log 2 K in information theory is also called entropy and is denoted by the symbol H. Information entropy is a measure of the uncertainty of the state of some random variable (physical system) with a finite or countable number of states. Random value(s.v.) is a quantity that, as a result of an experiment or observation, takes on a numerical value, which is unknown in advance.

So, let X be a random variable that can take N different values x 1, x 2, ... x N; if all values of r.v. X are equally probable, then the entropy (measure of uncertainty) of the quantity X is equal to:

H(X) = Log 2 N. (2.5)

Comment. If a random variable (system) can only be in one state (N=1), then its entropy is equal to 0. In fact, it is no longer a random variable. The greater the number of possible equally probable states, the higher the uncertainty of a system.

Entropy and the amount of information are measured in the same units - bits.

Definition. 1 bit is the entropy of a system with two equally probable states.

Let system X be in two states x1 and x2 with equal probability, i.e. N = 2; then its entropy H(X) = Log 2 2 = 1 bit. An example of such a system is given by a coin, when tossed, either heads (x1) or tails (x2) appear. If the coin is “correct”, then the probability of getting heads or tails is the same and equal to 1/2.

Let's give another definition of the unit of measurement of information.

Definition. The answer to a question of any nature (any character) contains 1 bit of information if it can be “yes” or “no” with equal probability.

Example. Game of "empty-thick". You hide a small object in one hand and ask your partner to guess in which hand you hid it. He asks you "in your left hand?" (or simply chooses a hand: left or right). You answer “yes” if he guessed right, or “no” otherwise. For any answer, the partner receives 1 bit of information, and the uncertainty of the situation is completely removed.

Hartley's formula can be used when solving problems of determining the selected element of a given set. This result can be formulated as the following rule.

If in a given set M, consisting of N elements, some element x is selected, about which nothing else is known, then to determine this element it is necessary to obtain Log 2 N bits of information.

Let's consider several problems using Hartley's formula.

Problem 1. Someone has thought of a natural number in the range from 1 to 32. What is the minimum number of questions that must be asked in order to guaranteed guess the intended (highlighted) number. The answers can only be “yes” or “no”.

A comment. You can try to guess the intended number by simple search. If you're lucky, you'll only have to ask one question, but in the worst case scenario, you'll have to ask 31 questions. In the proposed task, you need to determine the minimum number of questions with which you are guaranteed to determine the intended number.

Solution. Using Hartley's formula, you can calculate the amount of information that needs to be obtained to determine the selected element x from the set of integers (1,2,3 32). To do this, you need to get H = Log 2 32 = 5 bits of information. Questions must be asked in such a way that the answers to them are equally probable. Then the answer to each such question will bring 1 bit of information. For example, you can divide the numbers into two equal groups from 1 to 16 and from 17 to 32 and ask which group the intended number is in. Next, you should do the same with the selected group, which already contains only 16 numbers, etc. Let, for example, think of the number 7.

Question No. 1: Does the intended number belong to the set (17; 32)? The answer "no" gets you 1 bit of information. We now know that number belongs to the set (1; 16).

Question No. 2: Does the conceived number belong to the set (1; 8)? Answering “yes” brings you 1 more bit of information. We now know that number belongs to the set (1; 8).

Question No. 3: Does the conceived number belong to the set (1; 4)? The answer “no” brings you 1 more bit of information. We now know that number belongs to the set (5; 8).

Question No. 4: Does the conceived number belong to the set (7; 8)? Answering “yes” brings you 1 more bit of information. We now know that number belongs to the set (7; 8).

Question No. 5: Is the intended number equal to 8? The answer “no” brings you 1 more bit of information. We now know that the intended number is 7. The problem is solved. Five questions were asked, 5 bits of information were received in response, and the intended number was determined. ‚

Problem 2. (Problem about a counterfeit coin). There are 27 coins, of which 26 are real and one is fake. What is the minimum number of weighings on a lever scale for which one counterfeit coin out of 27 can be reliably identified, using the fact that the counterfeit coin is lighter than the real one?

Lever scales have two cups and with their help you can only determine whether the contents of the cups are the same in weight, and if not, then the contents of which cup is heavier.

Solution. This is a task to identify one selected element out of 27. Using Hartley’s formula, we can immediately determine the amount of information that needs to be obtained to identify a counterfeit coin: it is equal to I = Log 2 27 = Log 2 (3 3) = 3 Log 2 3 bits. Note that without yet knowing the weighing strategy, we can say how much information we need to obtain to solve the problem.

If you put an equal number of coins on the scales, then three equally probable outcomes are possible:

1. The left cup is heavier than the right (L > R);

2. The left cup is lighter than the right (L< П);

3. The left cup is in balance with the right (L = R);

The “lever scale” system can be in three equally probable states, so one weighing gives Log 2 3 bits of information. In total, to solve the problem you need to get I = 3 Log 2 3 bits of information, which means you need to do three weighings to determine a counterfeit coin. We already know the minimum number of weighings, but we do not yet know how they should be carried out. The strategy should be such that each weighing provides the maximum amount of information. Let's divide all the coins into three equal piles A, B and C, 9 pieces each. A counterfeit coin, denoted by the letter f, can be found in any of the three piles with equal probability. Let's choose any two of them, for example A and B, and weigh them.

There are three possible outcomes:

1) A is heavier than B (A > B); means f Î B;

2) A is lighter than B (A< B); значит f Î A;

3) A is in equilibrium with B (A = B); means f Î C.

For any outcome, we will determine in which pile the counterfeit coin f is located, but in this pile there will be only 9 coins. Divide it into three equal piles A1, B1, C1, 3 coins in each. Let's choose any two and weigh them. As in the previous step, we will determine the pile of coins in which the fake coin is located, but now the pile consists of only three coins. Let's choose any two coins and weigh them. This will be the last, third weighing, after which we will find the counterfeit coin.

Problem 3. Without using a calculator, estimate to one bit the entropy of a system that could be in 50 states with equal probability.

Solution. Using Hartley's formula, H = Log 2 50. Let's evaluate this expression.

Obviously 32< 50 < 64; логарифмируем это неравенство à Log 2 32 < Log 2 50 < Log 2 64 à 5 < Log 2 50 < 6. Энтропия системы с точностью до 1 бита 5 < H < 6 . ‚

Task 4. It is known that the entropy of the system is 7 bits. Determine the number of states of this system if it is known that they are all equally probable.

Solution. Let us denote by N the number of states of the system. Since all states are equally probable, then H = Log 2 N à N = 2 H, i.e. N = 2 7 = 128.

Information can exist in the form of:

texts, drawings, drawings, photographs;

light or sound signals;

radio waves;

electrical and nerve impulses;

magnetic recordings;

gestures and facial expressions;

smells and taste sensations;

chromosomes, through which the characteristics and properties of organisms are inherited, etc.

Objects, processes, phenomena of material or intangible properties, considered from the point of view of their information properties, are called information objects.

1.4. How is information transmitted?

Information is transmitted in the form of messages from some source of information to its receiver through a communication channel between them. The source sends a transmitted message, which is encoded into a transmitted signal. This signal is sent over a communication channel. As a result, a received signal appears at the receiver, which is decoded and becomes the received message.

A message containing information about the weather forecast is transmitted to the receiver (TV viewer) from the source - a meteorologist - through a communication channel - television transmitting equipment and a TV.

A living being with its sense organs (eye, ear, skin, tongue, etc.) perceives information from the outside world, processes it into a certain sequence of nerve impulses, transmits impulses along nerve fibers, stores it in memory in the form of the state of neural structures of the brain, reproduces in the form of sound signals, movements, etc., is used in the process of its life.

The transmission of information over communication channels is often accompanied by interference, causing distortion and loss of information.

1.5. How is the amount of information measured?

How much information is contained in the works of great poets, writers, poets or in the human genetic code? Science does not provide answers to these questions and, in all likelihood, will not provide answers soon. Is it possible to objectively measure the amount of information? The most important result of information theory is the following conclusion:

In certain, very broad conditions, it is possible to neglect the qualitative features of information, express its quantity as a number, and also compare the amount of information contained in different groups of data.

Currently, approaches to defining the concept of “amount of information” have become widespread, based on the fact that the information contained in a message can be loosely interpreted in the sense of its novelty or, in other words, reducing the uncertainty of our knowledge about an object. These approaches use mathematical concepts probabilities And logarithm

Approaches to determining the amount of information. Hartley and Shannon formulas.

American engineer R. Hartley in 1928, the process of obtaining information was considered as the selection of one message from a finite predetermined set of N equally probable messages, and the amount of information I contained in the selected message was defined as the binary logarithm of N.

Hartley's formula: I = log 2 N

Let's say you need to guess one number from a set of numbers from one to one hundred. Using Hartley's formula, you can calculate how much information is required for this: I = log 2 100 = 6.644. Thus, a message about a correctly guessed number contains an amount of information approximately equal to 6.644 units of information.

Let's give others examples of equally probable messages:

when tossing a coin: "It came up heads", "heads fell";

on the book page: "the number of letters is even", "the number of letters is odd".

Let us now determine whether the messages are equally probable "The first woman to leave the building's doors" And "The man will be the first to leave the door of the building". It is impossible to answer this question unequivocally. It all depends on what kind of building we are talking about. If this is, for example, a cinema, then the probability of leaving the door first is the same for a man and a woman, and if this is a military barracks, then for a man this probability is much higher than for a woman.

For problems of this kind, the American scientist Claude Shannon proposed in 1948 another formula for determining the amount of information, taking into account the possible unequal probability of messages in the set.

Shannon's formula: I = - (p 1 log 2 p 1 +p 2 log 2 p 2 + . . . +p N log 2 p N ), where p i- the probability that exactly i The th message is selected in a set of N messages.

It is easy to see that if the probabilities p 1 , ..., p N are equal, then each of them is equal 1/N, and Shannon's formula turns into Hartley's formula.

In addition to the two considered approaches to determining the amount of information, there are others. It is important to remember that any theoretical results are applicable only to a certain range of cases, outlined by the initial assumptions.

As a unit of information, Claude Shannon proposed to take one bit (English. bit - bi nary digit - binary digit).

Bitin information theory- the amount of information necessary to distinguish between two equally probable messages (such as “heads” - “tails”, “even” - “odd”, etc.). In computing A bit is the smallest “portion” of computer memory required to store one of the two characters “0” and “1” used for internal machine representation of data and instructions.

A bit is too small a unit of measurement. In practice, a larger unit is more often used - byte, equal to eight bits. It is precisely eight bits that are required to encode any of the 256 characters of the computer keyboard alphabet (256 = 2 8).

Even larger derived units of information are also widely used:

1 Kilobyte (KB) = 1024 bytes = 2 10 bytes,

1 Megabyte (MB) = 1024 KB = 2 20 bytes,

1 Gigabyte (GB) = 1024 MB = 2 30 bytes.

Recently, due to the increase in the volume of processed information, such derived units as:

1 Terabyte (TB) = 1024 GB = 2 40 bytes,

1 Petabyte (PB) = 1024 TB = 2 50 bytes.

Per unit of information, one could choose the amount of information needed to distinguish between, for example, ten equally probable messages. This will not be a binary (bit), but a decimal (dit) unit of information.

This formula, like Hartley’s formula, is used in computer science to calculate the total amount of information at various probabilities.

An example of various unequal probabilities is the exit of people from the barracks in a military unit. A soldier, an officer, and even a general can leave the barracks. But the distribution of soldiers, officers and generals in the barracks is different, which is obvious, because there will be the most soldiers, then officers in number, and the rarest type will be generals. Since the probabilities are not equal for all three types of military, in order to calculate how much information such an event will take, we use Shannon's formula.

For other equally probable events, such as a coin toss (the probability that heads or tails will appear is the same - 50%), Hartley's formula is used.

Now, let's look at the application of this formula using a specific example:

Which message contains the least information (Count in bits):

Vasily ate 6 sweets, 2 of which were barberries.
There are 10 folders on the computer, the required file was found in folder 9.
Baba Luda made 4 pies with meat and 4 pies with cabbage. Gregory ate 2 pies.
Africa has 200 days of dry weather and 165 days of monsoon rain. the African hunted 40 days a year.

In this problem, let us pay attention to the fact that options 1, 2 and 3 are easy to count, since the events are equally probable. And for this we will use Hartley's formula I = log 2 N(Fig. 1) But with point 4 where it is clear that the distribution of days is not even (preponderance towards dry weather), what then should we do in this case? For such events, Shannon’s formula or information entropy is used: I = - (p 1 log 2 p 1 + p 2 log 2 p 2 + . . . + p N log 2 p N),(Fig.3)

FORMULA FOR THE QUANTITY OF INFORMATION (HARTLEY FORMULA, FIG. 1)

Wherein:

I - amount of information
p is the probability that this event will happen

The events that interest us in our problem are

There were two barberries out of six (2/6)
There was one folder in which the required file was found in relation to the total number (1/10)
There were eight pies in total, of which Gregory ate two (2/8)
and the last forty days of hunting in relation to two hundred dry days and forty days of hunting in relation to one hundred and sixty-five rainy days. (40/200) + (40/165)

thus we get that:

PROBABILITY FORMULA FOR AN EVENT.

Where K is the event we are interested in, and N is the total number of these events, also to check yourself, the probability of a particular event cannot be greater than one. (because there are always fewer probable events)

SHANNON'S FORMULA FOR CALCULATING INFORMATION (FIG. 3)

Let's return to our task and calculate how much information is contained.

By the way, when calculating the logarithm it is convenient to use the website - https://planetcalc.ru/419/#

For the first case - 2/6 = 0.33 = and then Log 2 0.33 = 1.599 bits
For the second case - 1/10 = 0.10 Log 2 0.10 = 3.322 bits
For the third - 2/8 = 0.25 = Log 2 0.25 = 2 bits
For the fourth - 40/200 + 40/165 = 0.2 and 0.24, respectively, then we calculate using the formula -(0.2 * log 2 0.2) + -(o.24 * log 2 0.24) = 0.95856 bits

Thus, the answer for our problem was 4.

Send your good work in the knowledge base is simple. Use the form below

Students, graduate students, young scientists who use the knowledge base in their studies and work will be very grateful to you.

Posted on http://www.allbest.ru

1. Information theory

Information theory (or mathematical communication theory) is a branch of cybernetics that studies the processes of storage, transformation and transmission of information; like any mathematical theory, it operates with mathematical models, and not with real physical objects (sources and communication channels). Uses mainly the mathematical apparatus of probability theory and mathematical statistics.

Claude Shannon (1916-2001) is called the “father of information theory.”

Information theory is based on a certain way of measuring the amount of information. Emerging from problems in communication theory, information theory is sometimes considered as a mathematical theory of information transmission systems. Based on the seminal work of K. Shannon (1948), information theory establishes the basic boundaries of the capabilities of information transmission systems, sets the initial principles for their development and practical implementation.

The basic properties of information can be described using a mathematical model that reflects many of the characteristic features of an information measure as it is usually understood intuitively. The source of information and the communication channel through which information is transmitted can be modeled using probabilistic representations. The entropy of an information source is equal to the logarithm of the (effective) number of messages it generates. This is a measure of the complexity of the source description (or, as is sometimes said, a measure of the uncertainty of the message). This understanding of entropy is closely related to the concept of entropy used in thermodynamics.

Physically, the transfer of information can be represented as the induction of the required physical state in the receiving device. The sender intends to transmit a message to the recipient. The essence of the transmission is to reproduce the transmitted message at the output of the communication channel. At the time of transmission, the sender selects the desired message from a list of all possible messages. The recipient does not know in advance which one will be selected. (If he had been informed about this in advance, then there would be no need to send the message.) The communication channel introduces random noise into the process of transmitting information, which distorts the message and thereby makes it difficult to read. At the beginning of the communication process, the recipient is in complete uncertainty as to which message is selected from a list of possible ones. By the end of the communication, the recipient knows this, i.e. the exact description of the selected message becomes known.

The ability of a communication channel to transmit information is characterized by a certain number - throughput (capacity), equal to the logarithm of the effective number of messages distinguishable at its output. The process of information transmission can be considered reliable if the message transmission speed is less than the channel capacity. Otherwise, reliable transmission of information is impossible. The main result of information theory is the statement: if the entropy of the source is less than the channel capacity, then at its output the original message can be reproduced with an arbitrarily small error; if the entropy of the source exceeds its carrying capacity, then it is impossible to make the error small.

The difficulty of conveying a message does not depend on its content; It is no less difficult to convey meaningless messages than meaningful ones. For example, the number 23 in one context may be the price of one barrel of oil, and in another it may be the number of the winner of a race. The meaning of a message depends on context and semantics, and the difficulty of its transmission is determined only by the list of possible messages (and their probabilities).

Any information transmission system can be considered consisting of: a message source, a transmitter, a communication channel and a receiving device, as well as a recipient. For example, when talking on the phone, the source is the speaker, the message is his speech. The communication channel is wires that transmit an electrical signal from the speaker to the listener - the recipient of the message. A communication channel is a medium for transmitting a signal from a transmitter to a receiver. As the signal passes through the channel, it may be affected by interference, introducing distortions into the values of the information parameters of the signal.

Between the sender of the message and the communication channel there may be devices that convert the message into a form convenient for transmission over the communication channel. A decoder installed at the other end of the channel reconstructs the received message.

The study of information transmission systems begins with the source of messages. A wide variety of information can be transmitted through a communication channel: text, live speech, music or images. For each source, you can specify a list of messages that it can generate. For example, the source of telegraphic or telex messages transmits only letters and does not contain, say, musical notations. If live speech is transmitted over a communication channel, the signal loses useful content at a frequency above 20,000 Hz, the upper limit perceived by human hearing. These facts can be used when designing the input of a communication channel.

To estimate the amount of information in a message in information theory, a logarithmic measure introduced by R. Hartley is used, the probabilistic interpretation of which was given in the works of Shannon. If the probability of message x appearing is p(x), and 0<р (х)<1, то количество информации - I(x), содержащееся в сообщении, определяется формулой:

Posted on http://www.allbest.ru

2. Hartley and Shannon formulas

1928 American engineer Ralph Hartley considers the process of obtaining information as choosing one message from a finite given set of N equally probable events.

Hartley's formula:

K=log2 N,

where K is the amount of information, N is the number of equally probable events.

Hartley's formula can also be written as follows: N=2k

Since the occurrence of each of N events has the same probability P, then:

where P is the probability of the event occurring.

Then the formula can be written differently:

In 1948, the American scientist Claude Shannon proposed a different formula for determining the amount of information, taking into account the possible unequal probability of events in the set.

Shannon's formula:

K = - (p1 *log2 p1+ p2 *log 2p 2 + p 3 *log 2p 3 +…+ pi * log2 pi),

where pi is the probability that it is the i-th message that is selected in a set of N messages.

This formula can also be written:

Modern science about the properties of information and the patterns of information processes is called information theory. The content of the concept of “information” can be revealed using the example of two historically first approaches to measuring the amount of information: the approaches of Hartley and Shannon: the first of them is based on set theory and combinatorics, and the second on probability theory.

Information can be understood and interpreted in different problems and subject areas in different ways. As a result, there are different approaches to defining the measurement of information and different ways of introducing a measure of the amount of information.

The amount of information is a numerical value that adequately characterizes the updated information in terms of diversity, complexity, structure (orderliness), certainty, and choice of states of the displayed system.

If we are considering a system that can take one of n possible states, then the actual task is to evaluate this choice, the outcome. Such an assessment can be a measure of information (event).

A measure is a continuous real non-negative function defined on a set of events and which is additive.

Measures can be static and dynamic, depending on what kind of information they allow to evaluate: static (not updated; in fact, messages are evaluated without taking into account resources and the form of updating) or dynamic (updated, i.e. resource costs for updating are also evaluated information).

There are different approaches to determining the amount of information. The most commonly used are volumetric and probabilistic.

Volume approach.

The binary number system is used because in a technical device it is most simple to implement two opposite physical states: magnetized / not magnetized, on / off, charged / not charged, and others.

The amount of information recorded in binary characters in computer memory or on an external storage medium is calculated simply by the number of binary characters required for such recording. In this case, a non-integer number of bits is not possible.

For ease of use, larger units of information quantity than bits have been introduced. Thus, an eight-character binary word contains one byte of information, 1024 bytes form a kilobyte (KB), 1024 kilobytes form a megabyte (MB), and 1024 megabytes form a gigabyte (GB).

Entropy (probability) approach.

This approach is adopted in information and coding theory. This measurement method is based on the following model: the recipient of the message has a certain idea about the possible occurrence of certain events. These ideas are generally unreliable and are expressed by the probabilities with which he expects this or that event. The general measure of uncertainty is called entropy. Entropy is characterized by some mathematical dependence on the total probability of the occurrence of these events.

The amount of information in a message is determined by how much this measure has decreased after receiving the message: the greater the entropy of the system, the greater the degree of its uncertainty. An incoming message completely or partially removes this uncertainty; therefore, the amount of information can be measured by how much the entropy of the system has decreased after receiving the message. The same entropy, but with the opposite sign, is taken as a measure of the amount of information.

R. Hartley's approach is based on fundamental set-theoretic, essentially combinatorial foundations, as well as several intuitively clear and quite obvious assumptions.

If there are many elements and one of them is selected, then a certain amount of information is communicated or generated. This information is that if before the selection it was not known which element would be selected, then after the selection it becomes known. It is necessary to find the type of function that connects the amount of information obtained when choosing a certain element from a set with the number of elements in this set, i.e. with its power.

If the set of elements from which a choice is made consists of one single element, then it is clear that its choice is predetermined, i.e. there is no uncertainty of choice - zero amount of information.

If the set consists of two elements, then the uncertainty of choice is minimal. In this case, the amount of information is minimal.

The more elements in the set, the greater the uncertainty of choice, the more information.

Thus, the logarithmic measure of information proposed by Hartley simultaneously satisfies the conditions of monotonicity and additivity. Hartley himself arrived at his measure on the basis of heuristic considerations similar to those just outlined, but it has now been rigorously proven that the logarithmic measure for the amount of information follows unambiguously from these two conditions he postulated.

In 1948, exploring the problem of rational transmission of information through a noisy communication channel, Claude Shannon proposed a revolutionary probabilistic approach to understanding communications and created the first truly mathematical theory of entropy. His sensational ideas quickly became the basis for the development of two major fields: information theory, which uses the concept of probability and ergodic theory to study the statistical characteristics of data and communication systems, and coding theory, which uses mainly algebraic and geometric tools to develop efficient codes.

Claude Shannon suggested that the gain in information is equal to the loss of uncertainty, and set requirements for its measurement:

1. the measure must be continuous; that is, a change in the value of the probability value by a small amount should cause a small resulting change in the function;

2. in the case when all options (letters in the above example) are equally probable, increasing the number of options (letters) should always increase the value of the function;

3. It should be possible to make a choice (in our example of letters) in two steps, in which the value of the final result function should be the sum of the intermediate result functions.

Therefore, the entropy function must satisfy the conditions:

defined and continuous for everyone,

where for everyone and. (It is easy to see that this function depends only on the probability distribution, but not on the alphabet).

For positive integers, the following inequality must hold:

For positive integers, where, the equality must hold:

information bandwidth entropy

Shannon determined that a measurement of entropy applied to a source of information could determine the minimum channel capacity required to reliably transmit information in the form of encoded binary numbers. To derive Shannon's formula, it is necessary to calculate the mathematical expectation of the “amount of information” contained in a figure from the information source. The Shannon entropy measure expresses the uncertainty of the implementation of a random variable. Thus, entropy is the difference between the information contained in a message and the part of the information that is exactly known (or well predicted) in the message. An example of this is the redundancy of language - there are obvious statistical patterns in the appearance of letters, pairs of consecutive letters, triplets, etc.

Posted on Allbest.ru

How to estimate the amount of information on Hartley. Information, data, signals

Lessons 2 - 3
Information and probability. Hartley's formula. Shannon's formula
(§1. Amount of information)

1.4. How is information transmitted?

1.5. How is the amount of information measured?

Approaches to determining the amount of information. Hartley and Shannon formulas.

Send your good work in the knowledge base is simple. Use the form below

Hartley's formula:

K=log2 N,

where K is the amount of information, N is the number of equally probable events.

Similar documents

Popular articles

Latest articles

Sections

Pages

Special projects

Contacts

How to estimate the amount of information on Hartley. Information, data, signals

Lessons 2 - 3 Information and probability. Hartley's formula. Shannon's formula (§1. Amount of information)

1.4. How is information transmitted?

1.5. How is the amount of information measured?

Approaches to determining the amount of information. Hartley and Shannon formulas.

Send your good work in the knowledge base is simple. Use the form below

Hartley's formula:

K=log2 N,

where K is the amount of information, N is the number of equally probable events.

Similar documents

Popular articles

Latest articles

Sections

Pages

Special projects

Contacts

Lessons 2 - 3
Information and probability. Hartley's formula. Shannon's formula
(§1. Amount of information)