Literature      17.09.2021

Typical discrete distributions of random variables. Normal law of probability distribution Determine the distribution of a random variable

normal law distribution is the most common in practice. main feature that distinguishes it from other laws is that it is ultimate law, which is approached by other distribution laws under very common typical conditions (see Chap. 6).

Definition. A continuous random variable X hasnormal distribution law (Gauss law)with parameters a and a 2 , if its probability density has the form

The term "normal" is not entirely successful. Many signs obey the normal law, for example, the height of a person, the range of a projectile, and so on. But if any sign obeys another, different from the normal, distribution law, then this does not at all speak of the “abnormality” of the phenomenon associated with this sign.

The normal distribution curve is called normal, or Gaussian, crooked. On fig. 4.6, A, 6 the normal curve φ, (x) with the parameters d00 2 is given, i.e. I[a a 2), and the graph of the distribution function of a random variable X, which has a normal law. Note that a normal curve is symmetrical about a straight line. x = a, has a maximum at the point X= A,

equal , i.e.

And two inflection points x = a±

with ordinate

It can be seen that in the expression for the density of the normal law, the parameters are denoted by the letters A and article 2, which we designate expected value M(X) and dispersion OH). Such a coincidence is not accidental. Let us consider a theorem establishing the probabilistic meaning of the parameters of the normal law.

Theorem. The mathematical expectation of a random variable X distributed according to the normal law is equal to the parameter a of this law, those.

A its variance - parameter a 2 , i.e.

Mathematical expectation of a random variable X:

We make a change of variable by setting

Then limits of integration do not change

and hence

(first integral zero as an integral of an odd function over an interval symmetric with respect to the origin, and the second integral is the Euler-Poisson integral).

Variance of a random variable X:

We make the same change of variable x = a + o^2 t, as in the calculation of the previous integral. Then

Applying the method of integration by parts, we obtain

Find out how the normal curve will change when changing parameters A and with 2 (or a). If a = const, and the parameter changes a (a x a 3), i.e. the center of symmetry of the distribution, then the normal curve will shift along the x-axis without changing its shape (Fig. 4.7).

If a = const and the parameter a 2 (or a) changes, then the ordinate changes

curve maximum As a increases, the ordinate of the maximum

curve decreases, but since the area under any distribution curve must remain equal to one, then the curve becomes flatter, stretching along the x-axis; when decreasing su, on the contrary, the normal curve stretches upwards, simultaneously shrinking from the sides. On fig. 4.8 shows normal curves with parameters a 1 (o 2 and a 3, where o, A(aka mathematical expectation) characterizes the position of the center, and the parameter a 2 (aka dispersion) characterizes the shape of the normal curve.

Normal distribution of a random variable X with parameters A= 0, st 2 = 1, g.u. X ~ N( 0; 1), is called standard or normalized and the corresponding normal curve is standard or normalized.

The difficulty of directly finding the distribution function of a random variable distributed according to the normal law according to formula (3.23) and the probability of its falling into a certain interval according to formula (3.22) is connected with the fact that the integral of the function (4.26) is "uncollectible" in elementary functions . Therefore, they are expressed through the function

- function (probability integral) Laplace, for which the tables are made. Recall that we have already encountered the Laplace function when considering the integral theorem of Moivre - Laplace (see Section 2.3). Its properties were also considered there. Geometrically, the Laplace function Ф(.с) is the area under the standard normal curve on the segment [-X; X] (Fig. 4.9) 1 .

Rice. 4.10

Rice. 4.9

Theorem. The distribution function of a random variable X, distributed according to the normal law, is expressed in terms of the Laplace functionФ(х) according to the formula

According to the formula (3.23), the distribution function:

Let us make a change of variable, assuming X-> -oo? -" -00, so

1 Along with the probability integral of the form (4.29), which represents the function Ф(х), its expressions are also used in the literature in the form of other tabulated functions:

which are the areas of the standard normal curve, respectively, on the intervals (0; x], (-oo; x], [-x>/2; Xl/2 .

First integral

(due to the evenness of the integrand and the fact that the Euler-Poisson integral is equal to [To).

The second integral, taking into account formula (4.29), is

Geometrically, the distribution function is the area under the normal curve on the interval (-co, x) (Fig. 4.10). As you can see, it consists of two parts: the first, on the interval (-oo, A), equal to 1/2, i.e. half of the entire area under the normal curve, and the second, on the interval (i, x),

equal

Consider the properties of a random variable distributed according to the normal law.

1. The probability of hitting a random variable X, distributed according to the normal law, V interval[x 1(x 2 ], is equal to

Considering that, according to property (3.20), the probability P(x,

where and Г 2 are determined by the formula (4.33) (Fig. 4.11). ?

2. The probability that the deviation of a random variable X, distributed according to the normal law, from the mathematical expectation a will not exceed the value A > 0 ( in absolute value), is equal to

as well as the oddity property of the Laplace function, we obtain

Where? \u003d D / o (Fig. 4.12). ?

On fig. 4.11 and 4.12 gives a geometric interpretation of the properties of the normal law.

Comment. Considered in Chap. 2 the approximate integral formula of Moivre - Laplace (2.10) follows from the property (4.32) of a normally distributed random variable with x (= a, x 2 = b) a = pr And So

as a binomial law of distribution of a random variable x=t with parameters P And R, for which this formula was obtained, at n -> oc tends to the normal law (see Chap. 6).

Similarly, the consequences (2.13), (2.14) and (2.16) of the Moivre-Laplace integral formula for the number x=t occurrence of an event in P independent tests and its frequency t/n follow from properties (4.32) and (4.34) of the normal law.

Let us calculate by formula (4.34) the probabilities P(X-a e) at various values ​​of D (we use Table II of the appendices). Get

This is where the "rule of three sigma" comes from.

If a random variable X has a normal distribution law with parameters a and a 2 , i.e. M(a; a 2), then it is almost certain that its values ​​are in the interval(a - for, A+ For).

Violation of the "rule of three sigma", i.e. deviation of a normally distributed random variable X more than 3a (but in absolute value), is an event that is practically impossible, since its probability is very small:

Note that the deviation D in, at which , is called

likely deviation. For the normal law D in « 0.675a, i.e. per interval (A - 0.675a, A+ 0.675a) accounts for half of the total area under the normal curve.

Find the skewness coefficient and kurtosis of the random variable x, distributed according to the normal law.

Obviously, due to the symmetry of the normal curve with respect to the vertical line x = a, passing through the distribution center a \u003d M (X), the coefficient of asymmetry of the normal distribution L \u003d 0.

Kurtosis of a normally distributed random variable X we find by formula (3.37), i.e.

where we took into account that the central moment of the 4th order, found by formula (3.30) taking into account definition (4.26), i.e.

(we omit the calculation of the integral).

Thus, the kurtosis of the normal distribution is zero and the steepness of other distributions is defined with respect to the normal one (we already mentioned this in Section 3.7).

O Example 4.9. Assuming that the height of men of a certain age group is a normally distributed random variable X with parameters A= 173 and a 2 = 36:

  • 1) Find: a) the expression for the probability density and the distribution function of a random variable x; b) the proportion of costumes of the 4th height (176-182 cm) and the 3rd height (170-176 cm), which must be provided for in the total production volume for this age group; c) quantile x 07 and 10% random variable point x.
  • 2) Formulate the "rule of three sigma" for a random variable X. Decision. 1, a) Using formulas (4.26) and (4.30), we write

1, b) The share of suits of the 4th height (176-182 cm) in the total production is determined by the formula (4.32) as a probability


(Fig. 4.14), since according to formulas (4.33)

The proportion of suits of the 3rd height (170-176 cm) could be determined similarly to formula (4.32), but it is easier to do this using formula (4.34), given that this interval is symmetrical with respect to the mathematical expectation A = M(X) = 173 i.e. inequality 170 X X -173|

(see Fig. 4.14;.

1, c) Quantile x 07(see paragraph 3.7) random variable X we find from equation (3.29) taking into account formula (4.30):

where

According to the table 11 applications we find I- 0.524 and

This means that 70% of men in this age group are under 176 cm tall.

  • 10% point - ego quantile x 09 \u003d 181 cm (found similarly), i.e. 10% of men are at least 181 cm tall.
  • 2) It is almost certain that the growth of men of this age group lies within the limits from A- Z = 173 - 3 6 = 155 to a + Zet \u003d 173 + 3 - 6 \u003d \u003d 191 (cm), i.e. 155

    Due to the features of the normal distribution law noted at the beginning of the paragraph (and in Chapter 6), it occupies a central place in the theory and practice of probabilistic-statistical methods. The great theoretical significance of the normal law lies in the fact that with its help a number of important distributions are obtained, which are considered below.

    • The arrows in Fig. 4.11-4.13 conditionally marked the area and d and the corresponding figures under the normal curve.
    • The values ​​of the Laplace function F(x) are determined from the table. II applications.

In many problems related to normally distributed random variables, it is necessary to determine the probability that a random variable , obeying the normal law with parameters , falls into the interval from to . To calculate this probability, we use the general formula

where is the distribution function of the quantity .

Let us find the distribution function of a random variable distributed according to the normal law with parameters . The distribution density of the value is:

. (6.3.2)

From here we find the distribution function

. (6.3.3)

Let us make the change of variable in the integral (6.3.3)

and bring it to the form:

(6.3.4)

The integral (6.3.4) is not expressed in terms of elementary functions, but it can be calculated in terms of a special function that expresses a definite integral of the expression or (the so-called probability integral), for which tables are compiled. There are many varieties of such functions, for example:

;

etc. Which of these functions to use is a matter of taste. We will choose as such a function

. (6.3.5)

It is easy to see that this function is nothing but the distribution function for a normally distributed random variable with parameters .

We agree to call the function a normal distribution function. The appendix (Table 1) shows tables of function values.

Let us express the distribution function (6.3.3) of the quantity with parameters and in terms of the normal distribution function . Obviously,

. (6.3.6)

Now let's find the probability of hitting a random variable on the segment from to . According to formula (6.3.1)

Thus, we have expressed the probability that a random variable , distributed according to the normal law with any parameters, will fall on the plot in terms of the standard distribution function , corresponding to the simplest normal law with parameters 0.1. Note that the function arguments in formula (6.3.7) have a very simple meaning: there is a distance from the right end of the section to the center of dispersion, expressed in standard deviations; - the same distance for the left end of the section, and this distance is considered positive if the end is located to the right of the dispersion center, and negative if to the left.

Like any distribution function, the function has the following properties:

3. - non-decreasing function.

In addition, from the symmetry of the normal distribution with parameters about the origin, it follows that

Using this property, in fact, it would be possible to limit the function tables to only positive values ​​of the argument, but in order to avoid an unnecessary operation (subtraction from one), Table 1 of the appendix provides values ​​for both positive and negative arguments.

In practice, one often encounters the problem of calculating the probability that a normally distributed random variable will fall into an area that is symmetrical about the center of dispersion. Consider such a section of length (Fig. 6.3.1). Let us calculate the probability of hitting this site using the formula (6.3.7):

Taking into account the property (6.3.8) of the function and giving the left side of the formula (6.3.9) a more compact form, we obtain a formula for the probability of a random variable distributed according to the normal law falling into a section symmetric with respect to the scattering center:

. (6.3.10)

Let's solve the following problem. Let us set aside successive segments of length from the scattering center (Fig. 6.3.2) and calculate the probability that a random variable will fall into each of them. Since the curve of the normal law is symmetrical, it is enough to postpone such segments only in one direction.

According to the formula (6.3.7) we find:

(6.3.11)

As can be seen from these data, the probabilities of hitting each of the following segments (fifth, sixth, etc.) with an accuracy of 0.001 are equal to zero.

Rounding the probabilities of hitting the segments to 0.01 (up to 1%), we get three numbers that are easy to remember:

0,34; 0,14; 0,02.

The sum of these three values ​​is 0.5. This means that for a normally distributed random variable, all dispersions (up to fractions of a percent) fit into the section .

This allows, knowing the standard deviation and the mathematical expectation of a random variable, to approximately indicate the range of its practically possible values. Such a method for estimating the range of possible values ​​of a random variable is known in mathematical statistics called the three sigma rule. The rule of three sigma also implies an approximate method for determining the standard deviation of a random variable: they take the maximum practically possible deviation from the average and divide it by three. Of course, this rough method can only be recommended if there are no other, more accurate ways to determine .

Example 1. A random variable , distributed according to the normal law, is an error in measuring a certain distance. When measuring, a systematic error is allowed in the direction of overestimation by 1.2 (m); the standard deviation of the measurement error is 0.8 (m). Find the probability that the deviation of the measured value from the true value does not exceed 1.6 (m) in absolute value.

Solution. The measurement error is a random variable obeying the normal law with parameters and . We need to find the probability that this quantity falls on the interval from to . By formula (6.3.7) we have:

Using the function tables (Appendix, Table 1), we find:

; ,

Example 2. Find the same probability as in the previous example, but on the condition that there is no systematic error.

Solution. By formula (6.3.10), assuming , we find:

.

Example 3. At a target that looks like a strip (freeway), the width of which is 20 m, shooting is carried out in a direction perpendicular to the freeway. Aiming is carried out along the center line of the highway. The standard deviation in the firing direction is equal to m. There is a systematic error in the firing direction: the undershoot is 3 m. Find the probability of hitting the freeway with one shot.

We can single out the most common laws of distribution of discrete random variables:

  • Binomial distribution law
  • Poisson distribution law
  • Geometric distribution law
  • Hypergeometric distribution law

For given distributions of discrete random variables, the calculation of the probabilities of their values, as well as numerical characteristics(mathematical expectation, variance, etc.) is produced according to certain "formulas". Therefore, it is very important to know these types of distributions and their basic properties.


1. Binomial distribution law.

A discrete random variable $X$ is subject to the binomial probability distribution if it takes the values ​​$0,\ 1,\ 2,\ \dots ,\ n$ with probabilities $P\left(X=k\right)=C^k_n\cdot p^k\cdot (\left(1-p\right))^(n-k)$. In fact, the random variable $X$ is the number of occurrences of the event $A$ in $n$ independent trials. Probability distribution law for the random variable $X$:

$\begin(array)(|c|c|)
\hline
X_i & 0 & 1 & \dots & n \\
\hline
p_i & P_n\left(0\right) & P_n\left(1\right) & \dots & P_n\left(n\right) \\
\hline
\end(array)$

For such a random variable, the expectation is $M\left(X\right)=np$, the variance is $D\left(X\right)=np\left(1-p\right)$.

Example . There are two children in the family. Assuming the birth probabilities of a boy and a girl equal to $0.5$, find the law of distribution of the random variable $\xi $ - the number of boys in the family.

Let the random variable $\xi $ be the number of boys in the family. The values ​​that $\xi:\ 0,\ ​​1,\ 2$ can take. The probabilities of these values ​​can be found by the formula $P\left(\xi =k\right)=C^k_n\cdot p^k\cdot (\left(1-p\right))^(n-k)$, where $n =2$ - number of independent trials, $p=0.5$ - probability of occurrence of an event in a series of $n$ trials. We get:

$P\left(\xi =0\right)=C^0_2\cdot (0.5)^0\cdot (\left(1-0.5\right))^(2-0)=(0, 5)^2=0.25;$

$P\left(\xi =1\right)=C^1_2\cdot 0.5\cdot (\left(1-0.5\right))^(2-1)=2\cdot 0.5\ cdot 0.5=0.5;$

$P\left(\xi =2\right)=C^2_2\cdot (0,5)^2\cdot (\left(1-0,5\right))^(2-2)=(0, 5)^2=0.25.$

Then the distribution law of the random variable $\xi $ is the correspondence between the values ​​$0,\ 1,\ 2$ and their probabilities, i.e.:

$\begin(array)(|c|c|)
\hline
\xi & 0 & 1 & 2 \\
\hline
P(\xi) & 0.25 & 0.5 & 0.25 \\
\hline
\end(array)$

The sum of probabilities in the distribution law must be equal to $1$, i.e. $\sum _(i=1)^(n)P(\xi _((\rm i)))=0.25+0.5+0, 25=$1.

Expectation $M\left(\xi \right)=np=2\cdot 0.5=1$, variance $D\left(\xi \right)=np\left(1-p\right)=2\ cdot 0.5\cdot 0.5=0.5$, standard deviation $\sigma \left(\xi \right)=\sqrt(D\left(\xi \right))=\sqrt(0.5 )\approx $0.707.

2. Poisson distribution law.

If a discrete random variable $X$ can take only non-negative integer values ​​$0,\ 1,\ 2,\ \dots ,\ n$ with probabilities $P\left(X=k\right)=(((\lambda )^k )\over (k}\cdot e^{-\lambda }$, то говорят, что она подчинена закону распределения Пуассона с параметром $\lambda $. Для такой случайной величины математическое ожидание и дисперсия равны между собой и равны параметру $\lambda $, то есть $M\left(X\right)=D\left(X\right)=\lambda $.!}

Comment. The peculiarity of this distribution is that, based on experimental data, we find the estimates $M\left(X\right),\ D\left(X\right)$, if the obtained estimates are close to each other, then we have reason to assert that that the random variable is subject to the Poisson distribution law.

Example . Examples of random variables subject to the Poisson distribution law can be: the number of cars that will be serviced tomorrow by a gas station; the number of defective items in the manufactured product.

Example . The plant sent $500$ of products to the base. The probability of product damage in transit is $0.002$. Find the distribution law of the random variable $X$, equal to the number damaged products; which is equal to $M\left(X\right),\ D\left(X\right)$.

Let a discrete random variable $X$ be the number of damaged products. Such a random variable is subject to the Poisson distribution law with the parameter $\lambda =np=500\cdot 0.002=1$. The probabilities of the values ​​are $P\left(X=k\right)=(((\lambda )^k)\over (k}\cdot e^{-\lambda }$. Очевидно, что все вероятности всех значений $X=0,\ 1,\ \dots ,\ 500$ перечислить невозможно, поэтому мы ограничимся лишь первыми несколькими значениями.!}

$P\left(X=0\right)=((1^0)\over (0}\cdot e^{-1}=0,368;$!}

$P\left(X=1\right)=((1^1)\over (1}\cdot e^{-1}=0,368;$!}

$P\left(X=2\right)=((1^2)\over (2}\cdot e^{-1}=0,184;$!}

$P\left(X=3\right)=((1^3)\over (3}\cdot e^{-1}=0,061;$!}

$P\left(X=4\right)=((1^4)\over (4}\cdot e^{-1}=0,015;$!}

$P\left(X=5\right)=((1^5)\over (5}\cdot e^{-1}=0,003;$!}

$P\left(X=6\right)=((1^6)\over (6}\cdot e^{-1}=0,001;$!}

$P\left(X=k\right)=(((\lambda )^k)\over (k}\cdot e^{-\lambda }$!}

The distribution law of the random variable $X$:

$\begin(array)(|c|c|)
\hline
X_i & 0 & 1 & 2 & 3 & 4 & 5 & 6 & ... & k \\
\hline
P_i & 0.368; & 0.368 & 0.184 & 0.061 & 0.015 & 0.003 & 0.001 & ... & (((\lambda )^k)\over (k}\cdot e^{-\lambda } \\!}
\hline
\end(array)$

For such a random variable, the mathematical expectation and variance are equal to each other and equal to the parameter $\lambda $, i.e. $M\left(X\right)=D\left(X\right)=\lambda =1$.

3. Geometric law of distribution.

If a discrete random variable $X$ can take only natural values ​​$1,\ 2,\ \dots ,\ n$ with probabilities $P\left(X=k\right)=p(\left(1-p\right)) ^(k-1),\ k=1,\ 2,\ 3,\ \dots $, then we say that such a random variable $X$ is subject to the geometric law of probability distribution. In fact, the geometric distribution appears to be Bernoulli's trials to the first success.

Example . Examples of random variables that have a geometric distribution can be: the number of shots before the first hit on the target; number of tests of the device before the first failure; the number of coin tosses before the first heads up, and so on.

The mathematical expectation and variance of a random variable subject to a geometric distribution are respectively $M\left(X\right)=1/p$, $D\left(X\right)=\left(1-p\right)/p^ 2$.

Example . On the way of fish movement to the spawning place there is a $4$ lock. The probability of a fish passing through each lock is $p=3/5$. Construct a distribution series of the random variable $X$ - the number of locks passed by the fish before the first stop at the lock. Find $M\left(X\right),\ D\left(X\right),\ \sigma \left(X\right)$.

Let the random variable $X$ be the number of sluices passed by the fish before the first stop at the sluice. Such a random variable is subject to the geometric law of probability distribution. The values ​​that the random variable $X can take are: 1, 2, 3, 4. The probabilities of these values ​​are calculated by the formula: $P\left(X=k\right)=pq^(k-1)$, where: $ p=2/5$ - probability of fish being caught through the lock, $q=1-p=3/5$ - probability of fish passing through the lock, $k=1,\ 2,\ 3,\ 4$.

$P\left(X=1\right)=((2)\over (5))\cdot (\left(((3)\over (5))\right))^0=((2)\ over(5))=0.4;$

$P\left(X=2\right)=((2)\over (5))\cdot ((3)\over (5))=((6)\over (25))=0.24; $

$P\left(X=3\right)=((2)\over (5))\cdot (\left(((3)\over (5))\right))^2=((2)\ over (5))\cdot ((9)\over (25))=((18)\over (125))=0.144;$

$P\left(X=4\right)=((2)\over (5))\cdot (\left(((3)\over (5))\right))^3+(\left(( (3)\over (5))\right))^4=((27)\over (125))=0.216.$

$\begin(array)(|c|c|)
\hline
X_i & 1 & 2 & 3 & 4 \\
\hline
P\left(X_i\right) & 0.4 & 0.24 & 0.144 & 0.216 \\
\hline
\end(array)$

Expected value:

$M\left(X\right)=\sum^n_(i=1)(x_ip_i)=1\cdot 0.4+2\cdot 0.24+3\cdot 0.144+4\cdot 0.216=2.176.$

Dispersion:

$D\left(X\right)=\sum^n_(i=1)(p_i(\left(x_i-M\left(X\right)\right))^2=)0,4\cdot (\ left(1-2,176\right))^2+0,24\cdot (\left(2-2,176\right))^2+0,144\cdot (\left(3-2,176\right))^2+$

$+\ 0.216\cdot (\left(4-2.176\right))^2\approx 1.377.$

Standard deviation:

$\sigma \left(X\right)=\sqrt(D\left(X\right))=\sqrt(1,377)\approx 1,173.$

4. Hypergeometric distribution law.

If there are $N$ objects, among which $m$ objects have the given property. Randomly, without replacement, $n$ objects are extracted, among which there are $k$ objects that have a given property. The hypergeometric distribution makes it possible to estimate the probability that exactly $k$ objects in a sample have a given property. Let the random variable $X$ be the number of objects in the sample that have a given property. Then the probabilities of the values ​​of the random variable $X$:

$P\left(X=k\right)=((C^k_mC^(n-k)_(N-m))\over (C^n_N))$

Comment. The HYPERGEOMET statistical function of the Excel $f_x$ Function Wizard allows you to determine the probability that a certain number of trials will be successful.

$f_x\to $ statistical$\to$ HYPERGEOMET$\to$ OK. A dialog box will appear that you need to fill out. In the graph Number_of_successes_in_sample specify the value of $k$. sample_size equals $n$. In the graph Number_of_successes_in_population specify the value of $m$. Population_size equals $N$.

The mathematical expectation and variance of a discrete random variable $X$ subject to a geometric distribution law are $M\left(X\right)=nm/N$, $D\left(X\right)=((nm\left(1 -((m)\over (N))\right)\left(1-((n)\over (N))\right))\over (N-1))$.

Example . The credit department of the bank employs 5 specialists with higher financial education and 3 specialists with higher legal education. The management of the bank decided to send 3 specialists for advanced training, selecting them randomly.

a) Make a distribution series of the number of specialists with higher financial education who can be directed to advanced training;

b) Find the numerical characteristics of this distribution.

Let the random variable $X$ be the number of specialists with higher financial education among the three selected. Values ​​that $X:0,\ 1,\ 2,\ 3$ can take. This random variable $X$ is distributed according to the hypergeometric distribution with the following parameters: $N=8$ - population size, $m=5$ - number of successes in the population, $n=3$ - sample size, $k=0,\ 1, \ 2,\ 3$ - number of successes in the sample. Then the probabilities $P\left(X=k\right)$ can be calculated using the formula: $P(X=k)=(C_(m)^(k) \cdot C_(N-m)^(n-k) \over C_( N)^(n) ) $. We have:

$P\left(X=0\right)=((C^0_5\cdot C^3_3)\over (C^3_8))=((1)\over (56))\approx 0.018;$

$P\left(X=1\right)=((C^1_5\cdot C^2_3)\over (C^3_8))=((15)\over (56))\approx 0.268;$

$P\left(X=2\right)=((C^2_5\cdot C^1_3)\over (C^3_8))=((15)\over (28))\approx 0.536;$

$P\left(X=3\right)=((C^3_5\cdot C^0_3)\over (C^3_8))=((5)\over (28))\approx 0.179.$

Then the distribution series of the random variable $X$:

$\begin(array)(|c|c|)
\hline
X_i & 0 & 1 & 2 & 3 \\
\hline
p_i & 0.018 & 0.268 & 0.536 & 0.179 \\
\hline
\end(array)$

Let us calculate the numerical characteristics of the random variable $X$ according to general formulas hypergeometric distribution.

$M\left(X\right)=((nm)\over (N))=((3\cdot 5)\over (8))=((15)\over (8))=1,875.$

$D\left(X\right)=((nm\left(1-((m)\over (N))\right)\left(1-((n)\over (N))\right)) \over (N-1))=((3\cdot 5\cdot \left(1-((5)\over (8))\right)\cdot \left(1-((3)\over (8 ))\right))\over (8-1))=((225)\over (448))\approx 0.502.$

$\sigma \left(X\right)=\sqrt(D\left(X\right))=\sqrt(0.502)\approx 0.7085.$

Normal law of probability distribution

Without exaggeration, it can be called a philosophical law. Observing various objects and processes of the world around us, we often encounter the fact that something is not enough, and that there is a norm:


Here is a basic view density functions normal probability distribution, and I welcome you to this most interesting lesson.

What examples can be given? They are just darkness. This, for example, is the height, weight of people (and not only), their physical strength, mental abilities, etc. There is a "mass" (in one way or another) and there are deviations in both directions.

These are different characteristics of inanimate objects (the same dimensions, weight). This is a random duration of processes, for example, the time of a hundred-meter race or the transformation of resin into amber. From physics, air molecules came to mind: among them there are slow ones, there are fast ones, but most of them move at “standard” speeds.

Next, we deviate from the center by one more standard deviation and calculate the height:

Marking points on the drawing (green color) and we see that this is quite enough.

At the final stage, we carefully draw a graph, and especially carefully reflect it convexity / concavity! Well, you probably realized a long time ago that the abscissa axis is horizontal asymptote, and it is absolutely impossible to “climb” for it!

With the electronic design of the solution, the graph is easy to build in Excel, and unexpectedly for myself, I even recorded a short video on this topic. But first, let's talk about how the shape of the normal curve changes depending on the values ​​of and .

When increasing or decreasing "a" (with unchanged "sigma") the graph retains its shape and moves right / left respectively. So, for example, when the function takes the form and our graph "moves" 3 units to the left - exactly to the origin:


A normally distributed quantity with zero mathematical expectation received a completely natural name - centered; its density function even, and the graph is symmetrical about the y-axis.

In the event of a change in "sigma" (with constant "a"), the graph "remains in place", but changes shape. When enlarged, it becomes lower and elongated, like an octopus stretching its tentacles. And vice versa, when decreasing the graph becomes narrower and taller- it turns out "surprised octopus." Yes, at decrease"sigma" two times: the previous chart narrows and stretches up twice:

Everything is in full accordance with geometric transformations of graphs.

The normal distribution with unit value "sigma" is called normalized, and if it is also centered(our case), then such a distribution is called standard. It has an even simpler density function, which has already been encountered in local Laplace theorem: . The standard distribution has found wide application in practice, and very soon we will finally understand its purpose.

Now let's watch a movie:

Yes, quite right - somehow undeservedly we have remained in the shadows probability distribution function. We remember her definition:
- the probability that a random variable will take a value LESS than the variable , which "runs" all real values ​​\u200b\u200bto "plus" infinity.

Inside the integral, a different letter is usually used so that there are no "overlays" with the notation, because here each value is assigned improper integral , which is equal to some number from the interval.

Almost all values ​​cannot be accurately calculated, but as we have just seen, with modern computing power, this is not difficult. So, for the function of the standard distribution, the corresponding excel function generally contains one argument:

=NORMSDIST(z)

One, two - and you're done:

The drawing clearly shows the implementation of all distribution function properties, and from the technical nuances here you should pay attention to horizontal asymptotes and an inflection point.

Now let's recall one of the key tasks of the topic, namely, find out how to find - the probability that a normal random variable will take a value from the interval. Geometrically, this probability is equal to area between the normal curve and the x-axis in the corresponding section:

but each time grind out an approximate value is unreasonable, and therefore it is more rational to use "easy" formula:
.

! also remembers , What

Here you can use Excel again, but there are a couple of significant “buts”: firstly, it is not always at hand, and secondly, “ready-made” values, most likely, will raise questions from the teacher. Why?

I have talked about this many times before: at one time (and not very long ago) an ordinary calculator was a luxury, and in educational literature the "manual" method of solving the problem under consideration has been preserved to this day. Its essence is to standardize the values ​​"alpha" and "beta", that is, reduce the solution to the standard distribution:

Note : the function is easy to obtain from the general caseusing a linear substitutions. Then and:

and from the replacement just follows the formula transition from the values ​​of an arbitrary distribution to the corresponding values ​​of the standard distribution.

Why is this needed? The fact is that the values ​​were scrupulously calculated by our ancestors and summarized in a special table, which is in many books on terver. But even more common is the table of values, which we have already dealt with in Laplace integral theorem:

If we have at our disposal a table of values ​​of the Laplace function , then we solve through it:

Fractional values ​​are traditionally rounded to 4 decimal places, as is done in the standard table. And for control Item 5 layout.

I remind you that , and to avoid confusion always be in control, table of WHAT function before your eyes.

Answer is required to be given as a percentage, so the calculated probability must be multiplied by 100 and provide the result with a meaningful comment:

- with a flight from 5 to 70 m, approximately 15.87% of the shells will fall

We train on our own:

Example 3

The diameter of bearings manufactured at the factory is a random variable normally distributed with an expectation of 1.5 cm and a standard deviation of 0.04 cm. Find the probability that the size of a randomly taken bearing ranges from 1.4 to 1.6 cm.

In the sample solution and below, I will use the Laplace function as the most common option. By the way, note that according to the wording, here you can include the ends of the interval in the consideration. However, this is not critical.

And already in this example, we met a special case - when the interval is symmetrical with respect to the mathematical expectation. In such a situation, it can be written in the form and, using the oddness of the Laplace function, simplify the working formula:


The delta parameter is called deviation from the mathematical expectation, and the double inequality can be “packed” using module:

is the probability that the value of a random variable deviates from the mathematical expectation by less than .

Well, the solution that fits in one line :)
is the probability that the diameter of a bearing taken at random differs from 1.5 cm by no more than 0.1 cm.

The result of this task turned out to be close to unity, but I would like even more reliability - namely, to find out the boundaries in which the diameter is almost everyone bearings. Is there any criterion for this? Exists! The question is answered by the so-called

three sigma rule

Its essence is that practically reliable is the fact that a normally distributed random variable will take a value from the interval .

Indeed, the probability of deviation from the expectation is less than:
or 99.73%

In terms of "bearings" - these are 9973 pieces with a diameter of 1.38 to 1.62 cm and only 27 "substandard" copies.

In practical research, the “three sigma” rule is usually applied in the opposite direction: if statistically found that almost all values random variable under study fit into an interval of 6 standard deviations, then there are good reasons to believe that this value is distributed according to the normal law. Verification is carried out using the theory statistical hypotheses.

We continue to solve the harsh Soviet tasks:

Example 4

The random value of the weighing error is distributed according to the normal law with zero mathematical expectation and a standard deviation of 3 grams. Find the probability that the next weighing will be carried out with an error not exceeding 5 grams in absolute value.

Solution very simple. By the condition, and we immediately note that at the next weighing (something or someone) we will almost 100% get the result with an accuracy of 9 grams. But in the problem there is a narrower deviation and according to the formula :

- the probability that the next weighing will be carried out with an error not exceeding 5 grams.

Answer:

A solved problem is fundamentally different from a seemingly similar one. Example 3 lesson about uniform distribution. There was an error rounding measurement results, here we are talking about the random error of the measurements themselves. These errors arise due to technical specifications the instrument itself (the range of permissible errors, as a rule, is indicated in his passport), and also through the fault of the experimenter - when, for example, "by eye" we take readings from the arrow of the same scales.

Among others, there are also so-called systematic measurement errors. It's already nonrandom errors that occur due to incorrect setup or operation of the device. So, for example, unadjusted floor scales can consistently "add" a kilogram, and the seller systematically underweight buyers. Or not systematically because you can shortchange. However, in any case, such an error will not be random, and its expectation is different from zero.

…I am urgently developing a sales training course =)

Let's solve the problem on our own:

Example 5

The roller diameter is a random normally distributed random variable, its standard deviation is mm. Find the length of the interval, symmetrical with respect to the mathematical expectation, in which the length of the diameter of the bead will fall with probability.

Item 5* design layout to help. Please note that the mathematical expectation is not known here, but this does not in the least interfere with solving the problem.

AND examination task, which I strongly recommend to consolidate the material:

Example 6

A normally distributed random variable is given by its parameters (mathematical expectation) and (standard deviation). Required:

a) write down the probability density and schematically depict its graph;
b) find the probability that it will take a value from the interval ;
c) find the probability that the modulo deviates from no more than ;
d) applying the rule of "three sigma", find the values ​​of the random variable .

Such problems are offered everywhere, and over the years of practice I have been able to solve hundreds and hundreds of them. Be sure to practice hand drawing and using paper spreadsheets ;)

Well, I'll take an example increased complexity:

Example 7

The probability distribution density of a random variable has the form . Find , mathematical expectation , variance , distribution function , plot density and distribution functions, find .

Solution: first of all, let's pay attention that the condition does not say anything about the nature of the random variable. By itself, the presence of the exhibitor does not mean anything: it can be, for example, demonstrative or generally arbitrary continuous distribution. And therefore, the “normality” of the distribution still needs to be substantiated:

Since the function determined at any real value , and it can be reduced to the form , then the random variable is distributed according to the normal law.

We present. For this select a full square and organize three-story fraction:


Be sure to perform a check, returning the indicator to its original form:

which is what we wanted to see.

Thus:
- By power rule"pinching off". And here you can immediately write down the obvious numerical characteristics:

Now let's find the value of the parameter. Since the normal distribution multiplier has the form and , then:
, from which we express and substitute into our function:
, after which we will once again go over the record with our eyes and make sure that the resulting function has the form .

Let's plot the density:

and the plot of the distribution function :

If there is no Excel and even a regular calculator at hand, then the last chart is easily built manually! At the point, the distribution function takes on the value and here is

The distribution function of a random variable X is the function F(x), expressing for each x the probability that the random variable X takes the value, smaller x

Example 2.5. Given a series of distribution of a random variable

Find and graphically depict its distribution function. Solution. According to the definition

F(jc) = 0 for X X

F(x) = 0.4 + 0.1 = 0.5 at 4 F(x) = 0.5 + 0.5 = 1 at X > 5.

So (see Fig. 2.1):


Distribution function properties:

1. The distribution function of a random variable is a non-negative function enclosed between zero and one:

2. The distribution function of a random variable is a non-decreasing function on the whole numerical axis, i.e. at X 2 >x

3. At minus infinity, the distribution function is equal to zero, at plus infinity, it is equal to one, i.e.

4. Probability of hitting a random variable X in the interval is equal to the definite integral of its probability density ranging from A before b(see Fig. 2.2), i.e.


Rice. 2.2

3. The distribution function of a continuous random variable (see Fig. 2.3) can be expressed in terms of the probability density using the formula:

F(x)= Jp(*)*. (2.10)

4. Improper integral in endless limits on the probability density of a continuous random variable is equal to one:

Geometric properties / and 4 probability densities mean that its plot is distribution curve - lies not below the x-axis, and the total area of ​​the figure, limited distribution curve and x-axis, is equal to one.

For a continuous random variable X expected value M(X) and variance D(X) are determined by the formulas:

(if the integral converges absolutely); or

(if the reduced integrals converge).

Along with the numerical characteristics noted above, the concept of quantiles and percentage points is used to describe a random variable.

q level quantile(or q-quantile) is such a valuex qrandom variable, at which its distribution function takes the value, equal to q, i.e.

  • 100The q%-ou point is the quantile X~ q .
  • ? Example 2.8.

According to example 2.6 find the quantile xqj and 30% random variable point x.

Solution. By definition (2.16) F(xo t3)= 0.3, i.e.

~Y~ = 0.3, whence the quantile x 0 3 = 0.6. 30% random variable point X, or quantile Х)_о,з = xoj» is found similarly from the equation ^ = 0.7. whence *,= 1.4. ?

Among the numerical characteristics of a random variable, there are initial v* and central R* k-th order moments, determined for discrete and continuous random variables by the formulas: