kedarraj

Statistics Class 10 Notes Mathematics

Statistics Class 10 Notes

Methods of finding the mean of ungrouped frequency distribution

In earlier class,  you have studied the definition of the mean of raw data and ungrouped frequency distribution. You recall that the mean of the n values x1, x2, …, xn of a (random) variate or variable, denoted by \overline X or\overline x   is defined as

\overline X =\frac{{{x_1} + {x_2} + ... + {x_n}}}{n}  =\frac{1}{n}\sum\limits_{i = 1}^n {{x_i}}                                               …..(i)

If the raw data x1, x2, …, xn have frequencies f1, f2, …, fn respectively, then the mean \overline X is defined as

\overline X =\frac{{{x_1}{f_1} + {x_2}{f_2} + ... + {x_n}{f_n}}}{N}  =\frac{1}{N}\sum\limits_{i = 1}^n {{x_i}{f_i}}                                    …..(ii)

where     N = \sum\limits_{i = 1}^n {{f_i}}                                                                                               …..(iii)

We shall now discuss different methods of finding the mean of ungrouped frequency distribution. These methods are as follows :

(i) Direct method

(ii) Short-cut method or assumed mean method

(iii) Step-deviation method

Direct Method

The formula used in this method for the calculation of (arithmetic) mean had already been mentioned in equation above (ii). But this method is sometimes very laborious for numerical computation, particularly when xor for both are large numbers. In this case, we can apply the short-cut or step-deviation method.

Short cut method (or assumed mean method)

In this method, we subtract an arbitrary constant ‘a’, called the assumed mean from each value of xi. The reduced value is xi – a. We denote it by di and write

di = xi – a.

di is called the deviation of xi from the assumed mean a.

Hence, xi = a + di

⇒           fi xi = a fi + di fi

⇒           \sum\limits_{i = 1}^n {{f_i}} {x_i} = \sum\limits_{i = 1}^n {a{f_i}} + \sum\limits_{i = 1}^n {a{f_i}}

⇒           \sum\limits_{i = 1}^n {{f_i}} {x_i} = a\sum\limits_{i = 1}^n {{f_i}} +\sum\limits_{i = 1}^n {{d_i}{f_i}}   = aN + \sum\limits_{i = 1}^n {{d_i}{f_i}}                        [ \sum\limits_{i = 1}^n {{f_i}} = N]

⇒           \frac{1}{N}\sum\limits_{i = 1}^n {{f_i}} {x_i} = a +\frac{1}{N}\sum\limits_{i = 1}^n {{d_i}{f_i}}                                                   [Dividing both sides by N]

⇒          \overline X = a + \frac{1}{N}\sum\limits_{i = 1}^n {{d_i}{f_i}}                                                                        …..(iv)

Step-deviation method

            The main objective of subtracting an arbitrary number from each  observation xi for the calculation of arithmetic mean by short-cut method is to reduce these numbers xi. In this case, the calculation becomes simpler. We can make these differences still smaller, if these are divisible by a common non-zero number, say h. In this case, the calculation becomes much simpler. We can, therefore, modify the formula of short-cut method as follows:

Let u= \frac{{{x_i}-a}}{h}, i = 1, 2, 3, …, n.

∴          x= a + h ui

⇒          fi xi = af+ h ui fi

⇒           \sum\limits_{i = 1}^n {{f_i}{x_i}} = a\sum\limits_{i = 1}^n {{f_i}} + h \sum\limits_{i = 1}^n {{u_i}{f_i}} = aN + h\sum\limits_{i = 1}^n {{u_i}{f_i}}

⇒           \frac{1}{N} \sum\limits_{i = 1}^n {{f_i}{x_i}} = a +\frac{h}{N} \sum\limits_{i = 1}^n {{u_i}{f_i}}                                    \left[ {divide\,by\,\,:\sum\limits_{i = 1}^n {{f_i}\,\, = \,\,N} } \right]

⇒          \overline X = a +\frac{h}{N} \sum {{u_i}{f_i}}                                                                          …..(v)

Here, h is any arbitrary non-zero constant and a is an assumed mean which is also arbitrary.

Mean of grouped data

You have studied in class IX about the frequency distribution of grouped data. You will now learn how to find the arithmetic mean of grouped frequency distribution. In finding  the mean of grouped data, we should first of all make an assumption that the frequency in each class is centred at its class-mark (i.e., at the middle of the class-interval of each class). In other words, we assume that all values falling in a class-interval are equal to the class-mark i.e., the middle value of the class interval. Hence, we see that the individual values in a class lose their identity. We denote the class-marks by yi. If there are k class-intervals, than we shall get k class-marks, y1, y2, …, ykwith frequencies f1, f2, …, fk which are the frequencies of k classes. We can, therefore, treat the data y1, y2, …, yk with frequencies  f1, f2, …, fk respectively as ungrouped frequency distribution. We can, therefore, find their mean by the three methods discussed in section 19.1. If we denote the mean of the grouped data by Mg or ,ȳ then the different formulae for the calculation of the mean of grouped data can be listed as follows:

Mg (or \overline y ) =\frac{1}{N} \sum\limits_{i = 1}^k {{f_i}{y_i}}                                                                        …..(i)

              (Direct formula)

and         Mg (or \overline y ) = a +\frac{1}{N}\sum\limits_{i = 1}^k {{u_i}{d_i}} ,                                                     …..(ii)

where a is the assumed mean and

d= y – a               (i – 1, 2, …, k)                                                         …..(iii)

and         yi =\frac{1}{2}        (lower class-limit + upper class-limit)                               …..(iv)

The formula (ii) is the short-cut formula for the mean

Mg (or \overline y ) = a +\frac{h}{N}\sum\limits_{i = 1}^k {{f_i}} ,                                                                                …..(v)

The formula (v) is the step-deviation formula for the mean.

Here,     N = \sum\limits_{i = 1}^k {{f_i}}                                                                                                …..(vi)

The formula (v) can also be written as

M= a + h Mu,                                                                                      …..(vii)

where Mis the mean of ui’s, called the coded mean.

ILLUSTRATIONS

Ex. 1.     If the algebraic sum of the deviations of the observations xi (i = 1, 2, 3, …, n) from 12 is – 10 and that from 3  is 62, find the value of n and the mean of the raw data x1, x2, …, xn.

Answer
Sol.   It is given that

               \sum\limits_{i = 1}^n {\left( {{x_i}-12} \right)} = – 10                        …..(i)

and         \sum\limits_{i = 1}^n {\left( {{x_i}-3} \right)} = 62                              …..(ii)

From (i), we have

               \sum\limits_{i = 1}^n {{x_i}-12n} = – 10      ⇒ \sum\limits_{i = 1}^n {{x_i}-12} = – \frac{{10}}{n}

⇒            \overline X – 12 = –\frac{{10}}{n}                …..(iii)                     [ Mean \overline X = \frac{{\sum {{x_i}} }}{n}]

Similarly, from (ii), we have

              \sum\limits_{i = 1}^n {{x_i}} – 3 n = 62 ⇒ \frac{1}{n}\sum\limits_{i = 1}^n {{x_i}} – 3 =\frac{{62}}{n}

⇒           \overline X – 3 =\frac{{62}}{n}                           ......(iv)

Subtracting (iii) from (iv), we get

              9 =  \frac{{10}}{n}+\frac{{62}}{n}  = \frac{{72}}{n}    ⇒  9n = 72

⇒           n = \frac{{72}}{n} = 8

Putting the value of n in (iii), we get

             \overline X = 12 –\frac{{10}}{8}  = 12 –\frac{5}{4}  = 12 – 1.25 = 10.75

Hence, the required values of n and the mean are 8 and 10.75 respectively.

Ex. 2.     Find the mean of the following distribution:

x:            10           30           50           70           89

f:             7              8              10           15           10

Answer
Sol.

We prepare the following table for the computation of the mean.

ex 2

∴           Mean =\frac{{\sum {{f_i}{x_i}} }}{N} =\frac{{2750}}{{50}}  = 55

Ex. 3.     Find the mean height of plants from the following frequency distribution by both direct and short-cut methods.

Heights (in cm):         57           69           73           74           77

Number of Plants:      8              18           41           22           11

Answer
Sol.         We prepare the following table for the computation of the mean by the direct method and the short cut method.

3 ex

(i) By direct method, we have

          \overline X =\frac{1}{N}\sum {{f_i}{x_i}}   = \frac{1}{{100}} × 7166 cm = 71.66 cm.

(ii) By short-cut method, we have

         \overline X = a +\frac{1}{N}\sum {{f_i}{d_i}}   = (69 +\frac{1}{{100}}  × 266) cm = (69 + 2.66) cm = 71.66 cm.

Ex. 4.     Find the value of p, if the mean of the following distribution is 20.

x:            15           17           19           20 + p           23

f:             2              3              4              5p               6

Answer
Sol.         We first prepare the following table for the calculation of mean in terms of p.

4 ex

∴          Mean =\frac{1}{N}\sum {{f_i}{x_i}}   = \frac{{5{p^2} + 100p + 295}}{{15 + 5p}}

Hence, by the condition of the problem, we have

           \frac{{5{p^2} + 100p + 295}}{{15 + 5p}} = 20

⇒           5 p2 + 100 p + 295 = 300 + 100 p

⇒           5 p2 = 300 – 295 = 5

⇒           p2 = 1  ⇒   p = ± 1

Since one of the frequencies is 5 p and the frequency cannot be negative, hence, we reject p = – 1. Hence, the required value of p is 1.

Ex.5.      Find the mean of the following frequency distribution by the step-deviation method:

x:      525         545         565         585         605         625         645

f:       19           6              7              24           10           14           20

Answer
Sol.         We prepare the following table for the calculation of mean by the step-deviation method.

5 ex

Here N = 100, \sum {{f_i}{u_i}} = 22, a = 585 and h = 20.

Hence, \overline X = a + \frac{h}{N}\sum {{u_i}{f_i}}

             = 585 + \frac{{20}}{{100}} × 22 = 585 + 4.4 = 589.4

Hence, the required mean is 589.4.

Ex. 6.     The following frequency distribution table shows marks secured by 150 students in an examination:

Marks :    0 – 10         10 - 20            20 – 30         30 – 40       40 - 50

Number of students :20           30              40           40                 20

Calculate the mean marks by using all the three methods, viz., direct method, short-cut or assumed mean method and step-deviation method.

Answer
Sol.         (i) Direct method :

We construct the following table for the calculation of mean:

6 ex

Here,      N = 150 and  \sum\limits_{i = 1}^5 {{f_i}{y_i}} = 3850

Hence,   mean M =\frac{1}{N}\sum\limits_{i = 1}^5 {{f_i}{y_i}}   =  \frac{1}{{150}}× 3850 EQ 25.67.

(ii)           Short-cut method :

We prepare the following table:

6 ex 1

Here, a = 25, N = 150 and\sum {{f_i}{d_i}}   = 100

Hence, mean, M = a +\frac{1}{N}\sum {{f_i}{d_i}}   = 25 +\frac{{100}}{{150}} EQ  25 + .67 = 25.67.

(iii)         Step-deviation method :

We prepare the following table:

6 ex 2

Here, N = 100, a + 25, h = class-size = 10, \sum {{f_i}{u_i}} = 10

Hence, mean M = a + \frac{h}{N}\sum {{u_i}{f_i}} = 25 +\frac{{10}}{{150}}  × 10 EQ 25 + .67 = 25.67.

Thus, in any method, the required mean is  25.67 nearly.

Ex.7.      Find the mean weight of oranges from the following ‘less than’ cumulative frequency distribution:

Weights of oranges
(in g) 
 Less
than 30 
 Less
than 40   
Less
than 50
   Less
than 60 
 Less
than 70 
 Less
than 80
No. of oranges:4813161820
Answer
Sol.         We first convert the above ‘less than’ cumulative frequency distribution into an ordinary frequency distribution before calculating the mean. In absence of any other information, we can assume that the class-intrerval corresponding to “less than 30” is 20 – 30.

7 EX

Here,      we have

N = 20, h = 10, a = 55 and\sum {{f_i}{u_i}}   = – 19

∴          Mean = N +\frac{h}{N} \sum {{f_i}{u_i}} = (55 –\frac{{10}}{{20}}  × 19) g = (55 – 9.5) g = 45.5 g

Hence, the required mean of the given frequency distribution is 45.5 g.

Ex. 8.     Find the mean marks of the following ‘more than’ cumulative frequency distribution :

MarksMore
than 0
More
than 10
More
than 20
More
than 40
More
than 50
More
than 60
More
than 70
More
than 80
More
than 90
No. of  Students40038736532925315671326
Answer
Sol.        

We first convert the above ‘more than’ cumulative frequency distribution into an ordinary frequency distribution before calculating the mean. In absence of any other information, we can assume that the class-interval corresponding to “more than 90” is 90 – 100.

30

Here,    N = 400, h = 10, a = 45 and \sum {{f_i}{u_i}} = 17

∴ Mean, M = a +\frac{h}{N}\sum {{f_i}{u_i}}  

                 = 45 +  \frac{{10}}{{400}}× 17EQ  45 + .425 = 45.425

Hence, the required mean = 45.425.

Median

            If the data are arranged in ascending or descending order of magnitude, then the value of observation which lies in the middle is called the median of the data. Thus, a median divides the data into two equal parts. It is the value of the variable such that the number of observations above it is equal to the number of observations below it.

            The Median of ungrouped data

             If x1, x2, x3,… xn are n values of a variable X. Then to find the median, we use the following steps :

            Spep I :  Arrange the observation x1, x2, x3, …xn in ascending or descending order of magnitude.

            Step II :  Determine the total number of observations, say, n.

            Step  III :  If n is odd, then median is the value of  {\left[ {\frac{{n + 1}}{2}} \right]^{th}}observation.

            If n  is even, then median is the mean of the values of {\left[ {\frac{n}{2}} \right]^{th}} and {\left[ {\frac{n}{2} + 1} \right]^{th}} observations.

            The Median of a discrete frequency distribution

            In case of a discrete frequency distribution, we calculate the median by using the following steps.

            Step I :  Find the cumulative frequencies (c.f.).

            Step II : Find N/2, where N = \sum\limits_{i = 1}^n {{f_i}} .

            Step III : See the cumulative frequency (c.f.) just greater than N/2 and determine the corresponding value of the variable. This value is the required median.

            Median of grouped data in the form of classes

            For a grouped data in the form of classes, we calculate the median by using the following steps :

            Step I :  Prepare cumulative frequency table for the given data and obtain N = ∑fi.

            Step II :  Find N/2.

            Step III :  Find the cumulative frequency just greater than N/2 and determine the corresponding class. This class is known as the median class.

            Step IV :  Use the following formula :

                            Median = l + \left[ {\frac{{{\rm{N}}/2 - {\rm{Cf}}}}{f}} \right] \times h

            Where,  l = Lower limit of the median class;          f  = Frequency of the median class

                        h = Width (size) of the median class        Cf = Cumulative frequency of the class preceding the median class

                        N = ∑fi

            Merits and Demerits of median

            The following are some merits and demerits of median :

            Merits

            (i)     It is easy to compute and understand.

            (ii)    It is well defined an ideal average should be.

            (iii)   It can also be computed in case of frequency distribution with open ended classes.

            (iv)   It is not affected by extreme values.

            (v)    It can be determined graphically.

            (vi)   It is proper average for qualitative data where items are not measured but are scored.

            Demerits

            (i)     For computing median data needs to be arranged in ascending or descending order.

            (ii)    It is not based on all the observations of the data.

            (iii)   It cannot be given further algebraic treatment.

            (iv)   It is affected by fluctuations of sampling.

            (v)    It is not accurate when the data is not large.

            (vi)   In some cases median is determined approximately as the mid-point of two observations whereas for mean this does not happen.

Mode

            Mode is defined as the most frequent occuring observations. For example, in the set of data

                  2, 3, 3, 4, 4, 4, 5, 5, 5, 5, 6, 7, 8, 8

            5 occur four times and is, therefore, the mode.

            It is not necessary that in a data there is only one mode. A data having only one mode is called unimodal, having two modes is bimodal, and when more than two is called multimodal.

            The Mode of ungourped data

            In order to compute the mode of ungrouped data, we first prepare a frequency distribution table. From the frequency distribution table, the observation having maximum frequency is called the mode.

            The Mode of a discrete frequency distribution

            In a frequency distribution, the mode is easily recognised as the value with the highest frequency.

            The mode of grouped data in the form of classes

            In case of a grouped frequency distribution with equal class intervals, the class having maximum frequency is called modal class. After determining the modal class, we calculate the mode by using the following formula.

                                                     Mode = l + \frac{{f - {f_1}}}{{2f - {f_1} - {f_2}}} \times h

            where,                                       l = lower limit of the modal class

                                                            f = frequency of the modal class

                                                            h = width of the modal class

                                                           f1 = frequency of the class preceding the modal class

                                                           f2 = frequency of the class following the modal class

            Merits of Mode

  1. It can be easily understood and is easy to calculate.
  2. It is not affected by extreme values.
  3. It can be found by inspection in some cases.
  4. It can be determined in distributions with open classes.
  5. It can be represented graphically.

            Demerits of mode

  1. It is ill-defined. It is not always possible to find a clearly defined mode.
  2. It is not based upon all the observation.
  3. It is affected to a greater extent by fluctuations of sampling.

Cumulative Frequency Curve

            In previous class, we have learnt about graphical representation of frequency distributions by using bargraphs, histograms and frequency polygons. In this session, we will represent a cumulative frequency distribution graphically.

            Cumulative frequency curve (ogive)

            If we plot the points taking the upper limits of the class intervals x-coordinates and the cumulative frequencies as y-coordinates, and then join these ploted points by a free hand smooth curve, the curve so obtained is called a cumulative frequency curve or an ogive of the data.

            There are two methods of constructing a cumulative frequency curve or ogive.

            (i)   Less than method.

            (ii)   More than method.

            Less than Method

            To construct an ogive by less than method, we follow the procedure given below :

            Step I :  Construct the cumulative frequency table by adding class frequencies.

            Step II : Mark upper class limits along x-axis as a suitable scale.

            Step III :  Mark cumulative frequencies along y-axis on a suitable scale.

            Step IV :  Plot the points (xi, fi ), where xi is the upper limit of a class and fi is corresponding cumulative frequency.

            Step V : Join the points obtained in step IV by a free hand smooth curve to get the ogive and to get the cumulative frequency polygon join the points obtained in step IV by line segments.

            More than method

            To construct a cumulative frequency polygon and an ogive by more than method, we follow the procedure given below:

            Step I :  Construct the cumulative frequency table by subtracting the frequency of each class from the total frequency.

            Step II : Mark lower class limits along x-axis on a suitable scale.

            Step III :  Mark cumulative frequencies along y-axis on a suitable scale.

            Step IV : Plot the points (xi, fi ), where xi is the lower limit of a class and fi is corresponding cumulative frequency.

            Step V :  Join the points obtained in step IV by a free hand smooth curve to get the ogive and to get the cumulative frequency polygon join the points obtained in step IV by line segments.

            Note

            When nothing is mentioned, then we generally construct the ‘less than type ogive’.

            Determining the median from the ogive

            Ogive can be used to find the median of a frequency distribution.

            To find median follow the procedure given below :

            Step I :  Draw any one of the two types ogives on the graph paper.

            Step II :  Compute N/2 (N = ∑fi) and mark the corresponding part on y-axis.

            Step III :  Draw a line parallel to x-axis, from the point marked in step II, meeting the ogive at point A(say).

            Step IV :  Draw perpendicular AB from A on x-axis. The x-coordinate of point B gives the median.

            To find the median, when we have to draw both the types of ogives on graph paper, we follow the procedure given below.

            Step I :  Draw less than type and more than type ogives on the graph paper.

            Step II :  Mark the point of intersection of the two curves drawn in step I. Let this point be- A.

            Step III :  Draw perpendicular AB from A on the x-axis.

            The x-coordinate of point A gives the median.

 

ALSO READ

Excretion Life Processes Class 10 CBSE Notes Biology

Similar Triangles Class 10 CBSE Notes Mathematics

Acids Bases and Salts Class 10 CBSE Notes Chemistry

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top
Scroll to Top