• facebook
  • twitter
  • whatsapp
  • telegram

Statistics

Mind Map:


Introduction: Information available in the numerical form or verbal form or graphical form that helps in taking decisions or drawing conclusions is called Data. Information collected with a definite objective, the data obtained is called primary data. The information collected from a source, which had already been recorded, say from registers is called secondary data. Data collected from the sources is called raw data or ungrouped data. For our convenient, we can rearrange the raw data as grouped data.
   We have three measures in measures of central tendency which are known as Mean, Median and Mode and we use one measure called as Range is Measures of variability.
Range: From the given ungrouped data, the difference between highest score and lowest score is called range of the data.
                                                
For example, find the range of the data 32, 54, 62, 31, 18, 71, 14, 38, 16, 82.
Here highest score is 82, lowest score is 14 then range = 82 – 14 = 68.
Mean of Ungrouped Data:
Already we learnt about Mean of ungrouped data. Just we recall it.
Find the mean of the data 3, 9, 12, 16, 27, 31, 14, 24.

           Now, Let x1, x2, x3, . . . , xn be observations with respective frequencies f1, f2, f3, . . . , fn. This means that observations x1 occurs f1 times, x2 occurs f2 times and so on.
          Now, the sum of the values of all the observations = f1x1 + f2x2 +. . . + fnxn, and the number of observations = f1 + f2+ . . . + fn


Ex.1: Find the mean of the following distribution


Solution: 


 

Mean of Grouped Data:
   In most of our real life situations, data is usually so large that to make a meaningful study, it needs to be condensed as grouped data. In grouped data, we have class intervals. Now, for each class interval, we require a point which would serve as the representative of the whole class. It is assumed that the frequency of each class interval is centered around its mid-point. So, the mid-point of each class can be chosen to represent the observations falling in that class and is called the class mark. Recall that we find the class mark by finding the average of the upper and lower limit of the class.


                    
Ex.2: Find the mean of the following data



Solution:


This new method of finding the mean is known as the Direct Method.
   Sometimes when the numerical values of xi and fi are large, finding the product of xi and fbecomes tedious and time consuming. So, for such situations, let us think of a method of reducing these calculations.
   We can do nothing with the fi’s but we can change each xi to a smaller number so that our calculations become easy. How do we do this? What is about subtracting a fixed number from each of these xi’s? Let us try this method for the following data.
   The first step is to choose one among the xi’s as the assumed mean, and denote it by ‘A’. Also, to further reduce our calculation work, we may take “A” to be that xi which lies in the centre of x1, x2, . . ., xn.
   The second step is to find the deviation of ‘a’ from each of the xi’s, which we denote as di. i.e. di = xi – A
   The third step is to find the product of di with corresponding fi, and take the sum of all fi di ‘s. These calculations are shown in the given example.
   Let x1, x2, . . ., xn be values of a variable X with corresponding frequencies f1, f2, . . . , fn respectively. Taking deviations about an arbitrary point ‘A’, we have di = xi – A, i = 1, 2, 3, . . . , n


Finding Arithmetic Mean by using the above formula is known as the deviation method or Assumed Mean Method.

 

Ex.3: Find the mean of the following data


Solution: 


Step-Deviation Method: Sometimes, during the application of the short-cut method for finding AM, the deviations di are divisible by a common number h (say). In such a case the arithmetic is reduced to a great extent by taking

Finding AM by using this formula is known as the step-deviation method.
   Let us see how to find Arithmetic Mean by using step-deviation method in the following example.

 

Ex.4: Find the mean of the following data


Solution: 

Note that:
* The step-deviation method will be convenient to apply if all the di’s have a common factor.
* The mean obtained by all the three methods is the same.
* The assumed mean method and step-deviation method are just simplified forms of the direct method.
* The formula  still holds if A and h are not as given above, but are any non-zero numbers such that 
   Even if the class sizes are unequal and xi is large numerically, we can still apply the step-deviation method by taking h to be a suitable divisor of all the di’s

 

Ex.5: The distribution below shows the number of wickets taken by bowlers in one-day cricket matches. Find the mean number of wickets by choosing a suitable method. What does the mean signify?


Solution:


Mode of ungrouped data:
* A mode is that value among the observations which occurs most frequently.
* For example, find the mode of the data 2, 6, 4, 5, 0, 2, 1, 3, 2, 3.
* First write the data in ascending order then it is 0, 1, 2, 2, 2, 3, 3, 4, 5, 6 it is clear that 2 is the most frequent observation (3 times). Therefore mode of the given data is 2.
* Sometimes there are two or more observations repeated same times in the data then we say that those all observations are modes of the data.
* For example, find the mode of the data 20, 3, 7, 13, 3, 4, 6, 7, 19, 15, 7, 18, 3.
* Ascending order of the data is 3, 3, 3, 4, 6, 7, 7, 7, 13, 15, 18, 19, 20. Here 3 and 7 are repeating for 3 times so modes of the data are 3 and 7.
* Note that if the data has only one mode, then the data is called unimodal data; if it has two modes then it is called bimodal data and if it has more than two modes then it is called multimodal data.

 

Mode of Grouped data:
         In a grouped frequency distribution, it is not possible to determine the mode by looking at the frequencies. Here, we can only locate a class with the maximum frequency, called the modal class. The mode is a value inside the modal class, and is given by the formula.
                                        
Here    l = lower boundary of the modal class
            h = size of the modal class
            f1 = frequency of the modal class
            f0 = frequency of the class preceding the modal class
            f2 = frequency of the class succeeding the modal class.

 

Ex.6: A survey conducted on 20 households in a locality by a group of students resulted in the following frequency table for the number of family members in a household.


Solution:


Here is the maximum class frequency is 8, and the class corresponding to this frequency is 3 - 5. So, the modal class is 3 - 5.
          So,

 (average of lower limit of modal class and upper limit of the class preceding the modal class is boundary of the modal class)
h = 2, f1= 8, f0 = 7 and f2 = 2


 

Median of ungrouped data:
* Median gives the value of the middle-most observation in the data. For finding the median of ungrouped data, we first arrange the data values or the observations in ascending/descending order.
* Suppose a data has ‘n’ observations
* If ‘n’ is odd, the median is the  observation and
* If ‘n’ is even, then the median will be the average of the  observations.
* For example, find the median of (i) 2, 5, 3, 7, 9, 11, 7 and (ii) 5, 11, 6, 14, 9, 18, 10, 3
(i) Arrange the given data in ascending order then it is 2, 3, 5, 7, 7, 9, 11 Here number of observations is 7, odd number then median is  observation =  observation = 4th observation = 7
∴ Median of the data is 7
(ii) Arrange the given data in ascending order then it is 3, 5, 6, 9, 10, 11, 14, 18
Here number of observations is 8, even number then median is the average of the  observations i.e., the average of the  observations i.e. average of 4th and 5th observations i.e., Median 


 

Ex.7: Find the median of the following data, which is about the marks, out of 50 obtained by 100 students in a test.  


Solution:
Here n = 100, which is even. The median will be the average of the  observations, i.e., the 50th and 51th observations. To find the position of these middle values, we construct cumulative frequency. Now we add another column depicting this information to the frequency table above and name it as cumulative frequency column.

From the above table, we see that
50th observation is 28 and 51th observation is 29.


Median of Grouped data:
   In grouped data, we may not be able to find the middle observation by looking at the cumulative frequencies as the middle observation will be some value in a class interval. It is, therefore, necessary to find the value inside a class that divides the whole distribution into two halves. But which class should this be?
   To find this class, we find the cumulative frequencies of all the classes and . We now locate the class whose cumulative frequency exceeds

 for the first time. This is called the median class. After finding the median class, we use the following formula for calculating the median.
                                          
Here l = lower boundary of median class
         n = number of observations
         cf = cumulative frequency of class preceding the median class
         f = frequency of median class
         h = class size (size of the median class)
 

Ex.8: Find the median of the following data 

Solution:


Now 60-70 is the class whose cumulative frequency 29 is greater than (and nearest to) 26.5
Therefore, 60-70 is the median class

l = 60; cf = 22; f = 7; h = 10


 

Which measure would be best suited for a particular requirement:
   The mean is the most frequently used measure of central tendency because it takes into account all the observations, and lies between the extremes, i.e., the largest and the smallest observations of the entire data.
   In problems where individual observations are not important, especially extreme values, and we wish to find out a typical observation, the median is more appropriate.
   In situations which require establishing the most frequent value or most popular item, the mode is the best choice.

 

Graphical Representation of Cumulative Frequency Distribution:
   A graphical representation helps us in understanding given data at a glance. Let us now represent a cumulative frequency distribution graphically. For drawing cumulative frequency curve (Ogive curve), it should be ensured that the class intervals are continuous, because cumulative frequencies are linked with boundaries, but not with limits.

 

Ex.9: The annual profits earned by 30 shops in a locality give rise to the following distribution


Solution:
   (Note that if ‘more than’ is mentioned in class intervals then it implies greater than cumulative frequencies are given. Similarly ‘less than’ is mentioned in class intervals then it implies less than cumulative frequencies are given.)   First we have to calculate boundaries and cumulative frequencies. If we would like to draw less than cumulative frequency curve (LCFC), upper boundaries and less than cumulative frequencies are to be calculated. If we would like to draw greater than cumulative frequency curve (GCFC), lower boundaries and greater than cumulative frequencies are to be calculated.


Taking boundaries on X - axis
Taking cumulative frequencies on Y - axis
Scale on X - axis 1 cm. = 5 units
on Y - axis 1 cm. = 5 units


Using these values, we plot the points (10, 2), (15, 4), (20, 16), (25, 20), (30, 23), (35, 27), (40, 30) on the same axes as in last figure to get the less than ogive, as shown in the figure. The abscissa of their point of intersection is nearly 17.5, which is the median. This can also be verified by using the formula. Hence the median profit (in lakhs) is Rs.17.5.

 

Dr. T.S.V.S. Suryanarayana Murthy

Posted Date : 07-11-2020

గమనిక : ప్రతిభ.ఈనాడు.నెట్‌లో కనిపించే వ్యాపార ప్రకటనలు వివిధ దేశాల్లోని వ్యాపారులు, సంస్థల నుంచి వస్తాయి. మరి కొన్ని ప్రకటనలు పాఠకుల అభిరుచి మేరకు కృత్రిమ మేధస్సు సాంకేతికత సాయంతో ప్రదర్శితమవుతుంటాయి. ఆ ప్రకటనల్లోని ఉత్పత్తులను లేదా సేవలను పాఠకులు స్వయంగా విచారించుకొని, జాగ్రత్తగా పరిశీలించి కొనుక్కోవాలి లేదా వినియోగించుకోవాలి. వాటి నాణ్యత లేదా లోపాలతో ఈనాడు యాజమాన్యానికి ఎలాంటి సంబంధం లేదు. ఈ విషయంలో ఉత్తర ప్రత్యుత్తరాలకు, ఈ-మెయిల్స్ కి, ఇంకా ఇతర రూపాల్లో సమాచార మార్పిడికి తావు లేదు. ఫిర్యాదులు స్వీకరించడం కుదరదు. పాఠకులు గమనించి, సహకరించాలని మనవి.

ప్రత్యేక కథనాలు

 
 

విద్యా ఉద్యోగ సమాచారం