Saturday, September 1, 2012

Graphs, Frequency Distributions & Histograms For Displaying Data Distributions

Distribution
Distribution of variable describes what values the variable takes and how often it takes.

Bar Graph

Bar Graph shows the amount of data that belongs to each category in proportionally sized rectangular bars.
Data from categorical variables can be summarized using bar charts.

The following Bar Graph shows the Per Capita Net State Domestic Product at Current Prices (in Rupees) for 2010-11:

Data Source: Press Information Bureau, GOI, Per Capita Net State Domestic Product


Frequency Distribution

Wolfram MathWorld describes frequency distribution as "The tabulation of raw data obtained by dividing it into classes of some size and computing the number of data elements (or their fraction out of the total) falling within each pair of class boundaries."

Using Tally is a convenient and less error prone way to make frequency distribution tables.


Histogram
Data from quantitative variables are most commonly summarized using histograms. 

How To Draw Histograms

a. What are the number of observations (N) ? 
b. What is the range of the data? Range, R = H-L, H = highest value, L= least value. 
c. Use Sturge’s rule (k = 1+3.322LogN) to find the number of classes. 
d. Find the class width. cw = R/k. 
e. Make a frequency distribution of the data using the classes, [L,L+cw], [L+cw, L+2cw],... (use tally marks to count frequencies)
f. Draw the histogram.

Notes:
1. In order to avoid getting class width as fraction we can also use the formula 
cw = (R+w)/k where w is the least count of data. If that doesn't solve the problem then we may have to round off the class width.
2. To avoid any data point falling on boundaries of classes we can define classes as,
[L-w/2, L-w/2+cw], [L-w/2+cw, L-w/2+2cw],....
Again w, here is the least count of data.

PS: Improvements/Corrections/Comments are welcome.

No comments:

Post a Comment