HOME WORK WEB PAGE FOR CLASS MATH 251 & MATE 251 & MATE 345

MAIN PAGE

Chapter 3 Descriptive Statistics

 

3.3 Measures of Relative Location and Detecting Outliers

           

We have described several measures of locations and variability for data. The mean is the most widely used measure of location, whereas the standard deviation and variance are the most widely used measures of variability. Using only mean and standard deviation, we also can learn about the relative location of items in a data.

 

Z-Scores

            z=        where

   z= the z-score for

   = the sample mean

                                      S = the sample standard deviation

The z-score is often called the standardized value. The standardized value or z-score, z, can be interpreted as the number of standard deviations x is from the mean. For example, z=1.2 would indicate that xis 1.2 standard deviations greater than the sample mean. Similarly z= -.5 would indicate that xis 0.5 standard deviation less than the sample mean.

 

Chebyshev’s Theorem

 

At least (1-1/z) of the data values must be within z standard deviations of the mean, where z is any value greater than 1

 

            Some of the implications of the theorem, with z=2, 3, and 4 follow.

 

·         At least .75 or %75, of the data values must be within z=2 standard deviation of the mean.

·         At least .89, or 89%, of the data values must be within z=3 standard deviations of the mean.

·         At least .94, or 94%, of the data values must be within z=4 stanard deviations of the mean.

 

Empirical Rule

 

For data having bell-shaped distribution:

·        Approximately 68% of the data values will be within one standard deviation of the mean.

·        Approximately 95% of the data values will be within two standard deviations of the mean.

·        Almost all of the data values will be within three standard deviations of the mean.

 

Detecting Outliers

 

Standardized values (z-scores) can be used to help identify outliers. Recall that the empirical rule allows us to conclude that for data with bell-shaped distribution, almost all the data values will be within three standard deviations of the mean. Hence, in using z-scores to identify outliers, we recommend treating any data value with a z-score less than-3 or grater tan +3 as outliers.

                                                                                                                                                    

Home Work

 

1.       Consider a sample with mean of 30 and a standard deviation of 5. Use Chebyshev’s theorem to determine the proportion, or percentage, of the data within each of the following ranges.

a. 20 to 40  b. 15 to 45        c. 22 to 38        d. 18 to 42        e. 12 to 48

 

2.       Data that have a bell-shaped distribution mean of 30 and a standard deviation of 5. Use Empirical Rule to determine the proportion, or percentage, of the data within each of the following ranges.

a. 20 to 40  b. 15 to 45        c. 25 to 35

 

Application Home Work (DO IT ON EXCEL)

 

1.       Wageweb conduct surveys of salary data and presents summaries on its Web Site. Using salary data as of January 1, 2000, Wageweb reported that salaries of benefits managers ranged from $50,935 to $79,577 (wageweb.com, April 12, 2000). Assume the file Wageweb.xls is a sample of annual salaries for 30 benefits managers (data are in thousands of dollars)

 

a.       Compute the mean and standard deviation of the sample data

b.       Use Chebyshev’s theorem to determine the percentage of benefits managers with an annual salary between $55,000 and $71,000.

c.       Develop a histogram for the sample data. Does it appear reasonable to assume that the distribution of annual salary can be approximated by a bell-shaped distribution.

d.       Assume it is a bell-shaped. Use Empirical Method to determine the percentage of benefits managers with an annual salary between $55,000 and $71,000. Compare your answer with the value in part b.

e.       Do the sample data contain any outliers?

 

 

2.       The file Speakers.xls is a sample of 20 speakers systems and the ratings posted on January2, 1998.

a.       Compute the mean and the median

b.       Compute the first and second quartiles

c.       Compute the standard deviation

d.       What are the z-scores associated with the Allison One and the Omni Audio SA 12.3?

e.       Do the data contain any outliers? Explain.