HOME WORK WEB PAGE FOR CLASS MATH 251 & MATE 251
& MATE 345
Chapter 3
Descriptive Statistics
3.3 Measures of
Relative Location and Detecting Outliers
We have described
several measures of locations and variability for data. The mean is the most
widely used measure of location, whereas the standard deviation and variance
are the most widely used measures of variability. Using only mean and standard
deviation, we also can learn about the relative location of items in a data.
Z-Scores
z
=
where
z
= the z-score for
= the sample mean
S =
the sample standard deviation
The
z-score is often called the standardized value. The standardized value or
z-score, z
, can be interpreted as the number of standard deviations x
is from the mean
. For example, z
=1.2 would indicate that x
is 1.2 standard deviations greater than the sample mean.
Similarly z
= -.5 would indicate that x
is 0.5 standard deviation less than the sample mean.
Chebyshev’s
Theorem
At least
(1-1/z
) of the data values must be within z standard deviations of
the mean, where z is any value greater than 1
Some
of the implications of the theorem, with z=2, 3, and 4 follow.
·
At least .75 or %75,
of the data values must be within z=2 standard deviation of the mean.
·
At least .89, or 89%,
of the data values must be within z=3 standard deviations of the mean.
·
At least .94, or 94%,
of the data values must be within z=4 stanard deviations of the mean.
Empirical Rule
For
data having bell-shaped distribution:
·
Approximately 68% of
the data values will be within one standard deviation of the mean.
·
Approximately 95% of
the data values will be within two standard deviations of the mean.
·
Almost all of the
data values will be within three standard deviations of the mean.
Detecting
Outliers
Standardized
values (z-scores) can be used to help identify outliers. Recall that the
empirical rule allows us to conclude that for data with bell-shaped
distribution, almost all the data values will be within three standard
deviations of the mean. Hence, in using z-scores to identify outliers, we
recommend treating any data value with a z-score less than-3 or grater tan +3 as
outliers.
![]()
Home Work
1.
Consider a sample
with mean of 30 and a standard deviation of 5. Use Chebyshev’s theorem to
determine the proportion, or percentage, of the data within each of the
following ranges.
a.
20 to 40 b. 15 to 45 c. 22 to 38 d. 18 to 42 e. 12
to 48
2.
Data that have a
bell-shaped distribution mean of 30 and a standard deviation of 5. Use Empirical
Rule to determine the proportion, or percentage, of the data within each of the
following ranges.
a.
20 to 40 b. 15 to 45 c. 25 to 35
Application
Home Work (DO IT ON EXCEL)
1.
Wageweb conduct surveys of
salary data and presents summaries on its Web Site. Using salary data as of
January 1, 2000, Wageweb reported that salaries of benefits managers ranged
from $50,935 to $79,577 (wageweb.com, April 12, 2000). Assume the file Wageweb.xls is a sample of annual salaries for
30 benefits managers (data are in thousands of dollars)
a.
Compute the mean and standard
deviation of the sample data
b.
Use Chebyshev’s theorem to
determine the percentage of benefits managers with an annual salary between
$55,000 and $71,000.
c.
Develop a histogram for the
sample data. Does it appear reasonable to assume that the distribution of
annual salary can be approximated by a bell-shaped distribution.
d.
Assume it is a bell-shaped. Use
Empirical Method to determine the percentage of benefits managers with an
annual salary between $55,000 and $71,000. Compare your answer with the value
in part b.
e.
Do the sample data contain any
outliers?
2.
The file Speakers.xls is a
sample of 20 speakers systems and the ratings posted on January2, 1998.
a.
Compute the mean and the median
b.
Compute the first and second
quartiles
c.
Compute the standard deviation
d.
What are the z-scores associated
with the Allison One and the Omni Audio SA 12.3?
e.
Do the data contain any
outliers? Explain.