Solutions to Chapter 4 Practice Problems

1.

a. Without using SAS, provide a one-sided permutation p-value.

The mean in group 1 is 2.25.  The mean in group 2 is 112.  The differences between means is 112-2.25=109.75.

We need to find the total number of ways of dividing the 18 observed numbers into a group of 4 and a group of 14 numbers where

the mean in the group of 14 numbers is bigger than the mean in the group of 4 numbers by 109.75 or more.  An equivalent (and much

simpler) plan is to find the number of ways of picking, out of the 18 observed numbers, 4 numbers that have a sum of

1 + 1 + 2 + 5 = 9 or less.  It is not hard to see that if the 4 numbers have a sum of 9 or less, than the difference between the mean of the

4 numbers and the mean of the remaining 14 numbers will be 109.75 or more.

Below are the different choices for the four numbers along with the number of ways that the 4 numbers could be picked from the 18

observed numbers.  For example, there are 4 x 3=12 ways to get 1,1,1,2 because

i) there are four ways to choose three 1s from the four 1s in the eighteen observed data values, and

ii) for each of the choices in (i), there are three ways to choose one 2 from the three 2s in the eighteen observed data values.

Data Set                       Number of Ways

1,1,1,1                         (4 nCr 4) = 1

1,1,1,2                         (4 nCr 3)(3 nCr 1) = 12

1,1,1,3                         (4 nCr 3)(1 nCr 1) = 4

1,1,1,4                         (4 nCr 3)(1 nCr 1) = 4

1,1,1,5                         (4 nCr 3)(1 nCr 1) = 4

1,1,2,2                         (4 nCr 2)(3 nCr 2) = 18

1,1,2,3                         (4 nCr 2)(3 nCr 1)(1 nCr 1) = 18

1,1,2,4                         (4 nCr 2)(3 nCr 1)(1 nCr 1) = 18

1,1,2,5                         (4 nCr 2)(3 nCr 1)(1 nCr 1) = 18

1,1,3,4                         (4 nCr 2)(1 nCr 1)(1 nCr 1) = 6

1,2,2,2                         (4 nCr 1)(3 nCr 3) = 4

1,2,2,3                         (4 nCr 1)(3 nCr 2)(1 nCr 1) = 12

1,2,2,4                         (4 nCr 1)(3 nCr 2)(1 nCr 1) = 12

2,2,2,3                         (3 nCr 3)(1 nCr 1) = 1

Total Number of Ways = 132

The total number of ways to choose 4 data values from the 18 observed values is

(n1+n2)Cn1 = 18C4 = 3060.

Thus our one-sided p-value is 132/3060 = 0.0431

b. Provide an approximate one-sided p-value for the rank sum test by computing a Z-statistic and comparing its value to the standard normal distribution.

First, we will list all values and rank them.  (The bold values correspond to group 1.)

Value    1   1   1   1    2   2   2   3   4    5    7   15   32   41   77  107  299  976

Rank    2.5 2.5 2.5  2.5  6   6   6   8   9   10   11   12   13   14   15   16   17   18

Note that for the ties we use average ranks:

For the set of 1’s : (1 + 2 + 3 + 4) / 4 = 2.5

For the set of 2’s : (5 + 6 + 7) / 3 = 6

T = 2.5 + 2.5 + 6 + 10 = 21

Mean(T) = n1(n1 + n2 + 1) / 2 = 4(19)/2 = 38

S_R=5.300  <----This is the standard deviation of all the ranks: 2.5, 2.5, 2.5, 2.5, 6, 6, 6, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18

SD(T)=5.300 * sqrt[4*14/(4+14)]=9.348

Z = (21 – 38) /  9.348 = -1.819

Our one-sided p-value is between 0.025 and 0.05.  (If you use the standard normal table, you get 0.0344.)

2.

a.

Leaf                 Prep. 1             Prep. 2             Diff                  Rank

1                         2                      12                    -10                   5

2                         0                      1                      -1                     1

3                         7                      3                      4                      2

4                         9                      2                      7                      3

5                         13                    5                      8                      4

6                         15                    4                      11                    6

7                         14                    2                      12                    7

8                         22                    7                      15                    8

9                         20                    4                      16                    9

10                       27                    10                    17                    10

11                       32                    14                    18                    11

12                       21                    2                      19                    12

13                       44                    20                    24                    13

14                       47                    20                    27                    14

15                       51                    20                    31                    15

16                       47                    12                    35                    16

17                       67                    22                    45                    17

18                       72                    24                    48                    18

19                       69                    15                    54                    19

20                       71                    16                    55                    20

S = 2+3+4+6+7+8+9+10+11+12+13+14+15+16+17+18+19+20 = 204

b.

Mean(S) = n(n+1)/4 = 20(21)/4 = 105

SD(S) = sqrt [(n)(n + 1)(2n + 1)/24] = 26.786   <---This formula works only because there are no ties.

In general, you need to use the formula based on the

square root of the sum of the squares of all the ranks

divided by 4.

Z = (204 – 105) / 26.786 = 3.696

c.

3.696 is off the chart, so we can safely say that the two-sided p-value is less than or equal to 0.001.

d.

The sum of the ranks associated with negative differences is 1 + 5 = 6.

We want to find how many other ways we can assign signs to the observed ranks and still have the sum

of the negative ranks less than or equal to 6.  (This is equivalent to finding all the ways of assigning + and

- signs to the ranks that will give us S>=204, the value observed in the original data.)

RANKS

11111111112

12345678901234567890    sum of negative ranks

++++++++++++++++++++            0

-+++++++++++++++++++            1

+-++++++++++++++++++            2

++-+++++++++++++++++            3

+++-++++++++++++++++            4

++++-+++++++++++++++            5

+++++-++++++++++++++            6

--++++++++++++++++++            3

-+-+++++++++++++++++            4

-++-++++++++++++++++            5

-+++-+++++++++++++++            6

+--+++++++++++++++++            5

+-+-++++++++++++++++            6

---+++++++++++++++++            6

No other configurations give 6 or less.

Total # outcomes =14

Possible combinations = 2^20  (2 raised to the power 20)

One-sided p-value = 14/2^20 = 0.00001335

Two-sided p-value = 2(.00001335) = .0000267

e.

Sign Test

Z = (k – (n/2)) / sqrt (n / 4)

k = 18

n = 20

Z = (18 – 10) / sqrt (5) = 3.578

This is off the chart, so we can safely say that the two-sided p-value is less than 0.001.