Given the following boxplot where m is the median value, what statement could be made

Show all work when appropriate. You may type your answers onto the test, or complete it by hand and submit a scanned copy.

 

1. Given the following boxplot where m is the median value, what statement could be made about the distribution of the data?

Explain your answer.

 

2. The Colorado State Legislature wants to estimate the length of time it takes a resident of Colorado to earn a bachelor’s degree from a state college or university. A random sample was taken of 265 recentin-state graduates.

a) Identify the variable.

b)Is the variable quantitative or qualitative?

c)What is the implied population?

3. For the information in parts (a) through (g) below, list the highest
level of measurement as ratio, interval, ordinal, or nominal, and
explain your choice.

 

A student advising file contains the following information:

 

(a) Name of student

(b) Student I.D. number

(c) Cumulative grade point average

(d) Dates of awards (scholarships, dean’s list, etc.)

(e) Declared major

(f) A number code representing class standing:
1 = freshman, 2 = sophomore, 3 = junior,
4 = senior, 5 = graduate student

(g) Entrance exam rating for competency in English:
excellent, satisfactory, unsatisfactory

 

 

 

4. Exam Scores For 108 randomly selected collegestudents, this exam score frequency distribution wasobtained.

 

 

Class

Limits

Class

Boundaries

frequency

 

*Frequency

 

*Frequency

 

90–98

89.5 – 98.5

6

94

 

8836

 

 

99–107

98.5 – 107.5

22

103

 

10609

 

 

108–116

107.5-116.5

43

112

 

12544

 

 

117–125

116.5-125.5

28

121

 

14641

 

 

126–134

125.5-134.5

9

130

 

16900

 

 

Total

 

 

 

 

 

 

 

 

Find by using the correct formulas:

Be sure to show all work.

a) Mean

b) Modal class

c) Variance

d) Standard deviation

e) Constructa histogram

f) Discuss the shape of the distribution

 

5 .Stories in the World’s Tallest Buildings Thenumber of stories in each of the world’s 30 tallestbuildings is listed below.

 

88 88 110 88 80 69 102 78 70 55

79 85 80 100 60 90 77 55 75 55

54 60 75 64 105 56 71 70 65 72

 

a) Construct a stem-and-leaf plot

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

b) Find the 5-number summary

 

 

 

c) Construct a box-and-whiskers-plot

d) Check for outliers

e) Discuss the shape of the distribution

 

  1. Here clearly the data is not symmetrical as the very large dosage is expected to have a really high amount of drug dosage thus the data is somewhat positively skewed. And we know that for skewed data the median is the best measure to describe the location. So here the median would be best measure of position.

  2. Top quartile means top 25% of the data. Here the total number of mice are 40 so top quartile consists of (40*25%) = 10 mice. Now 40% of them survived so (10*40%) = 4 mice survived.

  3. The percentile would give us what percent of the mice are in that category. So in this example it will pin point the information we are looking for.

  4. The quartile is somewhat similar to percentile. The percentile divides the whole group into 100 equal parts and the quartiles into 4 equal parts. So quartiles would give the 25 percentiles information.

  5. The standard score will give us the relative measures. It will give enough information to compare to different scores by standardizing them.

Many mutual funds compare their performance with that of a benchmark, an index of the

Many mutual funds compare their performance with that of a benchmark, an index of the returns on all securities of the kind that the fund buys. The Vanguard International Growth Fund, for example, takes as its benchmark the Morgan Stanley Europe, Australasia, Far East (EAFE) index of overseas stock market performance. Here are the percent returns for the funds and for the EAFE from 1982 (the first full year of the fund’s existence) to 2000:

 

Year

Fund

EAFE

Year

Fund

EAFE

1982

5.27

-0.86

1992

-5.79

-11.85

1983

43.08

24.61

1993

44.74

32.94

1984

-1.02

7.86

1994

0.76

8.06

1985

56.94

56.72

1995

14.89

11.55

1986

56.71

69.94

1996

14.65

6.36

1987

12.48

24.93

1997

4.12

2.06

1988

11.61

28.59

1998

16.93

20.33

1989

24.76

10.8

1999

26.34

27.3

1990

-12.05

-23.2

2000

-8.6

-13.96

1991

4.74

12.5

     

 

Make a scatterplot suitable for predicting fund returns from EAFE returns. Is there a clear straight-line pattern? How strong is this pattern? (Give a numerical measure.) Are there any extreme outliers from the straight-line pattern?

 

 

 

Identify the sampling techniques used, and discuss potential sources of bias(if any).

Identify the sampling techniques used, and discuss potential sources of bias(if any). Explain.Using random digit dialing, researchers call 1400 people and ask what obstacles keep them from voting.

What type of sampling is used?

A.

Simple random sampling is used, since each number has an equal chance of being dialed, so all samples of 1400 phone numbers have an equal chance of being selected.

B.

Convenience sampling is used, since the 1400 phone numbers that are easiest to reach are selected.

C.

Cluster sampling is used, since the phone numbers are divided into groups, several groups are selected, and each number in those groups is called.

D.

Systematic sampling is used, since phone numbers are selected from a list using a fixed interval between phone numbers.

 

What potential sources of bias are present, if any? Select all that apply.

A.

Individuals may not be available when the researchers are calling. Those individuals that are available may not be representative of the population.

B.

The sample only consists of members of the population that are easy to get. These members may not be representative of the population.

C.

Telephone sampling only includes people who have telephones. People who own telephones may be older or wealthier onaverage, and may not be representative of the entire population.

D.

Individuals may refuse to participate in the sample. This may make the sample less representative of the population.

E.

There are no potential sources of bias.

 

The colors of candies such as M&M’s are carefully chosen to match consumer preferences.

The colors of candies such as M&M’s are carefully chosen to match consumer preferences. The color of an M&M drawn at random from a bag has a probability distribution determined by the proportions of colors among all M&M’s of that type.

(a)   Here is the distribution for plain M&M’s:

Color         Brown    Red   Yellow Green  Orange Blue

Probability     0.3       0.2       0.2       0.1       0.1       ?

What must be the probability of drawing a blue candy?

(b)   The probabilities for peanut M&M’s are a bit different. Here they are:

Color          Brown   Red   Yellow  Green OrangeBlue

Probability     0.2       0.2       0.2       0.1       0.1         ?

What is the probability that a peanut M&M chosen at random is blue?

(c)    What is the probability that a plain M&M is any of red, yellow, or orange? What is the probability that a peanut M&M has one of these colors?

 

A clinical psychologist is interested in comparing the effectiveness of short term relaxation and cognitive-

A clinical psychologist is interested in comparing the effectiveness of short term relaxation and cognitive-behavioral therapy in treating mild depression. An experiment is conducted in which 15 patients with mild depression are randomly selected and assigned 5 each to a relaxation therapy group, a cognitive/behavioral therapy group, and an attention placebo group. Therapy is administered until the patient is judged no longer depressed or until 10 treatment sessions have elapsed. The following data is obtained. Scores are the number of sessions for each patient.

Cognitive/Behavioral therapy: 5, 6, 8, 4, 7
Relaxation Therapy: 6, 8, 10, 9, 7
Attention-Placebo Therapy Group: 8, 10, 9, 10, 9

Refer to Exhibit 15-2. Fobt = ____.
a. 10.21
b. 8.44
c. 8.20
d. 6.88

Refer to Exhibit 15-2. Using a = 0.05, Fcrit = ____.
a. 3.98
b. 19.41
c. 6.93
d. 3.88

Refer to Exhibit 15-2. Using a = 0.05, what do you conclude?
a. reject H0; there is no difference among the treatments
b. reject H0; at least one of the treatments differs from at least one of the others
c. retain H0; we cannot conclude that there is a difference among the treatments
d. accept H0; we cannot conclude that there is a difference among the treatments

Refer to Exhibit 15-2. Estimate the size of the effect ^w^2 = ____.
a. 0.5226
b. 0.4824
c. 0.4393
d. 0.4658

The one-way ANOVA partitions the total variability into ____.
a. SSW and SSB
b. sW 2 and sB 2
c. SSW and SST
d. SSB and sB 2 

Independent random samples from normal populations produced the results shown in

Independent random samples from normal populations produced the results shown in the table. Complete parts a through d.

sample 1
3.2, 2.2, 2.7, 1.4, 2.1

sample 2 
2.9, 3.4, 3.4, 3.1

The Minitab output is given below,

Two-Sample T-Test and CI: Sample 1, Sample 2

 

Two-sample T for Sample 1 vs Sample 2

 

N Mean StDev SE Mean

Sample 1 5 2.320 0.676 0.30

Sample 2 4 3.200 0.245 0.12

 

 

Difference = μ (Sample 1) – μ (Sample 2)

Estimate for difference: -0.880

95% CI for difference: (-1.730, -0.030)

T-Test of difference = 0 (vs ≠): T-Value = -2.45 P-Value = 0.044 DF = 7

Both use Pooled StDev = 0.5356

 

 

 


a.) calculate the pooled estimate of σ2.

s2p = (round to four decimal places as needed)

b.) Do the data provide sufficient evidence to indicate that u2 > u1? Test using a = 0.05

Yes ( )
No ( )
Please select one.

c.) Find a 95% confidence interval for (u1-u2)

Confidence interval is
(round to two decimal places as needed)

d.) Which of the two inferential procedures, the test of hypothesis in part b or the confidence interval in part c, provides more information about (u1-u2)?

Select one:
A – The confidence interval in part c provides more information about u1 – u2

B – The test of hypothesis in part b provides more information. 

A survey was conducted to evaluate whether a Super Bowl advertisement changed consumer

Notice: There are five (5) questions.
A survey was conducted to evaluate whether a Super Bowl advertisement changed consumer attitudes about your company’s product.
A survey of 1000 people included 80% men.
When asked about the impact of the advertising, 70% of the sample said they were more likely to try it after seeing the Super Bowl ad.
150 of the people who said the ad made no impact on their decision were men.

1. Complete the following contingency table:

2. Given that a person is female, what is the probability that the Super Bowl advertisement made her more likely to try the product (show yourwork)?


3. Given that the advertising made a person more likely to try the product, what is the probability that the person is a man (show yourwork)?


4. A reporter sees this information and writes Women are less influenced by Super Bowl advertising than men. Is this a truestatement? Why or why not?


5. Is being a woman independent of the impact of Super Bowl advertising on a persons decision to try a product (show your work)?

 

A strong linear relationship (r = 0.97) exists between the two variables x and y in the table

1- A strong linear relationship (r = 0.97) exists between the two variables x and y in the table. The equation of the least squares line is ŷ = 15.75 – 0.55x. For what values of x should we use this equation to make predictions?

 

x 5 7 8 10 11 12

y 5.5 8 8 9 10 11

 

A) Any positive value of x

B) Values of x less than or equal to 12

C) Values of x less than or equal to 5

D) Values of x between 5 and 12 inclusive

 

2-

A survey of ages of children at a skate park produced the following results summarized in the frequency table:

 

 

 

Age Frequency

10 2

11 4

12 6

14 8

20 5

How many children were in the skate park? 

 

What is the median age of children in the skate park? 

 

 

What is the modal (mode) age of the children in the skate park? 

 

 

What is the range value of the ages of children in the skate park? 

 

 

If a birthday party of 5 children who were 10 years old came into the park, which of the following statistics would change? Type yes, or no.

 

median ?

 

 

mode ?

 

 

range ?

 

What percent of children in the skate park were less than 12 years of age? %

 

 

3- In two statistics classes, the same final exam was given and yielded the following results:

 

10:00am class: x-bar = 72, s = 10

 

11:00am class: x-bar = 67, s = 6

 

John, in the 10:00am class, scored 62 and Paul, in the 11:00am class, also scored 62.

 

Calculate John’s z-score, round to 3 decimal places: (enter as 0.xxx)

 

 

Calculate Paul’s z-score, round to 3 decimal places: (enter as 0.xxx)

 

 

Did John or Paul have a better relative standing in his respective class? 

 

· One thousand people are enrolled in a 10-year cohort study. At the start of the study,

·         One thousand people are enrolled in a 10-year cohort study. At the start of the study, 100 have diagnosed CVD. Over the course of the study, 80 people who were free of CVD at baseline develop CVD.

 

1. What is the cumulative incidence of CVD over 10 years?

 

Cumulative Incidence 10 Years =

 

2. What is prevalence of CVD at baseline?

 

Prevalence Baseline =

 

3. What is the prevalence of CVD at 10 years?

 

Prevalence 10 Years =

 

 

·         A study is deigned to investigate whether there is a difference in response to various treatments in patients with rheumatoid arthritis. The outcome is a patients self-reported effect of treatment. The data are shown above. Are symptoms independent of treatment? Conduct a Chi Square test at a 5% level of significance.

 

Symptoms worsened

No effect

Symptoms improved

total

Treament1

 

22

14

14

50

Treatment 2

14

15

21

50

Treatment 3

9

12

29

50

1. df=

2. Critical value

3. Computed statistic

Based on comparing the computed statistics to the critical value which of the following are true?

  1. There is significant evidence , alpha 0=0.05, to show that treatment and response are not independent
  2. There is not significant evidence, alph=0.05, to show that treatment and response are not independent
  3. There is significant evidence, alpha =0.05, to show that treatment and response are independent
  4. B and c

 

Compute the test statistic and the p-value for the following three cases.

Consider the following hypothesis test.
H: π ≥ 0.55
H: π < 0.55
Compute the test statistic and the p-value for the following three cases.
14 n = 300 p̅ = 0.51 α = 0.05
a p-value = 0.0823 Conclude that the population proportion is less than 0.55.
b p-value = 0.0823 Conclude that the population proportion is not less than 0.55.
c p-value = 0.0411 Conclude that the population proportion is not less than 0.55.
d p-value = 0.0411 Conclude that the population proportion is less than 0.55.

15 n = 300 p̅ = 0.51 α = 0.10
a p-value = 0.0411 Conclude that the population proportion is not less than 0.55.
b p-value = 0.0411 Conclude that the population proportion is less than 0.55.
c p-value = 0.0823 Conclude that the population proportion is not less than 0.55.
d p-value = 0.0823 Conclude that the population proportion is less than 0.55.

16 n = 900 ∑x = 468
a p-value = 0.0703 Reject H at α = 0.10, but do not reject at α = 0.05.
b p-value = 0.0703 Reject H at α = 0.05, but do not reject at α = 0.10.
c p-value = 0.0351 Reject H at α = 0.05, but do not reject at α = 0.01.
d p-value = 0.0351 Reject H at α = 0.10, but do not reject at α = 0.05.