# PROBABILITY AND STATISTICS Final Exam

** **

** **

**PROBABILITY AND STATISTICS **

# Final Exam

*This is an open-book take-home exam. Good luck!*

**1.** Let *X _{i}* be the life length of an item. Consider

*X*

_{1},

*X*

_{2},…

*X*to be independently and identically distributed, each with normal distribution

_{n}*N*(m,s

^{2}). Assume that s

^{2}=16, but that m is unknown. Suppose 100 tests yield an average life of =501.2 hours.

a) Construct a 95% confidence interval for the reliability of the item for a service time of *t* hours given by

*R(t; **m**)*=*P(X>t)*.

b) Compute numerical values for a) if *t* = 500 hours.

**2.** For a random sample of size *n* from *f*(*x*|θ)= *θL ^{θ}x*

^{-(1+θ)}for

*x>L*, where

*L*is known and θ>0,

a) Find the maximum likelihood estimator of θ and express it as a function of *g*=, the geometric mean of the observations.

b) Find the set of admissible rejection regions in terms of *g* for a likelihood ratio test of H_{0}: θ=5 versus H_{1}: θ=2.

**3.** For a normal data-generating process with m and s not known but the coefficient of variation c=s/m known, find the maximum likelihood estimates of m and s^{2} if *c*=0.25 and the data are: 16, 27, 24, 21, 23, 12, 21, 18, 17, 23. Compare these estimates with estimates that would be obtained if no information were available concerning *c*.

**4.** In a survey, some of the questions concern sensitive issues (e.g., income, drug use, sexual experiences). As a result, some respondents do not answer the questions truthfully. Denote the proportion of the members of a particular population that had incomes over $100,000 last year by *p*. A random sample of *n* members of this population is taken, and each person in the sample is asked “Was your income over $100,000 last year?” If a person really had an income over $100,000, the probability that she will give a truthful answer to this question is 1-l_{1}. If a person’s income was *not* over $100,000, the probability that she will give a truthful answer is 1-l_{2}. From past experience, l_{1} and l_{2} are known, with 0<l_{1}<0.5, 0<l_{2}<0.5.

a) For a sample of size one, find the likelihood function if the answer is “yes” and find the likelihood function if the answer is “no.”

b) For a random sample of size *n*, find the likelihood function and sufficient statistics.

c) Find the maximum likelihood estimator for *p*.

d) Assume that l_{1}=0.1, l_{2}=0, and there is one “yes” answer in a random sample of size 10. What is your best estimate of *p* and why?

e) Consider the same scenario as in (d), but assume that l_{1} is unknown (0<l_{1}<1). In this case, what would be your best estimate of *p* and why?

** **

**5.** Let *X*_{1}, *X*_{2},…*X _{n}* be the times in months until failure of

*n*similar pieces of equipment. If the equipment is subject to wear, a model often used is the one where

*X*

_{1},

*X*

_{2},…

*X*(i.i.d) is a sample from a Weibull distribution with density

_{n}, *x _{i}*>0.

Here *c* is a known positive constant and l>0 is the (scale) parameter of interest.

a) Show that is an optimal test statistic for testing H_{0}: 1/l<1/l_{0} versus H_{1}: 1/l>1/l_{0}, i.e., show that for a UMP test, the rejection and acceptance regions are defined in terms of the statistic .

b) If random variable *X* has a Weibull distribution specified above, find the distribution of the random variable .

**6.** A journal editor says: “If we only publish papers with results that are statistically significant at the a=0.05 level, at most 5% of our papers will have erroneous results.” Denote by *p* the proportion of researchers with true H_{0} and false H_{1}. Suppose that each researcher performs one test, sends the paper to the journal, and the paper is accepted if the results of the test are significant at the a=0.05 level.

a) If in a given year the journal publishes *n* papers, find the distribution of the papers with erroneous results that are published in this year. Assume that all the tests in all papers have the same b, probability of type II error.

b) What is this distribution if *p*=1, i.e., if all researchers, submitting the papers this year, had true H_{0} and false H_{1}?

c) Overall, comment on the above statement of a journal editor.

**7.** Suppose that a single observation *X* is to be drawn from an unknown distribution *P*, and that the following simple hypotheses are to be tested:

H_{0}: *P* is a uniform distribution on the interval [0,1],

H_{1}: *P* is a standard normal distribution.

Determine the most powerful test of size 0.01, and calculate the power of the test when H_{1} is true.

**8.** An unethical experimenter desires to test the following hypotheses:

H_{0}: q=q_{0},

H_{1}: q¹q_{0}.

She draws a random sample *X*_{1}, *X*_{2},…*X _{n}* from a distribution with the pdf

*f*(

*x*|q) and carries out a test of size a. If this test does not reject H

_{0}, she discards the sample, draws a new independent random sample of

*n*observations, and repeats the test based on the new sample. She continues drawing new independent samples in this way until she obtains a sample for which H

_{0}is rejected.

a) What is the overall size of this testing procedure?

b) If H_{0} is true, what is the distribution of the number of samples that the experimenter will have to draw until she rejects H_{0}? In particular, what is the expected number of samples for a=0.05?

**9.** Consider the following situation. There are *N* job applicants, and, with probability p* _{i}*,

*n*of them (

_{i}*i*=1,2,…

*M*; 0<

*n*<

_{i}*N*) are invited for an interview. All p

*and*

_{i}*n*are known to all job applicants, and if

_{i}*n*applicants are invited, then each of

_{i }*N*applicants has the same chance

*n*/

_{i}*N*to be invited.

a) Given that a job applicant is invited for an interview, what are her expectations about the total number of applicants invited for an interview? 1) Find the corresponding probability distribution – i.e., the posterior distribution (conditional on an applicant being invited for an interview) for the number of applicants invited for an interview. 2) For this distribution, find the expected number of invited applicants.

b) Assume that if *n _{i}* applicants are invited, each of them has equal (1/

*n*) chance of getting a job. Before the applicant is invited, what are her chances of getting a job? After the applicant is invited, what are her chances of getting a job? What are the chances to be invited? Do these three numbers agree with each other?

_{i}

c) Repeat questions a) and b) for a special case *M*=2, p_{1}=p_{2}=0.5, *n*_{1 }= 1, *n*_{2 }= 100, *N*=1000 – i.e., out 1000 applicants, either 1 or 100 are invited for an interview. Do the answers make sense?