Sampling: Design and Analysis, 2nd Edition Solution Manual

Master your textbook with Sampling: Design and Analysis, 2nd Edition Solution Manual, offering detailed solutions to every question.

Mason Carter
Contributor
4.4
37
5 months ago
Preview (16 of 258 Pages)
100%
Purchase to unlock

Page 1

Sampling: Design and Analysis, 2nd Edition Solution Manual - Page 1 preview image

Loading page image...

Chapter 1Introduction1.1Target population: Unclear, but presumed to be readers ofParademagazine.Sampling frame: Persons who know about the telephone survey.Sampling unit = observation unit: One call. (Although it would also be correct toconsider the sampling unit to be a person. The survey is so badly done that it isdifficult to tell what the units are.)As noted in Section 1.3, samples that consist only of volunteers are suspect. This isespecially true of surveys in which respondents must pay to participate, as here—persons willing to pay 75 cents a call are likely to have strong opinions about thelegalization of marijuana, and it is impossible to say whether pro- or anti-legalizationadherents are more likely to call.This survey is utterly worthless for measuringpublic opinion because of its call-in format. Other potential biases, such as requiringa touch-tone telephone, or the sensitive subject matter or the ambiguity of thewording (what does “as legal as alcoholic beverages” mean?) probably make littledifference because the call-in structure destroys all credibility for the survey by itself.1.2Target population: All mutual funds.Sampling frame: Mutual funds listed in newspaper.Sampling unit = observation unit: One listing.As funds are listed alphabetically by company, there is no reason to believe therewill be any selection bias from the sampling frame. There may be undercoverage,however, if smaller or new funds are not listed in the newspaper.1.3Target population: Not specified, but a target population of interest would bepersons who have read the book.Sampling frame: Persons who visit the websiteSampling unit = observation unit: One review.1

Page 2

Sampling: Design and Analysis, 2nd Edition Solution Manual - Page 2 preview image

Loading page image...

Page 3

Sampling: Design and Analysis, 2nd Edition Solution Manual - Page 3 preview image

Loading page image...

2CHAPTER 1.INTRODUCTIONThe reviews are contributed by volunteers. They cannot be taken as representativeof readers’ opinions. Indeed, there have been instances where authors of competingbooks have written negative reviews of a book, although amazon.com tries to curbsuch practices.1.4Target population: Persons eligible for jury duty in Maricopa County.Sampling frame: County residents who are registered voters or licensed drivers over18.Sampling unit = observation unit: One resident.Selection bias occurs largely because of undercoverage and nonresponse.Eligiblejurors may not appear in the sampling frame because they are not registered to voteand they do not possess an Arizona driver’s license. Addresses on either list may notbe up to date. In addition, jurors fail to appear or are excused; this is nonresponse.A similar question for class discussion is whether there was selection bias in selectingwhich young men in the U.S ˙were to be drafted and sent to Vietnam.1.5Target population: All homeless persons in study area.Sampling frame: Clinics participating in the Health Care for the Homeless project.Sampling unit: Unclear. Depending on assumptions made about the survey design,one could say either a clinic or a homeless person is the sampling unit.Observation unit: Person.Selection bias may be a serious problem for this survey.Even though the demo-graphics for HCH patients are claimed to match those of the homeless population(but do weknowthey match?) and the clinics are readily accessible, the patientsdiffer in two critical ways from non-patients:(1) they needed medical treatment,and (2) they went to a clinic to get medical treatment.One does not know thelikely direction of selection bias, but there is no reason to believe that the samepercentages of patients and non-patients are mentally ill.1.6Target population: Female readers ofPreventionmagazine.Sampling frame: Women who see the survey in a copy of the magazine.Sampling unit = observation unit: One woman.This is a mail-in survey of volunteers, and we cannot trust any statistics from it.1.7Target population: All cows in region.Sampling frame: List of all farms in region.Sampling unit: One farm.Observation unit: One cow.There is no reason to anticipate selection bias in this survey. The design is a single-

Page 4

Sampling: Design and Analysis, 2nd Edition Solution Manual - Page 4 preview image

Loading page image...

3stage cluster sample, discussed in Chapter 5.1.8Target population:Licensed boarding homes for the elderly in Washingtonstate.Sampling frame: List of 184 licensed homes.Sampling unit = observation unit: One home.Nonresponse is the obvious problem here, with only 43 of 184 administrators or foodservice managers responding. It may be that the respondents are the larger homes,or that their menus have better nutrition. The problem with nonresponse, though,is that we can only conjecture the direction of the nonresponse bias.1.13Target population: All attendees of the 2005 JSM.Sampling population: E-mail addresses provided by the attendees of the 2005 JSM.Sampling unit: One e-mail address.It is stated that the small sample of conference registrants was selected randomly.This is good, since the ASA can control the quality better and follow up on non-respondents. It also means, since the sample is selected, that persons with strongopinions cannot flood the survey. But nonresponse is a potential problem—responseis not mandatory and it might be feared that only attendees with strong opinionsor a strong sense of loyalty to the ASA will respond to the survey.1.14Target population: All professors of educationSampling population: List of education professorsSampling unit: One professorInformation about how the sample was selected was not given in the publication,but let’s assume it was a random sample. Obviously, nonresponse is a huge problemwith this survey. Of the 5324 professors selected to be in the sample, only 900 wereinterviewed. Professors who travel during summer could of course not be contacted;also, summer is the worst time of year to try to interview professors for a survey.1.15Target population: All adultsSampling population: Friends and relatives of American Cancer Society volunteersSampling unit: One personHere’s what I wrote about the survey elsewhere:“Although the sample contained Americans of diverse ages and backgrounds, andthe sample may have provided valuable information for exploring factors associatedwith development of cancer, its validity for investigating the relationship betweenamount of sleep and mortality is questionable.The questions about amount ofsleep and insomnia were not the focus of the original study, and the survey was notdesigned to obtain accurate responses to those questions. The design did not allow

Page 5

Sampling: Design and Analysis, 2nd Edition Solution Manual - Page 5 preview image

Loading page image...

4CHAPTER 1.INTRODUCTIONresearchers to assess whether the sample was representative of the target populationof all Americans. Because of the shortcomings in the survey design, it is impossibleto know whether the conclusions in Kripke et al. (2002) about sleep and mortalityare valid or not.” (pp. 97–98)Lohr, S. (2008). “Coverage and sampling,” chapter 6 ofInternational Handbook ofSurvey Methodology, ed.E. deLeeuw, J. Hox, D. Dillman.New York:Erlbaum,97–112.1.25Students will have many different opinions on this issue. Of historical interestis this excerpt of a letter written by James Madison to Thomas Jefferson on February14, 1790:A Bill for taking a census has passed the House of Representatives, and iswith the Senate. It contained a schedule for ascertaining the componentclasses of the Society, a kind of information extremely requisite to theLegislator, and much wanted for the science of Political Economy.Arepetition of it every ten years would hereafter afford a most curiousand instructive assemblage of facts.It was thrown out by the Senateas a waste of trouble and supplying materials for idle people to make abook. Judge by this little experiment of the reception likely to be givento so great an idea as that explained in your letter of September.

Page 6

Sampling: Design and Analysis, 2nd Edition Solution Manual - Page 6 preview image

Loading page image...

Chapter 2Simple Probability Samples2.1(a) ¯yU= 98 + 102 + 154 + 133 + 190 + 1756= 142(b) For each plan, we first find the sampling distribution of ¯y.Plan 1:Sample numberP(S)¯yS11/8147.3321/8142.3331/8140.3341/8135.3351/8148.6761/8143.6771/8141.6781/8136.67(i)Ey] = 18 (147.33) + 18 (142.33) +· · ·+ 18 (136.67) = 142.(ii)Vy] = 18 (147.33142)2+ 18 (142.33142)2+· · ·+ 18 (136.67142)2= 18.94.(iii) Bias [¯y] =Ey]¯yU= 142142 = 0.(iv) Since Bias [¯y] = 0, MSE [¯y] =Vy] = 18.94Plan 2:Sample numberP(S)¯yS11/4135.3321/2143.6731/4147.33(i)Ey] = 14 (135.33) + 12 (143.67) + 14 (147.33) = 142.5.5

Page 7

Sampling: Design and Analysis, 2nd Edition Solution Manual - Page 7 preview image

Loading page image...

6CHAPTER 2.SIMPLE PROBABILITY SAMPLES(ii)Vy]=14 (135.33142.5)2+ 12 (143.67142.5)2+ 14 (147.33142.5)2=12.84 + 0.68 + 5.84=19.36.(iii) Bias [¯y] =Ey]¯yU= 142.5142 = 0.5.(iv) MSE [¯y] =Vy] + (Bias [¯y])2= 19.61.(c) Clearly, Plan 1 is better. It has smaller variance and is unbiased as well.2.2(a) Unit 1 appears in samples 1 and 3, soπ1=P(S1) +P(S3) = 18 + 18 = 14 .Similarly,π2=14 + 38 = 58π3=18 + 14 = 38π4=18 + 38 + 18 = 58π5=18 + 18 = 14π6=18 + 18 + 38 = 58π7=14 + 18 = 38π8=14 + 18 + 38 + 18 = 78.Note that8i=1πi= 4 =n.(b)Sample,SP(S)ˆt{1,3,5,6}1/838{2,3,7,8}1/442{1,4,6,8}1/840{2,4,6,8}3/842{4,5,7,8}1/852Thus the sampling distribution of ˆtis:kPt=k)381/8401/8425/8521/8

Page 8

Sampling: Design and Analysis, 2nd Edition Solution Manual - Page 8 preview image

Loading page image...

72.3No, because thick books have a higher inclusion probability than thin books.2.4(a) A total of (83) = 56 samples are possible, each with probability of selection156.The R functionsamplistbelow will (inefficiently!)generate each of the 56samples. To find the sampling distribution of ¯y, I used the commandssamplist <- function(popn,sampsize){popvals <- 1:length(popn)temp <- comblist(popvals,sampsize)matrix(popn[t(temp)],nrow=nrow(temp),byrow=T)}comblist <- function(popvals, sampsize){popsize <- length(popvals)if(sampsize > popsize)stop("sample size cannot exceed population size")nvals <- popsize - sampsize + 1nrows <- prod((popsize - sampsize + 1):popsize)/prod(1:sampsize)ncols <- sampsizeyy <- matrix(nrow = nrows, ncol = ncols)if(sampsize == 1) {yy <- popvals}else {nvals <- popsize - sampsize + 1nrows <- prod(nvals:popsize)/prod(1:sampsize)ncols <- sampsizeyy <- matrix(nrow = nrows, ncol = ncols)rep1 <- rep(1, nvals)if(nvals > 1) {for(i in 2:nvals)rep1[i] <- (rep1[i - 1] * (sampsize + i - 2))/(i - 1)}rep1 <- rev(rep1)yy[, 1] <- rep(popvals[1:nvals], rep1)for(i in 1:nvals) {yy[yy[, 1] == popvals[i], 2:ncols] <- Recall(popvals[(i + 1):popsize], sampsize - 1)}}yy}temp1 <-samplist(c(1,2,4,4,7,7,7,8),3)temp2 <-apply(temp1, 1, mean)table(temp 2)

Page 9

Sampling: Design and Analysis, 2nd Edition Solution Manual - Page 9 preview image

Loading page image...

8CHAPTER 2.SIMPLE PROBABILITY SAMPLESThe following, then, is the sampling distribution of ¯y.kPy=k)2132/5631/563134/563231/5646/564138/564232/5656/565137/565233/5666/566136/5671/567133/56Using the sampling distribution,Ey] =256(2 13)+· · ·+ 356(7 13)= 5.The variance of ¯yfor an SRS without replacement of size 3 isVy] =256(2 135)2+· · ·+ 356(7 135)2= 1.429.Of course, this variance could have been more easily calculated using the formula in(2.7):Vy] =(1nN)S2n=(138)6.85714293= 1.429.(b) A total of 83= 512 samples are possible when sampling with replacement.Fortunately, we need not list all of these to find the sampling distribution of ¯y. LetXibe the value of theith unit drawn.Since sampling is done with replacement,X1, X2, andX3are independent;Xi(i= 1,2,3) has distributionkP(Xi=k)11/821/842/873/881/8Using the independence, then, we have the following probability distribution for¯X, which serves as the sampling distribution of ¯y.

Page 10

Sampling: Design and Analysis, 2nd Edition Solution Manual - Page 10 preview image

Loading page image...

9kPy=k)kPy=k)11/51242312/5121133/512563/5121233/51251357/51227/51252321/51221312/512657/5122236/51261336/512321/5126236/51231333/512727/51232315/51271327/512447/5127239/51241348/51281/512The with-replacement variance of ¯yisVwry] =1512 (15)2+· · ·+1512 (85)2= 2.Or, using the formula with population variance (see Exercise 2.28),Vwry] = 1nNi=1(yi¯yU)2N= 63 = 2.2.5(a) The sampling weight is 100/30 = 3.3333.(b) ˆt=i∈Swiyi= 823.33.(c) ˆVt) =N2(1nN)s2yn= 1002(130100)15.978160930= 3728.238, soSE (ˆt) =3728.238 = 61.0593and a 95% CI fortis823.33±(2.045230)(61.0593) = 823.33±124.8803 = [698.45,948.21].The fpc is (130/100) =.7, so it reduces the width of the CI.2.6(a)

Page 11

Sampling: Design and Analysis, 2nd Edition Solution Manual - Page 11 preview image

Loading page image...

10CHAPTER 2.SIMPLE PROBABILITY SAMPLESThe data are quite skewed because 28 faculty have no publications.(b) ¯y= 1.78;s= 2.682;SE [¯y] = 2.68250150807 = 0.367.(c) No; a sample of size 50 is probably not large enough for ¯yto be normallydistributed, because of the skewness of the original data.The sample skewness of the data is (from SAS) 1.593.This can be calculated byhand, finding1ni∈S(yi¯y)3= 28.9247040so that the skewness is 28.9247040/(2.6823) = 1.499314. Note this estimate differsfrom SAS PROC UNIVARIATE since SAS adjusts for df using the formula skewness=n(n1)(n2)i∈S(yi¯y)3/s3.Whichever estimate is used, however, formula(2.23) says we need a minimum of28 + 25(1.5)2= 84observations to use the central limit theorem.(d) ˆp= 28/50 = 0.56.SE (ˆp) =(0.56)(0.44)49(150807)= 0.0687.A 95% confidence interval is0.56±1.96(0.0687) = [0.425,0.695].

Page 12

Sampling: Design and Analysis, 2nd Edition Solution Manual - Page 12 preview image

Loading page image...

112.07(a) A 95% confidence interval for the proportion of entries from the South is1751000±1.961751000(11751000)1000= [.151, .199].(b) As 0.309 is not in the confidence interval, there is evidence that the percentagesdiffer.2.08Answers will vary.2.09Ifn0N, thenzα/21nNSn=zα/21n0N(1 +n0N)Sn01 +n0N=zα/21 +n0Nn0NSn0=zα/2Szα/2Se=e2.10Design 3 gives the most precision because its sample size is largest, eventhough it is a small fraction of the population. Here are the variances of ¯yfor thethree samples:Sample NumberVy)1(1400/4000)S2/400 = 0.00225S22(130/300)S2/30 = 0.03S23(13000/300,000,000)S2/3000 = 0.00033333S22.11(a)

Page 13

Sampling: Design and Analysis, 2nd Edition Solution Manual - Page 13 preview image

Loading page image...

12CHAPTER 2.SIMPLE PROBABILITY SAMPLES1012141618200204060age (months)frequencyThe histogram appears skewed with tail on the right. With a mildly skewed distri-bution, though, a sample of size 240 is large enough that the sample mean shouldbe normally distributed.(b) ¯y= 12.07917;s2= 3.705003; SE [¯y] =s2/n= 0.12425.(Since we do not know the population size, we ignore the fpc, at the risk of aslightly-too-large standard error.)A 95% confidence interval is12.08±1.96(0.12425) = [11.84,12.32].(c)n= (1.96)2(3.705)(0.5)2= 57.2.12(a) Using (2.17) and choosing the maximum possible value of (0.5)2forS2,n0= (1.96)2S2e2= (1.96)2(0.5)2(0.1)2= 96.04.Thenn=n01 +n0/N=96.041 + 96.04/580 = 82.4.(b) Since sampling is with replacement, no fpc is used. An approximate 95% confi-dence interval for the proportion of children not overdue for vaccination is27120±1.9627120(127120)120= [0.15,0.30]

Page 14

Sampling: Design and Analysis, 2nd Edition Solution Manual - Page 14 preview image

Loading page image...

132.13(a) We have ˆp=.2 andˆVp) =(17452700)(.2)(.8)744= 0.0001557149,so an approximate 95% CI is0.2±1.960.0001557149 = [.176, .224].(b) The above analysis is valid only if the respondents are a random sample of theselected sample. If respondents differ from the nonrespondents—for example, if thenonrespondents are more likely to have been bullied—then the entire CI may bebiased.2.14Here is SAS output:The SURVEYMEANS ProcedureData SummaryNumber of Observations150Sum of Weights864Class Level InformationClassVariableLevelsValuessex2f mStatisticsStd ErrorVariableLevelMeanof Mean95% CL for Mean__________________________________________________________________sexf0.3066670.0343530.23878522 0.37454811m0.6933330.0343530.62545189 0.76121478StatisticsVariableLevelSumStd Dev95% CL for Sum__________________________________________________________________sexf264.96000029.680756206.310434 323.609566m599.04000029.680756540.390434 657.689566__________________________________________________________________

Page 15

Sampling: Design and Analysis, 2nd Edition Solution Manual - Page 15 preview image

Loading page image...

14CHAPTER 2.SIMPLE PROBABILITY SAMPLES2.15(a) ¯y=301,953.7,s2=118,907,450,529.CI : 301953.7±1.96s2300(13003078),or [264883,339025](b) ¯y= 599.06,s2= 161795.4CI : [556,642](c) ¯y= 56.593,s2= 5292.73CI : [48.8,64.4](d) ¯y= 46.823,s2= 4398.199CI : [39.7,54.0]2.16(a) The data appear skewed with tail on right.(b) ¯y= 5309.8,s2= 3,274,784, SE [¯y] = 164.5Here is SAS code for problems 2.16 and 2.17:filename golfsrs ’C:\golfsrs.csv’;options ls=78 nodate nocenter;data golfsrs;infile golfsrs delimiter="," dsd firstobs=2;/* The dsd option allows SAS to read the missing values betweensuccessive delimiters */sampwt = 14938/120;

Page 16

Sampling: Design and Analysis, 2nd Edition Solution Manual - Page 16 preview image

Loading page image...

15inputRNstate $ holestype $ yearbltwkday18wkday9wkend18wkend9backteeratingparcart18cart9caddy $ pro$ ;/* Make sure the data were read in correctly */proc print data=golfsrs;run;proc univariate data= golfsrs;var wkday9 backtee;histogram wkday9 /endpoints = 0 to 110 by 10;histogram backtee /endpoints = 0 to 8000 by 500;run;proc surveymeans data=golfsrs total = 14938;weight sampwt;var wkday9 backtee;run;2.17(a) The data appear skewed with tail on left.(b) ¯y= 5309.8,s2= 3,274,784, SE [¯y] = 164.52.18ˆp= 85/120 = 0.70895%CI: 85/120±1.9685/120 (185/120)119(112014938)=.708±.081,or [0.627, 0.790].
Preview Mode

This document has 258 pages. Sign in to access the full document!

Study Now!

XY-Copilot AI
Unlimited Access
Secure Payment
Instant Access
24/7 Support
Document Chat

Document Details

Subject
Statistics

Related Documents

View all