Solution Manual for Statistical Methods for the Social Sciences, 5th Edition

Solution Manual for Statistical Methods for the Social Sciences, 5th Edition makes tackling textbook exercises a breeze, with clear and concise answers to every problem.

Madison Taylor
Contributor
4.1
49
5 months ago
Preview (16 of 138 Pages)
100%
Purchase to unlock

Page 1

Solution Manual for Statistical Methods for the Social Sciences, 5th Edition - Page 1 preview image

Loading page image...

SOLUTIONSMANUALJAMESLAPPSTATISTICALMETHODSFOR THESOCIALSCIENCESFIFTHEDITIONAlan AgrestiUniversity of Florida

Page 2

Solution Manual for Statistical Methods for the Social Sciences, 5th Edition - Page 2 preview image

Loading page image...

Page 3

Solution Manual for Statistical Methods for the Social Sciences, 5th Edition - Page 3 preview image

Loading page image...

CONTENTSChapter 1: Introduction ........................................................................................................1Chapter 2: Sampling and Measurement...............................................................................3Chapter 3: Descriptive Statistics..........................................................................................7Chapter 4: Probability Distributions ..................................................................................21Chapter 5: Statistical Inference: Estimation ......................................................................29Chapter 6: Statistical Inference: Significance Tests ..........................................................37Chapter 7: Comparison of Two Groups.............................................................................47Chapter 8: Analyzing Association Between Categorical Variables ..................................59Chapter 9: Linear Regression and Correlation...................................................................67Chapter 10: Introduction to Multivariate Relationships ....................................................83Chapter 11: Multiple Regression and Correlation .............................................................89Chapter 12: Regression with Categorical Predictors: Analysis of Variance Methods ....103Chapter 13: Multiple Regression with Quantitative and Categorical Predictors.............111Chapter 14: Model Building with Multiple Regression...................................................117Chapter 15: Logistic Regression: Modeling Categorical Responses...............................127Chapter 16: An Introduction to Advanced Methodology ................................................135

Page 4

Solution Manual for Statistical Methods for the Social Sciences, 5th Edition - Page 4 preview image

Loading page image...

Chapter 1: Introduction1Chapter 1: Introduction1.1.(a) an individual Prius (automobile)(b) All Prius automobiles used in the EPA tests.(c) All Prius automobiles that are or may be manufactured.1.2.(a) all 7.3 million voters is the population. The sample is the 1824 voters surveyed.(b) A statistic is the 60.5% who voted for Brown from the exit poll sample of size 1824; a parameter is the60.0% who actually voted for Brown.1.3.(a) all students at the University of Wisconsin(b) A statistic, since it's calculated only for the 100 sampled students.1.4.The values are statistics, since they is based on the 1028 adults in the sample.1.5.(a) all adult Americans(b) Proportion of all adult Americans who would answer definitely or probably true.(c) The sample proportion 0.523 estimates the population proportion.(d) No, it is a prediction of the population value but will not equal it exactly, because the sample is only avery small subset of the population.1.6.(a) The most common response was 2 hours per day.(b) This is a descriptive statistic because it describes the results of a sample.1.7.(a) A total of 85.7% said “yes, definitely” or “yes, probably.”(b) In 1998, a total of 85.8% said “yes, definitely” or “yes, probably.”(c) A total of 74.4% said “yes, definitely” or “yes, probably.” The percentages of yes responses were higherfor HEAVEN than for HELL.1.8.(a) Statistics, since they're based on a sample of 60,000 households, rather than all households.(b) Inferential, predicting for a population using sample information.1.9.The correct answer is (a).1.10.RaceAgeSentenceFelony?Prior ArrestsPrior Convictionswhite192no21black231no00white3810yes83Hispanic202no11white415yes541.11.(a) There are 60 rows in the data.(b) Answers will vary.1.12.Answers will vary.1.13.Answers will vary.1.14.(a) A statistic is a numerical summary of the sample data, while a parameter is a numerical summary of thepopulation. For example, consider an exit poll of voters on election day. The proportion voting for aparticular candidate is a statistic. Once all of the votes have been counted, the proportion of voters who votedfor that candidate would be known (and is the parameter).(b) Description deals with describing the available data (sample or population), whereas inference deals withmaking predictions about a population using information in the sample. For example, consider a sample ofvoters on election day. One could use descriptive statistics to describe the voters in terms of gender, race,party, etc., and inferential statistics to predict the winner of the election.

Page 5

Solution Manual for Statistical Methods for the Social Sciences, 5th Edition - Page 5 preview image

Loading page image...

2Statistical Methods for the Social Sciences1.15.If you have a census, you do not need to use the information from a sample to describe the population sinceyou have information from the population as a whole.1.16.(a) The descriptive part of this example is that the average age in the sample is 24.1 years.(b) The inferential part of this example is that the sociologist estimates the average age of brides at marriagefor the population to between 23.5 and 24.7 years.(c) The population of interest is women in New England in the early eighteenth century.1.17.(a) A statistic is the 78% of the sample of subjects interviewed in the UK who said yes.(b) A parameter is the true percent of the 50 million adults in the UK who would say yes.(c) A descriptive analysis is that the percentage of yes responses in the survey varied from 56% (in Denmark)to 95% (in Cyprus).(d) An inferential analysis is that the percentage of adults in the UK who would say yes falls between 75%and 81%.1.18.Answers will vary.1.19.Answers will vary.

Page 6

Solution Manual for Statistical Methods for the Social Sciences, 5th Edition - Page 6 preview image

Loading page image...

Chapter 2: Sampling and Measurement3Chapter 2: Sampling and Measurement2.1.(a) Discrete variables take a finite set of values (or possible all nonnegative integers), and we can enumeratethem all. Continuous variables take an infinite continuum of values.(b) Categorical variables have a scale that is a set of categories; for quantitative variables, the measurementscale has numerical values that represent different magnitudes of the variable.(c) Nominal variables have a scale of unordered categories, whereas ordinal variables have a scale of orderedcategories. The distinctions among types of variables are important in determining the appropriate descriptiveand inferential procedures for a statistical analysis.2.2.(a) quantitative(b) categorical(c) categorical(d) quantitative(e) categorical(f) quantitative(g) categorical(h) quantitative(i) categorical2.3.(a) ordinal(b) nominal(c) interval(d) nominal(e) nominal(f) ordinal(g) interval(h) ordinal(i) nominal(j) interval(k) ordinal2.4.(a) nominal(b) nominal(c) ordinal(d) interval(e) interval(f) interval(g) ordinal(h) interval(i) nominal(j) interval2.5.(a) interval(b) ordinal(c) nominal2.6.(a) state of residence(b) number of siblings(c) social class (high, medium, low)(d) student status (full time, part time)(e) Number of cars owned.(f) Time (in minutes) needed to complete an exam.(g) number of siblings2.7.(a) Ordinal, since there is a sense of order to the categories.(b) discrete(c) These values are statistics, because they apply to a sample of size 1962, not the entire population.2.8.ordinal2.9.The correct responses are (b), (c), (d), (e) and (f).2.10.The correct responses are (a), (c), (e), and (f).2.11.Answers will vary.2.12.Number names 00001 to 52000. Answers will vary 6907.

Page 7

Solution Manual for Statistical Methods for the Social Sciences, 5th Edition - Page 7 preview image

Loading page image...

4Statistical Methods for the Social Sciences2.13.(a) observational study(b) experiment(c) observational study(d) experiment2.14.(a) Experimental study, since the researchers are assigning subjects to treatments.(b) An observational study could look those who grew up in nonsmoking or smoking environments andexamine incidence of lung cancer.2.15(a) Sample-to-sample variability causes the results to vary.(b) The sampling error for the Gallup poll is –2.1% for Obama and 2.8% for Romney.2.16.(a) This is a volunteer sample because viewers chose whether to call in.(b) The mail-in questionnaire is a volunteer sample because readers chose whether to respond.2.17.The first question is confusing in its wording. The second question has clearer wording.2.18.(a) Skip number is52, 000 / 510, 400.kRandomly select one of the first 10,400 names and then skip10,400 names to get each of the next names. For example, if the first name picked is 01536, the other fournames are015361040011936, 119361040022336, 223361040032736, 327361040043136.(b) We could treat the pages as clusters. We would select a random sample of pages, and then sample everyname on the pages selected. Its advantage is that it is much easier to select the sample than it is with randomsampling. A disadvantage is as follows: Suppose there are 100 “Martinez” listings in the directory, all fallingon the same page. Then with cluster sampling, either all or none of the Martinez families would end up in thesample. If they are all sampled, certain traits which they might have in common (perhaps, e.g., religiousaffiliation) might be over-represented in the sample.2.19.Draw a systematic sample form the student directory, using skip number5000 10050.k2.20.(a) This is not a simple random sample since the sample will necessarily have 25 blacks and 25whites. Asimple random sample may or may not have exactly 25 blacks and 25 whites.(b) This is stratified random sampling. You ensure that neither blacks nor whites are over-sampled.2.21.(a) the clusters(b) The subjects within every stratum.(c) The main difference is that a stratified random sample uses every stratum, and we want to compare thestrata. By contrast, we have a sample of clusters, and not all clusters are represented—the goal is not tocompare the clusters but to use them to obtain a sample.2.22.(a) Categorical are GE, VE, AB, PI, PA, RE, LD, AA; quantitative are AG, HI, CO, DH, DR, NE, TV, SP,AH.(b) Nominal are GE, VE, AB, PA, LD, AA; ordinal are PI and RE; interval are AG, HI, CO, DH, DR, NE,TV, SP, AH.2.23.Answers will vary.2.24.(a) Draw a systematic sample from the student directory, using skip number100,kNwhereNnumberof students on the campus.(b) High school GPA on a 4-point scale, treated as quantitative, interval, continuous; math and verbal SATon a 200 to 800 scale, treated as quantitative, interval, continuous; whether work to support study (yes, no),treated as categorical, nominal, discrete; time spent studying in average day, on scale (none, less than 2hours, 2–4 hours, more than 4 hours), treated as quantitative, ordinal, discrete.2.25.This is nonprobability sampling; certain segments may be over- or under-represented, depending on wherethe interviewer stands, time of day, etc. Quota sampling fails to incorporate randomization into the selectionmethod.2.26.Responses can be highly dependent on nonsampling errors such as question wording.

Page 8

Solution Manual for Statistical Methods for the Social Sciences, 5th Edition - Page 8 preview image

Loading page image...

Chapter 2: Sampling and Measurement52.27.(a) This is a volunteer sample, so results are unreliable; e.g., there is no way of judging how close 93% is tothe actual population who believe that benefits should be reduced.(b) This is a volunteer sample; perhaps an organization opposing gun control laws has encouraged membersto send letters, resulting in a distorted picture for the congresswoman. The results are completely unreliableas a guide to views of the overall population. She should take a probability sample of her constituents to get aless biased reaction to the issue.(c) The physical science majors who take the course might tend to be different from the entire population ofphysical science majors (perhaps more liberal minded on sexual attitudes, for example). Thus, it would bebetter to take random samples of students of the two majors from the population of all social science majorsand all physical science majors at the college.(d) There would probably be a tendency for students within a given class to be more similar than students inthe school as a whole. For example, if the chosen first period class consists of college-bound seniors, themembers of the class will probably tend to be less opposed to the test than would be a class of lowerachievement students planning to terminate their studies with high school. The design could be improved bytaking a simple random sample of students, or a larger random sample of classes with a random sample ofstudents then being selected from each of those classes (a two-stage random sample).2.28.A systematic sample with a skip number of 7 (or a multiple of 7) would be problematic since the samplededitions would all be from the same day of the week (e.g., Friday). The day of the week may be related to thepercentage of newspaper space devoted to news about entertainment.2.29.Because of skipping names, two subjects listed next to each other on the list cannot both be in the sample, sonot all samples are equally likely.2.30.If we do not take a disproportional stratified random sample, we might not have enough Native Americans inour sample to compare their views to those of other Americans.2.31.If a subject is in one of the clusters that is not chosen, then this subject can never be in the sample. Not allsamples are equally likely.2.32.Answers will vary2.33.The nursing homes can be regarded as clusters. A systematic random sample is taken of the clusters, and thena simple random sample is taken of residents from within the selected clusters.2.34.The best answer is (b).2.35.The best answer is (c).2.36.The best answer is (c).2.37.The best answer is (a).2.38.False; this is a convenience sample.2.39.False; this is a voluntary response sample.2.40.An annual income of $40,000 is twice the annual income of $20,000. However, 70 degrees Fahrenheit is nottwice as hot as 35 degrees Fahrenheit. (Note that income has a meaningful zero and temperature does not.) IQ isnot a ratio-scale variable.

Page 9

Solution Manual for Statistical Methods for the Social Sciences, 5th Edition - Page 9 preview image

Loading page image...

Page 10

Solution Manual for Statistical Methods for the Social Sciences, 5th Edition - Page 10 preview image

Loading page image...

Chapter 3: Descriptive Statistics7Chapter 3: Descriptive Statistics3.1.(a)Place of BirthNumber (Millions)Relative FrequencyEurope4.54.5 / 37.612.0%Asia10.110.1/ 37.626.9%Caribbean3.63.6 / 37.69.6%Central America14.414.4 / 37.638.3%South America2.42.4 / 37.66.4%Other2.62.6 / 37.66.9%Total37.6(b)OtherS. AmC. AmCaribAsiaEur1614121086420Place of BirthNumber (M illions)(c) “Place of birth” is categorical.(d) The mode is Central America.3.2.(a)ReligionNumber (Billions)Relative FrequencyChristianity2.22.2 / 5.341.5%Islam1.61.6 / 5.330.2%Hinduism1.01.0 / 5.318.9%Buddhism0.50.5 / 5.39.4%Total5.3

Page 11

Solution Manual for Statistical Methods for the Social Sciences, 5th Edition - Page 11 preview image

Loading page image...

8Statistical Methods for the Social Sciences3.2(continued)(b)BuddhismHinduismIslamChristianity2.52.01.51.00.50.0ReligionNumber (Billions)(c) The mean or median cannot be calculated for these data since they are categorical. The mode of thesefour religions is Christianity. Christianity is also the mode of all religions.3.3.(a) There are 33 students. The minimum score is 65, and the maximum score is 98.(b)1009590858075706576543210M idterm ScoreFrequency3.4.(a)Persons per HouseholdNumber (Millions)Relative Frequency130.130.1/110.427.3%237.137.1/110.433.6%317.817.8 /110.416.1%415.015.0 /110.413.6%5 or more10.410.4 /110.49.4%Total110.4

Page 12

Solution Manual for Statistical Methods for the Social Sciences, 5th Edition - Page 12 preview image

Loading page image...

Chapter 3: Descriptive Statistics93.4(continued)(b) The distribution is right skewed.5+4321403020100Persons per HouseholdNumber (M illions)(c) The median household size is 2 persons, and the mode is also 2 persons.3.5.(a)Murder RateFrequencyRelative Frequency1+714%2+1122%3+816%4+714%5+918%6+612%7+12%8+00%9+00%10+12%Total50(b)1211109876543210121086420M urder RateFrequency (per 100,000 population)The distribution appears to be somewhat skewed right, with outlier at 10.8.

Page 13

Solution Manual for Statistical Methods for the Social Sciences, 5th Edition - Page 13 preview image

Loading page image...

10Statistical Methods for the Social Sciences3.5(continued)(c)Stem (1)Leaves (0.1)1456777820012234489931334889942356678500144456860124457289108The stem-and-leaf plot looks like the histogram turned on its side.3.6.(a) GDP is rounded to the nearest thousandStem (10,000s)Leaves (1,000s)192663245667840033333444534657891(b)100908070605040302010109876543210Rounded GDP (thousands)Frequency(c) The outlier in each plot is Luxembourg.3.7.(a) The mean is 129.5 and the median is 102(b) The mean would be higher because of the skew to the right and the extreme outlier (716 for the U.S.)(c) Without the United States, the mean is 104 and the median is 98. There is a greater effect onmean.

Page 14

Solution Manual for Statistical Methods for the Social Sciences, 5th Edition - Page 14 preview image

Loading page image...

Chapter 3: Descriptive Statistics113.8.(a) The mean is0.42.26.21.71.80.912.217.685.375metric tons per person. The median is2.0 metric tons per person.(b) The mean is now0.42.26.21.71.80.912.217.640.399.26metric tons per person. Themedian is 2.2 metric tons per person. Qatar had a greater impact on the mean.3.9.(a) The response “not far enough” is the mode.(b) The median is “not far enough.” We cannot the mean with these data since they are categorical.3.10.(a) For the data from the previous 2, the mean is 16.6 days, the median is 12 days, and the standard deviationis 13.9. For the data from 25 years ago, the mean was 27.6 days, the median was 24 days, and the standarddeviation is 12.4. The mean has decreased by 11 days, and the median has decreased by 12 days since 25years ago. The variability in length of stay was slightly lower 25 years ago.(b) Of the 11 observations, the median is 13 days. We cannot calculate the mean, but substituting 40 for thecensored observation gives a mean of 18.7 days.3.11.(a)TV HoursFrequencyRelative Frequency01207.2131018.6243726.2329317.6422713.651136.86754.57+945.6Total1669100.1(b) The distribution is unimodal and right skewed.(c) The median is the 835th data value, which is 2.(d) The mean is larger than 2 because the data is skew right by a few high values.3.12.Eastern EuropeMiddle EastLeaves (1)Stem (10)Leaves (1)0419293489402958769955547732110081Eastern Europe:mean77.7,standard deviation5.4,minimum67.0,Q175.0,median79.0,Q381.0,maximum87.0Middle East:mean37.5,standard deviation20.0,minimum4.0,Q129.0,median38.5,Q342.0,maximum81.0Female economic activity seems greater, on average, in Eastern Europe than in the Middle East. Except forstandard deviation, all of the descriptive statistics in Eastern Europe exceed the descriptive statistics in theMiddle East. There appear to be more women in the labor force (per 100 men) in Eastern Europe than in theMiddle East.

Page 15

Solution Manual for Statistical Methods for the Social Sciences, 5th Edition - Page 15 preview image

Loading page image...

12Statistical Methods for the Social Sciences3.13.Since the mean is much greater than the median, the distribution of 2010 household income in Canada ismost likely skewed to the right.3.14.(a) The median is “2 or 3 times a month.” The mode is “not at all.” The data are centered around therespondents having sex about 2 or 3 months in the past 12 months. The most frequent answer to the questionis “not at all.”(b) The sample mean is 3.9, which means that, on average, the respondents had sex about 4 times a month inthe past 12 months.3.15.(a) The mode is “never.” The median is “once a week.”(b) The mean is 2.4 times per week, which is lower than the 5.4 times a week in 1972.3.16.(a) For each gender, the distribution of earnings is skewed to the right, since each mean is greater than itsrespective median.(b) The overall mean income is$31,968 110$50,779 109$41, 330.1101093.17.(a) The response variable is median family income, and the explanatory variable is race.(b) We cannot find the median income for the combined groups since we do not know how many families arein each group.(c) We would need to know how many families were in each group.3.18.(a) The distribution is skewed to the right.(b) The Empirical Rule only applies to bell-shaped distributions, so it does not apply here.(c) The median is 0. If the 500 observations were to shift from 0 to 6, the median would remain zero, sincehalf of the data values fall below 0 and half fall above 0. This illustrates the resistance of the median toskewness and extreme values.3.19.(a) Median: $11.19; mean: $11.31; range: $8.36; standard deviation: $3.59(b) Median: $9.85; mean: $9.17; range: $14.99; standard deviation: $5.70; The median is resistant to outliers,but the mean, range, and standard deviation are highly impacted by outliers.3.20.(a) Mean: 30; standard deviation: 9.0.(b) Minimum: 13; lower quartile: 25.5; median: 31; upper quartile: 36; maximum: 42.3.21.(a) The standard deviation would decrease since the United States is farthest from the mean.(a) The standard deviation would increase since Australia is at the mean.3.22.(a) The life expectancies in Africa vary more than the life expectancies in Western Europe, because the lifeexpectancies for the African countries are more spread out than those for the Western European countries.Therefore, the standard deviation will be larger for African nations.(b) The standard deviation is 0.96 for the Western European nations and 4.22 for the African nations, whichis larger.3.23.(a) (i) $51,000 to $71,000; (ii) $41,000 to $81,000; (iii) $31,000 to $91,000(b) A salary of $100,000 would be unusual because it is 3.9 standard deviations above the mean.3.24.(a) Approximately 68% of the values are contained in the interval 32 to 38 days; approximately 95% of thevalues are contained in the interval 29 to 41 days; all or nearly all of the values are contained in the interval26 to 44 days.(b) (i)The mean would decrease if the observation for the U.S. was included.(ii)The standard deviation would increase if the observation for the U.S. was included.(c) The U.S. observation is 5.3 standard deviations below the mean.3.25.(a) 88.8% of the observations fall within one standard deviation of the mean.(b) The Empirical Rule is not appropriate for this variable, since the data are highly skewed to the right.

Page 16

Solution Manual for Statistical Methods for the Social Sciences, 5th Edition - Page 16 preview image

Loading page image...

Chapter 3: Descriptive Statistics133.26.10 is realistic; –20 is impossible since the standard deviation cannot be negative; 0 implies that every studentscored 76 on the exam, which is highly improbable; 50 is too large (it is half of the possible range of scores).3.27.(a) The most realistic value is 0.4, because the range is 5 times the length of this value.(b) The value of –10.0 is impossible since the standard deviation cannot be negative.3.28.The correct answer is (iv), since 0 would be approximately 2 standard deviations below the mean.3.29.(a) Since the range is 43.5 standard deviations above the mean, the distribution is most likely skewed to theright.(b) The distribution probably has outliers (take the maximum usage, for example).3.30.The distribution is most likely skewed to the right since the minimum commute time (0 hours) isapproximately1.5 standard deviations below the mean.3.31.(a) The range is $31,100, which is the difference between the mean salary for secondary school teachers inNew York (highest mean) and in South Dakota (lowest mean).(b) The interquartile range is $11,600 and represents the spread of the mean salaries for the middle 50% ofthe states.3.32.(a)70000650006000055000500004500040000M ean Salary(b) The box plot suggests that the data are skewed to the right.(c) 7000 is the most plausible standard deviation, since the range of the data is about 4 standard deviations.The values 100 and 1000 are too small for the spread that we see, and 25,000 is slightly under the value forthe range.3.33.The mean, standard deviation, maximum, and range all decrease, because the observation for D.C. was a highoutlier. Note that these statistics are not resistant to outliers. On the other hand, the median, Q3, Q1, theinterquartile range, and the mode remain the same, as these are all resistant to outliers. The minimum remains thesame since D.C. was a high outlier and not a low outlier.3.34.(a) The Empirical Rule does not apply to this distribution because the standard deviation is large relative tothe mean, suggesting a right-skewed distribution.(b) The five-number summary confirms that the distribution is skewed to the right, since the distancebetweenQ3and the median is larger than the distance between the median andQ1and the maximum is solarge.(c) IQRQ3Q1173,87591,87582,000;Low outliers would be observations less thanQ11.5 IQR91,8751.5 82, 00031,125. There are no values that are low outliers. High outlierswould be observations greater thanQ31.5 IQR173,8751.5 82, 000296,875.At least the maximumis a high outlier.
Preview Mode

This document has 138 pages. Sign in to access the full document!

Study Now!

XY-Copilot AI
Unlimited Access
Secure Payment
Instant Access
24/7 Support
Document Chat

Document Details

Subject
Statistics

Related Documents

View all