Interaction (statistics): Difference between revisions

Content deleted Content added
Adding local short description: "Statistical term", overriding Wikidata description "in statistics, the situation in which the simultaneous influence of two variables on a third is not additive" (Shortdesc helper)
Citation bot (talk | contribs)
Add: bibcode, publisher. | Use this bot. Report bugs. | Suggested by Abductive | Category:Design of experiments | #UCB_Category 87/158
 
(10 intermediate revisions by 7 users not shown)
Line 1:
{{Short description|Statistical term}}
[[File:GSS sealevel interaction.png|thumb|Interaction effect of education and ideology on concern about sea level rise]]In [[statistics]], an '''interaction''' may arise when considering the relationship among three or more variables, and describes a situation in which the effect of one causal variable on an outcome depends on the state of a second causal variable (that is, when effects of the two causes are not [[additive map|additive]]).<ref name=Dodge>{{cite book | last=Dodge | first=Y. | year=2003 | title=''The Oxford Dictionary of Statistical Terms'' | publisher=Oxford University Press | isbn=978-0-19-920613-1 | url-access=registration | url=https://archive.org/details/oxforddictionary0000unse }}</ref><ref>{{cite journal | doi=10.2307/1403235 | last=Cox | first=D.R. | year=1984 | title=Interaction | journal=International Statistical Review | volume=52 | pages=1&ndash;25 | jstor=1403235 | issue=1 }}</ref> Although commonly thought of in terms of causal relationships, the concept of an interaction can also describe non-causal associations (then also called [[Moderation (statistics)|''moderation'']] or ''effect modification''). Interactions are often considered in the context of [[regression analysis|regression analyses]] or [[factorial experiments]].
 
The presence of interactions can have important implications for the interpretation of statistical models. If two variables of interest interact, the relationship between each of the interacting variables and a third "dependent variable" depends on the value of the other interacting variable. In practice, this makes it more difficult to predict the consequences of changing the value of a variable, particularly if the variables it interacts with are hard to measure or difficult to control.
 
The notion of "interaction" is closely related to that of [[Moderation (statistics)|moderation]] that is common in social and health science research: the interaction between an explanatory variable and an environmental variable suggests that the effect of the explanatory variable has been moderated or modified by the environmental variable.<ref name=Dodge />
 
==Introduction==
 
An '''interaction variable''' or '''interaction feature''' is a variable constructed from an original set of variables to try to represent either all of the interaction present or some part of it. In exploratory statistical analyses it is common to use products of original variables as the basis of testing whether interaction is present with the possibility of substituting other more realistic interaction variables at a later stage. When there are more than two explanatory variables, several interaction variables are constructed, with pairwise-products representing pairwise-interactions and higher order products representing higher order interactions.
 
Line 23 ⟶ 22:
 
==In modeling==
 
===In ANOVA===
 
A simple setting in which interactions can arise is a [[factorial experiment|two-factor experiment]] analyzed using [[Analysis of Variance]] (ANOVA). Suppose we have two binary factors ''A'' and ''B''. For example, these factors might indicate whether either of two treatments were administered to a patient, with the treatments applied either singly, or in combination. We can then consider the average treatment response (e.g. the symptom levels following treatment) for each patient, as a function of the treatment combination that was administered. The following table shows one possible situation:
 
Line 126 ⟶ 123:
 
===Unit treatment additivity===
In its simplest form, the assumption of treatment unit additivity states that the observed response ''y''<sub>''ij''</sub> from experimental unit ''i'' when receiving treatment ''j'' can be written as the sum ''y''<sub>''ij''</sub>&nbsp;=&nbsp;''y''<sub>''i''</sub>&nbsp;+&nbsp;''t''<sub>''j''</sub>.<ref name="Kempthorne (1979)">{{cite book |author-link=Oscar Kempthorne |last=Kempthorne |first=Oscar |year=1979 |title=The Design and Analysis of Experiments |edition=Corrected reprint of (1952) Wiley |publisher=Robert E. Krieger |isbn=978-0-88275-105-4 }}</ref><ref name=Cox1958_2>{{cite book |author-link=David R. Cox |last=Cox |first=David R. |year=1958 |title=Planning of experiments |publisher=Wiley |isbn=0-471-57429-5 |at=Chapter 2 }}</ref><ref>{{cite book
 
In its simplest form, the assumption of treatment unit additivity states that the observed response ''y''<sub>''ij''</sub> from experimental unit ''i'' when receiving treatment ''j'' can be written as the sum ''y''<sub>''ij''</sub>&nbsp;=&nbsp;''y''<sub>''i''</sub>&nbsp;+&nbsp;''t''<sub>''j''</sub>.<ref name="Kempthorne (1979)">{{cite book |author-link=Oscar Kempthorne |last=Kempthorne |first=Oscar |year=1979 |title=The Design and Analysis of Experiments |edition=Corrected reprint of (1952) Wiley |publisher=Robert E. Krieger |isbn=978-0-88275-105-4 }}</ref><ref name=Cox1958_2>{{cite book |author-link=David R. Cox |last=Cox |first=David R. |year=1958 |title=Planning of experiments |isbn=0-471-57429-5 |at=Chapter 2 }}</ref><ref>{{cite book
|author=Hinkelmann, Klaus and [[Oscar Kempthorne|Kempthorne, Oscar]]
|year=2008
Line 263 ⟶ 259:
where the interaction term <math>(x_1\times x_2)</math> could be formed explicitly by multiplying two (or more) variables, or implicitly using factorial notation in modern statistical packages such as [[Stata]]. The components ''x''<sub>1</sub> and ''x''<sub>2</sub> might be measurements or {0,1} [[dummy variable (statistics)|dummy variable]]s in any combination. Interactions involving a dummy variable multiplied by a measurement variable are termed ''slope dummy variables'',<ref>Hamilton, L.C. 1992. ''Regression with Graphics: A Second Course in Applied Statistics''. Pacific Grove, CA: Brooks/Cole. {{ISBN|978-0534159009}}</ref> because they estimate and test the difference in slopes between groups 0 and 1.
 
When measurement variables are employed in interactions, it is often desirable to work with centered versions, where the variable's mean (or some other reasonably central value) is set as zero. Centering can make the main effects in interaction models more interpretable, as it reduces the [[multicollinearity]] between the interaction term and the main effects.<ref>{{Cite journal|lastlast1=Iacobucci|firstfirst1=Dawn|last2=Schneider|first2=Matthew J.|last3=Popovich|first3=Deidre L.|last4=Bakamitsos|first4=Georgios A.|date=2016|title=Mean centering helps alleviate “micro”"micro" but not “macro”"macro" multicollinearity|url=http://link.springer.com/10.3758/s13428-015-0624-x|journal=Behavior Research Methods|language=en|volume=48|issue=4|pages=1308–1317|doi=10.3758/s13428-015-0624-x|pmid=26148824 |issn=1554-3528|doi-access=free}}</ref> The coefficient ''a'' in the equation above, for example, represents the effect of ''x''<sub>1</sub> when ''x''<sub>2</sub> equals zero.
 
[[File:Tea party interaction.png|thumb|Interaction of education and political party affecting beliefs about climate change]]Regression approaches to interaction modeling are very general because they can accommodate additional predictors, and many alternative specifications or estimation strategies beyond [[ordinary least squares]]. [[Robust regression|Robust]], [[Quantile regression|quantile]], and mixed-effects ([[Multilevel model|multilevel]]) models are among the possibilities, as is [[generalized linear model]]ing encompassing a wide range of categorical, ordered, counted or otherwise limited dependent variables. The graph depicts an education*politics interaction, from a probability-weighted [[logit regression]] analysis of survey data.<ref>{{cite journal | last1 = Hamilton | first1 = L.C. | last2 = Saito | first2 = K. | year = 2015 | title = A four-party view of U.S. environmental concern | journal = Environmental Politics | volume = 24 | issue = 2| pages = 212–227 | doi = 10.1080/09644016.2014.976485 | bibcode = 2015EnvPo..24..212H | s2cid = 154762226 }}</ref>
 
==Interaction plots==
 
Interaction plots, also called [[Moderation (statistics)#Two continuous independent variables|simple-slope plots]], show possible interactions among variables.
 
===Example: Interaction of species and air temperature and their effect on body temperature===
Line 340 ⟶ 336:
}}</ref>
*''Interaction'' between genetic risk factors for [[Diabetes mellitus type 2|type 2 diabetes]] and diet (specifically, a "western" dietary pattern). The western dietary pattern was shown to increase diabetes risk for subjects with a high "genetic risk score", but not for other subjects.<ref>{{Cite journal | author = Lu, Q. | year = 2009 | title = Genetic predisposition, Western dietary pattern, and the risk of type 2 diabetes in men | journal = Am J Clin Nutr | volume = 89 | pages = 1453&ndash;1458 | doi = 10.3945/ajcn.2008.27249 | pmid = 19279076 | display-authors = 1 | issue = 5 | author2 = <Please add first missing authors to populate metadata.>| pmc = 2676999 }}</ref>
*''Interaction'' between education and political orientation, affecting general-public perceptions about climate change. For example, US surveys often find that acceptance of the reality of [[anthropogenic climate change]] rises with education among moderate or liberal survey respondents, but declines with education among the most conservative.<ref>{{cite journal | last1 = Hamilton | first1 = L.C. | year = 2011 | title = Education, politics and opinions about climate change: Evidence for interaction effects | url = https://scholars.unh.edu/cgi/viewcontent.cgi?article=1388&context=soc_facpub| journal = [[Climatic Change (journal)|Climatic Change]] | volume = 104 | issue = 2| pages = 231–242 | doi = 10.1007/s10584-010-9957-8 | bibcode = 2011ClCh..104..231H | s2cid = 16481640 }}</ref><ref>{{cite journal |last=McCright |first=A. M. |year=2011 |title=Political orientation moderates Americans' beliefs and concern about climate change |journal=[[Climatic Change (journal)|Climatic Change]] |doi=10.1007/s10584-010-9946-y |volume=104 |issue=2 |pages=243–253 |bibcode=2011ClCh..104..243M |s2cid=152795205 }}</ref> Similar interactions have been observed to affect some non-climate science or environmental perceptions,<ref>{{Cite journal | doi=10.1080/09644016.2014.976485|title = A four-party view of US environmental concern| journal=Environmental Politics| volume=24| issue=2| pages=212–227|year = 2015|last1 = Hamilton|first1 = Lawrence C.| last2=Saito| first2=Kei| bibcode=2015EnvPo..24..212H |s2cid = 154762226}}</ref> and to operate with science literacy or other knowledge indicators in place of education.<ref>{{cite journal | last1 = Kahan | first1 = D.M. | last2 = Jenkins-Smith | first2 = H. | last3 = Braman | first3 = D. | year = 2011 | title = Cultural cognition of scientific consensus | url = https://scholarship.law.gwu.edu/cgi/viewcontent.cgi?article=1269&context=faculty_publications| journal = Journal of Risk Research | volume = 14 | issue = 2| pages = 147–174 | doi = 10.1080/13669877.2010.511246 | hdl = 10.1080/13669877.2010.511246 | s2cid = 216092368 | hdl-access = free }}</ref><ref>{{cite journal | last1 = Hamilton | first1 = L.C. | last2 = Cutler | first2 = M.J. | last3 = Schaefer | first3 = A. | year = 2012 | title = Public knowledge and concern about polar-region warming | journal = [[Polar Geography]] | volume = 35 | issue = 2| pages = 155–168 | doi = 10.1080/1088937X.2012.684155 | bibcode = 2012PolGe..35..155H | s2cid = 12437794 }}</ref>
 
== See also ==
Line 348 ⟶ 344:
* [[Linear model]]
* [[Main effect]]
* [[Interaction]]
* [[Tukey's test of additivity]]
 
Line 356 ⟶ 351:
==Further reading==
*[[David R. Cox|Cox, David R.]] and Reid, Nancy M. (2000) ''The theory of design of experiments'', Chapman & Hall/CRC. {{ISBN|1-58488-195-X}}
*{{Cite journal | doi = 10.1086/226678 | last1 = Southwood | first1 = K.E. | year = 1978 | title = Substantive Theory and Statistical Interaction: Five Models | journal = [[The American Journal of Sociology]] | volume = 83 | issue = 5| pages = 1154–1203 | s2cid = 143521842 }}
*{{Cite journal | doi = 10.1093/pan/mpi014 | last1 = Brambor| first1 = T. | last2 = Clark | first2 = W. R. | year = 2006 | title = Understanding Interaction Models: Improving Empirical Analyses | journal = Political Analysis | volume = 14 | issue = 1 | pages = 63–82 }}
*{{Cite journal | doi = 10.3758/BRM.41.3.924 | last1 = Hayes | first1 = A. F. | last2 = Matthes | first2 = J. | year = 2009 | title = Computational procedures for probing interactions in OLS and logistic regression: SPSS and SAS implementations | journal = Behavior Research Methods | volume = 41 | issue = 3| pages = 924–936 | pmid = 19587209 | doi-access = free }}