Tuesday, March 12, 2013

What effect size would you expect?

In an earlier post, I considered the logical implications one common view of replications: If you believe that any effect in the same direction as the original result counts as a successful replication, then you must also believe that any two-tailed hypothesis is non-scientific because it cannot be falsified. In this post, I further consider what effect sizes we should expect if the null hypothesis is true, and why that matters for how we interpret the results of a replication attempt.

In the original thought experiment, a study found a significant effect with effect size of d = 0.5, but a larger replication found an effect size of d = 0.1. Some commenters argued that we should treat the second study as a replication because it produced an effect in the same direction. But, as I noted in my earlier post, even if the null hypothesis were true, that would mean replicating 50% of the time (Uri Simonsohn made the same point in a recent talk). Let's flesh that out a bit, because the problem is even worse than that.

Even when the null hypothesis is true, we should not expect to find d = 0. Let's assume for a moment that the null hypothesis is true—you have two populations whose means and standard deviations actually are identical to infinite precision. Next, let's assume that you sample 15 people at random from each population, for a total sample of 30 participants, and you compare the means of those two samples. Remember, there is no difference between the populations, so the null hypothesis is true. (Note, I'm asking about the absolute value of the effect size—how big an effect would you expect to find, ignoring the direction of the effect?) Before reading further, if you had to guess, how big an effect should you expect to find?

Answer: the median effect size in this case is approximately d = 0.25. If the null hypothesis of no difference is actually true, you'll find an effect larger in magnitude than d = 0.25 fifty percent of the time! In fact, you would expect to find an effect size bigger than d = 0.43 more than 25% of the time. In other words, you'll find a lot of spurious medium-size effects.

Now, 15 subjects/group is not an unusual study size in psychology, but it is a little on the small side. What if we had 20/group? The median effect size then would be d = 0.21, and 25% of the time you'd expect to find an effect d > 0.36. Again, with a typical psychology study sample size, you should expect to find some sizable effects even when the null hypothesis is true. What if we had a large sample size, say 100 subjects/group for a total of 200 subjects? When the null hypothesis of no-difference is true, the median effect size would be d = 0.096 and more than 25% of the time the effect size would exceed d= 0.16.

Here is a graph illustrating the median effect size (with 25% and 75% quartiles in red) as a function of sample size when there is no difference between the populations.

Effect size as a function of sample size when the null hypothesis is true

In all cases, both groups are drawn from a standard normal distribution with a mean of 0 and standard deviation of 1, so the null hypothesis of no difference is true. (The values in the graph could be derived analytically, but I was playing with simulation code, so I just did it that way.) Note that small sample sizes tend to produce bigger and more variable effect size estimates than do large samples.

What does this mean? First, and not surprisingly, you need a big sample to reliably find a small effect. Second, if you have a small sample, you're far more likely to find a spuriously large effect. Typical psychology studies lack the power to detect tiny effects, and even with fairly large samples (by psychology standards), you will find a non-zero effect size when the null hypothesis is true. Even with a large sample, an effect size greater than d = 0.1 should be expected, even when there is no real difference. So, for practical purposes in a typical psychology study, an effect size between -0.1 and +0.1 is indiscriminable from an effect size of zero, and we probably should treat it as if it were zero. At most, we should treat it as only suggestive evidence of an effect, and not as confirmation.

To conclude: If you care only about the sign of a replication attempt, when there actually is no effect at all, you will mistakenly conclude that a replication supported the original finding 50% of the time. For that reason, I think it's necessary to consider both the sign and size of a replication effect when evaluating if it supports the conclusions of an original result. (Of course, an even better approach might estimate a confidence interval around replication effect size to determine whether it includes the original effect size. The bigger the sample size, the smaller the confidence interval. Bayesian estimation of the effect size would be better as well.)

Saturday, March 9, 2013

Further thoughts on what counts as a replication

Last month I posted a replication thought experiment that I hoped would provoke an interesting discussion of what counts as a replication. It did. Today I want to flesh out why I think one of the most common interpretations doesn't work. In short, if you follow it to its logical conclusion, any two-tailed hypothesis is inherently unscientific!

I wasn't surprised that opinions were mixed about the crucial case: 
An original underpowered study produces a significant effect (p=.049) with an effect size of d=.50. A replication attempt uses much greater power (10x the original sample size) and significantly rejects the null (p<05) with a much smaller effect size in the same direction (d=.10).
Commenters fell into three camps:

  1. The new result replicates the original because the effect was in the same direction.
  2. The new result partially replicates the original because it was in the same direction, but it is not the same effect because it is meaningfully smaller. 
  3. The new result does not replicate the original because the new result is meaningfully smaller, and therefore it does not have the same theoretical/practical meaning as the original. 

Although there is no objectively right interpretation of this result, I do think the first interpretation has some theoretical ramifications that even its proponents might not like. When coupled with the logic of null hypothesis significance testing, a not-so appealing conclusion falls out by necessity: Any conclusion based on a two-tailed hypothesis test is unfalsifiable, and therefore not scientific!

Here's the logic:

  • A two-tailed hypothesis predicts that two groups will differ, but it does not predict the direction of difference, and either direction would be interesting. Two-tailed significance tests are common in psychology.
  • The null-hypothesis of no-difference is never true. Two groups may produce the same mean, but only with a limited level of measurement precision. With infinite precision, the groups will differ. That is a property of any continuous distribution: The probability of any exact value on that distribution is 0 (this is a matter of math, not methods). This issue is one reason that many people object to null-hypothesis significance testing, but we don't need to consider that debate here.
  • Given that the null is never true when measured with infinite precision, the measured effect will always fall on one side or the other of the null. And, with a large enough sample, that difference will be statistically significant.
And here's the problem:
  • For a two tailed hypothesis, any significant deviation from zero counts as a replication.
  • With enough power, the measured effect will always differ from zero.
  • Therefore, no result can falsify the hypothesis.
  • A hypothesis that cannot be falsified is non-scientific.
  • Therefore, two-tailed hypotheses and their accompanying tests are not scientific.
In some cases, that conclusion seems reasonable. For example, proponents of ESP will accept performance that is significantly better or worse than chance as support for the existence of ESP. But, in other cases, two-tailed hypotheses seem reasonable and scientifically grounded. Consequently, we must challenge one of the premises.

Let's assume for now that we accept the logic of null hypothesis significance testing. The best approach, in my view, would be to differentiate between tiny effect sizes and more sizable ones. If we are willing to say that effects near zero are functionally equivalent to no effect and different from large effects, then we can avoid the logical conclusion that two-tailed tests are inherently unscientific. 

But once we make that assumption, it applies to one-tailed hypotheses as well, effectively ruling out interpretation #1 from our thought experiment. We have to treat near-zero effect sizes as failures to replicate large effect sizes, even if they fall on the same size of zero and are significantly different from zero.

Another reason to make that assumption is that, even when the null hypothesis is true, effects will fall in the same direction as the original effect 50% of the time by chance. That means, by chance, if any effect in the same direction counts as a replication, we would replicate an original false positive 50% of the time by chance. That seems problematic as well.

If I've made an error in my reasoning or you see a way to salvage the idea that an infinitely small effect in the same direction as an original effect counts as replicating that effect, I would love to hear about it in the comments.

Tuesday, March 5, 2013

APS press release about replication reports

Here is the press release from the Association for Psychological Science about the new Registered Replication Reports at Perspectives on Psychological ScienceNote: I've replaced names with G+ profile links and added hyperlinks.

Leading Psychological Science Journal Launches Initiative on Research Replication

Reproducing the results of research studies is a vital part of the scientific process. Yet for a number of reasons, replication research, as it is commonly known, is rarely published. Now, a leading journal is adopting a novel way to promote and publish well-designed replications of psychological studies.

Perspectives on Psychological Science, published by the Association for Psychological Science, is launching an initiative aimed at encouraging multi-center replication studies. One of the innovative features of this initiative is a new type of article in which replication study designs are peer- reviewed before data collection.

The new approach is designed to give researchers more incentive to pursue replications, which involve repeating a study using the same methods as the original experiment, but with different subjects. Scientists traditionally have garnered far more credit for publishing novel results rather than verifying earlier published findings.

The goal of the new Perspectives initiative is to help make replication a valued part of daily scientific practice.

According to +Bobbie Spellman, Professor of Psychology at the University of Virginia and Editor of Perspectives, “Some research findings are so important that we should publish high quality replications of them regardless of the outcome. When multiple laboratories coordinate with original study designers to do multiple replications, we can learn about the robustness, generalizability, and effect sizes of noteworthy research.”

Perspectives plans to begin publishing collections of replications of original studies conducted independently by multiple labs. Each participating lab will follow a shared, vetted, pre-registered, and publicly available protocol. Each collection of replications will be compiled into a single article (a “registered replication report”), and all researchers contributing replications will be listed as authors. In addition to providing input on the replication protocol, the author of the original article that was the focus of the collected replications will be encouraged to submit a short commentary discussing the final report. +Daniel Simons, Professor of Psychology at University of Illinois at Urbana-Champaign, and +Alex Holcombe, Associate Professor of Psychology at University of Sydney, will serve as editors for these replication projects.

Published reports will be available without a subscription to the journal. And reports in the journal will link to more extensive information and data from each replicating lab on the Open Science Framework (OSF), http://openscienceframework.org/, a website that helps scientists store their research materials, collaborate with others, and share findings publicly. OSF is a signature project of the Center for Open Science (http://centerforopenscience.org/), a new non-profit opening this month in Charlottesville, Virginia. Founded by +Brian Nosek, Associate Professor of Psychology at the University of Virginia (UVA) and +Jeffrey Spies, a graduate student in Quantitative Psychology at UVA, COS aims to develop innovative practices and offer grants to scientists and journals to encourage replications of important research.

“Two core values of science are openness and reproducibility,” says +Brian Nosek. “The new initiative in Perspectives is an important step toward aligning scientific practices with these values. The Center for Open Science will provide support to scientific journals like Perspectives to improve how science is conducted and communicated. This includes infrastructure support for documenting, archiving, and sharing research materials and data, methods for registering research designs and analysis plans, and material support for conducting replications.”

This unique approach to publishing replications is part of broader efforts in psychological science to improve scientific practices. Psychological science is leading the way with initiatives that may have applications in other disciplines.

For more information, please visit

Perspectives on Psychological Science is ranked among the top 10 general psychology journals for impact by the Institute for Scientific Information. It publishes an eclectic mix of thought-provoking articles on the latest important advances in psychology. Please contact Scott Sleek at 202-293-9300 or ssleek@psychologicalscience.org for more information.

Registered replication reports at APS journal, Perspectives

I'm excited to announce a brand new initiative at the APS journal, Perspectives on Psychological Science. The journal will be publishing a new type of article: the registered replication report. +Alex Holcombe and I will be acting as editors for these reports. Below is the mission statement explaining the goals and approach. And, here is a link to the Perspectives website with more information about the reports, the submission process, etc. Send us your proposals!

Mission Statement

Replicability is a cornerstone of science. Yet replication studies rarely appear in psychology journals. The new Registered Replication Reports article type in Perspectives on Psychological Science fortifies the foundation of psychological science by publishing collections of replications based on a shared and vetted protocol. It is motivated by the following principles:
• Psychological science should emphasize findings that are robust, replicable, and generalizable.
• Direct replications are necessary to estimate the true size of an effect.
• Well-designed replication studies should be published regardless of the size of the effect or statistical significance of the result.

Traditional psychology journals emphasize theoretical and empirical novelty rather than reproducibility. When journals consider a replication attempt, the process can be an uphill battle for authors. Given the challenges associated with publishing replication attempts, researchers have little incentive to conduct such studies in the first place. Yet, only with multiple replication attempts can we adequately estimate the true size of an effect.

A central goal of publishing Registered Replication Reports is to encourage replication studies by modifying the typical submission and review process. Authors submit a detailed description of the method and analysis plan. The submitted plan is then sent to the author(s) of the replicated study for review. Because the proposal review occurs before data collection, reviewers have an incentive to make sure that the planned replication conforms to the methods of the original study. Consequently, the review process is more constructive than combative. Once the replication plan is accepted, it is posted publicly, and other laboratories can follow the same protocol in conducting their own replications of the original result. Those additional replication proposals are vetted by the editors to make sure they conform to the approved protocol.

The results of the replication attempts are then published together in Perspectives on Psychological Science as a Registered Replication Report. Crucially, the results of the replication attempts are published regardless of the outcome, and the protocol is pre-determined and registered in advance. The conclusion of a Registered Replication Report should avoid categorizing each result as a success or failure to replicate. Instead, it should focus on the cumulative estimate of the effect size. Together with the separate results of each replication attempt, the journal will publish a figure illustrating the measured effects from each study and a meta-analytic effect size estimate. The details of the protocol, including any stimuli or code provided by the original authors or replicating laboratories as well as data from each study, will be available on the Open Science Framework (OSF) website and will be linked from the published report and the APS website for further inspection and analysis by other researchers. Once all the replication attempts have been collected into a final report, the author(s) of the original article will be invited to submit a short, peer-reviewed commentary on the collection of replication attempts.

This publication model provides many broader benefits to psychological science:
  1. Because the registered replication attempts are published regardless of outcome, researchers have an incentive to replicate classic findings before beginning a new line of research extending those findings. 
  2. Subtleties of methodology that rarely appear in method sections of traditional journals will emerge from the constructive review process because original authors will have an incentive to make them known (i.e., helping to make sure the replications are designed properly).
  3. Multiple labs can attempt direct replications of the same finding, and all such replication attempts will be interlinked, providing a cumulative estimate of the true size of the effect.
  4. The emphasis on estimating effect sizes rather than on the dichotomous characterization of a replication attempt as a success or failure based on statistical significance could lead to greater awareness of the shortcomings of traditional null-hypothesis significance testing.
  5. Authors and journalists will have a source for vetted, robust findings, and a stable estimate of the effect size for controversial findings.
  6. Researchers may hesitate to publish a surprising result from a small-sample study without first verifying that result with an adequately powered design.

Sunday, March 3, 2013

Which priming claims conflict with research on subliminal perception?

I've received several interesting responses to my post on the history of disputed claims of subliminal persuasion. Perhaps the most interesting comment, from a theoretical perspective, was the idea that goal priming does not depend on the stimuli being perceived without awareness. According to this view, priming researchers are not actually interested in testing subliminal perception or persuasion. Rather, they are focused on the more traditional idea from social and cognitive psychology that we lack insights into the reasons for our actions; our behavior is influenced by primes that may or may not be processed without awareness, but we are unaware of the influence of those primes. By that view, presenting the primes subliminally is a means to an end, not an end in itself. As John Bargh wrote in 1992"subliminality of stimulus presentation, therefore, is important not because of the subliminality per se but because one cannnot be aware of the influence of a subliminally presented stimulus."

As I noted in my earlier post, claims of goal priming that do not argue for implicit perception are not controversial from a subliminal perception perspective, and implicit perception folks would not necessarily be skeptical of those.  For example, in their studies in which holding a warm or cold drink influences ratings of personality warmth, Williams & Bargh do not claim that the stimulus itself is implicit. It would be crazy to do so—after all, subjects were asked by the experimenter to hold the drink, so they presumably are aware of the drink and whether or not it is warm. Those studies are focused on a more traditional social psychology question: To what extent are we aware of the reasons or mechanisms underlying our judgments, and do those subtle mechanisms have big effects on behavior. That point is not controversial from an implicit perception perspective since there is no claim of implicit perception, but they are provocative for other reasons (see below). 

If all of the social goal priming research were focused on awareness of influence rather than on the subliminal nature of the stimuli themselves, it would not have inspired as much skepticism from those interested in subliminal perception. After all, the idea that we have mistaken intuitions about the workings of our minds is well established in both social and cognitive psychology.

But, many studies in the social priming literature explicitly claim that the prime stimuli themselves fall outside of awareness, and they use the subliminal nature of the primes to argue that the influence of those primes must occur outside of awareness as well. For example, consider these quotes from Bargh et al's seminal 1996 study of age priming (emphasis added):

"this behavior is unmediated by conscious perceptual or judgmental processes" 
"by the mere presence of environmental features, we mean that the activation of the behavioral tendency and response must be shown to be preconscious; that is, not dependent on the person's current conscious intentions."
"Social behavior is like any other psychological reaction to a social situation, capable of occurring in the absence of any conscious involvement or intervention."  
In their discussion, Bargh et al argue that their study was different from the Vicary "eat popcorn" study/hoax  because the behavioral goals were more relevant, accessible, and not in conflict with other goals (like staying in the theater). In other words, the age-priming study worked because it primed more accessible or actionable goals. Although the implicit nature of the prime is not the core issue for this study, and the result doesn't depend on it, the paper does imply that you don't need to perceive a stimulus consciously for it to have an influence. If conscious access to the prime itself is truly irrelevant, why bother trying to hide it in any way? Why measure awareness of it later?

Other more recent studies in the goal priming literature have made far stronger claims that the primes themselves are subliminal. For example, Hassin et al's PNAS paper claimed:

"We report a series of experiments that show that subliminal exposure to one's national flag influences political attitudes, intentions, and decisions, both in laboratory settings and in “real-life” behavior."
If the subliminal nature of the stimulus isn't central to the claims of goal priming, why claim that it was a subliminal exposure? Why bother presenting it rapidly and masking it? This one falls squarely into the much-disputed implicit persuasion literature, and it lacked adequate controls for awareness of the stimulus: it relied on post-experiment questioning to claim that the primes fell outside of awareness, a technique well-known to be inadequate to rule out awareness of the prime. This lack of proper control for awareness, coupled with claims of implicit persuasion, justify the sort of skepticism I wrote about in my previous post. More broadly, claims of subliminal influence like this one are not unusual—they seem to be accepted and grouped together with other goal priming findings in the literature, which leads me to the conclusion that at least some of the goal priming literature is assuming that the primes themselves fall outside of awareness. 

Another theoretical reason for skepticism

Although claims of subliminal persuasion are perhaps the largest reason for skepticism among those who study implicit perception, there are other theoretically motivated reasons to be skeptical of some recent claims that primes change behavior. Again, the skepticism arrises from a difference in the theoretical perspective of those studying goal priming and those studying other forms of priming. This reason has to do with the purported breadth, power, and persistence of the primes. 

Cognitive psychologists tend to think of priming effects in terms of spreading activation: Primes have a larger effect on closely associated representations and weaker effects on more remotely associated representations. Moreover, the activation diminishes rapidly as the associations become more remote, leading to almost no activation with relatively few steps between the prime and the target. Cognitive psychologists see the links between primes, goals, and behavior as remote, and if they are remote, the influence of a prime should be weak rather than strong. In contrast, goal priming advocates argue for a fairly direct link between primes and goals, with activation of goals directly influencing behavior even when people are not aware that their goals have been triggered by the prime. 

Why might cognitive psychologists question the close link between primes, goals, and behavior? Take what is perhaps the strongest and closest form of semantic priming (i.e., priming of meaning): A prime word leads to faster decisions about a closely related word. For example, seeing the word "doctor" leads to faster processing of the word "nurse." Even that closely related prime requires some spreading activation: Seeing the word doctor spreads to other closely related concepts (nurse), but it produces less priming of "nurse" than it does for "doctor." Such identity priming doesn't require as much spreading activation.

The implicit perception literature shows that it is exceptionally difficult to find evidence for semantic priming of closely related words when people are truly unaware of the prime. That was the focus of my last post. Yet, even with full awareness, a prime does not have a huge influence on semantically related words. In a meta-analysis of semantic priming, closely related semantic associates primed judgments about a related word (e.g., is it a word or not) with an effect size of about r=.21 (Lucas, 2000). Contrast that modest priming (with awareness) for closely related words to the much larger effects of goal priming from words to goals to behavior. For example, the reported effect size for the priming effect of warm coffee on personality judgments was approximately r =.3. It seems implausible to those who study semantic priming that holding a warm cup of coffee would have a bigger effect on personality judgments or pro-social behaviors than seeing the word doctor would have on judgments about the word nurse.

Reconciling the size of social priming effects with the apparently smaller size of explicit semantic priming requires one of three possibilities:

  • The chain of associations linking physical warmth to personality ratings is more direct than that between doctor and nurse. That would require a rethinking of the structure of representations.
  • The mechanisms guiding behaviors in goal priming are different from and more powerful than those underlying other forms of priming.
  • The social priming effects are not as large (or the semantic priming results not as small) as the published reports suggest. 
The first two possibilities would be strong theoretical claims that would require a rethinking of decades of priming research and research on the nature of semantic representations. Although they might be true, such strong claims merit skepticism (not dismissal. skepticism). The third possibility merits direct replication with large samples, preferably by multiple labs, in order to better estimate the true size of the effects. If the goal priming effects prove to be smaller than those for priming of close semantic associates, this second theoretical reason for skepticism would be negated (or at least weakened substantially). Given that the third possibility is the easiest to test and would potentially shore up the interesting claims of goal priming, it seems like the right way to go.

Friday, March 1, 2013

Skepticism about subliminal priming

A piece written by Wolfgang Stroebe and Miles Hewstone in defense of social priming research appeared yesterday as the cover story of THE (Times Higher Education). The piece highlights how the fallout from the Stapel fraud case in social priming has tainted the reputations of others who are not suspect. That undoubtedly is true—fraud cases have fallout well beyond the fraudster, and that fallout can be unfair and unjustified. Unfortunately, in making its case, the piece unfairly impugns the motives and agenda of at least some critics of those findings. Yes, some are motivated by the belief that the effects themselves are false positives, but many have a more theoretical and historical perspective that leads to skepticism about the interpretation of these findings rather than of the findings themselves. 

The commentary describes what the authors feel is a concerted attack against social psychology by a small cadre of clueless cognitive psychologists. 
Having used priming exclusively to test hypotheses about associative memory, cognitive psychologists could not believe either that priming could have such a pervasive influence on behaviour or that people were not aware of this influence
Next, the authors attribute the motivation for skepticism to resentment that cognitive psychology effects garner less media coverage
Whereas the press often reported the findings of cognitive social psychologists, reporters were less interested in the work of their non-social colleagues.
I find both the characterization of cognitive psychologist and the attribution of motives both misleading and unhelpful to the discussion. Yes, many of those who are skeptical of social priming claims are cognitive psychologists. But, the reasons for their skepticism have nothing to do with jealousy or any interdisciplinary vendetta. Rather, at least for many of the criticsthey are based on a long history of battles over subliminal perception and the effects of non-conscious processing.  These battles predate the present social priming kerfuffle are just the latest in over a century of disputed claims about subliminal influences.

What I find both disturbing and remarkable about the THE piece is that its authors appear unaware of how this history motivates the skepticism about more recent claims. A critical quotes from the THE piece make clear my concern:
But things changed dramatically towards the end of the 20th century with the rediscovery (the notion had already struck Freud and the behaviourists) that people can be influenced by stimuli in their environment without being aware of it.
When, exactly, was this "knowledge" lost and in need of rediscovery? The debate over the power of subliminal perception has been enduring, started before Freud's influence (with experimental claims dating to the mid-late 1800s), and continued unabated throughout the 20th century. The reason for skepticism comes from that history. If there is one consistent, recurring pattern in that literature, it is that new claims of powerful subliminal or implicit influences on behavior are later shown to have occurred with awareness, to have been subject to uncontrolled demand characteristics, or to be unreplicable.
  • Have you heard that subliminally flashing "eat popcorn" or "drink coke" leads you to buy more concessions at the movies? That one was an admitted hoax perpetrated by advertising executive James Vicary executive in the 1950s.  Even after it was debunked over the decade that followed, it was featured prominently in Wilson Bryan Key's 1973 bestseller, Subliminal Seduction, and is still a common myth. (see Snopes coverage. Chris Chabris and I detail this case in our book, The Invisible Gorilla as well.)
  • Remember the claims that subliminal self-help tapes could help you weight or stop smoking? Those were repeatedly debunked throughout the 1980s. (e.g, this paper by Merikle. This paper suggests that many such tapes didn't even have messages embedded.)
  • How about subliminal messages planted in rock music albums leading people to violence or suicide? Not so much, although claims of subliminal persuasion led to a lot of lawsuits against musicians (wikipedia covers that one).

Those are public examples of questionable claims about subliminal influences, but what about more rigorous methods used to study implicit perception in the lab? When you look at the history of implicit perception effects in domains ranging from dichotic listening (1950s - 1960s) to masked priming (1970s - present), there is one consistent pattern: The more rigorous the measurement of awareness, the smaller the effects. In perhaps the most influential review paper on this topic, Daniel Holender's Behavioral and Brain Sciences article in 1996 analyzed 30+ years of the history of semantic priming without awareness and found that almost no studies adequately excluded awareness. He identified a set of criteria necessary to objectively rule out awareness in order to document a truly implicit effect. Not everyone agrees with his criteria, but in the perception world, even the hard-core proponents of semantic priming without awareness acknowledge that the effects typically are small and short lived. They often try to find qualitative differences between explicit and implicit priming in order to further rule out awareness (e.g., see work by Merikle, Reingold, Snodgrass, and others).

When people from the implicit perception world see claims of huge effects of subliminal primes on behavior, they are skeptical because of this history. When the measures used to rule out awareness consist of post-experiment questioning (an inadequate approach because it does not rule out awareness at the time of presentation), they are skeptical that the effect is truly implicit. When the studies make no attempt to compare the size of an implicit effect to the same effect with varying degrees of awareness, they are skeptical that it is truly implicit. When studies do not use signal detection methods to test whether people were sensitive to the presence of a prime, they are skeptical that it is truly implicit.

The irony is that the history of social psychology is filled with robust effects of unrecognized influences on decisions and behavior, cognitive dissonance chief among them. Social psychology has documented the many ways in which we are unaware of the reasons for our beliefs, attitudes, and actions. Nisbett and Wilson's classic critique of think-aloud protocols highlights that dissociation between the factors that influence us and our awareness that those factors are influencing us. But, claims of implicit influences on behavior are different than claims that we can't intuit the mechanisms of mind.

What makes many of the social priming claims interesting, and the reason they garner attention, is that they purportedly occur entirely without awareness. If I told subjects to envision themselves as elderly and then they walked more slowly, that would be interesting, but it wouldn't imply a powerful unconscious influence—it would just be a cool example of induced method acting. Most cognitive psychologists I know reacted to that social priming result by assuming the effects occurred with awareness, making the findings interesting but not as provocative or ground breaking. They were more traditional social effects in which people were unaware of how their experiences influenced their behavior even if they were aware of the existence of those influences. For some social priming effects, subjects are admittedly aware of the prime but not of its influence. From the perspective of implicit perception, those aren't as controversial.

When the THE authors claim that these influences are entirely outside of awareness, that triggers the same skepticism that arose in light of earlier subliminal perception and persuasion debates. To my knowledge, no study in the social priming literature has applied the criteria identified by Holender to demonstrate that the primes truly were outside of awareness. I haven't seen any papers in that literature discuss the issue of what counts as implicit in light of these debates in the implicit perception world. For example, a target article by Huang and Bargh in Behavioral and Brain Sciences that is now open for peer commentary doesn't cite Holender's paper or any of the other critiques of implicit perception claims.

The latest trend, both in that paper (in a footnote) and in a new paper by Hassin in the current issue of Perspectives on Psychological Science, seems to be to claim that the priming effects can be implicit without being implicit. In effect, Hassin is claiming that you can claim unconscious influences without actually showing that the stimuli were processed outside of awareness. In effect, these papers are suggesting that the lack of an ability to attribute behaviors to the prime is all you need in order to claim that the prime subliminally influenced behavior. It's a clever reframing, but that isn't what many of these papers have claimed.

There is a crucial difference between stating that people misattribute the reasons for their behaviors and claiming that the actual reasons are themselves implicit. The first point is uncontroversial. The second is unsupported. To my knowledge, no study in the social priming literature, however you want to define social priming, has met the criteria Holender and others have set out for documenting subliminal persuasion. When the claim is that stimuli that fall outside of awareness influence behavior, the claim is one of subliminal persuasion. And, such claims depend on demonstrating that the primes themselves fall entirely outside of awareness.

My point is that, at least for most of the people in cognitive psychology with interests in implicit perception, the source of skepticism has nothing to do with a vendetta against social psychologists or jealousy about media coverage. Rather, these new findings intruded on a long-standing debate within the literature on implicit perception, one that has seen repeated claims of strong implicit effects on behavior, only to to have those claims shot down. The past 20 years have seen steady improvements in the methods used to test for implicit perception. Strong claims of the powerful influences of implicit processes are nothing new in cognitive (or social) psychology.  And, the past two decades have seen significant refinements in how we must measure awareness in order to make claims about implicit perception. Social priming research either must adopt those more rigorous controls for awareness or it must avoid claims that these influences are truly implicit. The findings might still be interesting even if the stimuli are not implicit. But, demonstrating that an influence is truly implicit has proven much more challenging than you might think.

An aside: It is less clear what to make of one-off failures to replicate some of these effects (as opposed to tests of whether the effects occur outside of awareness). Within cognitive psychology, we have seen examples of failures to replicate claims of implicit perception as well (e.g., Marcel's priming work from the 1980s). Perhaps the replication studies were conducted poorly or lacked adequate manipulation checks. Perhaps they are false negatives or lacked power to find the documented effect. Perhaps the original was a false positive. Without repeated replication using a common, accepted protocol, it's hard to determine what the true size of these effects might be. Individual failures to replicate can tar a finding and researcher unfairly. That is one reason I have been pushing for multiple independent replications of important findings, all conducted using a shared and vetted protocol. That will provide a better measure of the actual size of an effect in reality.

*update* In case you're interested in reading more about the conditions required to claim that a stimulus was processed implicitly, I have co-authored a couple of review pieces. You can get them here and here.

*update 2* - I have written a follow-up post (Link) in which I explore the nature of goal priming: Wh/at, exactly, must be implicit in these claims, and are there other theoretical reasons for skepticism