When researching studies like this, how can a journalist better identify if later studies have failed to replicate its results, or that later studies are producing what seems like contradictory evidence?
That's an excellent question. Identifying questionable findings is particularly hard for people who aren't experts in the field themselves. Without being immersed in the controversies, it's tough to grok the state of the field. In some cases, the key is knowing who to ask. In others, it's having a trained BS detector. In still others, it's knowing which questions to ask and when to believe the answers.
One thing a journalist can always do is ask an interviewee if there are any people in the field who disagree with their claims. For any new or controversial claim, some other expert likely will disagree. After all, if a finding makes a theoretical contribution, it presumably adds insight to a debate or controversy of some sort. That doesn't mean that the people who disagree are right or even that it's a real debate (e.g., there are crackpots who disagree with everything, including some who believe the Earth is flat). But if the original author claims that there is no disagreement (without giving any qualifications), they're likely not being entirely forthright, and your skepticism warning system should go to orange status. Scientists should not feel threatened by that question.
Assuming there is a debate, journalists should contact those on the other side—perhaps they have concrete reasons to doubt the finding. Good reporting identifies the scope and limits of a new finding and doesn't just regurgitate the claims reported in a press release or noted by the original authors. Ideally, the authors themselves should identify the scope of their finding and the limits on generalizability, but press releases almost never do. If you do interview others in the field, ask them explicitly if there are reasons someone might be skeptical about the finding or reasons to think generalization to society at large might be unmerited (you can always ask those questions of the original authors as well).
David's question raises a bigger issue: the field has not been good about publishing replication attempts, especially replication failures. That makes it harder for those outside the field (who aren't privy to water-cooler knowledge) to know what's solid and what isn't. If there happen to be published failures to replicate, sites like psychfiledrawer.org are trying to compile them for public access (they also track unpublished replication attempts). Failed replications are like mice -- if you see one or two, there likely are more that you haven't seen hidden away in the corners of academia. PsychFileDrawer is trying to shine some light on those dusty cabinets.
Another way to check for published replications is to look at who has cited the original paper and look for examples that include a replication (successful or not). If you find the original article on scholar.google.com, you can click on the citation count and it will show you a list of all the papers that have cited it since it first was published. (You can also search for the term "replicate" and the names of the original authors.) That will catch some published replication attempts. But, in many cases, there will not have been any published attempts to replicate. Journalists should look both for evidence of successful, direct replication and for evidence of failures to replicate. Journalists also can ask the interviewee if other labs have independently replicated their work (and follow up to make sure). If none have, don't assume that one finding changes the world. Just don't count on the original authors to mention failures to replicate spontaneously.