The ship-buoy bias correction excuse (HadSST3 and ERSSTv4)

In 2011, the Hadley Centre of the UK Met Office replaced their global sea surface temperature series, HadSST2, with a new one, HadSST3, an upgrade allegedly necessitated by, among other things, the particular ‘discovery’ that the recorded temperature evolution of the global ocean surface had, since about 1979, for one particular reason followed a path that trended artificially low. The official global sea surface temperature data as compiled simply showed too little overall warming. Since 1978-79, that is, during the satellite era.

This is funny, because the Hadley Centre’s own official global SST dataset, HadSST2, already showed an overall warming since the late 70s that was much larger than the other official datasets out there, like ERSST, Reynolds OI and HadISST:

Animation 1.

Note that the new ERSSTv4 series is also included in Anim.1 (red curve), and that it distinctly supports the group of ‘others’: The light blue HadSST2 curve all of a sudden makes a giant upward leap of nearly 0.1K at the 1997-1998 transition (light green vertical line). There is hardly any divergence to be observed between it and the others, however, either before or after this point (save that from ERSSTv4 post 2005; more on that later …).

Now, where did this ‘extra warming’ in the HadSST2 dataset relative to the other datasets come from? Bob Tisdale found out more than eight years ago. Back in December 2008 he wrote:

I’ve noted this [upward step relative to other SST datasets] in comments on numerous blogs over the past year, but have chosen not to offer an explanation for what appears to be the reason for the difference.


The Hadley Centre changed data sources for SST at 1998. […] This quote is from the Met Office’s Hadley Centre about the HADSST2 data set:
[Content changed since the time Tisdale linked to it.]

“Brief description of the data
The SST data are taken from the International Comprehensive Ocean-Atmosphere Data Set, ICOADS, from 1850 to 1997 and from the NCEP-GTS from 1998 to the present.”

The following ICOADS link will bring you to the following explanation:
[Content changed since the time Tisdale linked to it.]

The total period of record is currently 1784-May 2007 (Release 2.4), such that the observations and products are drawn from two separate archives […]. ICOADS is supplemented by NCEP Real-time data (1991-date; limited products, NOT FULLY CONSISTENT WITH ICOADS).” (Emphasis added.)

The change of data set also helps explain why HADCRUT3 Global, Northern Hemisphere, and Southern Hemisphere data sets consistently run high since the 1997/98 El Niño when compared to other land and sea surface temperature data sets.

So why was this obvious 1997/98 upward step in the HadSST2 dataset, a clear error of calibration when stitching together SST data from two independent (and “not fully consistent”) sources, never corrected? Why was it never even mentioned or noted as a ‘thing’?

Who knows? But one can’t help but wonder how the people at the UKMO would’ve reacted (and how fast) if the error instead happened to go the other way, creating a sudden artificial downward step across the seam.

For now, simply bear this peculiar case of ‘inattention’ in mind. And let’s go back to find the reasoning behind the HadSST satellite era upgrade (HadSST2 → HadSST3), as per the central paper on this issue: Kennedy et al., 2011. What did it claim as the cause of the long-term cooling bias in the recorded global SSTa since ~1979?

It has been noted a number of times (e.g. Emery et al. [2001]) that ships are biased warm relative to drifting buoys and that this relative bias has not changed significantly over the period 1989-2006 (Reynolds et al. [2010]). As the numbers of drifting buoys has increased over time and the number of ships has decreased, there is likely to be an artificial reduction of the trend in global average temperatures. Therefore, the difference between ships and drifters needs to be factored into the bias calculation.

A database of nearly coincident ship and buoy observations for the period 1998-2007 was created in which ship-buoy pairs were selected that lay within 50km of one another and on the same day. To avoid complications from diurnal heating, only observations taken close to local dawn were used. The average differences were calculated for each ocean basin, and for the globe. The average difference between ship and drifting buoy observations in the period 1998-2007 was 0.12◦C, with ships being warmer than drifting buoys.

Figure 1. (Fig.2 from Kennedy et al., 2011.)

The argument, then, apparently goes like this: The SSTs reported by drifting buoys are on average 0.12 degrees cooler than those reported by ships, and so, since the former’s share of the total (combined) amount of SST measurements globally increased from 0% in 1978/79 to almost 70% at the end of 2006 (Fig.1 above), and around 90% today (Huang et al., 2015 (linked below); Fig.12 below), then not correcting for this change will lead to an underestimation of the overall rise in global SST (and also, it is claimed, in SSTa) of about 0.08 degrees from 1979 to 2006, 0.1-0.11 degrees from 1979 till today.

Well, the HadSST2 dataset specifically does NOT contain a correction for this alleged ship-buoy cooling bias, while the HadSST3 specifically DOES. Apparently. Allegedly. This is, after all, one of the stated reasons why the 2011 HadSST update was deemed necessary in the first place.

Which naturally leads us to the obvious question: What did they end up doing, going from v2 to v3? What happened? What do we see? For all intents and purposes … nothing:

Figure 2.

Where’s the upward adjustment? Correcting for the ship-buoy cooling bias. Distinguishing HadSST3 from HadSST2 … It very much appears not to be there at all.

Or is it? Somehow hidden inside that spurious upward shift in the HadSST2 series between 1997 and 1998? Which just happens to be of the same magnitude as the upward adjustment from about 1979 to about 2013 ‘required’ by Kennedy’s ‘discovered’ cooling bias …

Did they simply leave the artificial warming in HadSST2, from a clear calibration error which quite frankly should have been corrected for, untouched, rather justifying keeping it in place by thinking that a new-found (and apparently ‘needed’) upward adjustment from some completely different and totally unrelated cause would cover for it anyway?

Here’s what the above comparison would look like if that spurious HadSST2 1997-98 warming step had in fact been appropriately amended:

Figure 3.

This, however, never happened. And now they found a way to avoid making this correction altogether? They had effectively found a neat way of circumventing the whole (and what would’ve been a rather embarrassing) issue?

Maybe so. Now, though, they have a new and different problem:

Either they will have to explain Fig.2 above; why no apparent upward ship-buoy bias adjustment? Or they will have to explain Fig.3; why more or less the entire upward ship-buoy bias adjustment in ONE sudden step alone?

People should’ve asked the UKMO these two simple questions. And forced them to provide an answer. Their peers should’ve asked them. Other (third-party) ‘experts’ in the field.

Yet no one did. And no one has.

How come …?

Now, let’s keep the ‘old’ (and appropriately down-adjusted) HadSST2 series from Fig.3 and compare it with the other global SST datasets. First, the HadISST1+OIv2 series, the one used by GISS before they switched to ERSSTv3b:

Figure 4.

By most measures, a very good match indeed. How about the ERSSTv3b, the ‘old’ ERSST series?

Figure 5.

Again the overall match is impressive. Finally, HadSST2 compared with the new ERSSTv4 series, which allegedly carries the very same ship-buoy bias correction as does HadSST3 (Huang et al., 2015); more on that later …:

Figure 6.

Now, isn’t this remarkable? Why would the ERSSTv4 dataset agree much better with the low-tracing (down-adjusted) and non-bias-corrected version of the HadSST series (v2, above) than with the high-tracing, bias-corrected one (v3, below), especially and most notably along the crucial section between 1974 and 2006?

Figure 7.

Here’s what Huang et al. have to say about this issue:

Ship-buoy SST adjustment [Section 5c.]

In addition to the ship SST bias adjustment, the drifting and moored buoy SSTs in ERSST.v4 are adjusted toward ship SSTs, which was not done in ERSST.v3b. Since 1980 the global marine observations have gone from a mix of roughly 10% buoys and 90% ship-based measurements to 90% buoys and 10% ship measurements (Kennedy et al. 2011). Several papers have highlighted, using a variety of methods, differences in the random biases, and a systematic difference between ship-based and buoy-based measurements, with buoy observations systematically cooler than ship observations (Reynolds et al. 2002, 2010; Kent et al. 2010; among others). Here the adjustment is determined by 1) calculating the collocated ship-buoy SST difference over the global ocean from 1982 to 2012, 2) calculating the global areal weighted average of ship-buoy SST difference, 3) applying a 12-month running filter to the global averaged ship-buoy SST difference, and 4) evaluating the mean difference and its STD of ship-buoy SSTs based on the data from 1990 to 2012 (the data are noisy before 1990 due to sparse buoy observations). The mean difference of ship-buoy data between 1990 and 2012 is 0.12°C with a STD of 0.04°C (all rounded to hundredths in precision). The mean difference of 0.12°C is at the lower end of published values of 0.12° to 0.18°C (e.g., Reynolds et al. 2002, 2010; Kent et al. 2010). Although buoy SSTs are generally more homogeneous than ship SSTs, they are adjusted here because otherwise it would be necessary to adjust ship SSTs before 1980 when there were no or very few buoys. As expected, the global averaged SSTA trends between 1901 and 2012 (refer to Table 2) are the same whether buoy SSTs are adjusted to ship SSTs or the reverse. However, the global mean SST is 0.06°C warmer after 1980 in ERSST.v4 because of the buoy adjustments (not shown) and there are therefore impacts on the long-term trends compared to applying no adjustment to account for the change in observational platforms.

(Emphasis added.)

So, very much the same reasoning behind the claimed need for an upward adjustment post 1979 as in Kennedy et al., 2011.

We’ll get back to that 0.06 degrees of extra warming after 1980 in the ERSSTv4 dataset in a bit.

But first, what’s perhaps even more bizarre when comparing HadSST3 and ERSSTv4 is this: The ERSSTv4 dataset is apparently, once again according to Huang et al., validated against (tuned to fit) the HadNMAT2 series, a Hadley Centre product of night marine air temps. So why does ERSSTv4 agree much better with the HadNMAT2 data (Fig.8) than the Hadley Centre’s own HadSST3 series (Fig.9)?

Figure 8.

Figure 9. !!!

And why does a non-ship/buoy bias corrected (satellite based) SST dataset like NOAA (Reynolds) OIv2 (Fig.4 above) appear to agree better than either of them?

Figure 10.

Could it be that the claim of ‘necessity’ of the so-called ship-buoy bias adjustment is ultimately unfounded? An interesting hypothesis that however proves invalid in the face of real-world observations?

There is simply too many strange things going on here …

The story behind the ERSSTv4 dataset is almost weirder than the HadSST3 one. If there were indeed a need to adjust the averaged global SSTa up over time from ~1979 because of a temperature bias between ship and buoy measurements, then how come, if we subtract the unadjusted ERSSTv3b from the adjusted ERSSTv4 series, we get the following difference?

Figure 11.

What on earth is going on between 1979 and 2006!? Here’s how the buoy share increased over that time frame:

Figure 12. Adapted from Kennedy et al., 2011.

Quite steadily, from a mere 2.5 % in 1982 to a full two thirds (67 %) at the end of 2006.

So how come the overall adjustment, going from ERSSTv3b to v4, is DOWN from 1979 (or 1982) to 2006? Shouldn’t it go distinctly UP? Why wait all the way until ~2004 before even moving in an upward direction? And why is the overall adjustment since 1979 only going positive beyond 2009 …!?

To reiterate: What in the world is going on here!?

And also, why that huge upward adjustment between 1976 and 1978/79, just before the earliest serious deployment of the temperature measuring buoys …!?

This is where we’re getting to the crux of the matter.

Let’s compare the new ERSSTv4 series, not with the old 3b version (as in Fig.11), but rather with the already introduced HadISST1+OIv2 series (Fig.4):

Animation 2.

Remember that statement from the Huang paper (quote above): “(…) the global mean SST is 0.06°C warmer after 1980 in ERSST.v4 because of the buoy adjustments (…)”

This stated ‘extra warming’, allegedly amounting to +0.06K since 1980, is obviously meant to be relative to the unadjusted ERSSTv3b series. However, when looking at Fig.11 above, it would seem that the extra warming in ERSSTv4 compared to ERSSTv3b is a mere 0.04 degrees. And that’s all the way down to the end of 2015. Just see for yourself.

If you take a look at the GIF animation above, though, you will immediately discover the following: The two curves follow each other to near perfection, first from late 1976 to early 2006, and then again from mid 2006 till today. Near complete agreement all the way since The Great Pacific Climate Shift, except that ONE sudden shift occurring in the first half of 2006. The ERSSTv4 curve abruptly lifts about 0.05-0.06 degrees relative to the OIv2 curve. All in one step. Nothing before. Nothing after.

So how do we explain this sudden upward shift? And how does it relate to that claimed +0.06K worth of extra warming from 1980 in the ERSSTv4 dataset due to the stated implementation of the ship-buoy bias adjustment?

How come it’s apparently rather expressed relative to the OIv2 dataset than to the ERSSTv3b dataset? And why is it all seemingly contained within that ONE sudden step alone?

And, maybe most importantly, is the reality of such a step justified by other relevant datasets?

The answer to the final question is simple: Not at all. Quite the contrary:

HadNMAT2 and HadSST3 both specifically don’t support it. Satellite datasets (including ARC/AATSR, TMI, AMSR-E and AVHRR (OIv2 SSTa); UAH, RSS and NOAA/STAR (TLT, TMT, TTT); CERES (OLR at ToA)) don’t support it. ARGO doesn’t support it. (See the Addendum below.) Which means it’s an obvious methodological artefact, and nothing else.

So why? Why is it there?

One has to ask: Was it put there specifically to … “bust ‘The Pause'”? Its particular location smack dab in the middle of it seems just a bit too perfect and too convenient to be a mere coincidence …

Or am I being a tad too conspirationally inclined here? 😉


Some plots to visualise how we can definitely tell that the ERSSTv4 adjustment apparently “busting ‘The Pause’” (1997/1998 – 2013/2014/2015) has no basis in the REAL world; it is fully and only a construction.

First, ARGO.

Some context. Here’s NOAA’s global mean upper ocean temp anomaly (0-100m) vs. the satellite-based (AVHRR) NOAA Reynolds OIv2 SSTa (from Nov’81) dataset, extended back to Jan’70 by the use of the HadISST1.1 dataset:

Animation 3. There’s an obvious step change occurring in late 2003 in the (otherwise pretty tight) correlation between the two datasets plotted here (what happens before ~1975, I don’t know, but that’s another story anyway).

It was just during the transition from 2003 to 2004 that the ARGO network finally reached a level of such maturity that its overall coverage could be said to have become practically global, and at the same time when the weight of ocean temp measurements making up the global mean record decidedly shifted from XBTs to ARGO buoys:

Figure 13.

For all intents and purposes, 2003-2004 (the orange vertical line in the diagram to the right) marks the actual start of the “ARGO era”. From this point on, NOAA’s ocean temp record basically coincides with that of the ARGO network (and with the step change in Anim.3 above):

Figure 14. NOAA mean ocean temperature, 0-100m (black), overlaying pure ARGO surface temps, 0m (gray); baseline: 2005-2014; 60N-60S.

It’s time, then, to bring in the ERSSTv4 dataset. Remember the 2006 step change between it and the NOAA Reynolds OIv2 dataset in Anim.2 above. That’s the “Pause-busting” adjustment right there. But how does it hold up against ARGO?

Animation 3. The red vertical line marks the conspicuous break between the OIv2 and the ERSSTv4 datasets (see Anim.2). They’re close to equal both before and after, the entire difference from 2003 to 2016 (all the way from 1976/77 to 2017, in fact) contained within that one sudden step alone. And as you can see, there is no justification for the ERSSTv4 “Pause-busting” (upward) adjustment to be found in ARGO, which simply tracks OIv2.

What about the satellites? We’ve already compared the ERSSTv4 with the OIv2, which is based on readings from the AVHRR instruments (broadband IR radiometers, like the ERBE and CERES instruments). OIv2 SSTa distinctly disagrees, just like ARGO, with the sudden 2006 +0.06K step in the ERSSTv4 dataset (Animations 2 and 3 above). The ARC (“ATSR Reprocessing for Climate”) project utilises measurements from another set of IR radiometers, the ATSR and AATSR instruments. Here’s how the combined ARC dataset (red) lines up with HadSST3 (black) and OIv2 (yellow) from 1997 to 2010:

Figure 15. As you can see, there is no visible step up in the red curve (ARC) relative to the yellow and black ones within the green ellipse (2004-2008).

The ARC dataset (ATSR, AATSR) in fact agrees reassuringly well with the AVHRR SSTa, at least post 1996-97, both based on IR satellite readings. Let’s see, then, how the IR datasets stack up with similar data based on microwave-reading satellite instruments like TMI and AMSR (ARC is actually validated against AMSR). We’ll let OIv2 (AVHRR) represent the IR SSTa datasets. Here’s TMI:

Figure 16. 38°N–38°S; red curve: TMI; yellow curve: OIv2; black curve: ERSSTv4.


Figure 17. Global (~60°N–60°S); blue curve: TMI; yellow curve: OIv2; red curve: ERSSTv4.

As is evident from the two figures above, the AVHRR IR satellite readings (OIv2) agree almost perfectly with both TMI and AMSR (both microwave satellite readings), while the ERSSTv4 has a distinct (and now more and more familiar) lift relative to both TMI and AMSR occurring around 2006-2007.

We’ve already touched on the next issue, the conspicuous discrepancy between two SST datasets that are supposedly adjusted according to the very same ‘discovered’ ship-buoy bias, apparently ‘necessitating’ an upward tilt of the overall SSTa trend from 1979-80. Strange, then, how the two teams in question (behind the HadSST3 and the ERSSTv4, respectively) seem to have chosen completely different routes towards achieving it (Fig.7). While the main goal of the folks at the Hadley Centre (Kennedy et al.) appears to be finding a plausible-sounding excuse for NOT correcting that obvious 1997-1998 spurious upward shift in the ‘old’ HadSST2 dataset, the main goal of NOAA (Huang et al.) rather appears to be “busting ‘The Pause'”. This becomes all too evident when comparing the two directly (Fig.7).

Below, ERSSTv4 (red) is superimposed on HadSST3 (green) from 1997 to 2016. Watch what happens inside the golden circle covering the 2005-2009 period:

Figure 18. Note how in the ERSSTv4 dataset the 1998 peak is made distinctly lower than the 2009-2010 peak. We see this same conspicuous divergence when comparing ERSSTv4 with the Hadley Centre’s HadNMAT2, a dataset that Huang et al. plainly stated as an important source of validation for their own SSTa series. So what’s going on inside the green rectangle below?

Figure 19.

We can go on. Let’s check with the troposphere-gauging satellite products from providers such as RSS, UAH and NOAA/STAR, all based on readings from microwave-sounding instruments. Here’s the oceanic part of the new RSSv4.0 TTT dataset (aquamarine) compared with OIv2, AVHRR (black) and ERSSTv4 (red). Same obvious divergence spotted, occurring in … you guessed it! 2006:

Figure 20.

The amount of total precipitable water in the atmospheric column (TPW) is to a very high degree a function of sea surface temperatures. Does the global TPW data, then, somehow corroborate that 2006 ERSSTv4 upward shift? Not at all. It rather very much confirms the OIv2 version of the story (just like everything else …):

Figure 21.

Furthermore, both RSS, UAH and NOAA/STAR agree that there is absolutely no tropospheric step up occurring in 2006-2008 relative to OIv2:

Figure 22. Note that the three TMT datasets here are all ocean+land, not ocean only like the RSSv4.0 TTT curve in Fig.20. The real divergence, though, still happens only in late 2013 and into 2014, with the firm establishment of “The Blob” phenomenon in the NE Pacific, one that is naturally much less pronounced in Fig.20.

Finally, even the ToA radiation flux data (CERES OLR) fully agrees with OIv2 (as already seen, strongly corroborated by the troposphere-gauging satellite datasets) that there is absolutely no step up when going from the 2005-2007 to the 2012-2013 segments, on either side of the 2007-2012 La Niña-El Niño-La Niña sequence:

Figure 23.

2 comments on “The ship-buoy bias correction excuse (HadSST3 and ERSSTv4)

  1. oz4caster says:

    Interesting comparisons that leave plenty of room for doubt about accuracy. Did you see the WUWT post on Zeke Hausfather’s analysis that supposedly vindicates ERSSTv4 and do you have any thoughts on Zeke’s analysis?

    • okulaer says:

      Thanks 🙂

      Of course I’ve seen his “analysis”. That’s basically what prompted me to go through the data myself.

      Hausfather is a staunch, policy-driven greenie (he basically admits this himself). He has an agenda. Hence, I have absolutely zero trust in any of his “analyses”. He’s simply not a neutral party in this. This “study” in particular comes off very much as a snow job and nothing else.

      He has clearly constructed his “datasets” himself. For a specific purpose: To “vindicate” NOAA’s upward SST adjustments, “busting ‘The Pause'”. His data, however, bears no resemblance to anything else, notably not to all the official data from various sources that I’ve looked at (importantly, including ARGO buoy and satellite series (IR and MW)), which all point in the exact same direction: There is NO upward shift in the mean temp post 1998/2000.

      One funny detail is that Hausfather’s own “buoy data” agrees well with the OIv2 dataset across the ERSSTv4 2006 upward step. Hausfather accomplishes the “Pause bust” by rather creating a spurious +0.06K (sounds familiar?) step change (not seen anywhere else) in ~2002 (light green vertical line):

      It’s all sooo predictable …

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s