The case of the disappearing Statistics Canada data

The problem of missing and incomplete data at Statistics Canada goes far deeper than that botched labour report

(Shutterstock)

(Shutterstock)

The Great Statistics Canada 200-Jobs Mystery is generating loads of headlines, as it should. The botched labour report for July, which, initially, and erroneously, claimed Canada produced just 200 jobs that month, has once again sparked questions about the quality of Canada’s statistical data. (Revised figures are due Friday).

But this is far from the only thing troubling regular StatsCan users. I made the following chart to illustrate one of the great frustrations that journalists, economists and academics have with StatsCan. One minute, the agency, tasked with measuring the tick tock of the economy and society, tracks seemingly vital data (such as detailed breakdowns of public sector employment and wages by all levels of government, or the total value of government transfer payments to persons by province and type of transfer), the next, *poof*, they’re terminated.

Good thing StatsCan still tracks the square footage of fungi production.

 Of mushrooms and government employment

 

Perhaps, buried somewhere deep in CANSIM—the agency’s socioeconomic database—there exist new data series that pick up where the above leave off, but I can’t find them.

Last year, Stephen Gordon railed against StatsCan’s attention deficit disorder, and its habit of arbitrarily terminating long-standing series and replacing them with new data that are not easily comparable.

For what appears to be no reason whatsover, StatsCan has taken a data table that went back to 1991 and split it up into two tables that span 1991-2001 and 2001-present. Even worse, the older data have been tossed into the vast and rapidly expanding swamp of terminated data tables that threatens to swallow the entire CANSIM site. A few months ago, someone looking for SEPH wage data would get the whole series. Now, you’ll get data going back to 2001 and have to already know (StatsCan won’t tell you) that there are older data hidden behind the “Beware of the Leopard” sign.

Statistics Canada must be the only statistical agency in the world where the average length of a data series gets shorter with the passage of time. Its habit of killing off time series, replacing them with new, “improved” definitions and not revising the old numbers is a continual source of frustration to Canadian macroeconomists.

Others are keeping tabs on the vanishing data. The Canadian Social Research Newsletter for March 2 referred to the cuts as the CANSIM Crash Diet and tallied some of the terminations:

For the category “Aboriginal peoples” : 4 tables terminated out of a total of 7
For the category “Children and youth” : 89 tables terminated out of a total of 130
For the category “Families, households and housing” : 67 tables terminated out of a total of 112
For the category “Government” : 62 tables terminated out of a total of 141
For the category “Income, pensions, spending and wealth” : 41 tables terminated out of a total of 167
For the category “Seniors” : 13 tables terminated out of a total of 30

As far as Statistics Canada’s troubles go, this will never get the same level of attention as the mystery of the 200 jobs. But, as it relates to the long-term reliability of Canadian data, it’s just as serious.




Browse

The case of the disappearing Statistics Canada data

  1. Well we knew this would eventually happen when Harp first started playing silly buggers with the census.

  2. Troubles lapping, competent, safe, stable, mumble, mumble ..

  3. The CONServatives knew that they had to destroy StatsCans credibility before it blew their plans to publish what ever they wanted Canadians to believe….after all you can’t have someone discounting the CONServatives obscure truthiness by researching & publishing the actual numbers.

  4. As anyone who’s ever taken a basic stats course knows, the larger the sample size, the more reliable the data. It’s all about confidence (“remember “confidence levels”?), and Harper’s government has stripped that away, starting with the census fiasco, and now this. The census, for one, is the most fundamental exercise in providing all manner of researchers with the data they need for planning at all levels. It’s all part of Harp’s anti-science campaign, I think. Scientists I know in the Federal public service don’t even want to call themselves that, for fear it will draw attention to themselves, thereby placing their employment in jeopardy (half-joking only, but you see what I mean ……)

  5. “Square footage of mushroom production Data series ongoing”
    …my my, and all this time Harp’s been trying to scandalize JT and POT ?!

    The truth is always buried in the facts and numbers, Harp just wants to make sure that his and his CON’s, are “really” buried deep in the bowels of confusion, so as to never arouse “Public Interest”, or “Suspiciions”.
    -oops?
    ;)

  6. You forgot to mention the data that has been lost to researchers of all types when the long form census was cancelled. This will be tragic to long-term serious and accurate planning.

  7. The Canadian Library Association urges the government to return Statistics Canada to its status as one of the world’s most respected National Statistical agencies by restoring its funding and the long-form census. The CLA urges the government to provide Statistics Canada with the support it needs to collect, analyze, and publish data that has proven, longstanding value for decision-makers, communities, and Canadians alike.
    http://www.cla.ca/AM/Template.cfm?Section=Home&TEMPLATE=/CM/ContentDisplay.cfm&CONTENTID=15669

Sign in to comment.