So dreary, lonely, desolate and dark. And cold. A bitter, stark landscape where no one goes, and my poor data subsists on cabbage water and heels of bread. The Gulag Biomedico. Nasty place, this.
We all know about Supplemental Siberia, and we know it's a problem, but we just don't know what to do about it. (By `we', I mean those of us publishing in the fields of biomedical research. If you are reading this and you haven't or don't plan to someday do such publishing, and you are reading this for fun, it might be a good time to re-evaluate your meaning of `fun'.) And for those of you who have not yet had an opportunity to publish and experience this desolate demesne where data goes to die, here's the thing. Once upon a time, long, long ago – when papers were actually on paper and `download' meant taking a volume from a shelf (as in, `Please, could you help me and download that volume 72 of the Journal of Lots of Molecules, I can't reach it and you're such a tall and strong human'. Hey, I'm an insectivore, remember? I had to kiss up) – we didn't actually show all of our data. There was only so much space that could be devoted to figures and text, and often we would make a statement and would say `data not shown', suggesting that we had done it and you should just trust us. But everyone knows that what this phrase really meant was one of two things: (a) data not showable, or (b) a nice result that we're saving for another paper (see supplementary material Fig. S1).
And then came the revolution, and we reckoned it was a good thing (and it really was). Our papers started coming out online, with ever-better search engines and opportunities to show things you couldn't show on paper, like cool movies. And we didn't have to spend hours and hours in sterile libraries poring over lists of subject indices, tomes of citation reports and weekly compendia of tables of contents (see supplementary material Fig. S2). We really did this, and before copy machines, we even had to actually read and take notes. Yes, I'm that old.
Then, after about ten years or so (maybe less) somebody had a very practical idea. Since our papers were coming out online, we didn't actually have to have `data not shown' (which I'll call DNS to save time, because as we'll see, it's all about time). If it is indeed data showable, it would be useful to actually show it, as a supplementary figure. That way, any interested reader with access to a computer and the ubiquitous World Wide Web could have a look. That's a good thing, right?
Before I go on, I can't resist sharing something that the brilliant actor, novelist, comedian, poet, raconteur and all-round brilliant person Stephen Fry noted in a podcast (yes, he even does podcasts – I'm so jealous of really brilliant people). `World Wide Web' has fewer syllables than saying `www' (see supplementary material Fig. S3), yet we persist in reading out the latter lest our colleagues think they should actually type `World Wide Web' to access a URL. I can't even remember what URL stands for, because saying `URL' saves time, and it's really all about time. He suggests saying `wuh wuh wuh'. Because, yes, time. But I digress. (But not really, as we'll see.)
So we introduced supplementary figures (and tables) to replace our DNS. Now, if we actually write `data not shown' (however timesaving `DNS' might be, only we enlightened few know what that means), someone, most likely a reviewer, will tell us that it is absolutely essential that the reader should see it, that we must have it available for inspection, thus eliminating all of our conclusions that are based on data not showable. That's a good thing.
But, as most of us have learned, we had opened Pandora's Box (see supplementary material Fig. S4). Now, because any amount of data can be associated with a paper, we may find it imperative to provide anything in a paper that has been deemed `necessary'. Reviewers and editors are empowered to ask for almost anything they can imagine. Gedanken experiments that are dreamed up on the fly condemn researchers to weeks of additional work that produces incremental bits of information to be relegated to Supplemental Siberia. If we refuse, we will very likely risk finding our paper degraded to an ever more `specialized' journal. We dread the response: `Thank you for resubmitting your manuscript. In light of the disappointing assessment by reviewers, we suggest you consider submitting your now extensively revised paper, which we must have once thought sort of interesting, to the Journal of Experiments that Time Forgot (J. Exp. Time Forg.)'. So supplement we must.
So the ether fills with quasi-accessible but unsearchable tidbits of experimental findings. I've looked for record numbers of supplementary figures, and I've found one small paper with 56 supplementary figures, but I'm sure that I've not hit the limit yet (see supplementary material Fig. S5). (Really, you want to look at this one.)
Is this really the best use of our efforts, resources and, yes, time? What can we do? Hey, this is the Mole you're talking to (see supplementary material Fig. S6) – of course I have a suggestion. Stick around, we ain't out of the wilderness yet.
Supplementary information
Fig. S1. How it was.
Fig. S3. Actual data.