First draft of just-published Value all Research Products

The copyright transfer agreement (arg) I signed for the Comment in Nature included restrictions on where I may post a copy of the article:

Although ownership of all Rights in the Contribution is transferred to NPG, NPG hereby grants to the Authors a licence [...]
c) To post a copy of the Contribution as accepted for publication after peer review (in Word or Tex format) on the Authors’ own web site, or the Authors’ institutional repository, or the Authors’ funding body’s archive, six months after publication of the printed or online edition of the Journal, provided that they also link to the Journal article on NPG’s web site (eg through the DOI).

The article is available for free for a week or two on Nature’s site, and I’ll post the text here as soon as I can, six months from now.

In the meantime, as per contract lingo above, I may post the first draft that I sent the Nature editors.  So here is the first draft, for the benefit of those who are looking for a free version in the first half of 2013, and for anyone who cares to compare the first draft to the final draft :)   [Hint: there were MANY rounds of editing.  more on that in next post.... ]

NSF policy welcomes alt-products, increases need for altmetrics

(or perhaps NSF welcomes bragging about software, datasets in proposals)

Research datasets and software no longer have to masquerade as research papers to get respect.  Thanks to an imminent policy change at the NSF, non-traditional research products will soon be considered first-class scholarly products in their own right, and worth bragging about.  This policy change will prove a key incentive to produce and disseminate alternative products, and have far-reaching consequences in how we assess research impact.

Starting January 14th, the NSF will begin to ask Principal Investigators to list their research Products rather than Publications in the Biosketch section of funding proposals.  Datasets and software are explicitly mentioned as acceptable products in the new policy, on par with research articles.

The policy update reflects a general increase in attention to alternative forms of scholarly communication.  Policies, repositories, tools, and best practices are emerging to support an anticipated increase in dataset publication, spurred, in part, by now-required NSF data management plans.  Tools for literate programming, reproducible research, and workflow documentation continue to improve, highlighting the need for shared software.  Open peer review, online lab notebooks, post-publication discussion — as it gets easier to “publish” a wide variety of material online it becomes easy to recognize the breadth of our intellectual contributions.

I believe in the long run this policy change from Publications to Products will do much more than just reward an investigator who has authored a popular statistics package.  It is going to change the game, because it is going to change how we assess research impact.

The change starts by welcoming alternative products.  The new policy welcomes datasets, software, and other research output types in the same breath as publications: “Acceptable products must be citable and accessible including but not limited to publications, data sets, software, patents, and copyrights. Unacceptable products are unpublished documents not yet submitted for publication, invited lectures, and additional lists of products.”  In contrast, previous versions of the Biosketch instructions policy allowed fewer types of acceptable products (“Patents, copyrights and software systems”) and considered their inclusion to be a “substitution” of the main task of listing research paper publications.

The next step will become apparent when we consider what peer reviewers will want to know when they see these alternative products in a Biosketch.  What is this research product?  Is it any good?  What is the size and type of its contribution?  We often assess the quality and impact of a traditional research paper based on the reputation of the journal that published it.  In fact the UK Engineering and Physical Sciences Research Council makes this clear in its fellowship application instructions: “You should include a paragraph at the beginning of your publication list to indicate … Which journals and conferences are highly rated in your field, highlighting where they occur in your own list.”

Including alternative products will change this: it necessitates a move away from assessment based on journal title and impact factor ranking.  Data and software can’t be evaluated with a journal impact factor — repositories seldom select entries based on anticipated impact, they don’t have an impact factor, and we surely we don’t want to calculate one to propagate the poor practice of judging the impact of an item by the impact of its container.  For alternative products, Item level metrics are going to be key evidence for convincing grant reviewers that a product has made a difference.  The appropriate metrics will be more than just citations in research articles: because alternative products often make impact ways that aren’t fully captured by established attribution mechanisms, alternative metrics (altmetrics) will be useful to get a full picture of how research products have influenced conversation, thought, and behaviour.

The ball will bounce further.  Once altmetrics and item level metrics become expected evidence to help assess the impact of alternative products, the use of item-level altmetrics will bounce back to empower innovations in the publication of traditional research articles.  Starting a new or innovative journal is risky: many authors are hesitant to publish their best work somewhere unusual, somewhere without a sky-high impact factor.  When research is evaluated based on its individual post-publication reception, innovative journals become attractive, perhaps competitively more attractive than staid established run-of-the-mill alternatives.  Reward for innovative journals will result in more innovations in publishing.  Heady stuff!

A few large leaps are needed to realize this future, of course.  First, this one policy change hardly represents a consistent message across the NSF.  Accomplishment-Based Renewals are still based on “six reprints of publications”, with no mention of alternative products.  Even in the Grant Proposal Guide, the same document that houses the new Products policy, the instructions for the References Citations section are written as if only research articles would be cited in a grant proposal.  What about preliminary data on figshare, or supporting software on RunMyCode, or a BioStar Q&A solution, or a patent, or a blog post, or, for that matter, an insightful tweet?  If we think these products are potentially valuable, the NSF should welcome and encourage their citation anywhere it might be relevant.

The second hurdle is that a policy welcoming the recognition of alternative products is not yet common outside the NSF.  A brief investigation suggests that many other funders — including the NIH, HMMI, Sloan, and UK MRC– still explicitly ask for a list of research papers rather than products.  A few, like the Wellcome Trust and UK BBSRC just seem to ask broadly for a CV, leaving the decision about its contents to the investigator.  This could be good, but because investigators are not used to considering alternative products to be first-class citizens, explicit welcoming is important to drive change.

The third challenge between us and a new future brings us to an exciting area under active development.  When products without journal title touchpoints start appearing in BioSketches, how will reviewers know if they should be impressed?  Reviewers can (and should!) investigate each research product itself and evaluate it with their own domain expertise.  But what if an object is in an area outside their expertise?  They need a way to tap into the opinion of expert in that domain.  Furthermore, beyond the intrinsic quality of the work, how will reviewers know if the Intellectual Merit has indeed been impactful on scholarship and the world, and thus should lend credence to the proposal under consideration?

Many data and software repositories keep track of citations and download statistics.  Some repositories, like ICPSR, go a step further and provide anonymous demographic breakdowns of usage to help us move beyond “more is better” to an understanding of the flavour of the attention.  This context will become richer as more types of engagement are added:  is the dataset being bookmarked for future use?  Who is cloning and building on the open software code?  Are blog posts be written about the contribution?  Who is writing them and what do they say?

Tools are available today to collect and display this evidence of impact.  Thomson Reuter’s Data Citation Index aggregates citations to datasets that have been identified by data repositories. identifies blog posts, tweets, and mainstream media attention for datasets with a DOI or handle: try it out using their bookmarklet.  The nonprofit organization ImpactStory tracks the impact of datasets, software, and other products, including blog and twitter commentary, download statistics, and attribution in the full text of articles: give it a try.  I’m a cofounder of ImpactStory: we as scientists need to go beyond writing editorials on evaluation and actually start building the next generation of scholarly communication infrastructure.  We need to create business models for infrastructure that support open dissemination of actionable, accessible and auditable metrics for research and reuse.

Finally, the practice shift to value broad impact will be more rapid and smooth if funders and institutions explicitly welcome broad evidence of impact.  Principal investigators should be tasked with making the case that their research has been impactful.  Most funders, including the NSF, do not currently ask for evidence of impact.  This may be changing: the NIH issued an RFI earlier this year on BioSketch changes that would include documenting significance.  In the meantime, the lack of an explicit welcome hasn’t stopped cutting-edge investigators from augmenting their free-form CVs and annual reviews to mention that their work has been “highly accessed” or received a F1000 review.  This — and next generation evidence with context — should be explicitly welcomed.

Despite these hurdles, the future is not far away.  You and I can start now.  Create research products, publish them in their natural form without shoehorning everything to look like an article, make citation information clear, track impact, and highlight diverse contributions when we brag about our research.  We’re on our way to a more useful and nimble scholarly communication system.