OMII-UK Home

Code strengthening - modularising, APIs and wrapping

Attendees

  • Pascal Ekin – The University of Manchester
  • Mike Jackson – The University of Edinburgh (scribe)
  • Tobias Schiebeck – The University of Manchester
  • Claire Sloggett – Intersect
  • Aaron Turner – The University of York

Overview

This session discussed the lack of update and exploitation of research code, and how this situation might be improved.

Reasons for lack of uptake

Researchers don’t productise code. Such code isn't research, code isn't citeable as a research artefact. Code may be rapidly developed then run, discarded and rewritten. However, even API documentation and comments are neglected.

A lack of comments and doc can be a blocker, both to productisation - what does the code do? - and to reuse - what does it do? how can I use it?

Researchers don't publish code. Often because they view it as just a proof-of-concept and they might be embarrassed by perceived poor quality or under the impression it's not worth the hassle as no-one will be interested.

All code decays but the decay of research code seems faster as code often isn’t stable in first place. Especially as it's often layerered on other unstable code e.g. e-science middlewares.

Research councils don’t fund productisation/hardening of code.

There is a lack of tools to enable researchers to reuse components e.g. to drag-and drop e-science components, glue these together and so focus on the business logic.

Combat decay

One way to combat decay is via virtualisation. This allows anyone to have access to the particular operating system and tools that are needed to run the code. However, there can be a subsequent reluctance to risk changing a black box that works and so possible gains in efficiency or power might be missed.

An alternative is to expose research code as services or provide another layer of abstraction. This can buffer code users from changes in middleware and also provide a layer to validate inputs and improve robustness. But another layer is another source of bugs and can decay. Also, who hosts the services? Who maintains them?

Encourage improved coding

It would be useful to improve the documentation of research code, particularly - commenting, APIs, what it does and how to use it.

  • Very short cheap courses could be provided by universities.
  • Online material could be provided.

These should cite the benefits to researchers, which include:

  • Productisation by third parties
  • Maintainable
  • Reusable

But who pays for such courses or material delivery and who ensures that researchers continue to apply what has been learned?

Software consultants

Another option is to create a pool of trained software developers available to all researchers in a university. Software development expertise is available in-house and on tap to ingest, review, harden and support research code.

Who would fund such a pool? A university itself out of its own funds or via portions of a research grant?

What if there aren’t enough projects to continue to fund the pool? Does a university cut down the team and lose expertise?

Conclusions

One proposal is to encourage amongst researchers, universities and research councils a view that code is a recognised research artefact, and to allow code to be cited in papers.

A paper describing an experiment and results is valid, so why not code? The code allows the reproduction of results and also for the validation of the implementation (and so of the "experimental method").

Research councils could insist that code is published in a repository. The repository evaluates the code according to a minimum set of criteria for understanding what the code does, how it does it and how to use it.

How would such a fundamental shift be enabled?

A short proposal/recommendation/advisory to research councils

This would outline the problem - a lack of exploitation, reuse, or productisation of valuable research code - and the reasons for this - – a lack of documentation, a lack of publication. This could be backed up by evidence/experience from actual researchers.

It would summarize why this state of affairs is of concern

It would suggest:

  • The promotion of code as a citable research output, as recognised and acceptable as a paper to both funders and researchers.
  • A repository for research code, to provide the basis for the citation and ensure that code remains accessible beyond the lifetime of each project. Access control can be used so that sensitive research outputs can remain strictly controlled.
  • A review process for the repository that can address the question of whether submitted code is maintainable – is there enough doc on how to use it, what it does and how it does it. This would be based upon the following.
  • Provision of code comment/doc guidelines which provide guidelines which can be applied as code is produced, which are low cost in terms of time and easy to understand. This should cite references to existing literature/examples and also literature on the benefits of following commenting/doc guidelines in code development.

It would also provide a statement of possible hurdles (to show we recognise it won’t be easy).

Further work

Write the advisory.

Add new attachment

Only authorized users are allowed to upload new attachments.
« This page (revision-11) was last changed on 08-May-2009 14:47 by SimonHettrick [RSS]

© The University of Southampton on behalf of OMII-UK. All Rights Reserved. | Terms of Use | Privacy Policy | PageRank Checker