[Medin_dacwg] FW: Cost to archive data

Mowat, Mary R. mmow at bgs.ac.uk
Mon Jul 20 14:09:42 BST 2015


Some comments from Garry

From: Baker, Garry R.
Sent: 20 July 2015 12:49
To: Mowat, Mary R.; Glaves, Helen M.; Henni, Paul H.
Cc: Shelley, Claire E.; Harrison, Matthew
Subject: RE: Cost to archive data

Mary, Paul, Helen

This is an interesting discussion and one that NERC data community has had several times in recent years on the funding of its data centres specifically regarding the NERC funded 'Discovery Science' Projects and 'Strategic Research Programmes'.

In an ideal world there would be funds made available for the long-term data management and ingestion of all data created from projects and programmes on a case by case basis (with each being investigated and costed, itself a costly exercise).  The source of this money to be established within the funding call when the project/programme was defined and initiated.  In the case of the larger NERC programmes individual costings are ultimately used with an initial assessment that between 3 to 5 % of the programme funds will be needed to undertake this function/work (at programme planning or call phases).  There are however not many of these funded each year by NERC.

Our initial evidence in the geoscience disciplines is that approximately 3-3.5% of the grant/programme is needed to undertake the long-term data management of the scientific data.   In the specific case of the smaller 'Discovery Science' projects, which are more prevalent as more are funded by NERC each year this is not a feasible solution so instead a top-slice funding model is applied at a NERC level.  This is presently the equivalent of 1.66% of the Discovery Science grant funding pot which is distributed between the five NERC data centres.  We will have a review of that 1.66% figure and the underpinning funding model in 18-20 months' time to ascertain if it is working.  It is also built on a set of assumptions (1: only a proportion of the projects producing data which cannot be said of the larger programmes which always produce scientific data 2:  that large ticket items need to be considered and skew the model and 3: that an underpinning data centre function/infrastructure is provided and funded by someone).

Given the complexity and range of data under consideration in this specific case I feel they are more akin to 'Strategic Research' Programmes so the 3 to 5% costs estimates are more appropriate.  How this is funded (the source of the money) is a separate discussion to the understanding and agreement that this work needs to be done to undertake professional long-term data management of this scientific data.  The challenge here is separating these two discussions and seeking agreements and solutions to each.

I have read Peter's comments on an 'absolute range' being more useful with an acceptance that anything in the initial stages being funded by a MEDIN DAC but I would counter (in the spirit of constructive dialogue) that if we were to receive 50 smaller 'budget surveys' in any one year that too would break the data centre model if we are expected to find the money/resources/infrastructure for them as much as fewer larger geotechnical survey.

Of specific note is that the NERC funded Discovery Science and Strategic Research Programmes have funding models but they too rely upon the underlying data centres, its infrastructure/staff/resources which do not (Note *1: caveat) and therefore the underlying data centre is provided by a co-located NERC Research Centre using the diminishing 'National Capability' funding they receive.  The very same funding they also have to run their Research Centre and undertake their science.   I expect we are probably in danger of repeating the "the programmes cannot consider that cost burden" while the research centres/data centres will say "our staffing/infrastructure cannot accommodate that burden" leading to some future discussions on a funding source while we all hopefully agree that the professional long-term data management needs to be done in accordance with the now widespread adoption of 'Data Policies' in organisations and government.

Kind regards,

Garry

(Copied to Matt and Claire  - so they are kept in the loop on these discussions)

From: Mowat, Mary R.
Sent: 20 July 2015 10:24
To: Baker, Garry R.; Glaves, Helen M.
Cc: Henni, Paul H.
Subject: FW: Cost to archive data

Hi Garry, Baker

Do you have any comments/advice on this, and can you get back to me in the next couple of days?

The 3-5% was suggested from Robin McCandliss as in the NERC guideline data management costs and whether that could be used as a ball park figure in general when esimating costing for archiving in the MEDIN DACs, but the comments from Pete (Crown Estate) suggest this figure maybe isn't always appropriate for these industry (renewables) type of surveys.

Do you have any similar experience from other projects?

Cheers
Mary

From: medin_dacwg-bounces at mailman.nerc-liv.ac.uk<mailto:medin_dacwg-bounces at mailman.nerc-liv.ac.uk> [mailto:medin_dacwg-bounces at mailman.nerc-liv.ac.uk] On Behalf Of Edmonds, Peter
Sent: 20 July 2015 09:55
To: Postlethwaite, Clare; medin_dacwg at mailman.nerc-liv.ac.uk<mailto:medin_dacwg at mailman.nerc-liv.ac.uk>
Subject: Re: [Medin_dacwg] Cost to archive data

Hi Clare,

I'm not sure that this will be all that helpful to be honest. For a £10,000 budget survey 3-5% (£300-500) is not that material, a reasonable number, and could probably be found. However, for a £9m geotechnical survey, 3-5% (£270k - £450k) is certainly material, and most importantly way out! I think that an absolute cost range may be more helpful. Say costs of up to £x would/could be covered by the DAC, but above that there is a sliding scale depending on  the factors you outline in the document up to a ceiling of £y. I would have thought that £500 - £5,000 was reasonable. I certainly can't imagine wanting to spend more than that on data archival.

I also imagine that a suite of examples will be helpful. What might a typical, well-structured hydrographic survey cost to archive vs. what is the maximum could be imagined for a large and poorly structured set of data from a survey? Etc.

As it is I can see industry baulking at the thought of an extra 5% on their already huge costs (we estimate around £30m at risk spend on pre-consent surveying for a typical size offshore wind farm).

Many thanks,

Pete

________________________________

Peter Edmonds
Spatial Data Manager

[Image removed by sender.]

16 New Burlington Place, London, W1S 2HX
Tel: +44 (0) 20 7851 5349 | Mob: +44 (0) 7702 719 919
www.thecrownestate.co.uk<http://www.thecrownestate.co.uk> [Image removed by sender.] <http://www.twitter.com/thecrownestate>
Please think - do you need to print this email?

________________________________

LEGAL DISCLAIMER - IMPORTANT NOTICE

The information in this message, including any attachments, is intended solely for the use of the person to whom it is addressed. It may be confidential and subject to legal professional privilege and it should not be disclosed to or used by anyone else. If you receive this message in error please let the sender know straight away.
We cannot accept liability resulting from email transmission.
The Crown Estate's head office is at 16 New Burlington Place London W1S 2HX


From: medin_dacwg-bounces at mailman.nerc-liv.ac.uk<mailto:medin_dacwg-bounces at mailman.nerc-liv.ac.uk> [mailto:medin_dacwg-bounces at mailman.nerc-liv.ac.uk] On Behalf Of Postlethwaite, Clare
Sent: Monday, July 20, 2015 9:27 AM
To: medin_dacwg at mailman.nerc-liv.ac.uk<mailto:medin_dacwg at mailman.nerc-liv.ac.uk>
Subject: [Medin_dacwg] Cost to archive data

Dear MEDIN DAC Working Group,

One of the actions on me at last week's meeting was to draft a paragraph on the cost to archive data at DACs. Can you give it a read to check you are happy with the message and that the 3-5% estimate holds for the types of data you deal with. Just to remind you, this was requested by DEFRA, the PSEG " access to industry data" project and to be a FAQ on the MEDIN website. So I'm looking for comments on the text itself and whether it is appropriate to be used for those purposes.

All comments to me please by 27th July.
Best wishes,
Clare



  ________________________________
This message (and any attachments) is for the recipient only. NERC is subject to the Freedom of Information Act 2000 and the contents of this email and any reply you make may be disclosed by NERC unless it is exempt from release under the Act. Any material supplied to NERC may be stored in an electronic records management system.
  ________________________________
________________________________
This message (and any attachments) is for the recipient only. NERC is subject to the Freedom of Information Act 2000 and the contents of this email and any reply you make may be disclosed by NERC unless it is exempt from release under the Act. Any material supplied to NERC may be stored in an electronic records management system.
________________________________
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.nerc-liv.ac.uk/pipermail/medin_dacwg/attachments/20150720/d0a3678a/attachment-0001.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 417 bytes
Desc: image001.jpg
Url : http://mailman.nerc-liv.ac.uk/pipermail/medin_dacwg/attachments/20150720/d0a3678a/attachment-0002.jpg 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.jpg
Type: image/jpeg
Size: 338 bytes
Desc: image002.jpg
Url : http://mailman.nerc-liv.ac.uk/pipermail/medin_dacwg/attachments/20150720/d0a3678a/attachment-0003.jpg 


More information about the Medin_dacwg mailing list