Data flows and Policy for APS-operated Beamlines

Post a reply


This question is a means of preventing automated form submissions by spambots.
Smilies
:D :) ;) :( :o :shock: :? 8-) :lol: :x :P :oops: :cry: :evil: :twisted: :roll: :!: :?: :idea: :arrow: :| :mrgreen: :geek: :ugeek:

BBCode is OFF
Smilies are ON

Topic review
   

Expand view Topic review: Data flows and Policy for APS-operated Beamlines

Data flows and Policy for APS-operated Beamlines

by Brian McMahon » Sat Oct 20, 2012 12:03 pm

Data flows and Policy for APS-operated Beamlines and some overview info for the Collaborative Access Teams (CATS)

This information was communicated by Denny Mills ([email protected]), Associate Director at APS to John R Helliwell on Oct 19th 2012

Scientific Data and Beamline Critical Files

At present, across all the APS beamlines it is estimated that over 100 TB/day is generated. With APS typically running 5000 hrs/yr (about 200 days) that is 15-20 Petabytes per year. For comparison the LHC expects to collect about 15 Petabytes/year. (http://public.web.cern.ch/public/en/LHC ... ng-en.html)

As of 2012, about half of the 64 operating beamlines at the APS are run centrally by the X-ray Science Division (XSD). For those XSD-operated beamlines, scientific data that is transferred and stored on APS central file servers is automatically backed up at least once per day. User scientific data that is stored on a local beamline computer, or on the visiting users' own computers or storage media, will not be backed up by APS IT Department unless requested by XSD beamline staff. Scientific data that is backed up/stored on the central APS file server is retained for three months.

CATS: 5 examples

  • we keep data for 3 weeks. That allows users to get home and verify that the data transfer to home was successful;
  • we currently archive data for 12 months (3 months spinning, 9 months on tape only);
  • Officially, only until we have confirmed that they have it all, which usually means one run's worth. Unofficially, data often would be available for up to two years based on our current storage capacity;
  • we keep data for 2-3 months;
  • we keep data for 6 months.

Top