Data flows and Policy for APS-operated Beamlines

This forum allows IUCr Commissions, subject experts and invited consultants to provide input to the IUCr Working Group on Diffraction Data Deposition.
Post Reply
Brian McMahon
Site Admin
Posts: 116
Joined: Fri May 13, 2011 12:34 pm

Data flows and Policy for APS-operated Beamlines

Post by Brian McMahon » Sat Oct 20, 2012 12:03 pm

Data flows and Policy for APS-operated Beamlines and some overview info for the Collaborative Access Teams (CATS)

This information was communicated by Denny Mills (dmm@aps.anl.gov), Associate Director at APS to John R Helliwell on Oct 19th 2012

Scientific Data and Beamline Critical Files

At present, across all the APS beamlines it is estimated that over 100 TB/day is generated. With APS typically running 5000 hrs/yr (about 200 days) that is 15-20 Petabytes per year. For comparison the LHC expects to collect about 15 Petabytes/year. (http://public.web.cern.ch/public/en/LHC ... ng-en.html)

As of 2012, about half of the 64 operating beamlines at the APS are run centrally by the X-ray Science Division (XSD). For those XSD-operated beamlines, scientific data that is transferred and stored on APS central file servers is automatically backed up at least once per day. User scientific data that is stored on a local beamline computer, or on the visiting users' own computers or storage media, will not be backed up by APS IT Department unless requested by XSD beamline staff. Scientific data that is backed up/stored on the central APS file server is retained for three months.

CATS: 5 examples

  • we keep data for 3 weeks. That allows users to get home and verify that the data transfer to home was successful;
  • we currently archive data for 12 months (3 months spinning, 9 months on tape only);
  • Officially, only until we have confirmed that they have it all, which usually means one run's worth. Unofficially, data often would be available for up to two years based on our current storage capacity;
  • we keep data for 2-3 months;
  • we keep data for 6 months.

Post Reply