This is a public forum that invites community input on strategies and desirable practices in providing open and long-term access to diffraction data sets.

Slides from the talk have also been posted at viewtopic.php?f=21&t=371

Access to raw diffraction data; current practice in article linking to raw diffraction data

J. R. Helliwell[1], B. McMahon[2]

[1] Chemistry, University of Manchester, Manchester, UK
[2] R&D, IUCr, Chester, UK

The IUCr global Diffraction Data Deposition Working Group has for over 4 years examined the issues and prospects for linking raw diffraction data sets to publications. Considerable headway has been made in the last year, which we will summarise with examples. Recently, a number of problems with structures of proteins, their ligands, nucleic acids, carbohydrates, bound metals, etc. have been identified and discussed in several publications (e.g. [1]). Raw diffraction images may be useful for improving PDB depositions, teaching and as training sets for methods developers. There are also recalcitrant structures where ‘crowd-efforts’ to solve them might work. Details of the benefits of access to raw diffraction data are in the October 2014 Acta Crystallographica D. For storage of and access to such data sets a digital object identifier (DOI) for each raw data set is registered. Data archive examples are the NIH funded Big Data to Knowledge (BD2K, led by Wladek Minor at the University of Virginia), http://www.proteindiffraction.org/ (USA); Zenodo http://zenodo.org (Europe); Store.Synchrotron https://store.synchrotron.org.au/public_data/ (Australia); and Structural Biology Data Grid https://data.sbgrid.org/ (USA). The BD2K initiative has ~2900 indexable and searchable diffraction experiments. A major development is that IUCr Journals (IUCrJ, J. Appl. Cryst., Acta Crystallographica D, F) have started linking their publications with primary data sets in repositories e.g. approximately twenty crystal structure studies on the binding of anti-cancer compounds (platins) to histidine in a protein [Tanley et al. (2012), Acta Cryst F68, 1300-1306] and the associated raw data [doi:10.15127/1.215887, https://www.escholar.manchester.ac.uk/u ... scw:215887], so that for these there is Gold Open Access to the publications, the PDB files and the raw data.

We thank the University of Manchester Library staff responsible for data deposition in their eScholar repository.

[1] Wladek Minor, Zbigniew Dauter, John R. Helliwell, Mariusz Jaskolski and Alexander Wlodawer (2016). ‘Safeguarding structural data repositories against bad apples.’ Structure, 24, 216-220.
