This is the second full Workshop of the IUCr Diffraction Data
Deposition Working Group (DDDWG). It follows a very successful
meeting in Bergen in 2012 (programme and presentations are available
at It is
also a natural successor to the Crystallographic Information
and Data Management Symposium at Warwick University in 2013,
amplifying and building on many of the topics discussed there

The Bergen Workshop surveyed the potential benefits of routine
deposition of diffraction images, and explored some of the practical
and cost implications of such a strategy. This led to a number of
special articles published in Acta Crystallographica Section D
that provided a detailed analysis of many of the issues involved.

A meeting of the Working Group at the IUCr Congress in Montreal in
August 2014 concluded that there were promising movements towards
widespread deposition of raw (otherwise known as `primary') data, but
that there were still a number of limiting factors. (1) Since there is
no obvious single institution which will archive all crystallographic
raw data, the initial strategy should be the encouragement of
voluntary deposition in locations most convenient for authors
(e.g. synchrotron and other instrument facilities,
university and institutional repositories, domain repositories such as
the Australian Synchrotron.Store). (2) Search and discovery functions
across diverse locations would depend on common metadata identifying
and describing data sets. The obvious candidate for an identifier is
the Digital Object Identifier (DOI), because of the existing machinery
to register and share DOI information. (3) Because molecular/atomic
structural studies increasingly rely on a range of technologies and
techniques, it would be desirable to harmonise metadata descriptions
across as many such technologies as possible. Studying the
`arrangement of atoms' in its most general sense - as well as
diffraction, spectroscopy and microscopy - has long been recognized
as fitting within the remit of the IUCr.

While `metadata' enters the discussion in the context of building
distributed systems for search/discover, identification and retrieval
of data sets, it rapidly becomes apparent that there is much more to
metadata than that. `Metadata' is variously defined, but the general
sense is that it is the information that is needed to make sense of
data, to allow its reuse, validation and critical analysis. Yet such
`information' is itself data - data that collectively open doors to
further avenues of study, and even new scientific insight. Standard
uncertainties on atomic positions modify the weights that should be
given to structural models collected in databases, and so subtly
affect our understanding of chemical bonding or biological function
(e.g. in knowledge-based research using the Cambridge
Structural Database or Protein Data Bank). The raw intensities
ignored in models based solely on Bragg peaks (i.e. diffuse
scattering) can now be reanalysed to provide insights into correlated
disorder. Comparison of structural models derived from X-ray
crystallography or from NMR can deepen understanding of protein
structure and dynamics. Analysis of raw diffraction intensities from
different experiments can yield examples of systematic bias (or, in
extreme examples, dishonest practice).

Overall, the richer the metadata available to the scientist, the
greater the potential for new discoveries. Crystallography is
exceptional in the richness and granularity of metadata descriptors
already available, mostly in diffraction-based research, and largely
owing to the data dictionaries developed within the Crystallographic
Information Framework (CIF), as so clearly shown in the Warwick
Symposium. (That said, the achievements of other research communities
in making available their data - such as the astronomers - should
also be recognized. Our enthusiastic participation in organisations
such as the International Council for Science (ICSU) and its Committee
on Data (CODATA) is vital, both to represent crystallography, and to
learn of best practice from other research communities.)

This two-day Workshop will survey the many uses already being made of
crystallographic metadata, especially where associated with raw data
capture, analysis and reuse. We will identify areas where better
metadata descriptors are required, and we shall begin to look at the
challenges of defining new metadata, especially in studies which
do not have the clean, well-defined parameters of classical
single-crystal or powder diffraction experiments. Some of the biggest
challenges being faced are at the centralised synchrotron (and X-ray
laser) and neutron facilities, where colossal quantities of
diffraction, spectroscopy and especially microscopy raw data are being
generated, and also in the databases which must organise and protect
access to the fruits of all our researches in perpetuity.

We look forward to your active participation. We are grateful to our
sponsors, who have made possible the web streaming and video recording
of proceedings, so that we can reach a wider audience and provide a
permanent record of the content of these two days. We shall enjoy
the warm-hearted hospitality of our Croatian hosts in this beautiful
location, and to whom we are indebted for their energetic and
efficient logistical preparations. We welcome you to Rovinj, and to
this latest IUCr DDDWG Workshop.

John Helliwell
Brian McMahon
Video recordings of all the presentations and discussion at the DDDWG Metadata Workshop in Rovinj are now available on the web at ... j-workshop
