Scope of the API - features

Forum for CIF developers to define an application programming interface for CIF software.

Moderators: Brian McMahon, jcbollinger

jcbollinger
Posts: 57
Joined: Tue Dec 20, 2011 2:41 pm

Scope of the API - features

Post by jcbollinger » Tue Dec 20, 2011 11:24 pm

The first question that occurred to me when I was presented with the idea of a standard CIF API was that of scope. What actions and data structures, generally speaking, will the API provide to clients? Here are some of the things that it might provide:

  • Data structures representing CIFs and their components
  • Functions for building, manipulating, and examining in-memory CIF data
  • Functions for reading and writing CIF files
  • Data structures representing CIF dictionaries and their components
  • Functions implementing the CIF dictionary merging protocol
  • Functions or options for validating CIF
Does anyone have other classes of features that we should consider including?

Does anyone want to omit any of the feature groups above, or characterize them differently? In particular, does this API need to address validation? If yes, then does the initial version of the API need to do so, or could that work be deferred to a later version of the API or to a companion API?

jamesrhester
Posts: 39
Joined: Mon Sep 19, 2011 8:21 am

Re: Scope of the API - features

Post by jamesrhester » Fri Dec 23, 2011 12:18 am

I think it would be productive to identify a core set of features to start with, and once that is decided to tackle the less widely used features. My list of core features in terms of actions would be:

(1) Open, read, write, close a CIF file
(2) Read, write a key-value pair
(3) Open/Create a loop structure
(4) Read/Write packets from a loop structure
(5) Add/remove columns from a loop structure

These features roughly match to the first 3 points on John's list.

yayahjb
Posts: 18
Joined: Sun Sep 11, 2011 9:54 pm

Re: Scope of the API - features

Post by yayahjb » Fri Dec 23, 2011 3:48 am

Validation is difficult to add later if it has not been provided for in the initial design. If the concern is efficiency, I would suggest designing on the basis of a validating parser and then providing the option of a bypass of the validation for efficiency.

rjgildea
Posts: 3
Joined: Fri Dec 23, 2011 6:34 pm

Re: Scope of the API - features

Post by rjgildea » Fri Dec 23, 2011 6:49 pm

yayahjb wrote:Validation is difficult to add later if it has not been provided for in the initial design. If the concern is efficiency, I would suggest designing on the basis of a validating parser and then providing the option of a bypass of the validation for efficiency.


It is not clear to me why the validation step should in any way be involved in the parsing step as they are two completely distinct steps. In my opinion, parsing is solely for the purpose of syntax checking and populating the internal data structure representation of the CIF format, whilst validation of the content is functionality that is performed upon that data structure independently of parsing. In my experience it is mostly parsing and building of some internal data structure that is required by an application, with dictionary-based validation used far less frequently.

yayahjb
Posts: 18
Joined: Sun Sep 11, 2011 9:54 pm

Re: Scope of the API - features

Post by yayahjb » Fri Dec 23, 2011 8:31 pm

Certainly many CIFs can be parsed successfully without recourse to a dictionary. However, there are also CIFs for which parsing without a dictionary can be difficult (e.g. due to confusion between strings and numbers). If we are trying to design a common API to be used by a wide range of applications on a wide range of CIFs, it make senses to provide the necessary hooks for dictionary use when needed as well as the ability to use the same API without reference to a dictionary when desired. Designing a common reference API for use by the entire community is a different task from designing APIs to serve particular subsets of the community. In the end it may not be possible to provide a single API that satisfies all needs, but I suggest that it is worth considering the possibility of doing so.

rjgildea
Posts: 3
Joined: Fri Dec 23, 2011 6:34 pm

Re: Scope of the API - features

Post by rjgildea » Fri Dec 23, 2011 8:49 pm

yayahjb wrote:However, there are also CIFs for which parsing without a dictionary can be difficult (e.g. due to confusion between strings and numbers).


Do you have an example where this is the case? I have yet to see a CIF that can't be parsed using only the formal definition of the syntax? Interpreting the content is another matter, however I rarely find that programmatic recourse to a dictionary is necessary even for that.

yayahjb
Posts: 18
Joined: Sun Sep 11, 2011 9:54 pm

Re: Scope of the API - features

Post by yayahjb » Fri Dec 23, 2011 10:22 pm

The most difficult cases to handle without a dictionary that I encounter are unquoted string of digits with leading zeros and embedded pluses hyphens. These could be intended as numbers or as serial numbers in bibliographic context or as symmetry operations. Having the dictionary type specified greatly reduces possible confusion in parsing them. However, the point is not whether you or I have particularly troubling cases, but whether the design of the API will allow the API to support a reasonably wide range of application developers, some of whom may be new to CIF and may rely heavily on the API to help them avoid mistakes, and some of whom may be old hands with very clean data and very limited need for support from the API. I believe a good API should support both.

jcbollinger
Posts: 57
Joined: Tue Dec 20, 2011 2:41 pm

Re: Scope of the API - features

Post by jcbollinger » Tue Jan 03, 2012 6:59 pm

yayahjb wrote:Validation is difficult to add later if it has not been provided for in the initial design. If the concern is efficiency, I would suggest designing on the basis of a validating parser and then providing the option of a bypass of the validation for efficiency.

That's a fair consideration, but I think it impacts the API implementation more than the design. Am I mistaken? Before we expend a great deal of energy on what appears to me to be a procedural question, I would like to have an idea of the stakes.

What design-level difficulties might arise from addressing validation after designing at least a rough version of the some of the other, more universal API features?

Or else, how might an API design that addresses validation be distinguished from one that addressed only James's list of core features? What makes those distinctions difficult to add after the fact?

jcbollinger
Posts: 57
Joined: Tue Dec 20, 2011 2:41 pm

Re: Scope of the API - features

Post by jcbollinger » Tue Jan 03, 2012 7:16 pm

jcbollinger wrote:That's a fair consideration, but I think it impacts the API implementation more than the design.

I realized immediately after I wrote that that there are two separate questions here:
  • I offered the possibility that perhaps validation could be addressed via a companion API, which clearly has implementation considerations.
  • James, as I read him, merely suggested that we focus first on the design of the core features, from which he excludes validation.
I think we can follow James's suggestion while reserving judgement on the other. Indeed, that approach may put us in a better position to decide, later, how validation would best be incorporated. Are there any objections?

jcbollinger
Posts: 57
Joined: Tue Dec 20, 2011 2:41 pm

Re: Scope of the API - features

Post by jcbollinger » Mon Jan 09, 2012 3:15 pm

jcbollinger wrote:Are there any objections?

I take most of a week of silence as the absence of objections. I will shortly open one or more new topics dedicated to requirements for the "core" features, and to the extent we can reasonably do so, we will defer discussion of validation details.

Post Reply