Scope of the API - features

Forum for CIF developers to define an application programming interface for CIF software.

Moderators: Brian McMahon, jcbollinger

yayahjb
Posts: 18
Joined: Sun Sep 11, 2011 9:54 pm

Re: Scope of the API - features

Post by yayahjb » Mon Jan 09, 2012 3:47 pm

As previously noted, I think failing to make allowances for validation in the initial design will greatly increased the difficulty in incorporating it later, while designing to include validation from the start costs very little and, when properly done can easily be turned off when efficiency or other considerations demand it.

jcbollinger
Posts: 57
Joined: Tue Dec 20, 2011 2:41 pm

Re: Scope of the API - features

Post by jcbollinger » Mon Jan 09, 2012 7:04 pm

yayahjb wrote:As previously noted, I think failing to make allowances for validation in the initial design will greatly increased the difficulty in incorporating it later, while designing to include validation from the start costs very little and, when properly done can easily be turned off when efficiency or other considerations demand it.

Indeed you did say so previously, but you did not respond to my request for elaboration. In particular:
jcbollinger wrote:What design-level difficulties might arise from addressing validation after designing at least a rough version of the some of the other, more universal API features?

Or else, how might an API design that addresses validation be distinguished from one that addressed only James's list of core features? What makes those distinctions difficult to add after the fact?

Lest there be any confusion, by a "design" I mean roughly function and data type form and behavior specifications, including function prototypes or an equivalent, but excluding implementation code. If we would indeed be risking later difficulties by holding off on validation considerations then I surely want at minimum to understand the risk. I'm not seeing it, however, so please enlighten me.

yayahjb
Posts: 18
Joined: Sun Sep 11, 2011 9:54 pm

Re: Scope of the API - features

Post by yayahjb » Mon Jan 09, 2012 7:43 pm

jcbollinger wrote:
What design-level difficulties might arise from addressing validation after designing at least a rough version of the some of the other, more universal API features?

Or else, how might an API design that addresses validation be distinguished from one that addressed only James's list of core features? What makes those distinctions difficult to add after the fact?

Having made the mistake of doing a non-validating CIF API and then adding validation to it -- I would say the biggest issues start in the design of the lexer and parser which need to be designed to allow for recovery after an error, rather than a hard abort in order to facilitate reporting of meaningful multiple errors in each pass rather than driving the user nuts with one error at a time. This relates to the need to design in a simple and effective error reporting/logging mechanism. Those are actually fairly general software engineering issues for any language. For CIF in particular, the most important design feature is to provide dictionary support. This has a strong impact on the design of the API because the dictionary format is somewhat different from the data CIF format. Less critical for the small molecule community is the need to cope sensibly with mixed DDL1/DDL2 data -- even if you intend to treat such mixing as an error, the API works better for users if you plan the hooks to tell the user what they did, rather than producing cryptic aborts.

There is more, but I think you get the point -- a design goes better if you plan ahead.

jcbollinger
Posts: 57
Joined: Tue Dec 20, 2011 2:41 pm

Re: Scope of the API - features

Post by jcbollinger » Mon Jan 09, 2012 9:40 pm

Thank you, Herbert, I appreciate your insight.
yayahjb wrote:Having made the mistake of doing a non-validating CIF API and then adding validation to it -- I would say the biggest issues start in the design of the lexer and parser which need to be designed to allow for recovery after an error, rather than a hard abort in order to facilitate reporting of meaningful multiple errors in each pass rather than driving the user nuts with one error at a time. This relates to the need to design in a simple and effective error reporting/logging mechanism. Those are actually fairly general software engineering issues for any language.

That's an excellent point, and one that I'm all too prone to overlook despite having run into it before, both in CIF and in general context. Truly, validation considerations magnify the importance those issues.

It still seems like we're not quite connecting, however, for it is my fervent hope that the group never has need to discuss the details of a CIF lexer in this forum. The design specifications I am trying to reach, at least initially, will be primarily for the functions and data structures that programs using the API will touch -- that is, the public interface. Logging and error recovery are relevant there, to be sure, but even if we did not address them before considering validation, I don't see there being enough specifications altogether to make additions and changes an onerous task at the point where I would like to take that up.
yayahjb wrote:For CIF in particular, the most important design feature is to provide dictionary support. This has a strong impact on the design of the API because the dictionary format is somewhat different from the data CIF format. Less critical for the small molecule community is the need to cope sensibly with mixed DDL1/DDL2 data -- even if you intend to treat such mixing as an error, the API works better for users if you plan the hooks to tell the user what they did, rather than producing cryptic aborts.

I am very much hoping that we can come up with a design that localizes the differences between dictionary formats to a smallish number of routines for each DDL, but that's beside the point at the moment. Dictionaries and DDLs are purely validation considerations, and I can't see how the public interface for any "core" function would need to differ much with which DDLs were supported by the validation subsystem.
yayahjb wrote:There is more, but I think you get the point -- a design goes better if you plan ahead.

Oh, I certainly get the point. Design is planning ahead, and that's just what I'm trying to do. I spent several years as a professional software architect for a profitable consultancy, and more moonlighting as a freelance designer, so I am no stranger to the concept. I am also, however, trying to make us as productive as possible by keeping our attention as focused at any given time as it is feasible to do. Moreover, I know from repeated, sometimes bitter experience that incremental design stands a far better chance of overall success than does trying to capture an entire system all at one go.

I hope that the limited scope of what I'm trying to do before attending to validation will allay your concerns. If not, then I'm sure I can trust you to continue to make those concerns known to the group.

jamesrhester
Posts: 39
Joined: Mon Sep 19, 2011 8:21 am

Re: Scope of the API - features

Post by jamesrhester » Tue Jan 17, 2012 11:34 pm

yayahjb wrote:As previously noted, I think failing to make allowances for validation in the initial design will greatly increased the difficulty in incorporating it later, while designing to include validation from the start costs very little and, when properly done can easily be turned off when efficiency or other considerations demand it.


I agree with Herbert insofar as I think it is worth keeping in mind the question "Would this change if we were validating?" as we move forward. I disagree with Herbert insofar as I think the answer is almost always "No" in a well-designed system.

Post Reply