Scope of the API - features

Post a reply


This question is a means of preventing automated form submissions by spambots.
Smilies
:D :) ;) :( :o :shock: :? 8-) :lol: :x :P :oops: :cry: :evil: :twisted: :roll: :!: :?: :idea: :arrow: :| :mrgreen: :geek: :ugeek:

BBCode is OFF
Smilies are ON

Topic review
   

Expand view Topic review: Scope of the API - features

Re: Scope of the API - features

by jamesrhester » Tue Jan 17, 2012 11:34 pm

yayahjb wrote:As previously noted, I think failing to make allowances for validation in the initial design will greatly increased the difficulty in incorporating it later, while designing to include validation from the start costs very little and, when properly done can easily be turned off when efficiency or other considerations demand it.


I agree with Herbert insofar as I think it is worth keeping in mind the question "Would this change if we were validating?" as we move forward. I disagree with Herbert insofar as I think the answer is almost always "No" in a well-designed system.

Re: Scope of the API - features

by jcbollinger » Mon Jan 09, 2012 9:40 pm

Thank you, Herbert, I appreciate your insight.
yayahjb wrote:Having made the mistake of doing a non-validating CIF API and then adding validation to it -- I would say the biggest issues start in the design of the lexer and parser which need to be designed to allow for recovery after an error, rather than a hard abort in order to facilitate reporting of meaningful multiple errors in each pass rather than driving the user nuts with one error at a time. This relates to the need to design in a simple and effective error reporting/logging mechanism. Those are actually fairly general software engineering issues for any language.

That's an excellent point, and one that I'm all too prone to overlook despite having run into it before, both in CIF and in general context. Truly, validation considerations magnify the importance those issues.

It still seems like we're not quite connecting, however, for it is my fervent hope that the group never has need to discuss the details of a CIF lexer in this forum. The design specifications I am trying to reach, at least initially, will be primarily for the functions and data structures that programs using the API will touch -- that is, the public interface. Logging and error recovery are relevant there, to be sure, but even if we did not address them before considering validation, I don't see there being enough specifications altogether to make additions and changes an onerous task at the point where I would like to take that up.
yayahjb wrote:For CIF in particular, the most important design feature is to provide dictionary support. This has a strong impact on the design of the API because the dictionary format is somewhat different from the data CIF format. Less critical for the small molecule community is the need to cope sensibly with mixed DDL1/DDL2 data -- even if you intend to treat such mixing as an error, the API works better for users if you plan the hooks to tell the user what they did, rather than producing cryptic aborts.

I am very much hoping that we can come up with a design that localizes the differences between dictionary formats to a smallish number of routines for each DDL, but that's beside the point at the moment. Dictionaries and DDLs are purely validation considerations, and I can't see how the public interface for any "core" function would need to differ much with which DDLs were supported by the validation subsystem.
yayahjb wrote:There is more, but I think you get the point -- a design goes better if you plan ahead.

Oh, I certainly get the point. Design is planning ahead, and that's just what I'm trying to do. I spent several years as a professional software architect for a profitable consultancy, and more moonlighting as a freelance designer, so I am no stranger to the concept. I am also, however, trying to make us as productive as possible by keeping our attention as focused at any given time as it is feasible to do. Moreover, I know from repeated, sometimes bitter experience that incremental design stands a far better chance of overall success than does trying to capture an entire system all at one go.

I hope that the limited scope of what I'm trying to do before attending to validation will allay your concerns. If not, then I'm sure I can trust you to continue to make those concerns known to the group.

Re: Scope of the API - features

by yayahjb » Mon Jan 09, 2012 7:43 pm

jcbollinger wrote:
What design-level difficulties might arise from addressing validation after designing at least a rough version of the some of the other, more universal API features?

Or else, how might an API design that addresses validation be distinguished from one that addressed only James's list of core features? What makes those distinctions difficult to add after the fact?

Having made the mistake of doing a non-validating CIF API and then adding validation to it -- I would say the biggest issues start in the design of the lexer and parser which need to be designed to allow for recovery after an error, rather than a hard abort in order to facilitate reporting of meaningful multiple errors in each pass rather than driving the user nuts with one error at a time. This relates to the need to design in a simple and effective error reporting/logging mechanism. Those are actually fairly general software engineering issues for any language. For CIF in particular, the most important design feature is to provide dictionary support. This has a strong impact on the design of the API because the dictionary format is somewhat different from the data CIF format. Less critical for the small molecule community is the need to cope sensibly with mixed DDL1/DDL2 data -- even if you intend to treat such mixing as an error, the API works better for users if you plan the hooks to tell the user what they did, rather than producing cryptic aborts.

There is more, but I think you get the point -- a design goes better if you plan ahead.

Re: Scope of the API - features

by jcbollinger » Mon Jan 09, 2012 7:04 pm

yayahjb wrote:As previously noted, I think failing to make allowances for validation in the initial design will greatly increased the difficulty in incorporating it later, while designing to include validation from the start costs very little and, when properly done can easily be turned off when efficiency or other considerations demand it.

Indeed you did say so previously, but you did not respond to my request for elaboration. In particular:
jcbollinger wrote:What design-level difficulties might arise from addressing validation after designing at least a rough version of the some of the other, more universal API features?

Or else, how might an API design that addresses validation be distinguished from one that addressed only James's list of core features? What makes those distinctions difficult to add after the fact?

Lest there be any confusion, by a "design" I mean roughly function and data type form and behavior specifications, including function prototypes or an equivalent, but excluding implementation code. If we would indeed be risking later difficulties by holding off on validation considerations then I surely want at minimum to understand the risk. I'm not seeing it, however, so please enlighten me.

Re: Scope of the API - features

by yayahjb » Mon Jan 09, 2012 3:47 pm

As previously noted, I think failing to make allowances for validation in the initial design will greatly increased the difficulty in incorporating it later, while designing to include validation from the start costs very little and, when properly done can easily be turned off when efficiency or other considerations demand it.

Re: Scope of the API - features

by jcbollinger » Mon Jan 09, 2012 3:15 pm

jcbollinger wrote:Are there any objections?

I take most of a week of silence as the absence of objections. I will shortly open one or more new topics dedicated to requirements for the "core" features, and to the extent we can reasonably do so, we will defer discussion of validation details.

Re: Scope of the API - features

by jcbollinger » Tue Jan 03, 2012 7:16 pm

jcbollinger wrote:That's a fair consideration, but I think it impacts the API implementation more than the design.

I realized immediately after I wrote that that there are two separate questions here:
  • I offered the possibility that perhaps validation could be addressed via a companion API, which clearly has implementation considerations.
  • James, as I read him, merely suggested that we focus first on the design of the core features, from which he excludes validation.
I think we can follow James's suggestion while reserving judgement on the other. Indeed, that approach may put us in a better position to decide, later, how validation would best be incorporated. Are there any objections?

Re: Scope of the API - features

by jcbollinger » Tue Jan 03, 2012 6:59 pm

yayahjb wrote:Validation is difficult to add later if it has not been provided for in the initial design. If the concern is efficiency, I would suggest designing on the basis of a validating parser and then providing the option of a bypass of the validation for efficiency.

That's a fair consideration, but I think it impacts the API implementation more than the design. Am I mistaken? Before we expend a great deal of energy on what appears to me to be a procedural question, I would like to have an idea of the stakes.

What design-level difficulties might arise from addressing validation after designing at least a rough version of the some of the other, more universal API features?

Or else, how might an API design that addresses validation be distinguished from one that addressed only James's list of core features? What makes those distinctions difficult to add after the fact?

Re: Scope of the API - features

by yayahjb » Fri Dec 23, 2011 10:22 pm

The most difficult cases to handle without a dictionary that I encounter are unquoted string of digits with leading zeros and embedded pluses hyphens. These could be intended as numbers or as serial numbers in bibliographic context or as symmetry operations. Having the dictionary type specified greatly reduces possible confusion in parsing them. However, the point is not whether you or I have particularly troubling cases, but whether the design of the API will allow the API to support a reasonably wide range of application developers, some of whom may be new to CIF and may rely heavily on the API to help them avoid mistakes, and some of whom may be old hands with very clean data and very limited need for support from the API. I believe a good API should support both.

Re: Scope of the API - features

by rjgildea » Fri Dec 23, 2011 8:49 pm

yayahjb wrote:However, there are also CIFs for which parsing without a dictionary can be difficult (e.g. due to confusion between strings and numbers).


Do you have an example where this is the case? I have yet to see a CIF that can't be parsed using only the formal definition of the syntax? Interpreting the content is another matter, however I rarely find that programmatic recourse to a dictionary is necessary even for that.

Top