Any and all comments whether nitpicking or general are welcome and indeed necessary. I will edit this post as the comments come in and try to flag that edit in this top comment. If it all gets too confused I'll start a GitHub wiki page and we can all pile in.
I expect that all of the below can be easily implemented through SQLite, indeed many of the functions below are trivial SQL calls. I have already a complete (in terms of the requirements) CIF implementation in Python using SQLite for the datastructure which I'll be releasing to the community as soon as I get licensing sorted out with my workplace.
File-level operations
Code: Select all
ciftype * open(FILE * filename)
Create a ciffile structure based on the contents of filename. The returned pointer is to be used for all interactions with the library. Note that we are hereby taking responsibility for memory management of the ciffile structure. The ciftype structure is opaque.
Code: Select all
ciftype * create()
Create a new ciffile structure
Code: Select all
void write(ciftype * ciffile, FILE * filename)
Write the contents of ciffile to filename. Note I have taken the road of allowing the OS to provide the output stream, to allow maximum flexibility.
Block-level operations
Code: Select all
int create_block(ciftype * ciffile, char * blockname, int parent)
Create a new datablock with name blockname as a parent of parent. If parent is 0, the block is a top-level datablock. Otherwise it is a save frame. The returned integer is a unique identifier for this datablock.
Code: Select all
int delete_block(ciftype * ciffile, int blockid)
Remove the block identified by blockid from the file identified by ciffile.
Code: Select all
int get_block_id(ciftype * ciffile, char * blockname, int parent)
Return the unique block id for the given blockname with given parent. Note that blocknames need only be unique within their enclosing block.
Code: Select all
char * get_block_name(ciftype * ciffile, int blockid)
Get the blockname for blockid.
Code: Select all
int count_blocks(ciftype * ciffile, int blockid)
Return the number of child blocks in block identified by blockid. If blockid = 0, gives number of datablocks in the ciffile.
Code: Select all
int * get_blocks(ciftype * ciffile,int blockid)
Return an array of all blockids that are direct children of blockid. If blockid = 0, gives an array of datablocks, otherwise it will represent the save frames.
Data item query operations
Code: Select all
int find_name(ciftype * ciffile, int blockid, char * dataname)
Return KVPAIR if the dataname occurs as a key-value pair, INLOOP if the dataname occurs in a loop, and 0 if absent
Code: Select all
bool has_name(ciftype *ciffile, int blockid, char * dataname)
Return true if the dataname occurs anywhere within the given block. Note that this and the previous call do not search the contents of nested save frames.
Code: Select all
char * get_item_as_string(ciftype * ciffile, int blockid, char * dataname)
Return the string representation of an item's value. We undertake not to destroy memory for this string while ciffile remains open and this dataname is defined.
Code: Select all
double * get_item_as_float(ciftype * ciffile, int blockid, char * dataname)
Return the representation of the item as a pair of real numbers, if possible. Position zero is the number itself, and position one is the esd. If esd is missing, esd will be negative. If there is no numerical representation of the number, NaN is returned in position zero.
Code: Select all
datatype * get_item(ciftype * ciffile, int blockid, char *dataname)
Return all information about the item in the datatype structure. This is notionally a 4-entry structure containing the string representation, two numbers for the numerical representation, and an int to tag the type as UNKNOWN, NULL, or NUMB/CHAR. See the end of the post for methods of accessing this structure.
Loop operations
Code: Select all
int count_loops(ciftype * ciffile, int blockid,bool include_kvpairs)
Return the number of loops in ciffile. If include_kvpairs is true, the one-row loop containing all key-value pairs counts as a separate loop. Otherwise, it is ignored.
Code: Select all
ciflooptype * get_loops(ciftype * ciffile, int blockid)
Return an array of pointers to loops in the block, with length as given by the previous command.
Code: Select all
ciflooptype * get_loop_by_dataname(ciftype * ciffile, int blockid, char * dataname)
Return pointer to loop containing dataname.
Code: Select all
int get_loop_length(ciflooptype * loop)
Return the number of packets in the loop. For use in conjunction with the following functions.
Code: Select all
datatype** get_loop_item(ciftype * ciffile, int blockid, char *dataname)
double** get_loop_item_as_float(ciftype * ciffile, int blockid, char * dataname)
char** get_loop_item_as_char(ciftype * ciffile, int blockid, char * dataname)
Return the contents of a looped dataname as an array of values.
Code: Select all
cifpacket * start_loop_iteration(ciflooptype * cifloop)
Get a pointer to a loop packet, where the loop is identified by the opaque loop pointer; the packet pointer can be used to iterate over packets (see below). The contents of the packet will be destroyed once it is used in one of the calls below, so contents should be copied before getting the next packet.
Code: Select all
cifpacket * get_next_packet(cifpacket * packetptr)
Get the next packet from the nominated loop. If no more packets are available, a null pointer is returned. See below for handling routines.
Code: Select all
cifpacket * get_matching_packets(ciffile * file, int blockid, cifpacket * conditions)
Get all packets where the values match those in the conditions packet. get_next_packet() will return further packets where more than one exists. If no packets match, a null pointer is returned.
Data value construction routines
Code: Select all
set_string_item(ciftype * ciffile, int blockid, char * dataname, char * value)
set_numb_item(ciftype * ciffile, int blockid, double value, double esd)
set_data_item(ciftype * ciffile, int blockid, datatype * value)
Set dataname to a value.
Code: Select all
ciflooptype * create_loop(ciftype * ciffile, int blockid, char ** datanames)
Create an empty loop containing datanames and return a handle for the loop. It is not
an error to pass a null pointer for datanames, in which case an empty loop is created
suitable for adding columns.
Code: Select all
add_column(ciflooptype * cifloop, char * dataname)
Add an empty column to the loop. If other datanames are already present, the column
will be initialised with values of UNKNOWN.
Code: Select all
add_packet(ciflooptype * cifloop, cifpacket * packet)
Add a packet to the loop. The contents of packet are copied.
Code: Select all
cifpacket * get_packet_template(ciflooptype * cifloop)
Get a template for adding packets to this loop. See below for packet handling routines.
Code: Select all
add_numb_column(ciflooptype * cifloop, char * dataname, float * values, int len)
add_char_column(ciflooptype * cifloop, char * dataname, char ** values, int len)
add_data_column(ciflooptype * cifloop, char * dataname, datatype ** values, int len)
Add a column of data. If the loop is not empty, the length of the data should
correspond to the length of the data already in the loop, otherwise an error will be raised.
Code: Select all
delete_column(int cifloop, char * dataname)
Remove the column from the loop. It is not an error to remove the last column or a non-existent column.
Code: Select all
delete_dataname(ciffile * file, int blockid, char * dataname)
Remove a dataname from the datablock. It is not an error to remove a non-existent dataname.
Code: Select all
delete_packets(cifpacket * packet)
Delete all packets matching packet.
Packet handling routines
The cifpacket structure is attached to a specific loop and created using the loop as an argument (see above).
Code: Select all
char * get_char_value(cifpacket * cp, char * dataname)
double[2] get_numb_value(cifpacket * cp, char * dataname)
datatype * get_data_value(cifpacket *cp, char *dataname)
set_char_value(cifpacket * cp, char * dataname, char * value)
set_numb_value(cifpacket * cp, char * dataname, double value, double esd)
set_data_value(cifpacket *cp, char *dataname, datatype * value)
Get or set values in a packet.
Code: Select all
int count_columns(cifpacket *cp)
Return number of columns in the packet. For use with the following routine
Code: Select all
char ** get_column_names(cifpacket *cp)
Return the datanames in this loop.
Datatype handling routines
The datatype structure is the most general way of representing CIF values, as it can represent all CIF types, in particular NULL and UNKNOWN. Applications would generally use direct setting of columns as CHAR or NUMB for efficiency unless there are some UNKNOWN or NULL values in the columns.
Code: Select all
datatype * create_datavalue()
void set_numb_value(datatype * data, double value, double esd)
void set_char_value(datatype * data, char * value)
void set_unknown(datatype * data)
void set_null(datatype * data)
double[2] get_numb_value(datatype * data)
char * get_char_value(datatype * data)
bool is_unknown(datatype * data)
bool is_null(datatype * data)
Get and set values in the datatype. This is clearly extensible to CIF2.0 compound structures as well.