TIP #129: NEW FORMAT CODES FOR THE [BINARY] COMMAND ===================================================== Version: $Revision: 1.5 $ Author: Arjen Markus Torsten Reincke State: Final Type: Project Tcl-Version: 8.5 Vote: Done Created: Friday, 14 March 2003 URL: https://tip.tcl-lang.org129.html Post-History: ------------------------------------------------------------------------- ABSTRACT ========== This TIP proposes to add a set of new format codes to the *binary* command to enhance its ability to deal with especially non-native floating-point data. The assumption is that current limitations are due to the distinction between little-endian and big-endian storage of such data. INTRODUCTION ============== The current *binary* command can manipulate little-endian and big-endian /integer/ data, but only /native/ floating-point data. This means that binary data from other computer systems that use a different representation of floating-point data can not be directly handled. The lack of format codes to handle "native" integer data means that one has to distinguish the current platform's byte ordering to be platform-independent, whenever the *binary* command is used. Most current computer systems use either little-endian or big-endian byte order and the so-called IEEE representation for the exponent and mantissa. So, the main variation to deal with is the endian-ness. Some popular file formats, like ESRI's ArcView shape files, use both types of byte order. It is difficult (though not impossible) to handle these files with the current set of format codes. It should be noted that there /is/ more variety among floating-point representation than just the byte order. This TIP will not solve this more general problem. PROPOSED FORMAT CODES ======================= Format codes should be available to catch the two main varieties of byte ordering. There should, both for reasons of symmetry and for practical purposes, also be a full set to deal with "native" data. For integer types there are no codes to deal with native ordering. So: * *t* (tiny) for short integers, using native ordering. * *n* (normal) for ordinary integers, using native ordering. * *m* (mirror of "w") for wide integers, using native ordering. The floating-point types will be handled via: * *r*/*R* (real) for single-precision reals. * *q*/*Q* (mirror of "d") for double-precision reals. where the lower-case is associated with little-endian order and the upper-case with big-endian order. IMPLEMENTATION NOTES ====================== The implementation for the integer types is simple: * The new format codes are synonyms for the current ones, but different ones for each endian-ness. The implementation for the floating-point types is somewhat more complicated, this involves adding byte swapping, if the ordering of the platform does not correspond to that of the format code. COPYRIGHT =========== This document is placed in the public domain. ------------------------------------------------------------------------- TIP AutoGenerator - written by Donal K. Fellows