Speeding up software prototype development with code re-use

One of the most common software development tasks for me is the creation of a prototype. Prototypes can be seen as a kind of discardable software, the kind with short life cycle and a heavily restricted set of design goals.

I tend to create a prototype whenever I’ve recognized some part of the software as mission-critical and I’m unsure whether or not I’m able to implement it. Sometimes a prototype is needed as a proof-of-concept to ensure someone else. In any case, prototypes are used as tools to answer a research question.

Qualities of prototypes

Research questions typically arise in an unpredictable fashion and often the answers to those questions make or break the project or at least part of it. Unsolvable problems need to be circumvented before expending serious development effort on them. These use cases demand that the prototype is developed quickly, produce the result and then discarded.

The prototype can be created as a fork of an existing project. Continuity of the project need not be addressed. Code quality is not of importance, nor future compatibility with other developments. Prototype does not usually need to be user-friendly. It’s entirely possible that the sole user of the prototype is the programmer, and no user interface needs to be created.

Abandoning most of the rules for good programming allow the developer to take a shortcut to the result. The untidy code will rapidly deteriorate, become unreadable and the details of it will be quickly forgotten. The re-use value of such code is exceedingly low.

Requirements for rapid prototype generation

Prototypes need to heavily rely on existing code to shorten the development cycle. Rapid development requires that the programmer is adept in many libraries, concepts, and tools that the prototype will utilize. As a programmer can’t be adept at everything, this requires that the prototyping tools be devoted to an appropriately sized problem area.

Rapid prototype development is enabled by easily usable, high-level access to parts of software which have been developed earlier. One could say that a good prototype starts from a nearly finished program which is generic enough to be malleable into many kinds of software in the problem domain, but specific enough to support only the required high-level operations.

The contents of the generic prototype program are found experimentally. If some kind of software functionality has to be implemented twice, it might be better to implement it in the prototype root the second time and to write a high-level access to that functionality. This way the root prototype from which the other prototypes are derived grows in functionality, enabling shorter development cycles for future prototypes.

For example, there’s one party who provided data in the form of CSV files, sometimes in the range of hundreds of megabytes. Standard libraries were pretty slow in reading these files, and as the program was run pretty often, the delay became a limiting factor in productivity.

The first piece of code had an easily usable functionality to read files, and sloppily written functionality to read that particular format of CSV. The second time a prototype was required for a similar purpose, the file reading functions could be moved to the permanent part of the software, and general purpose CSV reader could be written.

The third time a similar need occurred, and I noticed the CSV files supplied at the start of the project never changed during the project. I then wrote a routine to dump an array of CSV row objects to disk and read them from there, cutting object initialization time by a huge amount. The objects had densely packed data which was initialized at the speed of the hard drive. As the hard drive was cached, the initialization of half a gigabyte of objects could be handled in about a second.

The object data dumping/reading procedure isn’t generic yet but it’s specific to the implementation I required of it the last time. If I were once again creating a prototype which relied on the same kind of CSV files, I would make the dump/read procedure generic and easy to use. This way I would be re-using old code and making more and more of the code re-usable in the future, further cutting down prototype development time.

Improvements to the prototype root can be delayed until there’s enough time. The prototype development cycle then has two stages, one where the developer quickly re-uses old code and writes new disposable code, and the other where one develops the re-usable codebase. This enables fast reaction time to development needs for most, if not all parts of the development cycle.

Code re-use examples

This one time I had to write a tool to convert a BMP texture image to DDS format which is used by the graphics cards. The final tool for that purpose had functions to convert BMP to DDS, and functions to convert from DDS to BMP.

The second time an image converter was required, one of the supported file formats had to be BMP. I didn’t foresee the need for image formats with more than 32 bits per pixel, so I specified an internal packed pixel format which stored all the pixels as a 32bit word and included a small header for each image. I then modified the converter to perform all conversions through that intermediary format. Adding extra formats became easier than ever.

Another task required to make an estimate of bivariate probability distribution based on some sample. So I wrote a quadtree routine to add points to a plane and to find the points inside a range of values. I then learned of a superior form of function estimation which required to find the nearest samples in any direction. I abandoned the quadtree routine as useless and wrote a routine for finding nearest neighbors.

The second time I needed a nearest neighbor routine I realized the NN-methods have many applications. The old routine was quite slow and specific to that application, so I recognized the need for a re-usable, well-performing NN routine. The time didn’t allow for a perfect implementation, but I got the usability right, and although there were performance issues, my library of re-usable routines grew by one more implementation.

CURRENT code re-use goals

My current code for handling data puts the data in one type of data structure, like the Nearest Neighbor format, or in spatial hash format. There isn’t functionality to update the other structures when one structure changes. Writing this code to update the various structures are starting to take some time, so I’m imagining a common data structure format where items may belong to several data structures and all are updated at once when one item is updated.

It seems sensible to rewrite the routines to follow this format, as the tasks of writing conversions and updates between the formats have become burdensome. This rewrite can be scheduled for idle times when there’s no rush for a specific goal in the immediate future. The performance of the data structure library might suffer a bit, but the gains in development productivity will make it up.

I’m calling an end to data visualization with Java. The performance without DirectX is just too low for real-time animation of particle physics and other animation, and I’m already using DirectX in another language, C++. The future data visualization library will be written in C++ and it will use DirectX11 for rendering.

It will be unnecessary to work with the whole functionality provided in the DirextX11 API, so I’ll write routines which initialize the windows classes and other data structures to common settings I wish to use in all my data visualization prototypes. Other low-level functionalities will be hidden behind an abstraction layer as well. If the need arises, the low-level details can always be manipulated later. They will remain in the same codebase.

SUMMARY

Huge chunks of code can be easily re-used when the abstraction is taken to the extreme. The use of the abstraction will be rewarded when that abstraction can be re-used without touching anything that lies below the abstraction layer. The use of the abstraction layer will prove to be an impediment to productivity when the code below that layer has to be refactored and reworked.

In the repeated rapid prototyping model there’s the underlying generic prototype which holds most of the tools the prototype developer will use in an accessible format. The generic prototype is developed carefully with ease of use, performance and code quality in mind.

When a rapid prototype is needed, the generic prototype project – the main branch – is forked. The expected lifetime of that branch is low, so maintainability of that code need not be considered. The programmer can take advantage of the usable, well-performing functionality of the base code, but he need not maintain the code quality. Disposable, one-time-use programs can be quickly created.

New directions for development of the codebase will be revealed when creating the disposable prototypes. Code segments and functions which might have further use may be integrated into the main branch if the code quality is good enough for recycling. Future needs may also be predicted and new abstractions and functionality are written in the main branch.

Codebase development and maintenance can be performed separately from the prototype development. One working version of the codebase can be maintained at all times, and rapid prototypes can be derived from it. The prototype development time and the reaction time are shortened by keeping the main branch base code in good condition and treating the prototypes as disposable forks.

Leave a Reply

Your email address will not be published. Required fields are marked *