The great founder of modern mathematics – Muhammad ibn Musa al-Khwarizmi – would surely have been delighted with spreadsheets.  It was he who first defined the idea of an ‘algorithm’ (an adaptation of his Latin name Algorithmi), but he never formalised his work into the x + y = z notation with which we are familiar today.  Instead he referred to the ‘root’ ‘square’ ‘thing’.   Spreadsheets have cells which do not heed to have names, and are mostly referenced by their position(e.g A1), so the calculations in a spreadsheet can represent the algorithm in it’s purest form.  The problem with spreadsheets is that the only way of applying the algorithm to more than one set of data is to copy it many times, then, copy the whole sheet

Without this flaw, the spreadsheet should be the ideal way of expressing complex algorithms, and it should thus be at the heart of any methodology that requires more than basic data manipulation.   But it is not used for work requiring a lot of data, entirely because the preferred method of storing data in computers is the relational database, and spreadsheets cannot interface simply with relational databases.  Spreadsheet data is inherently ‘messy’ while relational databases are ‘clean’ insofar as they impose a structure on data that is convenient for computers.

But what if we throw away the relational database and devise a different way of storing data that is more convenient for spreadsheets – and for those using them?  Then we can work out a new programming paradigm:

  • Source data:  Messy data sits in spreadsheets without formulae – high volume data still sits in tables.
  • Processing: : Individual spreadsheets are created to express the algorithm  – data is loaded from the sources , processed and saved away to be available to report spreadsheets
  • Output:  processed data is loaded into workbooks.

Neither the source sheets nor the output sheets need have formulae, though they will probably have a small number of formulae.  Tracking data through the system is not done by relating cells to cells – which, after all, is meaningless – but by tracing named ‘values’ through the system where those values have an ‘audit’ attached which shows their ‘provenance’

Here’s an example.

I want to create a project management system.   I have a number of projects to complete; each project is composed of ‘activities’ which may or may not overlap; each activity requires ‘resources’ – personnel or machinery.  My job is to ensure that no project gets held up because a resource is not available.  The problem is to define the start points of each project and the start point, resources required and duration of each activity within the project.

The preferred methodology for solving this problem would be to accumulate all the source information in a workbook, then be able to press a button to process the information and receive as output a workbook highlighting any potential conflicts.  I should then tweak the source data by moving start points, or assigning resources differently, then reprocess the data to see if the conflicts were resolved.  With iterations, I could find the best solution.

With our methodology, that’s exactly what you can do!


Comments are closed.