This is the first post in a series exploring programming practice in cognitive modelling. While I have over a decade of experience in cognitive modelling, I am in no way an expert and have no formal computer science training. I am not an authority on this, I am feeling my way. Comments welcome. This post may well be updated over time.
Typically, there is a lot of excitement and enthusiasm at the start of any project, and this translates into wanting to get the code written as quickly as possible. This coding stage can be fun and involve rapid iteration of ideas. This exploratory phase is great, and a lot of progress is made quite quickly. But without a little bit of planning then technical debt can accumulate. This fun term refers to how code can become gnarly, complicated, unclear, and error prone, which results in the initially fun coding process becoming a horrible experience. The aim of this series of posts is to explore some lessons learnt, specifically in cognitive modelling. Perhaps I can consolidate my experience and become a better coder, and if anyone else benefits from this, then great.
In an ideal world
Firstly, what we want is to be able to rapidly develop and iterate code. Often we will not have a concretely defined model that we wish to code up – most of the time we are developing models and ideas in conjunction with coding. Perhaps this is bad practice, but I’ve found that working simultaneously on the idea and implementation levels is difficult to avoid.
We don’t want to spend a long time writing code to implement a cognitive model which then turns out to be silly. There is no problem having ideas that do not pan out, but spending a lot of time on that is not so good. We will also want to test out new ideas, variations of a model, parameter estimation procedures, or data plotting methods. Visualisation of data, parameter estimates and model predictions is particularly important, it will help guide the rapid development of our cognitive model.
What gets in the way?
Problem 1 – Mess & confusion
As one’s thinking and code develop, I find that the contents of my code folder become horrific. It is full of many functions, often with entirely unencryptable names. If I take a few days break from the code then I often have no idea what I did. It takes a long time to reconstruct what I was thinking, what files are old or current, and what magical combination of functions I need to call in order to make things work.
Problem 2 – Inefficiency and time wasting
The argument of technical debt is that if you dive straight in and just get coding, then your approach will suck and you will accumulate technical debt. This debt increases over time and drags you down, resulting in sadness and the eventual need to rewrite your code from scratch.
It took me years and years to even notice that I had a particular pattern of programming that was incredibly inefficient. I used to embed data visualisation code amongst the code that did the computation. I think this started as a way of double checking my code was working correctly, if I can see the results as I expect them to be as the code executes then I can see if I’ve avoided mistakes. While this is not necessarily a bad idea, having the data plotting code amongst the computation code is a silly idea. This means that in order to visualise a minor change to how data or model fits appear, then I have to re-run the entire code which can often involve waiting for time consuming computations to complete.
There are many ways to code inefficiently, and I may well add to this list over time.
Next up
This post has introduced a few problems and frustrations that I’ve encountered. Future posts will explore bite-sized solutions and recommendations.
Hi, I am looking forward to the future parts – curious about where you are heading. I have three things I try to keep for better organisation (and I usually lose them all on the way): 1) fixed template for a project (what to put where), 2) unit tests, 3) documentation
What really helps me, however, is to create a clean script reproducing the final results for a paper (and a new one for each iteration of review process).
Hi. I’ll start with some basic yet important issues in organisation, working with Github and Drobox. Then move on to some more interesting topics like setting up templates with which to approach modelling, then move on to using an object oriented approach. I agree that a script to reproduce the full process is important. Posts will happen as and when my teaching/admin/research allow 🙂