Why you should consider Pair / Mob programming (as a Data Scientist)

By Daniel Durling (Senior Data Scientist) 

At Crimson Macaw we follow an agile working style (see our previous blog post for more info). The principles of agile were written for programmers working to deliver software. Now before the Agile Manifesto was written, many people had tried to address the problems of traditional “waterfall” development, including Scrum and Extreme Programming.

Extreme Programming introduced me to the concept of pair programming. This is when two programmers work together on a single problem, crucially with one keyboard and one monitor. This allows both parties involved to learn from each other whilst working on a real problem. It also fosters important soft skills around communication and asking questions. Every decision made is reviewed by someone else, giving the code produced an instantaneous code review. I have found this leads to fewer bugs in production code.

Working in this way is very intense for both parties, but I have found this can yield impressive results (as it should do, being twice as expensive in terms of people on a single piece of work).

Mob programming or mobbing takes this concept from the pair and builds it up to the team level. The benefits (and potential issues) of pair programming are magnified with the more people involved in working on a single problem.

Data Science

Data Science problems can certainly benefit from being tackled in a pair / mob programming way. I have come to realise that Data Science can mean different things to different people, but if we consider this model of what Data Science includes:

data science lifecycle

I would say at any of these stages you can benefit from working through a problem with another human being. Of course, if a human is not around, you can always try working alongside a rubber duck but I digress… 

I have found the further towards the left-hand side you are in the above image, the more benefit there is to working with more than one person. In a more linear project, having the whole team making decisions and (crucially) understanding those decisions allows for individuals to work effectively on their own as you move along the data science journey. Therefore, if it works for your team, work as a team to get the data in and tidy, then split into pairs for the understanding virtuous circle, being as individuals’ communicate the results / issues to the rest of the team. 

In modern hybrid working environments pair programming helps us build a sense of camaraderie when we are not all in the same room all that often. Here at Crimson Macaw, we are advocates of hybrid working, allowing for office, home and remote based working with Pairing / mobbing makes us communicate with our colleagues, allowing us to maintain a connection as though we are all in the same room.  

How we pair / mob within the Crimson Macaw Data Science chapter 

The ways in which we work are always being reviewed to see if we can improve them, but currently when we pair or mob we mean: 

  • One person holding the keyboard (they are in control of typing) 
  • Another person leading the conversation and instructing the typist
  • Other members of the Data Science chapter who are in attendance have the role of supporter; it is their role to support the other more active roles 

This pattern then holds for 25 minutes (this timing is inspired by the pomodoro technique). After 25 minutes we take stock and see if we are happy to end the session now (and return to pursuing alternative work) or if we would benefit from another session. If another session is needed, we take a 5-minute break and then go again. When returning we make sure that the roles (keyboardist, conversationalist, support) are all changed. 

Benefits of the pair / mob approach 

  • Flattens structure: By involving all our data scientists (no matter their seniority) and by having different people hold the keyboard and leading the ideas, we remove the idea of deference to the senior involved. 
  • Pride comes before a fall: Working with someone else forces you to quickly state if you do not know how to do something and need some help (be it from the keyboardist or from the conversationalist). This can be a humbling experience, but that is no bad thing, especially when it leads to better outcomes. 

Concerns about the pair / mob approach 

Expense: Assigning more than one person to work on a ticket can seem as if you have increased the cost of the work. I would argue that the benefit of working this way outweighs this, however your milage may vary.  

Experience: It is important for senior members of the team to lead in this space. If they do not take part, or if they take part and demand work is done the way they want it to with no discussion, you will not reap the benefits of pair / mob programming. I would counter this though by saying this is an issue with any way of working, because at its heart this is a personnel issue, not a working practises issue. 

Conclusion  

I wanted to write this post after speaking to fellow Data Scientists who where not familiar with these software development techniques.   

Why not give this way of working a try? If you do, please reach out and let me know how you get on here.