6 Reasons Why I Think Agile Data Science Does Not Work

6 Reasons Why I Think Agile Data Science Does Not Work

Originally published here

For a short video of this article go here

And why I would go find another company to work for

Data is the new oil and agile methodology is the best way to extract this valuable resource? Agile methodology for data science projects works in theory, but not in reality, in my humble opinion. The agile approach has been popularized by the success of agile software development and it relies on short-term deliverables aka sprints that allow teams to show progress frequently and adapt quickly. However, when we need to do research or explore data for unknown insights, the agile methodology does not work because we cannot predefine nor schedule these activities with certainty.

1. There is no clear start or end to the project

When we talk about agile methodology, it’s difficult to understand what is exactly agile. There are some aspects of agile that work well for data science projects but some do not. The main problem with agile project management in data science is the lack of a clear start and endpoint. Usually, there isn’t even an idea of what the final product should look like at the beginning of the agile project. As we iterate through agile sprints, we produce working pieces of data science code. This is great but it does not always lead to the finished product that was desired at the start of agile development. Perhaps, defining the final deliverable for customers would be a good agile practice to adopt but that again is very difficult to do using the agile methodology.

While I agree that agile can work well for small projects, it can lead to major problems with large data science projects that have an indeterminate scope. On the other hand, the waterfall methodology is more suitable for these types of business intelligence or data science projects because it has defined start and end points unlike agile which does not.

The agile methodology is a great way to manage smaller projects and data science teams that do not have the resources for large-scale software development. However, if an organization has big ambitions in terms of implementing agile methodologies for their data science project management, they should be prepared for some challenges along the way.

2. The process of iterating through a backlog and prioritizing work can be difficult

Agile data science projects require a lot of flexibility. The agile data science process often includes an agile backlog that teams work through to prioritize what needs to be done next. Picking the right items from this list can be difficult and requires some experimentation as you learn more about your problem domain and potential solutions. I acknowledge that there is a variety of projects ranging from very defined application development types of projects to data-driven research projects. For the agile data science process to work, I am talking more about projects with unclear outcomes and paths to get there. Projects that I have been working on often require me to do a lot of research (i.e. getting a clear understanding of the business problems, exploring potential data sources, exploring previous related work that was done internally or by others in the industry, and etc.). This means that getting to the final outcome can be uncertain and usually has some amount of risk associated with it let alone getting a consensus on the meaning of “done”. Thus, iterating through a backlog and reading through never-ending user stories to an unclear endpoint does not make a lot of sense.

3. It’s hard to estimate how long it will take to complete a task

Building on the previous point, using the agile methodology for data science projects is notoriously difficult to estimate how long it will take to complete a task. From my experience, data science projects represent problems that haven’t been solved before. That means there’s no way to know how long it will take until you’ve tried a few different approaches and compared results. Considering the uncertainty around the solution and the pathway to get there, the agile methodology might not be the best approach to solve difficult problems that us data scientists face every day.

4. A lot of time is spent on meetings, which means less time for working on actual tasks

I know I am bugging the heck out of my wife listening to my daily standup over MS Teams (both of us are working from home during this lockdown in Sydney Australia). Simply too many meetings. Daily standups, end-of-sprint showcases, reflections, sprint planning, and frequent catch-ups with stakeholders to gather their ever-changing requirements and project priorities are all part of agile methodology for data science projects. It is a time-consuming process that sometimes can discourage us from continuing on working on our tasks. I think agile methodology for data science projects is not the best way to go especially considering the level of concentration we need to sort through multiple datasets from multiple sources without a clear pathway to get to the final desired outcome.

5. The lack of documentation may make it harder for new team members to get up-to-speed with what has been done so far in the project

Many of us don’t work alone, we work in a data science team. Or maybe you work with a data engineer and data analysts who are not in your team. As much as I learned about the benefits of agile methodology and the “proper” way to implement agile process and etc., in practice I have witnessed how it actually gets implemented. I have seen too many times agile teams not following through with documentation. I have seen agile teams dumping everything in email, JIRA (issue tracking system), Confluence (documentation management software) without proper tagging and naming conventions.

I know it sounds obvious but we should have a way of sharing what we learn from these projects, so that the next person isn’t blindsided by unfamiliar code. It also helps us stay accountable if something goes wrong and others can pick up where you left off easier.

6. I don’t like agile methodology because it doesn’t allow me to explore my creativity at work as much as other methodologies do

I have to admit that I get a lot of lead way when it comes to my work. The autonomy I have is a blessing but it can also be a curse because I am free to experiment as much as I want. In agile methodology, the client puts together a list of deliverables that they expect from me and my team at the end of each sprint. The problem with this for data science projects is that there are so many variables involved in these of projects that I do not know what the deliverables might be until we get there. This limits my exploration and creativity at work because agile methodology is so linear.

I am used to working in a creative bubble where I can experiment with ideas without having too much oversight into them which agile methodology does not allow for data science projects . It doesn’t matter what type of data science project I am working on, there will be a large part that is open to creative exploration and agile methodology does not allow for this.

I do believe that the experiment aspect assumes that you test and find ways in which they do not work. This means that agile methodology assumes that the problem has been solved already and don’t need exploring. This is not the case for most of my work so agile methodologies does not fit well with them.

This blog post discusses how agile methodologies do not always work in reality. Data science requires a lot more creativity than other types of projects in my opinion. It doesn’t give me freedom and wiggle room to explore and test as much as other types of project management methodologies. That’s why I would likely to call my recruiter back when I am forced into do Agile Data Science.

I understand that your experience and the organizational culture that you work in would have a huge impact on this topic. Perhaps you are a data scientist, leading a data science team, product owner, scrum master, or an executive. I would love to hear your thoughts in the comments below. Please share your experience and would love to hear about your positive experience especially!!