Despite the rosy predictions often made by consultancy firms1, there are still many major hurdles when it comes to making money in data science and AI. It takes a significant amount of effort for a conventional industry sector to explain the need for AI to prominent stakeholders as many established sectors already have a sound customer base, are financially stable, and do not want to explore unless they have to. This leaves data science entrepreneurs who want to leverage AI and machine learning faced with much resistance.
In this blog, I will discuss a methodology that fuses Design Thinking (DT) and System Engineering (SE) that is an effective flow to incorporate agility in data science-oriented projects and foster disruptive ideas, all while tightly managing the development phase. By gathering pain points early on and inviting feedback on possible prototypes, this framework can help increase your next project’s chance of success.
The Four Phases of Design Thinking
The typical Design Thinking cycle involves four major phases: Empathize, Ideate & Define, Prototype, and Test.
Figure 1: A DT flow for the domain of data science and AI
The first step is to empathize with the end users and endeavor to understand their pain points. For example, in one project that I was running with some of my students, we were exploring how to make shared taxi services in Cape Town safer. After a few rounds of initial informal discussions with daily commuters and taxi drivers, one of the pain points we discovered was that some taxis speed on empty roads in not-so-safe suburbs to avoid taxi-jacking. So one solution that we explored was to use a simple interactive mobile application to suggest a route with a higher number of vehicles to the driver (so to avoid empty routes). During this phase, in addition to exploring the pain points, we also had to empathize with possible challenges that the end users might have. For example, in some countries a mobile phone application solution may not work due to a lack of inexpensive mobile data.
2. Ideate & Define
In the next stage, we need to ideate as a team, unpack the interview results, and try to converge on defining a set of initial needs. From a data science perspective, this also means that we need to decide on what data to use and how it should be collected.
Now, it’s easy to assume that we all know what the meaning of “data” is, but there is no standard definition. I have personally found the DIKW pyramid helpful, as it suggests that initial “data” is used to generate “information” which in turn generates “knowledge” and which ultimately gives rise to “wisdom.”2 Once we brainstorm what we need in order to ameliorate the end users’ pain points, we can then think of what knowledge and information might help us gather that wisdom. Finally, we also need to consider what data might lead us to have the information we require.
It can be noted here that for “information” we’re using Fisher information as a quantitative measure, whereas for “knowledge” we’re using clustering metrics. Such quantitative measures enables us to exploit data to the best capacity and understand any and all limitations that may exist.
It is very important to keep a prototype simple and functional. In this step, we must go back to the end users for feedback and get some real and limited data. In another one of my projects, we used machine learning on sensor data in order to track the immediate environment. The main hypothesis was that even if there was not enough data to fully characterize the events, they could at least distinguish major events. Once we knew the major key hypothesis, we did not have to spend time and energy in building a fully functional system. Rather, we could hack a minimum viable prototype, gather some real data, and check if it is at all possible to validate the hypothesis.
Figure 2a shows the hacked system we used to gather some initial data. Then we used Principal Component Analysis to view the clustering characteristics (as shown in Figure 2b). The result was encouraging and we continued to improve it until we had a better prototype to detect motion behind wall.
For a deeper appreciation of the topic, you can read the Prototyping chapter of The Innovator's Method: Bringing the Lean Start-up into Your Organization.
Figure 2a: Hacked hardware to test the main hypothesis in one of my projects; Figure 2b: PCA used for initial clustering analysis
The last step is field testing by taking our prototype scheme to the end users and gathering the data. Here, we can use data visualization algorithms like PCA to gauge the success of the prototype solution.
This cycle can be repeated until we have a fixed set of informed user requirements grounded with a deep understanding of customer pain points and data collection limitations.
System Engineering Process
Once the iterative Design Thinking phase is over, we potentially know the right thing to design. At this point, we can move to the more conventional system engineering process that enables us to design things correctly or in the best way possible. There are multiple well-established standards for system engineering, e.g. IEEE 26702-2007.3
Figure 3 gives a rough flow of steps that can be taken in this phase. The first step is to analyze the user requirements as captured from the Design Thinking phase and extract the technical requirements. Quantitative figures of merit are preferred in the technical specifications as it helps in listing the acceptance test procedures (ATPs) for the final performance of the design. All this can be linked to the standard traceability matrix.4
The final implementation or development can be done following any of the standard processes. We prefer to use the V-model as it focuses on the interaction of each sub-modules and validation plans for those sub-modules as well.5
Figure 3: System Engineering process
The process described above is in no way linear. The biggest strength of the process is that it can be iterated at any stage. As the process is well captured, hence such agile spiraling does not affect the integrity of the project. Fusing Design Thinking with System Engineering is one of the best ways to foster disruptive ideas with high-agility process to deliver reliable data science solutions.
1. McKinsey & Company The Real-World Potential and Limitations of Artificial Intelligence, https://www.mckinsey.com/featured-insights/artificial-intelligence/the-real-world-potential-and-limitations-of-artificial-intelligence
2. A DIKW Paradigm to Cognitive Engineering, https://arxiv.org/abs/1702.07168
3. IEEE 26702-2007 - ISO/IEC Standard for Systems Engineering - Application and Management of the Systems Engineering Process, https://standards.ieee.org/standard/26702-2007.html
4. Wikipedia Traceability Matrix, https://en.wikipedia.org/wiki/Traceability_matrix
5. Wikipedia V-model (Software Development), https://en.wikipedia.org/wiki/V-Model_(software_development)