When OpenAI released its new LLM model GPT-4 last week, I could not resist and signed up for $20 monthly. I wanted to determine whether ChatGPT 4 is superior to its predecessor, which left a good impression in recent months; see my previous articles on Scrum, Agile, and ChatGPT.
I decided to run three comparisons, using the identical prompt to trigger answers from the new GPT-4 and previous GPT-3.5 models. Read on and learn what happened. It was not a foregone conclusion.
🗞 Shall I notify you about articles like this one? Awesome! You can sign up here for the ‘Food for Agile Thought’ newsletter and join 45,000-plus other subscribers.
🎓 Join Stefan in one of his upcoming Professional Scrum training classes!
📖 Get notified when the Scrum Anti-Patterns Guide book is available!
Join your peers on April 5, 2023: Hands-on Agile #50: The Product Community of Practice w/ Petra Wille.
I ran the session on ChatGPT on Wednesday, March 15, from 5:40 am CET to 6:30 am CET on the models GPT-3.5 and GPT-4 after upgrading to ChatGPT Plus.
I did not apply Grammarly to ChatGPT’s answers.
I used the following initial prompt to prime ChatGPT: “Are you using the same training data as GPT-3.0 or 3.5?”
ChatGPT-4.0:
As a GPT-4 model, I am an updated version of the previous GPT models. While I share some similarities with GPT-3 and GPT-3.5, there are differences in the training data used for each model.
The primary difference is the time period of the data. GPT-3 and GPT-3.5 were trained on data available up until around 2020 and 2021, respectively. In my case, as GPT-4, my training data extends until September 2021. This allows me to have a more up-to-date understanding of the world, including newer events and developments.
Additionally, improvements in data curation, preprocessing, and filtering techniques have likely been made between versions, leading to differences in the quality and breadth of the training data. While the core data sources might be similar, such as books, articles, and websites, the data sets themselves would not be identical across these models.
Now that we have a better understanding of the model’s background let’s run three experiments to figure out which model is superior. I will run three comparisons or A/B tests, using the identical prompt to trigger answers from the new GPT-4 and previous GPT-3.5 models:
Learn more about the new GPT-4 model from OpenAI’s GPT-4 Developer Livestream of March 14, 2023, or the OpenAI GPT-4 Technical Report.
Cannot see the form? Please click here.
Prompt: I want you to act as a Scrum Master. Tell me: what is the difference between the Scrum Guide 2017 and the Scrum Guide 2020?
ChatGPT-3.5:
As a Scrum Master, I'd be happy to answer your question about the differences between the Scrum Guide 2017 and the Scrum Guide 2020.
The Scrum Guide is the official guide to Scrum, and it provides a framework for implementing and improving Scrum practices. The latest version of the Scrum Guide was released in November 2020, and it contains several updates from the previous version released in 2017. Here are some of the key differences between the two versions:
Overall, the changes in the Scrum Guide 2020 are intended to make the framework more accessible and adaptable to a wider range of contexts while still maintaining the core principles and values of Scrum.
Comment: These answers are irrelevant, or hallucinations: Scrum Master and Product Owner are now “equal members of the Scrum?” No mention of commitments, namely the Product Goal? Highlighting the importance of empirical process control? That has been at the core of Scrum from its beginnings. The answer is embarrassing.
ChatGPT-4.0:
As a language model, I can provide you with the key differences between the Scrum Guide 2017 and the Scrum Guide 2020. Here are some of the most notable changes:
Less prescriptive and more flexible: The Scrum Guide 2020 is shorter and more focused on the Scrum framework's core concepts. It has reduced the prescriptive nature of some practices, allowing teams more flexibility in adopting Scrum according to their specific context.
Introduction of the Product Goal: The 2020 version introduces the concept of a Product Goal, which is a long-term objective for the product. It provides guidance and direction for the team, and all the Sprint Goals should align with and contribute to the Product Goal.
Simplified team structure: The Scrum Guide 2020 eliminates the distinction between development team members and other roles. It now refers to the whole team as the "Scrum Team," consisting of the Scrum Master, Product Owner, and Developers, all of whom share accountability for delivering a potentially releasable increment each sprint.
A greater focus on commitment: The Scrum Guide 2020 emphasizes the importance of commitment for each artifact. The Product Backlog has the Product Goal, the Sprint Backlog has the Sprint Goal, and the Increment has the Definition of Done. These commitments help provide clarity and focus for the Scrum Team.
Self-managing instead of self-organizing: The 2020 version refers to Scrum Teams as self-managing, emphasizing their autonomy in making decisions regarding how to achieve the Sprint Goal, and how to create the product Increment. This change highlights the team's responsibility for organizing and managing their own work, rather than relying on external direction.
Removal of the Sprint Retrospective's three-question format: The 2020 Scrum Guide no longer suggests the specific three-question format ("What went well? What can be improved? What will we commit to improve in the next Sprint?") for the Sprint Retrospective. Instead, it emphasizes the need to plan improvements and foster a continuous improvement mindset.
Updated the Definition of Done: The Scrum Guide 2020 clarifies that the Definition of Done is a formal description of the state of the Increment when it meets the quality measures required for the product. It also emphasizes that the Definition of Done is used to assess when work is complete on the product Increment.
These are some of the main differences between the Scrum Guide 2017 and the Scrum Guide 2020. The 2020 version aims to provide a more straightforward, flexible, and focused framework for Scrum practitioners, emphasizing the importance of team collaboration, commitment, and continuous improvement.
Comment: We have a clear winner! GPT-4’s answer is better and more substantial than the previous model’s.
Score: ChatGPT-4 1-0 ChatGPT-3.5
Prompt: I want you to act as an experienced Scrum Master.
Your Scrum team had a problematic Sprint and missed accomplishing the Sprint Goal. Some team members had unexpected food poisoning. The procurement department failed to renew credit card information for an essential service your team uses, and that service was no longer available. Additionally, the Scrum team ran into unexpected technical debt hidden in the legacy part of the code base. The tension in the team is high; arguments have become more frequent.
As a Scrum Master, I want you to use your knowledge of Scrum, iterative and incremental product development, non-violent communication, collaboration, team building, Scrum values, psychology, and Liberating Structures to design a Retrospective that helps the Scrum team to overcome recent setbacks and improve the relationships among each other.
ChatGPT-3.5:
I'm sorry to hear that your Scrum team had a challenging Sprint. As an experienced Scrum Master, I'm here to help you navigate this difficult situation and design a Retrospective that will help the team overcome recent setbacks and improve relationships among team members.
First, it's essential to acknowledge that setbacks and challenges are a natural part of the Scrum process. Our goal is not to eliminate problems but to learn from them and use them as opportunities for growth and improvement.
To start the Retrospective, we need to create a safe and respectful space where team members can express themselves honestly and openly. One Liberating Structure that could help us achieve this is the “1-2-4-All” structure, which involves individuals reflecting silently, sharing with a partner, joining a group of four to discuss their thoughts, and then sharing the group's ideas with the entire team.
We'll start by asking the team to reflect on the past Sprint and identify what went well and what didn't go as planned. We'll then focus on the challenges that arose during the Sprint, including the unexpected technical debt, credit card issues, and team members getting sick.
Once we have identified the challenges, we'll use the Liberating Structure “TRIZ” to explore potential solutions. TRIZ stands for “Theory of Inventive Problem Solving,” and it's a structured approach to generating creative solutions to complex problems. We'll ask the team to identify the root cause of each problem and then use TRIZ to generate new ideas for addressing those issues.
Next, we'll use the Liberating Structure “Celebrity Interview” to encourage empathy and understanding among team members. Each team member will take on the role of a "celebrity" who embodies certain values or characteristics, and the team will ask them questions about how they would handle the recent setbacks.
Throughout the Retrospective, we'll emphasize the importance of non-violent communication and collaboration, and we'll remind the team of the Scrum values of commitment, courage, focus, openness, and respect.
By the end of the Retrospective, our goal is to have identified concrete steps that the team can take to address the challenges that arose during the previous Sprint and improve their relationships with each other. We'll create an action plan that outlines these steps and assign owners to each task to ensure accountability and follow-through.
Comment: Sounds good, initially, doesn’t it? Too bad the suggestion does not pass the test. Of course, you could use 1-2-4-All to start the Retrospective. However, if every team member shall have their airtime, a structure like “Conversation Café” would be better suited. Instead of TRIZ, I would use “Nine Whys” or “What. So What. Now What?” Finally, the “Celebrity Interview” is useless for the intended purpose. If empathy is your main concern, “Heard, Seen, Respected” would be a better choice.
ChatGPT-4.0:
As an experienced Scrum Master, I would design a Retrospective that not only addresses the setbacks but also focuses on rebuilding trust, improving communication, and fostering a collaborative environment within the team. The Retrospective will follow these five steps:
Follow up with the team in the upcoming Sprints to ensure that the action items are being implemented and to gauge the effectiveness of the improvements. Encourage the team to maintain open communication and support each other as they work towards achieving their goals.
Follow-up Prompt: Is it necessary to run the Retrospective in this strict, prescriptive manner?
ChatGPT-4.0:
While the five-step structure I provided is a popular and effective way to conduct Retrospectives, it is not mandatory to follow it in a strict, prescriptive manner. As a Scrum Master, you should be flexible and adapt the format based on your team's needs, preferences, and specific situation.
It is essential to maintain the core principles of a Retrospective, which include creating a safe and open environment for discussion, reflecting on the past Sprint, generating insights, identifying improvements, and fostering a sense of collaboration and continuous improvement within the team.
You can experiment with different techniques and activities to achieve these objectives, and it is common for experienced Scrum Masters to tailor Retrospectives to suit their team's dynamics. Some alternative approaches include:
The key is to be receptive to your team's feedback and continuously improve the Retrospective process to ensure it remains valuable and engaging for everyone involved.
Comment: The first suggestion was merely a standard application of the Esther Derby and Diana Larsen model, ignoring the additional information on the situation from the prompt. If we also consider the answer to the follow-up prompt, the GPT-4-based performance gets a bit better, although it is mediocre at best. However, such a performance is sometimes sufficient when the competitor’s answer is even worse.
Score: ChatGPT-4 2-0 ChatGPT-3.5
Prompt: I want you to act as an experienced Scrum Master.
The Definition of Done is essential for a Scrum team’s success. It defines the quality standard any Increment is expected to meet by stakeholders and governance bodies. The quality level does not have to be perfect or top-of-the-line in all situations. However, it is critical that once an Increment is “done,” it can be delivered to the customers without any legal, ethical, or financial repercussions.
I want you to use your knowledge of Scrum, iterative and incremental product development, collaboration, product management, stakeholder collaboration, quality management, innovation, and business agility in general to suggest a process on how a new Scrum team can create an initial Definition of Done and then maintain it accordingly.
ChatGPT-3.5:
Sure, I'd be happy to help you with that.
Creating an Initial Definition of Done:
Maintaining the Definition of Done:
Overall, creating and maintaining a Definition of Done is critical to the success of a Scrum team. By following these steps, the team can ensure that they're delivering quality Increments that meet stakeholder expectations and can be released without any legal, ethical, or financial repercussions.
Comment: Wow, who would have thought ChatGPT-3.5 could deliver at such a high level? The approach is collaborative and inclusive, technical debt is part of the consideration, and the Definition of Done shall be simple and regularly inspected and adapted, where feedback from stakeholders is as valuable as data from tracking suitable metrics.
ChatGPT-4.0:
As an experienced Scrum Master, I recommend the following process for a new Scrum team to create an initial Definition of Done (DoD) and maintain it:
By following this process, a new Scrum team can create an initial Definition of Done and maintain it to ensure consistent quality and alignment with stakeholder expectations throughout the product development lifecycle.
Comment: Gee, this is unexpected. It sounds like a PMI brochure.
Follow-up Prompt: Your suggestion contains too much project management thinking and ignores the importance of collaboration and teamwork in a Scrum team. Partly, it is plain wrong: A Scrum Master does not create the initial Definition of Done. Also, they do not enforce compliance; that is the task of the Developers. Recreate the answer based on Scrum’s first principles of self-management and servant leadership.
ChatGPT-4.0:
I apologize for any confusion in my previous response. As a Scrum Master, your role is to facilitate and guide the team through the process of creating and maintaining the Definition of Done (DoD), while emphasizing the importance of collaboration, teamwork, and self-management. Here's an updated approach based on Scrum's first principles:
By following this approach, you can enable the Scrum team to create and maintain a Definition of Done that is rooted in collaboration, self-management, and continuous improvement, while staying true to the principles of servant leadership.
Comment: ChatGPT-4.0 talks a lot about collaboration, self-management, and continuous improvement. However, it seems that the Scrum Master is still in charge of pushing the Developers in the right direction from the Scrum Master’s perspective. Honestly, I like ChatGPT-3.5’s answer better. PS: The Product Owner is also a member of the Scrum team.
Score: ChatGPT-4 2-1 ChatGPT-3.5
ChatGPT based on the GPT-4 model has advantages over the “old” GPT-3.5 model. OpenAI promised better reasoning and conciseness and delivered. However, judging by three small everyday experiments, the new model’s advantage is not as conclusive as expected. So, let’s wait for GPT-5. Nevertheless, I will continue my subscription.
What is your opinion as a Scrum Practitioner: Is ChatGPT-4.0 worth spending $20 per month compared to ChatGPT-3.5? Please share it with us in the comments.
Club Scrum: What Are You Doing all Day, ChatGPT — as a Scrum Master?
ChatGPT Prompts for Scrum Masters, Product Owners, and Developers
Agile Transformation: ChatGPT or McBoston?
A ChatGPT Job Interview for a Scrum Master Position
Agility and Scrum According to OpenAI’s ChatGPT — Be Surprised!
What Is ChatGPT Doing … and Why Does It Work?
Learn more about ChatGPT 4 with our Scrum training classes, workshops, and events. You can secure your seat directly by following the corresponding link in the table below:
See all upcoming classes here.
You can book your seat for the training directly by following the corresponding links to the ticket shop. If the procurement process of your organization requires a different purchasing process, please contact Berlin Product People GmbH directly.
I invite you to join the “Hands-on Agile” Slack Community and enjoy the benefits of a fast-growing, vibrant community of agile practitioners from around the world.
If you like to join all you have to do now is provide your credentials via this Google form, and I will sign you up. By the way, it’s free.
Support your Scrum Master’s efforts checking ChatGPT 4 for hallucinations by pointing to the free Scrum Anti-Patterns Guide:
TL; DR: System-Level Scrum Stakeholder Anti-Patterns Learn how outdated organizational structures manifest themselves in system-level…
TL; DR: Scrum Master Tasks: Let's Bust Some Myths! Rumor says that a great Scrum…
TL; DR: Scrum Master Interview Questions on Creating Value with Scrum If you are looking…