Pilot to Copilot: Where is my typo - Sun, Oct 8, 2023
Pilot to Copilot: Where is my typo
Pilot to Copilot: Where is my typo ?
After my first experience with the GitHub CoPilot I wanted to go beyond simple ad hoc code generation. My idea was (or still is) to integrate the CoPilot with little additional human actions to implement a feature. The process would look like this:
- Write a BDD style feature file
- Use the CoPilot to generate the python code for the feature
- Generate the step definitions for testing the feature
In an ideal case only minor adjustment from my side (like putting code into the correct files) would be needed.
The Setup
I wrote a very simple Gherkin scenario for a Python function that could add and subtract numbers. The scenario which can be found here looked like this:
Feature: A simple calculator
Scenario Outline: Basic calculation
Given I create a calculator instance
When I call the method calculate with the paramneters <number_a>, <number_b> and the <sign>
Then Then the method should return be <result>
Examples:
| number_a | number_b | sign | result |
| 1 | 1 | + | 2 |
| 1 | 1 | - | 0 |
So really no rocket since. The full source code of this example can be found here .
Using the new chat function to generate code
Using the new chat feature I selected the secnario and prompted the CoPilot to generate a python class for the scenario. The prompt looked like this:
Creata a python class that fulfills this specification
Talking about being precise when prompting, the generated code not only did subtraction and addition but also multiplication and division. So I got a little bit more specific in the next prompt:
Creata a python class that only fulfills this specification
Note the only in the prompt. The generated code which can be found here seemed much better this time:
class Calculator:
def calculate(self, number_a, number_b, sign):
if sign == '+':
return number_a + number_b
elif sign == '-':
return number_a - number_b
else:
raise ValueError("Invalid sign")
For the time being I was happy with the result and put the code in a file called calculator.py
.
Next I needed step definitions for the scenario.
Generating step definitions
So back to the prompt, highlight the senario again, start a code chat with the following prompt:
Create python step implemenations for the scenario using python's behave framework and the class Calculator
This time it seemed like the CoPilot’s code was spot on. The generated which can be found here looked like this:
from behave import given, when, then
from calculator import Calculator
@given('I create a calculator instance')
def step_impl(context):
context.calculator = Calculator()
@when('I call the method calculate with the parameters {number_a:d}, {number_b:d} and the {sign}')
def step_impl(context, number_a, number_b, sign):
context.result = context.calculator.calculate(number_a, number_b, sign)
@then('the method should return {result:d}')
def step_impl(context, result):
assert context.result == result
After putting the files into the file steps/steps.py
I was ready to run the scenario.
Running the scenario
I used the behave framework to run the scenario but the run failed with the following error message:
2 steps passed, 0 failed, 0 skipped, 4 undefined
Took 0m0.000s
You can implement step definitions for undefined steps with these snippets:
@when(u'I call the method calculate with the paramneters 1, 1 and the +')
def step_impl(context):
raise NotImplementedError(u'STEP: When I call the method calculate with the paramneters 1, 1 and the +')
@then(u'Then the method should return be 0')
def step_impl(context):
raise NotImplementedError(u'STEP: Then Then the method should return be 0')
The expression(s) for the step implementations generated by the CoPilot obviously did not match the gherkin specification. So what was the problem ?
Automatic spell checking ?
First I thought the problem might be related to the parameter types in the expression and I tried to change them from digit to string. But if you look closely at the generated code you will see that the CoPilot generated the following expression:
@when('I call the method calculate with the parameters {number_a:d}, {number_b:d} and the {sign}')
while the actual gherkin sentence in the scenario was:
When I call the method calculate with the paramenters <number_a>, <number_b> and the <sign>
Notice the typo in the name paramenters
?
The CoPilot generated the word parameters
instead of paramenters
which was a typo I made right at the start when writing the scenario.
This kind of auto correction also was the cause for the second error message. The CoPilot generated the following expression:
@then('the method should return {result:d}')
again the gherkin statement in the scenario looked different:
Then the method should return <result>
Notice the additional word Then
, which the CoPilot removed. After fixing the typo and removing the additional word Then
, the scenario ran fine.
Conclusion
Well, the most obvious conclusion is that I make far too many typos ;-)
But seriously, I think the CoPilot did a good job in generating the code. But this example also shows that the CoPilot does not really understand the context of the code it generates but just sees it as another language. It does not really understand the meaning of the code it generates.
This again emphasizes that the code generated by the CoPilot needs to be carefully reviewed before using it.