Copilot Chat 可以分析测试失败并帮助识别潜在原因。
示例情景:本地测试通过但在 CI 中失败
考虑一种情形:你的测试在本地机器上通过,但有时在 CI 中失败。Copilot Chat 能帮助找出导致失败的原因。
在本例中,被测试的代码定义了一个简单的订单服务(order.py),并且有相应的测试检查订单是否是今天创建的(test_order_service.py)。
示例提示
下面的提示向 Copilot 提供了相关的代码和测试文件(使用 #file: 标记),并粘贴了 CI 失败的相关片段。
Please take a look at this CI failure message. The test passes locally, but intermittently fails in CI. Can you help me figure out if this looks like a code bug, environment issue, or a flaky test? Failure: ___ TestOrderService.test_order_created_today ___ > assert order["created_date"] == date.today() E AssertionError: assert datetime.date(2024, 1, 15) == datetime.date(2024, 1, 16) test_order_service.py:45: AssertionError #file:order.py #file:test_order_service.py
Please take a look at this CI failure message. The test passes locally, but intermittently fails in CI. Can you help me figure out if this looks like a code bug, environment issue, or a flaky test?
Failure:
___ TestOrderService.test_order_created_today ___
> assert order["created_date"] == date.today()
E AssertionError: assert datetime.date(2024, 1, 15) == datetime.date(2024, 1, 16)
test_order_service.py:45: AssertionError
#file:order.py
#file:test_order_service.py
示例回复
注意
以下响应为示例。Copilot Chat 的回复具有不确定性,您可能会得到与此处显示的不同的响应。
Copilot Chat 注意到日期相差恰好一天,并判断这可能是一个 时区 或 时间边界 问题。
本地机器和 CI 运行器可能使用了不同的时区设置,或从不同的时钟(UTC 与本地时间)中获取 today,因此当测试在午夜附近运行时,date.today() 可能在两个环境中返回不同的日期。
Copilot Chat 建议将此失败视为由环境/时间假设导致的测试不稳定(而非逻辑错误),并通过统一各环境中 today 的计算方式来修复。
示例情景 2:间歇性测试失败
考虑一种情形:同一机器上的测试有时通过有时失败。Copilot Chat 可以比较通过运行和失败运行的日志,以帮助找出原因。
在本例中,被测代码在 order_service.py 中使用后台任务异步更新订单状态,而 test_order_service.py 中的测试断言最终状态为 "processed"。
示例提示
下面的提示向 Copilot 提供了失败信息、一次通过运行和一次失败运行的日志摘录,以及相关代码文件(使用 #file:)。
This test passes sometimes and fails sometimes. Can you compare the logs and help me figure out why? Failure message: > assert order.status == "processed" E AssertionError: assert "pending" == "processed" test_order_service.py:62: AssertionError Logs from a passing run: [DEBUG] Created order #1234 [DEBUG] Background job started for order #1234 [DEBUG] Background job completed (52ms) [DEBUG] Checking order status [DEBUG] Order #1234 status: processed Logs from the failing run: [DEBUG] Created order #1234 [DEBUG] Background job started for order #1234 [DEBUG] Checking order status [DEBUG] Order #1234 status: pending #file:order_service.py #file:test_order_service.py
This test passes sometimes and fails sometimes. Can you compare the logs and help me figure out why?
Failure message:
> assert order.status == "processed"
E AssertionError: assert "pending" == "processed"
test_order_service.py:62: AssertionError
Logs from a passing run:
[DEBUG] Created order #1234
[DEBUG] Background job started for order #1234
[DEBUG] Background job completed (52ms)
[DEBUG] Checking order status
[DEBUG] Order #1234 status: processed
Logs from the failing run:
[DEBUG] Created order #1234
[DEBUG] Background job started for order #1234
[DEBUG] Checking order status
[DEBUG] Order #1234 status: pending
#file:order_service.py
#file:test_order_service.py
示例回复
注意
以下响应为示例。Copilot Chat 的回复具有不确定性,您可能会得到与此处显示的不同的响应。
Copilot Chat 对比了两段日志,发现通过运行时后台任务在状态检查 之前 完成,而在失败运行时状态检查发生时任务仍在运行。Copilot Chat 指出这是一种 竞争条件,因为测试没有等待后台任务完成。
Copilot Chat 建议加入机制确保任务在断言之前完成,例如同步运行任务、等待完成(例如通过回调),或轮询等待。