-
-
Notifications
You must be signed in to change notification settings - Fork 33.6k
Temporarily allow CI failures for iOS #142365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Temporarily allow CI failures for iOS #142365
Conversation
|
I know the convo over Discord was specifically about Android but I've also been seeing this for iOS as well, so I've included it. |
|
It looks like the Android issue was fixed by #142289, as there have been no Android failures on the main and 3.x branches since that was merged. So I don't think that needs to be an allowed failure now. On the same branches, iOS has failed about 5 times in the last 2 days because of missing simulator images on the runners. @freakboy3742: did you have any possible solution to this? |
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
The only solution I'm aware of at this point is to switch back to the macOS-14 runner; AFAIK the macOS-14 image isn't subject to the problem that is causing issues for macOS-15 and macOS-26 runners. The downside is that tests would be run on iOS 17, which is a little old at this point - that isn't ideal. Is anyone aware of a way to get notifications when these failures occur? Or even search recent history for failures? I'm trying to get a sense of how frequent these issues are, if only to report to GitHub as part of the litany of macOS-15 image issues they've been working on. The thing that is confusing is that we don't appear to see this mode of failure with BeeWare tests (or cibuildwheel tests either, to the best of my knowledge)... |
Based on a manual search of failures of the Tests task on main, there's been 4 iOS test failures since Dec 29 (the last 100 CI runs), excluding 2 runs that were mass failures across multiple other platforms. 3 of those failures were in the last 24 hours. They're all failing for the same reason - a disk image that is meant to have multiple simulators installed is reporting having no simulators installed. I do wonder if the increased frequency might be related to rolling out a new macOS-15 image version. Either way, I've posted an update on the GitHub Actions issue tracking the problem. |
I'm pretty sure we've had more than 100 runs in ~11 months :)
Here's a script that wraps the Details#!/usr/bin/env python3
"""Check recent CPython build.yml workflow runs for iOS build failures.
uv run check-ios-failures.py
"""
# /// script
# requires-python = ">=3.10"
# dependencies = ["prettytable", "rich"]
# ///
import argparse
import datetime as dt
import json
import shlex
import subprocess
from prettytable import PrettyTable, TableStyle
from rich.progress import track
def osc8_link(url: str, text: str) -> str:
return f"\033]8;;{url}\033\\{text}\033]8;;\033\\"
def run_gh(cmd: str) -> str:
result = subprocess.run(
["gh", *shlex.split(cmd)],
capture_output=True,
text=True,
check=True,
)
return result.stdout
def get_recent_runs(repo: str, days_back: float) -> list[dict]:
cutoff = dt.datetime.now(dt.timezone.utc) - dt.timedelta(days=days_back)
cutoff_str = cutoff.strftime("%Y-%m-%dT%H:%M:%SZ")
limit = int(150 * days_back)
output = run_gh(
f"run list --repo {repo} --workflow build.yml "
f"--limit {limit} --json databaseId,conclusion,createdAt,displayTitle"
)
runs = json.loads(output)
if len(runs) >= limit:
print(f"Warning: fetched {limit} runs, results may be incomplete")
return [r for r in runs if r["createdAt"] >= cutoff_str]
def get_job_failures(repo: str, run_id: int) -> tuple[bool, bool]:
"""Return (ios_failed, other_failed)."""
output = run_gh(f"run view {run_id} --repo {repo} --json jobs")
data = json.loads(output)
ios_failed = False
other_failed = False
for job in data.get("jobs", []):
name = job.get("name", "")
if name == "All required checks pass":
continue
if job.get("conclusion") == "failure":
if "ios" in name.lower():
ios_failed = True
else:
other_failed = True
return ios_failed, other_failed
class Formatter(
argparse.ArgumentDefaultsHelpFormatter,
argparse.RawDescriptionHelpFormatter,
):
pass
def main() -> None:
parser = argparse.ArgumentParser(
description="Check recent CPython build.yml runs for iOS build failures",
formatter_class=Formatter,
)
parser.add_argument(
"-d", "--days", type=float, default=2, help="number of days to look back"
)
parser.add_argument(
"-r", "--repo", default="python/cpython", help="GitHub repository"
)
parser.add_argument(
"-m", "--markdown", action="store_true", help="output tables in markdown format"
)
args = parser.parse_args()
style = TableStyle.MARKDOWN if args.markdown else TableStyle.SINGLE_BORDER
print(f"Fetching build.yml runs from the last {args.days} days...")
runs = get_recent_runs(args.repo, args.days)
total = len(runs)
failures = [r for r in runs if r["conclusion"] == "failure"]
num_failures = len(failures)
print(f"\nTotal runs: {total}")
print(f"Failures: {num_failures}")
ios_failure_runs = []
for run in track(failures, description="Checking failed runs..."):
run_id = run["databaseId"]
ios_failed, other_failed = get_job_failures(args.repo, run_id)
if ios_failed:
full_title = run["displayTitle"]
title = full_title[:30] + "…" if len(full_title) > 30 else full_title
ios_failure_runs.append((run_id, run["createdAt"], title, other_failed))
ios_only = sum(1 for *_, other in ios_failure_runs if not other)
ios_plus_other = sum(1 for *_, other in ios_failure_runs if other)
# Summary table
table = PrettyTable()
table.set_style(style)
table.field_names = ["Metric", "Count"]
table.align["Metric"] = "l"
table.align["Count"] = "r"
table.add_row(["Total runs", total])
table.add_row(["Failed runs", num_failures])
table.add_row(["iOS + other failures", ios_plus_other])
table.add_row(["iOS only failures", ios_only])
print(f"\nSUMMARY (last {args.days} days)")
print(table)
if ios_failure_runs:
table = PrettyTable()
table.set_style(style)
table.field_names = ["Run ID", "Title", "Created", "Other failures"]
table.align["Title"] = "l"
table.align["Other failures"] = "l"
for run_id, created_at, title, other_failed in ios_failure_runs:
url = f"https://github.com/{args.repo}/actions/runs/{run_id}"
link = f"[{run_id}]({url})" if args.markdown else osc8_link(url, str(run_id))
table.add_row([link, title, created_at, "yes" if other_failed else "iOS only"])
print("\nRuns with iOS failures:")
print(table)
if __name__ == "__main__":
main()Here's 7 days' results:
Runs with iOS failures:
|
Both Android and iOS tests are quite flaky right now. Based on the convo in Discord, it seems like allowing failures is the best option.