Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build(airbyte-ci): Test python versions matrix #40731

Merged

Conversation

inossidabile
Copy link
Contributor

@inossidabile inossidabile commented Jul 4, 2024

What

airbyte-ci test command was made to pick up the python version from the package pyproject.toml. Allowed running several versions at once.

How

[tool.airbyte_ci]
python_versions = ["3.11.5", "3.10.12"]

Review guide

🤷

User Impact

None

Can this PR be safely reverted and rolled back?

  • YES 💚
  • NO ❌

@inossidabile inossidabile requested a review from a team as a code owner July 4, 2024 09:07
Copy link

vercel bot commented Jul 4, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
airbyte-docs ⬜️ Ignored (Inspect) Visit Preview Jul 9, 2024 9:12am

@octavia-squidington-iii octavia-squidington-iii added CDK Connector Development Kit community labels Jul 4, 2024
@inossidabile inossidabile force-pushed the test-python-versions-matrix branch 2 times, most recently from 32c1a2a to b562774 Compare July 4, 2024 09:09
@inossidabile
Copy link
Contributor Author

inossidabile commented Jul 4, 2024

@natikgadzhi 🕺

@inossidabile inossidabile changed the title Test python versions matrix build(airbyte-ci): Test python versions matrix Jul 4, 2024
Copy link
Contributor

@natikgadzhi natikgadzhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! <3

I like this. Our CI checks if you changed airbyte-ci itself and refuses to run tests on PRs from forks if you did, because security.

I'll have to copy your PR over and spin up the CI, and test locally.

@alafanechere, let's pair on this tomorrow for a bit and see if we can get this in easily. Perhaps running tests on CDK on 3.11 could be a low hanging fruit, and then it would make sense to upgrade everything!


from pydantic import BaseModel, Field, validator


DEFAULT_PYTHON_VERSION = "3.10.12"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ironically, while the default here is / was 3.10, connectors run 3.9 today. Nothing for you to change, just our mistake.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They still do. Connector testing is a bit on the side compared to so-called component testing. That's different commands on airbyte-ci and they bootstrap dagger differently.. From what I figured, it's probably cause you have prebuilt Docker to inherit for components that they run into. I wonder if it's only used for testing or if it's a part of distribution and deployment. If it's test-only, we can probably pack it on the fly. If it's not, we need to make tagged versions of it to supply all demand anyway. So it's looking good.

Copy link
Contributor

@alafanechere alafanechere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the initiative!
It mostly LGTM 😄
I'm requesting change as I'd like to minimize the awareness of airbyte-ci about python versions. I'd like the AirbyteCiPackageConfiguration to be the single interface.
I'll 👍 if you agree to:

  • Remove the default python version
  • Enforce a python version to be declared in pyproject.toml (and update internal packages to set it)
  • Remove the new option which allows overriding pyproject.toml

@@ -113,7 +113,7 @@ def get_poetry_base_container(dagger_client: dagger.Client) -> dagger.Container:
poetry_cache_path = "/root/.cache/poetry"
return (
dagger_client.container()
.from_("python:3.10.12")
.from_(f"python:{python_version}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather create an explicit mapping between python versions and images in case we want to use other images/tags.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love the idea, will pick up

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking a lot about this one and this is what I ended up with: https://github.com/inossidabile/airbyte/pull/1/files#diff-b86e6d2f27d70812919004f8b786a5bb0ea5a7a28ec99c5ffd8773d9eb424eecR44. Can't get over internal "overengineered" alarm. Also, what do you think about using it to map minor versions to patch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to go ahead and push the change. Let's go through it in this PR directly :).

@inossidabile
Copy link
Contributor Author

@alafanechere I really like the idea of inferring the matrix from classic dependency syntax btw. That makes perfect sense. We pick up the lowest major version and take the latest release of every major release in the allowed range. Duplication is never cool :)

@inossidabile
Copy link
Contributor Author

inossidabile commented Jul 8, 2024

The more I understand how airbyte-ci is invoked in Github Actions, the more I understand your point about not having --python-version. I dropped it because using it on CI isn't a short reach.

However, I still think that forking jobs into a matrix is a good idea but it has to be done this way:

  • Workflow should put in the specific parameters it wants to run with (e.g. --modified) to a special new argument. Let's call it --plan which would return a list of arguments for every package and its python versions it'd run. For instance: airbyte-ci test --modified --plan
["-p airbyte-cdk/python --python-version=3.10", "-p airbyte-cdk/python --python-version=3.11", "airbyte-ci test -p airbyte-ci/connectors/pipelines --python-version=3.10"]

If I understood you well, that won't create redundancy in the Github Actions definition and will still be controlled entirely from pyproject.toml just like we agreed it should be. And the main benefit of doing that is distinct test runs for each version and package which are way more manageable, transparent, and give you a lot of flexibility in protection rules, etc. Let's look into it a separate task/PR if you don't mind.

Copy link
Contributor

@alafanechere alafanechere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @inossidabile for all these changes!
It LGTM but I need to test it locally before 👍 : CI on this branch is using the airbyte-ci latest binary, it's not built from your branch source (it's for security on our community CI) .

About this:

I still think that forking jobs into a matrix is a good idea but it has to be done this way:

I don't think it's a bad idea but I continue to believe that more granular status check can be emitted from airbyte-ci directly and not from GHA. In the same manner we currently emit per package status check we could increase the granularity per python version. I prefer keeping things in airbyte-ci as we try to centralize the control flow as much as possible.

@@ -98,6 +98,7 @@ check-ci = {sequence = ["build", "lint", "unit-test-with-cov"], help = "Build th
pre-push = {sequence = ["build", "check-local"], help = "Run all build and check tasks."}

[tool.airbyte_ci]
python_versions = ["3.11.5", "3.10.12"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
python_versions = ["3.11.5", "3.10.12"]
python_versions = ["3.11.5", "3.10.12", "3.9"]

@@ -20,6 +20,7 @@ class AirbyteCiPackageConfiguration(BaseModel):
False,
description="Flag indicating the mount of the host docker socket to the container running the poe task, useful when the package under test is using dagger",
)
python_versions: List[str] = Field([], description="List of unique python versions to run the poe tasks on")
Copy link
Contributor

@alafanechere alafanechere Jul 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
python_versions: List[str] = Field([], description="List of unique python versions to run the poe tasks on")
python_versions: List[str] = Field(description="List of unique python versions to run the poe tasks on")

This will make the python_versions field required and raise a ValidationError. I think its better to have validation at this level instead of checking if python_versions is set in the pipeline.py and raise a click.UsageError

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops :). Sorry, I didn't familiarize myself with this part enough. Thanks for the advice.

Copy link
Contributor

@alafanechere alafanechere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this locally and it works as expected 👍

@alafanechere alafanechere merged commit 1c4ff2a into airbytehq:master Jul 9, 2024
25 of 27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CDK Connector Development Kit community team/tooling
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants