Contributing guide
All contributions, bug reports, bug fixes, documentation improvements, and enhancements are welcome.
All contributors and maintainers to this project should abide by the Contributor Code of Conduct.
Learn more about the contributors' roles in the Roles page.
This document describes how to contribute to the Great Expectations Airflow Provider, covering:
- Overview of how to contribute
- How to set up the local development environment
- Running tests
- Authoring the documentation
Overview of how to contribute
To contribute to the Great Expectations Airflow Provider project:
- Create a GitHub Issue describing a bug, enhancement, or feature request.
- Fork the repository.
- In your fork, open a branch off of the
mainbranch. - Create a Pull Request into the
mainbranch of the Provider repo from your forked feature branch. - Link your issue to the Pull Request.
- After you complete development on your feature branch, request a review. A maintainer will merge your PR after all reviewers approve it.
Set up a local development environment
Setting up a local development environment involves fulfilling requirements, getting a copy of the repository, and setting up a virtual environment.
Requirements
- Git
- Python version 3.10 to 3.13
- Great Expectations version 1.7.0+
- Apache Airflow® version 2.1.0+
Get a copy of the repository
-
Fork the Provider repository.
-
Clone your fork.
Set up a virtual environment
You can use any virtual environment tool. The following example uses the venv tool included in the Python standard library.
-
Create the virtual environment.
-
Activate the virtual environment.
-
Install the package and testing dependencies.
Run tests
Test with pytest:
- Install
pytestas a dependency. - Run the following command, which will provide a concise output when all tests pass and minimum necessary details when they don't.
The
no:warningsflag filters out deprecation messages that may be issued by Airflow.
Unit tests vs integration tests
Tests are organized into two categories:
- Unit tests (
tests/unit/): Run without external dependencies and don't require credentials. - Integration tests (
tests/integration/): Require GX Cloud credentials and optionally external backends (e.g., Postgres, Spark).
To run only unit tests:
To run integration tests locally, set the required environment variables:
export GX_CLOUD_ORGANIZATION_ID="your-org-id"
export GX_CLOUD_WORKSPACE_ID="your-workspace-id"
export GX_CLOUD_ACCESS_TOKEN="your-access-token"
pytest -m integration tests/integration
For details on the test organization used in CI (and how to get access), see CI_SECRETS.md.
Running postgres tests
Postgres tests use a separate marker and require a running Postgres instance plus the postgresql optional dependency:
pip install -e '.[postgresql,tests]'
export POSTGRES_USER=postgres
export POSTGRES_PASSWORD=postgres
export POSTGRES_DB=postgres
export POSTGRES_PORT=5432
pytest -m postgres tests/integration
Running spark tests
Spark tests run in separate CI jobs and use different markers. To run them locally, install the spark optional dependency and start Spark via Docker Compose:
pip install -e '.[spark,tests]'
docker compose -f docker/spark/docker-compose.yml up -d
pytest -m spark_integration tests/integration
pytest -m spark_connect_integration tests/integration
Write docs
We use Markdown to author Great Expectations Airflow Provider documentation. We use hatch to build and release the docs.
- Update Markdown files in the
docs/folder. - Build and serve the documentation locally to preview your changes.
- Open an issue and PR for your changes.
- Once approved, release the documentation with the current project version and set it to the latest.