/home /blog 28 Feb 2023 | Get ipynb

Jaypore CI

Meditation

A long time back, I used to create git projects for small ideas that I used to have and push them to github. Eventually I used Travis CI for the automated testing of my code. Then came a time when I discovered that gitlab offered infinite private repositories! I moved a lot of my projects there and was delighted by the easy to use CI that they offered. One yaml file and things would get tested quickly.

Eventually CI became a pain point. I found myself copy pasting common configs from one project to another. It seemed that a cookie cutter tool was needed. As my projects grew in quantity, and in individual size it I started needing more complex behavior from the CI system itself. I wanted to build docker images, publish them, build binaries, publish those, combine binaries / test reports / changelogs and build documentation out of them. Config variables, deployment keys had to be managed / rolled / deprecated. Certain jobs only had to be run on specific branches, specifit commit messages, when certain files changed, when certain jobs failed and so on.

As my projects started being used by others, other issues arose. I had to keep on sharing/removing access for CI as well as the projects, worry about secrets being leaked, help people to debug code since a lot of the times the bugs were found in the CI system so they had to learn the CI system's way of debugging things. Projects were scattered across github / gitlab / bitbucket and consequently CI started being scattered across github Actions / gitlab CI / Jenkins / Drone CI / Travis CI and a lot others. This was obviously a reflection of how people work in teams. Every team had a different thing going on and sometimes within the same git repo you needed to use different CI systems since different parts were handled by different teams. When releases were done stakeholders had to be emailed sometimes. At other times, jobs had to be run when people requested them. Pull requests (PRs) had to be updated with CI reports and so we had to integrate them with github/gitlab/bitbucket, then issues had to be re-labelled / marked as done. Sometimes based on a PR's labels we had to cherry pick to other branches and so on.

Commonly people's answer to this was to live inside one of the gardens. Either pick github or gitlab and stick to whatever they provide. The nature of my projects meant that I was not the one dictating where the projects lived.

Premature clarity

Eventually it got to a point where I just stopped using fancy stuff and did everything in git as much as possible and those practicies gave rise to the Jaypore CI project.

I started using gitlab CI exclusively, even if I was on another host like github I would mirror to a self hosted gitlab in the worst case and run CI there.
Most of my gitlab was simply declaring what image to use and what shell script to run. The actual code was in a folder called cicd/run_tests.sh and so on.
I stopped using gitlab CI for storing CI secrets. All env vars / secrets were stored on vault. This was the only pain point in my work since vault requires some level of maintenance and nobody else in the teams I worked with took up this load.
I gave up on things like Kubernetes. I was not solving things at scale. My clients / projects needed to work at max 100 requests per second. All my deployments became a single docker-compose.yml file that used profiles to separate prod / dev / staging containers.
I had a dead simple auto-deploy system. Poll gitlab / github / gitea at a regular interval and docker compose up if the latest tag had changed. The auto-deploy script itself was part of the repo making projects fully self contained.
I had a garden of servers, not a farm. Each server had a home directory that was controlled using git itself.
Onboarding people was so much easier. They had to learn two things. Git and docker. That's it.

Some of these practices were enjoyable, some brought immense productivity to junior devs at a small cost, some became footguns, some were outright questionable but since the load of maintenance was ultimately on me I was happy. Eventually I thought of putting all of this ritual into a template of sorts, or a package, and eventually decided on this blog post.

This was also the time I started to prefer self hosting things. It started with gitlab changing their pricing for one of out clients. When they looked at self hosting I realized that it does not make sense to shell out so much money just to have a single repo on that instance. I started to self host my code on a linode using gitea.

A lack of CI systems that could compete with gitea in terms of weightlessness drove me first to agola CI, then drone, eventually leading me to write my own.

Choosing the name itself took some time. Simple CI? No, CI should be powerful and minimal, not necessarily simple. Writing yaml is simple, but debugging it is horrific. No CI? Well, we're not exactly against CI itself are we? Power CI? It is powerful, but I don't want people to mistakenly think that it's something in the family of Power BI/PowerShell. Cross CI? It does work across gitlab/github/gitea but that's not the main point of it. It could work anywhere! The final name I decided on was JayporeCI. I live in the city of Jaipur. It's an ancient city, powerful enough to repell invaders for centuries. The people live a simple life, a happy life. It has adopted to modern times well enough to change it's name from Jaypore to Jaipur. It has all the charm of monster cities like Delhi but none of the size. Yes; small, powerful, and very flexible. That's what our CI system was. Jaypore CI.

Jaypore CI: growing slowly with needs

The first cut had a very simple flow.

Use git hooks to run a bash script cicd/run_tests.sh.
Bash script had docker run commands in it to run tests in parallel and it waited for all containers to finish before exiting.

This was amazing! Nothing ever broke down and I could use it wherever I wanted without having to integrate CI with my git provider. The script itself was in git so I could simply clone on another machine and use it there as well without any extra config/setup effort. However I missed some of the things that years of CI usage had drilled into me. I did not want to merge my PRs unless CI passed. I wanted a nice graph, a way to see job dependencies, which ones took too long, and which ones ran suspiciously fast.

JayporeCI has matured to some extent now. Over time and repeated usage in different projects it has acquired a list of features that enable very powerful workflows for small teams / individual developers. Each of these features was added as a direct requirement for some project / team structure / automation need.

Local and offline first

Jobs are run via docker on the dev's laptop. Since there's no time between git push and the runner picking up the job to execute it, jobs trigger very quickly.
Since it's not a shared vCPU, jobs themselves run very fast.
Debugging things is incredibly easy! You can simply docker exec into the container OR you can look at the CI logs and re-run a certain job since you have the exact docker run command available that ran the job.
We use git hooks so we can trigger jobs based on pre-commit or pre-push as per our needs.
You can use Dozzle to monitor the jobs you're running on your machine.
Since everything is local, it can run without internet!

Config in python

The job config is python code written in a declarative style.
Since it's a general purpose programming language, we can do complex calculations for selecting / de-selecting jobs to be run / where to run them / how to run them and so on. For example we can choose to run jobs:
- If other jobs pass / fail.
- What the branch / tag name is.
- Lines / words present in the commit message. A common pattern is to use with:release,lint in the commit message to trigger sections of the CI manually.
- What the current time is. Sometimes we want to send CI failure reports to people via mail only when they are in office.
- Who is running the job. For example only certain people on the team can run release/publish jobs.
- Which files have changed as compared to another branch (main/trunk/develop). For repos containing multiple projects / languages we can select which linters to run based on which files have changed.
- Run different levels of regressions / fuzz testing based on if it is a feature / bug / release branch.
- Run some jobs on local / some on GPU machines.
The CI config itself can be tested for syntax / semantic / configuration issues.
You can pip install and import things in the CI config, making it easy so share common jobs across the organization / personal projects.
If you need extra information in reports, or need to integrate with slack / github / gitea / gitlab / email / telegram it's very straightforward.. since it's just python. Either import another library that does it or write your own.

Secrets are in SOPS

We create a secrets/ folder and every developer has their own <name>.enc file. A separate prod.enc or staging.enc file is there for other environments as needed.
Since secrets are in git we get without any more effort:
- Secret versioning
- Secure secret deployment to CI machines
- Rotating environment secrets
- Multi environment secrets
- Multiple person access for secrets
- Simple grep can tell you if GITEA_TOKEN is used in a project or not.
Secrets can be edited in a normal text editor. I use vim, you can use whatever you want.

Sharing logs / reports

Eventually we need to share things with other people. A common pattern is to git push to a remote and have them sync from there.
If we are debugging, they can simply SSH into your laptop and connect to your docker socket to see what's going on in the CI.
If you want to share historical jobs you can share the container logs.
There is the concept of a Remote in JayporeCI. These can be gitea, github, gitlab, email, text, or even the git repository you are using.
Reports are posted to a Remote.
- If this is gitea or github, it will open a PR and put the report in the description of the PR.
- If it is email it will send an email thread with updates containing the latest report.
- If it is git itself, it will add the reports to the git repository refs and you can git push / git pull your CI logs/ reports like any normal ref. This works similar to git-bug.
- If you have a custom remote you like, for example discord / telegram / slack, you can pip install it or write your own.