Optimistic Coding Craftsmanship

Hey there! My name is Shai Geva.

I’ve been writing code for over 20 years, and I remain optimistic about our ability to create stuff that works and are not too complex to maintain.

I’ve always liked the craftsmanship itself in software development, and I think these crazy times (craziest I’ve ever seen, certainly) give us a sort-of “new chapter” in our craft - we don’t know where it’s going, but there are a lot of exciting new things to learn and to create.

I write about AI, code, tests and other stuff.

Look around, and ping me on social to tell me if you found something useful!

Should You Use Toon?

A new data format for LLMs called TOON (Token-Oriented Object Notation) has been getting some crazy attention in the last couple of weeks. First - I think it’s a very cool concept and it looks like there’s an impressive engineering effort in a very short time here. And having said that - I think it’s worth trying to understand what are the use cases where TOON is actually the best option. ...

Moving past vibe-coding? AI-First Design Patterns and Frameworks (blog post series)

(first post in a series) Like many others, I spent a lot of time thinking about AI and software development. I belong to the camp that believes that AI is a total paradigm shift - it’ll redefine the ecosystem and what it means to create software, and it’ll be the deepest change we have seen to date. My own “flavor” of thinking about this is to try, from an engineering / implementation perspective, to understand what that change could look like. This blog series will dig into this and share my thoughts and conclusions from experimentation. I’ll talk about both: ...

AI-first design patterns - CRUD backend fundamentals

this post is part of a series about creating production-grade maintainable AI-first projects, using AI-first design patterns and frameworks. One of the best ways to help our AI agent work on our code is to set it up with a solid feedback loop so it can “plan -> do -> verify” on its own. We define the design, and the agent follows our conventions to create code and tests that enable the feedback loop. (that’s the premise of this post; earlier posts in the series go deeper on the rationale). ...

What an AI Feedback Loop Looks Like

this post is part of a series about creating production-grade maintainable AI-first projects, using AI-first design patterns and frameworks. In the previous post I mentioned that an internal AI feedback loop will be central to all our AI-first design patterns. But “AI feedback loop” might mean different things to different people - so this lightweight post focuses on giving an example to make it concrete. We will implement a small (but realistic) project. The project is set up so the agent has an internal feedback loop - it has instructions that tell it to use a loop, and it has a clear way to create effective tests and run validations (the tests it creates, type-checking, linter). We’ll see how it makes mistakes, finds them and self-heals. ...

10 Ways To Shoot Yourself In The Foot With Tests

This is a series of posts, following a talk I gave (twice - at Pycon-US 2023 and Pycon-IL 2024), about testing best (and not-so-best) practices. The talk shares 10 practices that I had bad experience with, along with ways of avoiding them. The main objective of the post series is to help you write tests that have a better ROI. I’ll discuss different practices, different ways that we can work. These practices affect us by changing the properties of our tests: ...

Footgun #10 - Wrong Priorities

This mini-post is part of a series about good testing practices, which I also presented at a couple of conferences. Here it is in PyCon US 2023 In this blog post series, we saw a bunch of different practices, and talked about how they will affect us by changing the properties of our tests: Strength - how well they catch bugs Maintainability - how easy it is to work with them Performance - how fast they run The bug funnel is all about performance. Testing a cohesive whole (implementation vs behavior) is about maintainability and strength. ...

Footgun #9 - Slow Tests

This mini-post is part of a series about good testing practices, which I also presented at a couple of conferences. Here it is in PyCon US 2023 Slow tests are not fun. In this post, I’ll talk about two ways in which they are not fun The bottleneck and the time bomb The feedback loop and the bug funnel The bottleneck and the time bomb The bottleneck here is where the tests take so long to run, that we have a long queue of tasks waiting to be merged to the main branch. (this assumes we’re merging tasks to the main branch one-by-one, and only after the tests pass. Other branching models have similar issues, but this is the simplest to explain) ...

Footgun #8 - Test Doubles Everywhere

This mini-post is part of a series about good testing practices, which I also presented at a couple of conferences. Here it is in PyCon US 2023 Sometimes, in a test, we switch a part of the system, a dependency, with an alternative implementation. These are called test doubles. Things like stubs, mocks and fakes. A few of the central reasons for doing this are: Performance - if the real thing is too slow to run a lot of tests, we switch it with a fast test double. Control - it might be difficult or impossible to set up the real thing in a certain state or make it behave in a certain way. Maybe it’s non-deterministic, maybe it has side effects that are not acceptable in tests. But tests doubles are under our full control and won’t create side effects we don’t want. The problem with test doubles Test doubles can be useful, but they are a re-implementation. They know the implementation details of the thing they’re replacing. Different types of test doubles do it differently, but this is what they do. ...

Footgun #7 - Improper Test Scope

This mini-post is part of a series about good testing practices, which I also presented at a couple of conferences. Here it is in PyCon US 2023 The root cause of many testing problems is improper test scope, i.e. that their boundaries aren’t appropriate. Test a cohesive whole - complete story My approach here is that a test should verify a cohesive whole, a “complete story”. It can be a large story like an e2e test or a small story that’s part of a bigger story, like a custom sorting function that something else uses. As long as it’s something self-contained - something whole, it might be worth testing. ...

Footgun #6 - Testing Too Many Things

This mini-post is part of a series about good testing practices, which I also presented at a couple of conferences. Here it is in PyCon US 2023 Just like with product code, if we put too many things in the same place we get a mess. ...