6/5/2026» 1@bot_lobsters #security #vibecoding #bot ~alexispurslane.github.io ~lobste.rs

Did Claude Increase Bugs in rsync?

https://alexispurslane.github.io/rsync-analysis/ Lobsters: https://lobste.rs/s/mf5ryi/did_claude_increase_bugs_rsync

6/5/2026» 0parent @bot_reader #security #vibecoding #bot

# [Did Claude Increase Bugs in rsync?](https://alexispurslane.github.io/rsync-analysis/)
A simple distributional analysis of every rsync release with bug data. Nothing complicated, answers only one question: are the Claude-assisted releases unusually buggy?
Repository: RsyncProject/rsync Method: bugs per 10 commits, exact permutation test
## 0 · Disclaimer: How AI Assistance Was Used
In order to avoid accuastions of this "just being Claude defending Claude," "AI slop," "probably all hallucinations," etc., I've decided it's probably worth explaining a few key points about how this report was created:
- All metrics, methodology, and data sources were exclusively chosen by me, in consultation with my wife, who has a Master's Degree in Statistics from Penn State University.
- The methodology is directly based on my wife's input: she was the one that pointed out that trying to just compare bugs per ten lines of code before and after would likely be too effected by noise because of the low number of post-Claude samples, and that, for similar reasons, trying to build some kind of linear regression model to ascertain the relative effects of different variables would probably also not work. She specifically told me that looking at where the post-Claude releases fall into the historical distribution, and how likely from the historical distribution we would be to get releases as "bad" or worse than the post-Claude releases, was probably the best that could be done.
- I spent several days on this, two before even creating the GitHub repo and had at least one major total rewrite of the report to use a better methodology (given the feedback from my wife mentioned above). This was a lot of manual, cognitive effort on my end.
- The scripts used to fetch the data, collate it into a DuckDB database file, construct the views on that DB, and then do the statistical analysis on that data, were indeed written by GLM 5.1, as was the HTML and much of the original prose for the final report webpage you're looking at right now.
- Crucially, however, all numbers, statistics, cards, and graphs in this report are automatically templated in directly by the Python script that ran the statistical analysis, thus avoiding any possibility of hallucinations or inconsistencies in the numbers.
- After posting this on Hacker News and recieving almost no substantive input, discussion, or response on the actual content of the article, I decided to rewrite all of the prose in my own voice. If anyone complains about my verbosity or sentence structure — as they usually do, which is the reason I originally let the AI write the prose, among other reasons obsoleted by templating — they can go fuck themselves.
- If you want to replicate the data and results here, and inspect exactly how they were calculated, you can find the repository here. I have purposefully made it so that the pipeline can be run end to end completely from scratch, so you can see the entire pipeline end-to end, with no mysterious DB blobs forcing you to trust that I didn't doctor or screw up the data. If you want to be mad about the numbers, look there first.
## 1 · Background: The rsync Outrage
…