10 years of personal finances in plain text files
30 points by siddhantgoel
30 points by siddhantgoel
I've read two posts about Beancount in as many days and they fundamentally disagree on the better import format between PDF and CSV :'D
Super interesting though! I personally bought into the Actual Budget universe for the ease of use factor but having to write TypeScript for importers hasn't been particularly appealing, forcing me to do bookkeeping per-transaction and by hand. I might give Beancount an honest shot soon.
I've read two posts about Beancount in as many days and they fundamentally disagree on the better import format between PDF and CSV :'D
The good thing is that Beancount doesn't care whether you extract from PDF or CSV. What matters more is that the data that goes in to your .beancount file is accurate. Pick whatever format you prefer where you can be confident about data's correctness and ease of parsing.
Generally banks (at least the good ones) also include balance values on specific dates in export statements, which you can throw into a balance directive to be extra cautious about correctness, e.g. to make sure that CSV exports didn't break. I use this quite liberally in my own importers. It's a neat little way to ensure that numbers over a time frame like a decade do add up.
Anecdata: I've been importing CSV files for the past 10 years and haven't really ran into major issues. Only once did one of my banks change their CSV exports format considerably, so that was a somewhat painful transition, but that's what regression tests are for.
Thank you for linking Actual Budget! I've been working on and off on personal budgeting software for while a quite (current incarnation, but it's still quite rough). Actual is astoundingly similar to what I wanted to build. I think I'm going to continue with mine for the experience—I'm learning a lot about Zig and modern WASM—but I'll probably just end up using Actual when I'm done.
Phew. And I thought I had too many transactions…
$ hledger -f 2025.journal stats
Main file : /home/rpaulo/ledger/2025/2025.journal
Included files :
Transactions span : 2025-01-01 to 2026-01-01 (365 days)
Last transaction : 2025-12-31 (2 days ago)
Transactions : 2037 (5.6 per day)
Transactions last 30 days: 164 (5.5 per day)
Transactions last 7 days : 5 (0.7 per day)
Payees/descriptions : 1258
Accounts : 65 (depth 3)
Commodities : 1 ($)
Market prices : 0 ()
Run time (throughput) : 0.25s (8245 txns/s)
I really enjoy this type of data introspection that is difficult when you don’t have it all condensed in one particular location, but it does take up some amount of time. I estimate it takes me about 5h per year just to download, import and categorize transactions. I only have a couple of real accounts (not the 65 virtual ledger accounts above).
Due to the two stories about the topic, I jotted down some quick notes on how I use ledger. Those do not intend to be a complete introduction to the topic, but they are likely applicable to more software than ledger.
I like this project. And what a coincidence, I just discovered it today before seeing this post. Someone asked for advice on how to create an agentic workflow for personal finances using fancy frameworks, like LangGraph. He had in mind that he needed tons of agents doing different things.
With different coding harnesses, such as codex-cli, I showed how it could answer his queries such as "What are the total cash balances?" using the beancount tutorial examples directory.
A more simple observation-action loop is all that is needed. Recommended just fork an existing coding agent and tweak it and configure it to work for the use case. With the correct hardening, I showed him how you could even have it run on schedule and send email reports based on certain events.