Why Is This Site Built With C
36 points by mysticmode
36 points by mysticmode
My blog is backed by a blogging engine in C which I’ve been using it for 25 years, so yes, I suspect the author’s program can easily last 10 years. And adding a blog entry to my blog (which I can do via email, a web form with POST or a custom script using PUT) is fast not because it’s C, but because each entry is stored as HTML (my site predates Markdown by several years).
The rest of my site, however, is in XML and I use xsltproc
to regenerate the site. I was curious about XSLT in the early 2000s and haven’t bothered to change it. I add a page (which may involve editing several XML files due to how it’s structured), run xsltproc
, then use rsync
to upload the changes to the server. It’s … fast enough for me, but it’s not instant.
Wow, I no longer use it, but I also created a static site generator using XML and xsltproc
. Glad to (finally) know I’m not the only one! Kudos for sticking with your system for so long—I have shown no such restraint.
I think, this is quite common – XSLT is a perfect tool for such tasks. Years ago, I created XML Web generátor based on set of XSLT templates and Ant. Later I also created several smaller websites just with xsltproc
and Make.
It always appears to me that people claim they want to write blog posts but end up writing a blog engine. They start wanting to write prose but end up writing code. I get it. Writing code can feel more satisfying at first. But then, you still haven’t gotten any word out. And what’s worst, once you do want to write prose, you eventually run into problems with your code. or your template. or the css colors. or, actually, dark mode would be fun!
Which is why my newest blog is jekyll with the minimal theme on GitHub pages and that’s it. (A blog post is gonna cone up on Monday actually. Just waiting for the final review)
The author of the article you are responding to has a respectable blogging rate with a good range of subjects https://marcelofern.com/posts/index.html
I don’t think that’s true at least for me! I definitely write more now that I can just write, instead of futzing with ruby/bundler/Jekyll/asciidoctor.
Perhaps the extra determinant is the kind of blog engine? Some people want their blog to use fancy technology, some people want to have fewer deps.
It always appears to me that people claim they want to write blog posts but end up writing a blog engine….I get it. Writing code can feel more satisfying at first. But then, you still haven’t gotten any word out.
Tangentially, I always like to keep the hobby quadrants in mind.
I try to be in the first quadrant. But I can’t deny that I have fun with the rest of the three too.
Oddly enough, I feel that I might have completely misinterpreted your post. On my first reading, I though that you were saying that you also avoided writing blog posts. Expanding on my interpretation of your four quadrants, I saw them as
Thus, anyone trying to be in the first quadrant would not write blog posts. However, I’m now thinking that you might have meant
I will clarify it further. The thing here is writing. The kit is the blog engine. Now you can:
I don’t think there’s much evidence to suggest that the people who write their own SSG don’t often write many posts because of issues with their SSG. Often times, even a shoddy DIY SSG is faster to get working the way you want than an off the shelf one. The problem is just so simple that if necessary you can just start from scratch and get back up and running in an hour or two. For me, writing a good article takes a lot longer than that. In fact, I’ve faced more issues with off the shelf SSGs not being able to be made to do what I want and figuring out how to get them to let me do what I want taking time away from writing an article.
I’ve written a number of SSGs. In that time I’ve accumulated an increasing amount of half finished blog posts. At this point the number of published posts is the same as the number of finished SSGs. I just don’t have a knack for finishing blog posts. That seems unrelated to the effectiveness of my SSG. At this point it seems to be the most flexible SSG I can find (71 lines of python for the ssg core (depends on Jinja2), 195 lines of python for configuration (depends on marko and pygments).
I’ve certainly seen other people with similar problems to mine. I still think that removing all the obstacles of “will this look the way I want if I do finish it” has gotten me to write a lot more, even if I don’t often finish anything.
I want to write blog posts, but I can’t say that I have the patience for finishing them :)
I started to contribute patches to the SSG engine I chose and eventually (more than ten years later) became a co-maintainer of it. I can’t decide whether that counts as an instance of what you are describing, or not.
Frankly I think part of the problem is that most SSGs are just not good enough yet. In particular, they seem to have a very narrow opinion about how the site they are generating should be structured.
My own writing has slowed down in recent years for external reasons. I did notice I found it much easier to write about a non tech subject (music in my case) than something closer to work.
My impression of SSGs is that the more complicated they become, the more you just want to describe how everything fits together directly in code. And the more you end up writing code directly, the less valuable a SSG becomes, because the simple work (converting from one format to another) is typically very simple, but the complicated stuff is very site-specific.
I’d like to see a kind of SSGlib-type tool, that provides all the tools you need to build your own SSG, with the assumption that each site is its own SSG. But I suspect you then run into the other issue of SSGs, which is that if you’re going to be programming, you want to program in the language you’re most comfortable with, which for OP is going to be C, but for someone else might be Python, and another person Typescript. So unless your SSGlib comes with a DSL that nobody’s comfortable with, it’ll only ever target a subset of people who want an SSG.
Personally, I’m using Hugo for now, and accepting its limitations as opportunities to be more creative in how I solve problems! 😅
This is also what I realised after writing a few SSGs. This is my current/last SSG:
SSGlib: https://the-tk.com/cgit/ssg/
Example ssg written using it: https://paste.rs/awAka.py (This is just the generation script. There’s also the contents and template files but these are just like any other blog. You can fill in the details in your mind.)
Maybe I should write a post about it……………. I’ll think about it in a few years :)
Metalsmith might be the JavaScript version of what you’re describing—basically an SSG framework/library/construction kit. I used and liked it for a while, but once you use a few plugins it pulls in a staggering number of dependencies.
I’m kind of one of those people or at least there is a passing semblance. I find that I am more interested in communication technology than wanting to say anything specific, myself. On the (double) flip side, for the things I could have written about, I am more interested in doing the thing than talking about it. It’s not very conductive to keeping an online presence, and at this point I’m doubting whether I want to to begin with, with everything published openly seemingly beeing destined to be scraped and regurgitated.
I’m always tempted by this but I’ve been using Jekyll for 11 years now and have written over 1300 posts, it’s given me zero hassle over the years.
There should be little to none dependencies for generating the website.
I totally understand this sentiment. After failing to keep up with dependencies of different blogging attempts of mine in Hugo, Sphinx, and Jekyll, I wrote my own blog engine in C++. I wrote my own subset-of-Markdown parser, my own syntax highlighters, and set up file monitoring so I can edit/view updates while writing by hosting locally with python’s built-in http server.
Mine’s not the best, but it’s not a magical black box and I’ve learned a lot about Markdown, parsing, CSS and modern HTML making it.
I looked for a better alternative and found md4c, which is a parser written in C with no dependencies other than the standard C library. It also has only one header file and one source file, making it easy to embed it straight into any C project.
The official library of CommonMark (cmark) is a bit more elaborate but some may prefer it. It also comes with an official CLI tool, if someone wants an alternative to pandoc that only reads Markdown.
That worked fine for about 20 to 30 markdown files. After that, the process of converting files to html started to deteriorate in speed. Pandoc is written in Haskell, and it is not known for being fast at parsing large volumes of files.
We may leave the usual “blame it on Haskell” fare aside for a moment. Pandoc does a lot. Including support for entirely custom formats and Lua hooks for modifying the AST. That flexibility invariably comes at a cost. And that cost is not even that high:
A quick test with a ~16K Markdown file:
$ i=0
$ time while [ $i -le 80 ]; do echo $i; pandoc -f CommonMark -t html test.md -o /tmp/test.html; i=$(($i+1)); done
real 0m2.948s
user 0m1.258s
sys 0m1.240s
You can also offset that cost with caching the output. Arguably, should. Not going to say that wanting cold builds to be faster is not a valid goal, but I do think that saying that pandoc is slow is not exactly true. It’s easy to be fast if you aren’t doing much, and when one finally finds themselves in need of greater flexibility, it’s suddenly back to pandoc or similar.
There is nothing inherently bad with Hugo. It is decently fast (written in Go) and it is easy to get going for a simple website.
I wouldn’t be so generous. ;)
One problem I have with Hugo is that new releases do break old themes, and it uses 0.x.y versioning after a decade of having a large crowd of real-world users — I haven’t seen it warn about incompatible changes either through semantic versioning or release notes. That problem is only offset by the fact that it’s a static executable and one can never update it — I know many people actually never update it.
Almost 15 years ago (jesus) I decided to redo my personal site + blog in Go, as a way to learn Go’s http library. I brought in a Markdown library and ended up with a complete webserver which, when you open a blog post, goes and opens a file on the disk and reads out the Markdown and parses it and serves it to you. It originally ran on Plan 9 on a laptop in my bedroom, but then I migrated it to a VPS with half a gig of RAM and that stood up to multiple front-page-of-HN events. Always served directly by my code, never run behind Cloudflare or anything like that.
It’s fast, it’s customized for precisely my use, and it’s required essentially no changes since 2011 except to roll a go.mod file at some point (and add the occasional new feature I want).
The first version of my website was also something built on Go. I ended up adding SQLite, but it worked for the better part of a decade, and I only really moved it because I got a bug to try to learn Elixir and Phoenix.
This reminds me of an anecdote.
I worked in the post production world for many years starting back when everything was analog. People would use tape to identify cables. Over and over again I told them to please not use tape because when I’d take off the tape, it’d leave gooey glue that was really hard to clean up.
“No - look”, they’d say, as they peeled tape off of a cable. “That’s not true. No glue.”
I’d then explain that in a year or two, long after they’re on their next project, I’d be stuck removing the tape and dealing with the gooey mess left behind.
I feel like the kind of people who’d recommend Ruby on Rails, for instance, would totally be putting tape on cables, comfortably ignoring what someone else will need to deal with years down the line.
I feel like the kind of people who’d recommend Ruby on Rails, for instance, would totally be putting tape on cables, comfortably ignoring what someone else will need to deal with years down the line.
This is incredibly condescending for absolutely no reason. Kind of strange writing that on a site that is Ruby on Rails.
Perhaps I’ve touched a nerve?
I used Ruby on Rails as an example because I’ve dealt with trying to support it. Need I say more? All the people who think it’s simple somehow aren’t around when it’s time to update things. I stopped supporting it because I was tired of dealing with the aftermath of updating for security reasons, then disovering that things broke.
It’s perfectly fine for certain things, but when the systems administrator isn’t the same person as the site developer, it’s not fine. When there’s a security issue, I update. If the Ruby on Rails people had their way, I’d never update. If you run your own site and your own servers, then you can balance updating and security.
From the sound of it, you’ve had a bad experience working someplace where you were held responsible for maintaining something that you shouldn’t have been responsible for (keeping a web application up-to-date should be the responsibility of the team that builds it). But that’s the fault of a dysfunctional workplace, not the fault of web applications as a whole.
Plus, imagine if someone insisted the company isn’t allowed to have a web site at all because you’d have to maintain a whole server and deal with updating it in response to security issues and keeping the configuration up-to-date. Insisting that web applications are inherently bad because they have maintenance requirements is hard to differentiate from that position.
I feel like the kind of people who’d recommend Ruby on Rails, for instance, would totally be putting tape on cables, comfortably ignoring what someone else will need to deal with years down the line.
The author apparently started out writing their own dynamic blogging app in a web framework, and then decided to switch to a static site generator. Both of those are valid choices for a personal blog, depending on the author’s personal tastes and wants. I’m not sure there’s a deeper lesson to take away from it. De gustibus non est disputandum.
But to flip your anecdote back on you: imagine you work at a place where the company website frequently needs to be updated, by any of several non-software-developer people. Would it make sense to force them all to learn Markdown and to write plaintext files and to commit them to a repository and to either learn how to run a static generator and deploy the results or to always have to wait for a “tech person” to come along and run the generator for them/deploy the output? Or would it make sense to give them a simple content management system with a web interface they can use to edit the content?
Or more simply: think about situations where your own apparent dislike of certain technologies would make you the person who “puts tape on the cables” and generates unnecessary extra work for others.
I wrote what I wrote because I’m much more often not at the removing gooey glue end of the equation.
You’re trying to say that me preferring simplicity is akin to putting tape on the cable for others. You might not be wrong for a whole family of examples, but if I’m the one who has to make sure the source to web tools work in ten years, then I’m still worried about the tape and might suggest label printer tape over editor’s tape.
But as a systems administrator I think about how things will be maintained in ten years. Most people don’t.
I recently made a website by programming by own “static site generator” (if you can call it that) in a similar manner to what was presented in this article, but I used Python, pandoc, and lxml (a Python wrapper over libxml2) instead of C and md4c. I think my solution probably satisfies requirements (1), (3), and (4). Probably not (2), since it takes around five times longer, and that’s only for processing a single file (!), with time
reporting 0.54s. I mostly made it for fun rather than out of practical needs. I did consider Hugo or Jekyll (which I use for my personal site), but I was too lazy to learn how to adapt it for the needs of the project.
An alternative for saving time with recompilation was to update my script so that only new markdown files or changed ones are marked for recompilation.
That’s what make
was made for. It uses timestamp metadata to determine what’s outdated, a simple but reliable approach. Despite being commonly used for compiling program binaries from source code, it can be used to build any product files from a set of source files.
More over, the whole Pandoc ecosystem requires a lot of of dependencies. 227 dependencies and over 400MB of installed size to be exact
You can compile pandoc
statically. The resulting binary is ~150 MB but it shouldn’t ever break as you upgrade the rest of your system. See here.
i started a hugo site years ago and as i’ve updated things break. it’s very frustrating that tools can’t remain backwards compatible.
I’ve yet looked for alternatives or built my own but i imagine those days are coming because hugo is infact growing to be big and complicated.
Depending on your use cases, I’ve recently switched to Mendoza for ssg stuff, and have been liking it quite a lot so far.
It does take a little bit more up front investment if the existing templates don’t do what you want. I’ve been building out large portions of https://junglecoder.com with it (which has been updated to be more than just a blog). If you want to deeply customize it, you’ll want to learn some Janet as well.
soupalt is another option I’ve used elsenet, though I can’t speak to it’s stability over time.
I feel like I lost so much time trying to get all of the various static site generators to work. While I do have to futz with my site generator, and a lot of the design is inspired by what I experienced with Jekyll (front matter in the markdown file is great), it’s mostly been straightforward and me knowing how my thing works, rather than dealing with plugins or whatever.
The trickiest stuff has been things of the “partially render this, then use the result to do something and fill in the blanks” stuff, but fortunately that was annoying enough to where I’ll not forget about it.
I’m always a bit disappointed that blogging should be done in a language throwing mostly all html out of the window and NIH-ing an idiosyncratic dialect. (Markdown in this case)
Isn’t there one blog engine that takes laconic, redundancy-free html and amend and expand it to what modern idioms are expected today? https://codeberg.org/mro/pagerake/src/branch/ma/prake.ml#L10-L41 addresses some such aspects but only some so far.
When blogging was a big thing most authors used engines like MovableType (god I’m old) and WordPress, and the interface was some textarea with buttons for bold, italic, links etc.
Markdown just moved that inline, a bit like WordStar, and enabled a faster workflow for keyboard users. And stock MD allows seamless mix-in of HTML elements.
(The OG Markdown Perl module was designed for easy inclusion into MovableType and Blosxom)
I’ve hand-authored HTML since 1994, Markdown was a revelation and a liberation. I much prefer to edit prose sprinkled with the occasional asterisk or underscore compared to a thicket of angle brackets.
lists, quotes, verbatim, footnotes, nested - terrible in markdown et al.
Terrible in what way? In my experience they are fine and much less painful than HTML – well, except for footnotes, which are not in original Markdown but are a common extension, and are nasty in HTML.
(I’m not a fan of footnotes on the web: the reading experience is almost as bad as endnotes in books. Much better, I think, is to frame the note as an aside immediately after the paragraph.)
sourcecode and lists in blockquotes in lists always bites me. In markdown - every single time.
In html - never.
(footnotes and endnotes are in my world not for reading but for rabbitholing and research. I have my own take on them however, see e.g. https://seppo.mro.name/en/about/#ref:0:fn:ap_delete)
I guess. I’m blogging here, not writing a treatise.
lucky you
Not really. I’m using the right tool for the job.
If I were writing a treastise, I’d use LaTeX.
If I had to author HTML to include features favored by the typesetters of bloated American textbooks, I’d probably use RestructuredText or AsciiDoc. Or mixin HTML in Markdown, like the format is designed to handle.
soupalt would probably be worth checking out if you want that
bet that! Funny https://opam.ocaml.org/packages/soupault/ uses the same OCaml DOM https://opam.ocaml.org/packages/lambdasoup/ as my tool above but I totally forgot about soupault. However I’m looking for something more make-integrateable. More of an editor plugin/helper/filter, not taking control over the complete site. Can’ phrase it better. Thanks for the reminder!
I found this interesting because while I share the author’s desire to eliminate as many dependencies as possible, there’ s a huge dependency in the middle of this solution: the pc and compiler to code in C.
Facing a similar problem ten-ish years ago when I wanted to create a dependency-free blogging system that was also modular, I landed on client-side Markdown translation. Make a web page frame, include 2-3 js libraries statically as part of the page, then put whatever content you want into markdown inside a local json file.
No dependencies, no libraries (As far as update issues), no compiling, no webserver. Done.
Fascinating to me how much baggage we carry along as developers even when we’re trying to get rid of it. Now imagine devs who aren’t interested in this topic.
I don’t have it 100% done yet, but I’ve been working on a no-build set of JS frontend libraries that lean into what modern browsers can do.
I haven’t switch any of my sites over to it yet, and I don’t know that it’ll be strictly better than something like LemonadeJS (though I expect it to end up smaller than Alpine), but it’s been a very interesting exercise
The blog post assumes something that isn’t true, like the fact that things written in Go and C will be faster than things written in Python or Haskell. I use pelican which the post mentions by name and states isn’t fast enough to be useful, which is untrue.
I use a SSG that uses Perl but the real heavy lifting (Markdown rendering) is done by calling out to the C CommonMark library. Renders almost 11k (short) entries in under 10s.