Hexinverter ¬ D language Userspace

Project link: Dcore on git.antpantheon.com

So I started a project the other week to rewrite coreutils in the D programming language thanks to some D fans in the Glimpse Project which may or may not be pending a rewrite in D depending on how things go.

So rather than do a lot of boring exercises on Excercism (or whatever), I decided that it'd be interesting to implement a userland in D. Coreutils is a significant GNU project in the Linux realm, and possibly a sticking point for people trying to make non-GNU Linux distributions. For better or worse, I understand this perspective, but I have no particular opinion, but I thought it'd be kind of awesome to reimplement some of them in D. It's a discrete set of things that you know, is a self limiting project of course. Also D interoperates with C trivially, but also simplifies a lot of things that a C implementation wouldn't find as simple such as text manipulation.

As part of this project I've also decided to more closely match the FreeBSD functionality from the manpages. I decided that I shouldn't look at any source code, just documentation, and implement based on testing and that documentation. This has been very interesting in terms of discovering the differences and commonalities (my base assumption is that the FreeBSD implementations are somewhat closer to spec, and I have done a little spec reading to understand some of the tools better). The Wikipedia article List of Unix Commands has also been extremely useful, and the documentation there has been a great boon.

Well, of course, it's also a really interesting archaeology expedition. There are several utilities that are like "why is this even included in POSIX™"! There are things with weird features, or areas that could be improved - and in my infinite humility I have decided that I can start this project. And of course inevitable scope creep leads to the idea that we could do a whole userspace in D - and that's bloody interesting innit. 😉

Interesting finds so far:

  • tsort - This is a utility that does a topological sort on a set of input tokens which are understood to be pairs of nodes in a tree. It then prints the nodes in an order that would satisfy a sort of dependency order if the tree has no cycles in it. This was fun! Also I didn't expect this to exist. I implemented this early because it was interesting to me.
  • ptx - This creates something called a "permuted index" of a text input. A permuted index seems kind of awesome and I wish it were used in more books but it's basically showing the context for every word in a text input, sorted by word - so you could look up any word in a body of work and find all of the areas and sentences it's used in. I haven't started implementing this yet, but I often think, "why is this a POSIX required thing?".
  • cksum - Implementations are weird and the algorithm is weird and I haven't quite been able to match the output of the Linux one even following various implementations online. More work is needed.
  • m4 - A macro processing programming language. Another manditory utility that will be quite a challenge to implement! I haven't started yet. This along with things like awk, bc, expr, and sed are all microlanguages included by the standard, and will be interesting to implement.

I've also started writing a shell, but instead of implementing the standard shell scripting (even though, now that I'm into it, it shouldn't be too hard), I've decided to spin my own shell scripting language that simplifies the specialization in the language to a sort of easy to understand minimal set. We'll see how that goes!

Anyways, this promises to be a long, long term project for me, but it has been fascinating from a learning more about this thing I've used for 30ish years at this point.

One major impression I get is there is a lot of historical cruft, and like unquestioned historical descisions even in the "newer" implementations of coreutils and similar - something maybe we could with hindsight make more user-oriented! I'm looking forward to improving these little tools I use all of the time (right now, I'm pretty proud of the extensions I made to the cut utility!)