Distro Thoughts #1

So, after some consideration, I’ve decided to resurrect my previous efforts at building a Linux distribution. Mostly – because I’d like to tinker with a light-weight Linux that’s easily customizable. Something that really goes “back to basics”.

My first experiments were attempting to bootstrap a Clang/Musl build variant. Ugh!

My initial environment was Ubuntu 18.04 – with a modern C++ toolchain. I thought it’d be easy to populate a chroot environment, especially without cross compiling required. The LLVM code looks generally pretty clean – big, it does a lot, but clean. The build system though? See my previous comments on build system messes. The ‘one repo’ approach assumes a rather complete environment and does not bootstrap well to a new rootfs. I thought a hacky compile from source build would be neat – but this does not seem doable.

Too much time spent on this today, time to go outside and enjoy the sun.

Exploding Complexity

I’ve been thinking a lot about embedded Linux and Yocto and what constitutes an “embedded” system lately. My first PC was an 8086 clone with 2 5 1/2 floppy drive and no hard drive. I had a small collection of disks I’d trade out for my projects.

I remember running Linux on an old 386 with 4mb of RAM. I browsed the web with a 486 sporting 8Mb RAM. In more recent memory, I ran a few machines with 512 – 1Gb of RAM that managed to browse the web and accomplish most of what I do on a day to day basis – editing code and writing.

What exactly are programs doing today that they sit eating gobs of RAM? The smallest programming running on my desktop right now is the new Windows 10 terminal application – and THAT still takes up 10+MB of RAM. The argument is “they do so much more”, but I don’t think I can buy it. Software has jumped the shark.

I stumbled upon the people at suckless.org recently. While I don’t share they’re love of weird tiled user interfaces – I do find myself looking over their lists of software thinking – why can’t Linux running on Windows XP era hardware halfway decently. It’s been a long, long times since I heard about restoring older PCs to service by switching over operating systems.

Professionally, one of the larger code bases I deal with is still in C – and I find myself yearning for C++ as I stare at piles and piles of copy-paste, poorly maintained crap. That said, as I look over the software landscape, I’m not sure it’s language choice that matters. Old C code generally has corners of repeated if(error) checks, switch statements / function pointer based polymorphism, and the occasional macro based non-template (or worse, magic with untyped pointers). We can use C++ to good effect to reduce / eliminate / clarify these situations. But, instead of enabling cleaner code, we create brand new realms of furthered complexity with undebuggable template meta programs and spiraling complexity of exception handling. At least in C it’s obvious if you didn’t check for a malloc return – C++ happily throws std::bad_alloc and you’re off to the complexity farm. Bring in the even newer happier cohorts of Perl, Java, and now Python, Javascript, and extended Java / C#. Adding extra gas to the fire is web package management with nuget, pip, and npm.

All of this is to “manage” the complexity of software. And yet, at the end of the day, computers today are still keyboard/mouse or even simpler, a touchscreen. The actual complexity – high bandwidth modems, extended security concerns, increased screen resolution – could easily be dealt with in our old C/Pascal land primitives OR are being dealt with that already.

As I sit staring at the build screen of Yocto taking hours upon hours to finish compilation, I’m starting to form a new and strong opinion on software complexity:

  1. Design failures of any given abstraction geometrically increase the complexity of successive layers failing to address the design failure.
  2. The ability to complicate software scales linearly with the productivity gain of an abstraction.

All hope is not lost though – developers can choose to manage complexity and ease its impact with the the same tools that allow that complexity to exist in the first place. Unfortunately, I don’t think this is happening. Indeed, the combination of opinions 1 & 2 – and the current state of code quality (that seems to be an industry wide pandemic) indicates a final concluding observation:

Applying Improved productivity technology to legacy software conceals underlying design flags of the system.

I feel Linux in general is reaching an odd ‘critical mass’, where the ancient discussions of system flaws and design needs are catching up to us. Meanwhile, I’m starting to come to see the light of long term colleagues that hate C++. Going from C to see C++ now seems to be like digging with a shovel versus a bobcat. Sadly, I don’t think C++ generally has the right level of discipline when considering the mess.

C++ Has Me In a Funk

After spending the past year or so developing large Python applications, I’ve returned back to the C++ fold for my day to day work. For a long time, I developed primarily in C++, enough that the compiler became my tool of choice for simple automation tasks over the more logical scripting platforms. Large applications in a scripting language? Lunacy. And now, returning back to C++ feels painful. Time to spend some time trying to figure out why. I’m hoping to perhaps uncover some improved style for my C++ implementations with this introspection.

Build Systems All Suck

Being interpreted, Python doesn’t require any build systems. Drop in a .py file and it’s off to the races. Drop __init__ into a directory for a package. Simple. Python distribution utilities generally equate to a list of required dependencies readily served up from a pypi repository. In C++ land, we’ve got CMake, Waf, Autotools, Makefile, MS Build, and any number of other alternatives. Pick any one – and quickly you’ll be staring in the monitor wishing you’d chosen differently when some IDE, library, or maintenance task gets in the way. Unfortunately, I’m not sure there’s anything to be done here outside of inventing yet another new standard.

Dynamic Typing vs Generics, Virtual Methods, and Templates

Dynamic Typing with static hinting / analysis just feels more natural than the world of static types no real introspection. Concepts may help bridge the gap here – as, I find myself mostly treating all my Python functions as accepting concept inputs. Further, since all the classes are dynamically bound, there’s no need for a central module to be aware of a plugin module class type or define an interface for it. Perhaps a solution here would be a combination of CRTP and C++20 concepts – or maybe helper methods? Some experimentation is needed. SFINAE substitutions may be able to aid with attribute looking using named class members.

Coroutines and Async

C++20 is hopefully going to help here – but I’ve yet to see a good example (and try out a compiler supporting), the new coroutines spec. Further, C++20 support here seems embryonic and tailored HIGHLY toward library authors. The more I’ve worked in Python, the more I find myself utilizing generators, and lately, async generators, and in some cases, full-on async coroutines. Database search operations, network operation – so much easier to utilize async with the ‘await’ keyword, versus traversing seas of callbacks.

File and Module Namespaces

This one has me pondering adopting a new scheme of defining per-file namespaces in C++ and utilizing using in top ‘module’ directories. More thought is needed – but, my initial thought here is there’s a certain niceness to each python module polluting only it’s own namespace. In C++, a single “using” statement can derail a whole include train traversal. Worse, you’ve got to worry about any third party throwing their crap in as well.

Missing REPL in the Debugger

In Data analysis, there’s no end to the power of a well populated workspace. The primary benefit I find in Matlab is simply the data visualization toolset and ability to have a workspace that your actively manipulating and saving small chunks of to use later. My Python development often sees one window left open with a REPL, in where I’m continuously trying new code segments and verifying operation.

List Comprehensions

This is right up there with generators and co-routines. The ability to build a list directly instead of appending to one – especially as that can be done internally with an iterator allowing filtering and mapping without large intermediary data structures. Will also need to check this out with C++20.

I’ll reserve some space to whine more later – but I think this covers at least the top level points from a language standpoint. I haven’t talked about the elephant in the room that is pypi and the current rather sad C++ ecosystem there. That said, with my current project, so much of the code was developed in house, that using an off the shelf framework like Qt and following up with entirely custom development would be roughly comparable.

Whatever your language of choice, diving into another land and trying to bring back some ideas with you is a powerful way to improve the skill set. Good hunting.

On COVID Research…

Over the past day, I’ve seen multiple articles (three, four, five?) from different people posted on COVID. Generally, posted with an agenda of either orange-man-bad or lizard-people-left-wing-libral-conspiracy. I’m not gonna get into specifics of these debates. I try VERY hard to keep my life positive and mostly non-political. Personally, I’m trying to wear a mask out in public, keep hand sanitizer in the car, and am actively avoiding any/all social situations. Biggest risk I’m taking these days is occasional to-go food and potentially get a beard trim from my barber after the 18th – which I’m only considering because I know how fastidious he is.

Instead, I’d like to talk about basic “Scientific” literacy. Long standing crisis like COVID move too slow for the media to cover on a 24-hour basis. Research takes months or years if done right. Building a simulation or model isn’t like an episode of CSI or Numb3rs. If you see an article talking about “new study says” or “study claims” or generally anything “study” related – take it with a grain of salt. A number of researches are rushing to publish “findings” right now, and the news will happily take those that match THEIR viewpoint, dress them up, and push them as some sort of huge revelation. They aren’t. Don’t confuse the media’s search for content as in-the-field development.

Coming from the standpoint as someone that’s actively developed forecasting models and lately done lots of cloud processing with spatial data: models are hard. They are also limited. I’ve seen multiple recent posts on conflicting models saying conflicting things from preprint papers waiting to be peer reviewed and published. Without thorough review of the collected data, methodology, and analysis these “reports” are near worthless. In data science, epidemiology has had something of a gnarly reputation as a weak science: poor documentation of method, lack of reproducible results, significant examples of p-hacking, poorly documented data sources, “studies” consisting of largely anecdotal data or retrospective with significant bias, etc… Once we get out of today’s weeds, I’m hopeful that we’ll see some better development with increased interest and more realization of the importance of the field. But – if I’m objective, at this point, the “models” have been mediocre at best.

TL;DR – Please, please, PLEASE, stop shouting at each-other over SCIENTIST SAYS THING. We all have our own risk tolerance and concern over relatives or the economy. Chill, sit down, and don’t be a douche. Realize that personal space is now 6′ for a lot of us. You’re gonna see people wearing masks – they may even be Orange-Man voters. If someone isn’t wearing a mask – maybe they have a respiratory issue or some reason they can’t / aren’t. If you favor the shutdown or reopen – realize there are some REALLY good points on both sides and that it’s OK to disagree.

C++ Needs a New GUI Framework

The landscape of GUI C++ development is pain – native Windows gets third tier support from Microsoft, and Android actively discourages native API. Linux is better with Qt and GTK, but GTK on Windows is rough. My go to choice for years has been Qt.

Lately though, it seems Trolltech Nokia Digia The Qt Company has an active dislike of their users. I’ve brought up the idea of Qt at my day job, but the word is they won’t cut a deal amenable to requirements. So we mush on. There’s lots of homebrew garbage out there – especially if you start looking at widget sets on top of Unity. Hey, why not yet another CSS Browser?

In the end – maybe it’s just that the demand for native code isn’t there. Web front-ends are all the rage, and electron apps can do wonders. Why not take a gig of ram for a text editor and chat client – RAM is cheap these days. Still, there’s something hugely missing in development work when you start looking at the interface between C++ and whatever Javascript engine du jour you’ll be running on.

Qt is almost there for so much. Unless you want to make money or distribute an app with a GPLv3 incompatible license or environment. Given Microsoft’s amazing collection of freeware tools, you might expect a license for commercial development to be reasonable. You’d be wrong. The keepers of Qt licensing want $5k+ per developer. The community shouted. They offered a ‘small business’ package for anyone with less than $100k revenue. The community shouted again. Now, they’ve upped that to $250k. Just don’t look at the fine print if you want to distribute embedded works.

What would make a nice GUI library?

  • Some sort of DOM / Canvas model that is intuitive and easily interacts with C++
  • Scripting support with C++ tie-in
  • Stable API that plays well with “standard” C++

Hit those buttons, don’t charge me an arm and a leg, preferable be open source – GPL + commercial would be ok by me, and we’ll talk. Maybe it’s time for a Motif comeback, I miss you X11 days.

Dev Rule #1: Delivery Isn’t a Cure-All

How many programmers view themselves as craftspeople? From those I know, the vast majority.

When dates are on the line and customers are shouting – many programmers with that mindset will hunker down, work extended hours, and chant the mantra – “Once we ship, everything will be better.” This is a lie.

Best Case Scenario: You hit whatever date the boss or customer wants and deliver exactly what they want. In years of development, I’ve yet to see a single deliverable that qualified as exactly what someone wanted. Even with extensive documentation and discussion beforehand. Delivery is compromise. The verified features and functionality of software meets the standard set. People always want more, it’s why no software is ever “done”. If the end product is exactly what they wanted, the bitter taste of angry meetings and phone calls will flavor your delivery like an open container of ice cream left in a freezer of rotten meat and onions. It may still be ice cream, but it tastes like garbage.

Worst Case Scenario: Continual miss communications force further and further schedule slips and angry calls until the product is cancelled and new “leadership” or (more likely) a new development team is brought in to “Fix Things”. This is, almost universally, the end result of using a poor outsourcing team or contractor. At some point, a manager is gonna realize that doubling down on the steaming pile isn’t gonna work. Don’t be in this situation, it won’t end well.

Delivery won’t necessarily hurt a bad situation, but it’s also not guaranteed to help. Continual communication and clear expectations keep bad situations at bay. Some leadership may simply be toxic – but the vast majority of people in the world simply want to get things done and not look stupid in the process. Delivery only fixes the first part of that, and it can make the second part substantially worse.

An amazing craftsperson doesn’t just make an amazing product – they sell the world that it is amazing. Quality whispers – it doesn’t shout.

Dev Rule #0: All Rules Have Exceptions.

Why is this Rule #0 and not #1?

First, it’s not so much a rule as a disclaimer. You’ll always find an exception to the rule – even this one. To quote a pure cinema classic, “not so much rules as guidelines”. I know enough about nerd mad typing from having done it so many times. I’m not going to argue with paragraphs of text justifying violation of one of my personal rules. Chances are, I’ve violated that rule more often and with more gusto than you anyhow.

Second, we’re programmers – indices start at 0. Unless you’re one of those MatLab folk. But really, Matlab folk should be isolated on an island away from civilized folk anyway. Bonus Unlisted Rule: Don’t take programming advice from any programmer who’s primary language is Matlab.

Old Man Js 1: Too Many Tools

As I dive more into web programming in an effort to become stronger at the front-end, I figure I’ll drop some notes for any other enterprising embedded / server programmers wanting to join in.

Plodding along on the internet, I’m rapidly discovering that the choice of libraries seems to expose on to an endless array of different methods of building / compiling your web-app. PHP seems much more straightforward in comparison. The first, and most confusing element to me was ‘nodejs’ itself.

My backend is all Python, so what’s with requiring this NodeJS Javascript web server? It’s not a web server, it’s a scripting environment. Well, that makes a bit more sense.

Ok, but why do I need a Javascript environment to use these toolkits? Well, the utilities to compile JS are written for that environment.

Wait, I thought JS was interpreted by the browser? True, but you want something to maintain all the dependencies and automate things like minification and creating map files.

5 minutes in to reading a basic tutorial on several different frameworks, I’ve already had to discover new terminology for nodejs / npm. And, at this point, I haven’t even started down the alphabet soup of different environments:

  • Yarn vs. Npm vs. Bower – Ok, we’ve got multiple competing package managers here to get going… And each has it’s own quirks. Maybe the best answer is to stick with npm since it came with the environment? Crud, looks like these tutorials use yarn.
  • Gulp vs. Grunt. – Ok, so now we start to discover that inside this JS environment are apparently new environments for running tasks… Ok, not too much a problem.
  • Webpack vs. Browserify – Well, these are what I installed this node thing for anyway aren’t they? What am I getting here?

Annoyingly, each JS developer has their own ‘special sauce’ combination of components that yield something for the back-end developer. The larger the application (and the more 3rd party utilities one brings in), the more likely it seems one will need to go ‘off script’ from recommended configurations provided. That doesn’t even begin to raise the shear number of potential library combinations that may (or may not) be tested.

I’m trying to like this Javascript thing, but it’s really reminding of DLL hell days in windows.

Dev Rules: Personal Philosophy of a Rogue Software Engineer

Not long ago, a fellow software engineer popped his head into my office to reveal some new daily horror worthy of posting to TheDailyWtf. As usually happens in such situations, my brain ejected a small stream of profanity before I gave into an uncontrollable urge to shake my fist and point out the voluminous reasons this particular example indicated the responsible party should be tossed off the roof of our building. As my face returned to the normal shade of programmer day-glow white, my fellow laughed and said that I should write down my personal development philosophy.

So here goes. Friend – if you are out there – I suspect you will find this good bathroom reading. And if you printed, perhaps useful as well. Just use soft paper.

For everyone else, ignore these posts. They will not make you happier or more productive. I am no Mel, and definitely don’t qualify as a Real Programmer. For #@$* sake, this a WordPress site complete with crappy PHP stolen from a WP index. Chances are half the server traffic here is Russian Command and Control Botnet commands forcing the latest DoS attack against some GOP website. Worse yet, I wrote this with a WYSIWYG editor. Not random SQL queries.

What I’m saying is, don’t take me seriously.

More likely than not, these posts are all written under the power of various prescription drugs in a vain attempt achieve some sort of sleep while dealing with Chronic Illness. If nothing else, such curses give you more free time. Bad spelling, grammer, and made-up Texanish words be ahead. You’ve been warned.

Social Networks are Hard

I’ve decided to start writing a bit about various theories I have on social networks, Facebook, twitter, and the blogosphere. I fear that attempting to start an “Open Source” social network, or join an “Open Source” network is a cause doomed for failure. But, I’m not sure why. Back when I started blogging, I was amazed to find a network of real-world people brought together over blogging. The years haven’t been kind to blogging. Facebook and Twitter have slowly pulled users into their clutches.

My early days online were during the time of AOL disks and TV news hours advising against meeting people you talked to online. Meeting an open source contributor or two was as far as I dared advance. Certainly no online dating. I enjoyed reading BBS articles on graphics programming and tinkering with MS-DOS games and utilities. My access was limited and supervised as I was in middle school, and Linux wasn’t happening due to my PC being an old 286.

It took a few years, but I finally managed to scavenge a 486 from trash parts and with the help of NetZero (and a little sneaking around my parents), scored a net connection. Geocities gave me my first web home, and I started my first blog. I wrote posts in a text file and published by running ‘make install’. Staticgen before it was cool.

Key things I remember liking about ‘social networking’ in the days of Geocities and later MySpace:

  • Webrings formed small networks of people with similar interests and cool information.
  • Newsgroups provided amazing access to experts and connections with similar interests.
  • E-Mail was used for more than verifying acconts.

Things I remember sucking:

  • Connecting with real-world friends was generally e-mail only
  • Newsgroups were full of self-styled experts and many weren’t kind to n00bs.
  • Technical barriers were significantly higher than before.
  • Slowwwwwwwww.

I recently took the effort of joining Mastodon. I don’t see it replacing Facebook or Twitter for anyone I know. I had hoped to maybe find a viable alternative to Voxer – not so much there either. Indeed, I don’t see much of use for me as an English speaker except maybe meeting some interesting folk around. But, mostly I’m seeing a bunch of young left wingers, and I don’t have much patience for the college crew today. Indeed, I’d venture a significant number of people I’ve interacted with weren’t even born when I first got to drive on the internet. Not much more to say, because someone else already wrote my experience with the problems clearly highlighted.

I don’t care for where Facebook is going these days – and I’d love to see the “next” big thing be something that fulfills some of the early promises of the internet. To me, that means choice of service provider and the ability to contact people from other providers.

I find the underlying technology interesting, and there’s a lot of awesome research potential and algorithmic stuff possible in this space. I thought I’d try to cover my exploration here on this blog. It’s as good a place for any for these brain dumps, and maybe some Zuckerburg character somewhere can use it to help build something. If you’re that person, cut me in after you make the money please.