Episode 5 – Ethics of Software Programmers

Today I discuss the implications of ethics in software development. We’ll take a look at some of my personal experience, and 2 new Netflix documentaries: The Social Dilemma, and Challenger: The Final Flight.

Episode 4 – Fall 2020 Programming Developments

A look at major new developments in the coding world for C/C++ – and some brief talk on Amazon BottleRocket and Microsoft’s Hyper-V advancements.

Episode 3 – To College or not to College?

For new and aspiring developers – in this episode I answer the #1 question asked from future coders – should I go to college? Pros and Cons of a college degree and the University system.

Contents: The Cold Truth of Job Seeking. Beware of For-Profit systems. Selecting a Degree. The actual meaning of a Computer Science degree.

Episode 2 – The Decline of Usenet

We return to the early 80s and 90s to discuss the rise of consumer access to Usenet and the following decline of the service and its rise to the ranks of the undead.

Contents:

  • Show introduction
  • Merging of networks
  • SPAM, SPAM, more SPAM
  • A brief analysis of the “Decline”

Resources:

Sources:

Credits:

  • Music by Audionautix.com – Creative Commons licensed music and commercial options available.

Episode 1 – The Creation of Usenet

Today, we go back in time to the late 70’s and observe the events that led to the creation of newgroups and usenet. Especially as this podcast covers historic events, factual corrections will be attached to the original post and/or episode as required.

Contents:

  • Introduction
  • The A News Reader
  • The B News Reader
  • Creation of UUNET
  • C News Reader, and NNTP
  • Lessons Learned

Sources in essay form will be included as a follow-up to the next episode.

Introduction the Podcast

After repeated encouragement from friends and coworkers, I’ve decided to take a dive into the world of podcasting. We’ll see if this is something that can keep going or not – but for now, I introduce the first episode of my podcast for this site.

Show contents:

  • Introduction
  • Inspiration
    • suckless.org” software project
    • Rebuilding fun in open source development
    • Venting to the world
  • Future content
    • Distribution and software ecosystem coverage
    • Computer science topics, experiments, and interesting papers
    • New developments in programming (C/C++) and technology

Distro Thoughts #1

So, after some consideration, I’ve decided to resurrect my previous efforts at building a Linux distribution. Mostly – because I’d like to tinker with a light-weight Linux that’s easily customizable. Something that really goes “back to basics”.

My first experiments were attempting to bootstrap a Clang/Musl build variant. Ugh!

My initial environment was Ubuntu 18.04 – with a modern C++ toolchain. I thought it’d be easy to populate a chroot environment, especially without cross compiling required. The LLVM code looks generally pretty clean – big, it does a lot, but clean. The build system though? See my previous comments on build system messes. The ‘one repo’ approach assumes a rather complete environment and does not bootstrap well to a new rootfs. I thought a hacky compile from source build would be neat – but this does not seem doable.

Too much time spent on this today, time to go outside and enjoy the sun.

Exploding Complexity

I’ve been thinking a lot about embedded Linux and Yocto and what constitutes an “embedded” system lately. My first PC was an 8086 clone with 2 5 1/2 floppy drive and no hard drive. I had a small collection of disks I’d trade out for my projects.

I remember running Linux on an old 386 with 4mb of RAM. I browsed the web with a 486 sporting 8Mb RAM. In more recent memory, I ran a few machines with 512 – 1Gb of RAM that managed to browse the web and accomplish most of what I do on a day to day basis – editing code and writing.

What exactly are programs doing today that they sit eating gobs of RAM? The smallest programming running on my desktop right now is the new Windows 10 terminal application – and THAT still takes up 10+MB of RAM. The argument is “they do so much more”, but I don’t think I can buy it. Software has jumped the shark.

I stumbled upon the people at suckless.org recently. While I don’t share they’re love of weird tiled user interfaces – I do find myself looking over their lists of software thinking – why can’t Linux running on Windows XP era hardware halfway decently. It’s been a long, long times since I heard about restoring older PCs to service by switching over operating systems.

Professionally, one of the larger code bases I deal with is still in C – and I find myself yearning for C++ as I stare at piles and piles of copy-paste, poorly maintained crap. That said, as I look over the software landscape, I’m not sure it’s language choice that matters. Old C code generally has corners of repeated if(error) checks, switch statements / function pointer based polymorphism, and the occasional macro based non-template (or worse, magic with untyped pointers). We can use C++ to good effect to reduce / eliminate / clarify these situations. But, instead of enabling cleaner code, we create brand new realms of furthered complexity with undebuggable template meta programs and spiraling complexity of exception handling. At least in C it’s obvious if you didn’t check for a malloc return – C++ happily throws std::bad_alloc and you’re off to the complexity farm. Bring in the even newer happier cohorts of Perl, Java, and now Python, Javascript, and extended Java / C#. Adding extra gas to the fire is web package management with nuget, pip, and npm.

All of this is to “manage” the complexity of software. And yet, at the end of the day, computers today are still keyboard/mouse or even simpler, a touchscreen. The actual complexity – high bandwidth modems, extended security concerns, increased screen resolution – could easily be dealt with in our old C/Pascal land primitives OR are being dealt with that already.

As I sit staring at the build screen of Yocto taking hours upon hours to finish compilation, I’m starting to form a new and strong opinion on software complexity:

  1. Design failures of any given abstraction geometrically increase the complexity of successive layers failing to address the design failure.
  2. The ability to complicate software scales linearly with the productivity gain of an abstraction.

All hope is not lost though – developers can choose to manage complexity and ease its impact with the the same tools that allow that complexity to exist in the first place. Unfortunately, I don’t think this is happening. Indeed, the combination of opinions 1 & 2 – and the current state of code quality (that seems to be an industry wide pandemic) indicates a final concluding observation:

Applying Improved productivity technology to legacy software conceals underlying design flags of the system.

I feel Linux in general is reaching an odd ‘critical mass’, where the ancient discussions of system flaws and design needs are catching up to us. Meanwhile, I’m starting to come to see the light of long term colleagues that hate C++. Going from C to see C++ now seems to be like digging with a shovel versus a bobcat. Sadly, I don’t think C++ generally has the right level of discipline when considering the mess.

C++ Has Me In a Funk

After spending the past year or so developing large Python applications, I’ve returned back to the C++ fold for my day to day work. For a long time, I developed primarily in C++, enough that the compiler became my tool of choice for simple automation tasks over the more logical scripting platforms. Large applications in a scripting language? Lunacy. And now, returning back to C++ feels painful. Time to spend some time trying to figure out why. I’m hoping to perhaps uncover some improved style for my C++ implementations with this introspection.

Build Systems All Suck

Being interpreted, Python doesn’t require any build systems. Drop in a .py file and it’s off to the races. Drop __init__ into a directory for a package. Simple. Python distribution utilities generally equate to a list of required dependencies readily served up from a pypi repository. In C++ land, we’ve got CMake, Waf, Autotools, Makefile, MS Build, and any number of other alternatives. Pick any one – and quickly you’ll be staring in the monitor wishing you’d chosen differently when some IDE, library, or maintenance task gets in the way. Unfortunately, I’m not sure there’s anything to be done here outside of inventing yet another new standard.

Dynamic Typing vs Generics, Virtual Methods, and Templates

Dynamic Typing with static hinting / analysis just feels more natural than the world of static types no real introspection. Concepts may help bridge the gap here – as, I find myself mostly treating all my Python functions as accepting concept inputs. Further, since all the classes are dynamically bound, there’s no need for a central module to be aware of a plugin module class type or define an interface for it. Perhaps a solution here would be a combination of CRTP and C++20 concepts – or maybe helper methods? Some experimentation is needed. SFINAE substitutions may be able to aid with attribute looking using named class members.

Coroutines and Async

C++20 is hopefully going to help here – but I’ve yet to see a good example (and try out a compiler supporting), the new coroutines spec. Further, C++20 support here seems embryonic and tailored HIGHLY toward library authors. The more I’ve worked in Python, the more I find myself utilizing generators, and lately, async generators, and in some cases, full-on async coroutines. Database search operations, network operation – so much easier to utilize async with the ‘await’ keyword, versus traversing seas of callbacks.

File and Module Namespaces

This one has me pondering adopting a new scheme of defining per-file namespaces in C++ and utilizing using in top ‘module’ directories. More thought is needed – but, my initial thought here is there’s a certain niceness to each python module polluting only it’s own namespace. In C++, a single “using” statement can derail a whole include train traversal. Worse, you’ve got to worry about any third party throwing their crap in as well.

Missing REPL in the Debugger

In Data analysis, there’s no end to the power of a well populated workspace. The primary benefit I find in Matlab is simply the data visualization toolset and ability to have a workspace that your actively manipulating and saving small chunks of to use later. My Python development often sees one window left open with a REPL, in where I’m continuously trying new code segments and verifying operation.

List Comprehensions

This is right up there with generators and co-routines. The ability to build a list directly instead of appending to one – especially as that can be done internally with an iterator allowing filtering and mapping without large intermediary data structures. Will also need to check this out with C++20.

I’ll reserve some space to whine more later – but I think this covers at least the top level points from a language standpoint. I haven’t talked about the elephant in the room that is pypi and the current rather sad C++ ecosystem there. That said, with my current project, so much of the code was developed in house, that using an off the shelf framework like Qt and following up with entirely custom development would be roughly comparable.

Whatever your language of choice, diving into another land and trying to bring back some ideas with you is a powerful way to improve the skill set. Good hunting.

On COVID Research…

Over the past day, I’ve seen multiple articles (three, four, five?) from different people posted on COVID. Generally, posted with an agenda of either orange-man-bad or lizard-people-left-wing-libral-conspiracy. I’m not gonna get into specifics of these debates. I try VERY hard to keep my life positive and mostly non-political. Personally, I’m trying to wear a mask out in public, keep hand sanitizer in the car, and am actively avoiding any/all social situations. Biggest risk I’m taking these days is occasional to-go food and potentially get a beard trim from my barber after the 18th – which I’m only considering because I know how fastidious he is.

Instead, I’d like to talk about basic “Scientific” literacy. Long standing crisis like COVID move too slow for the media to cover on a 24-hour basis. Research takes months or years if done right. Building a simulation or model isn’t like an episode of CSI or Numb3rs. If you see an article talking about “new study says” or “study claims” or generally anything “study” related – take it with a grain of salt. A number of researches are rushing to publish “findings” right now, and the news will happily take those that match THEIR viewpoint, dress them up, and push them as some sort of huge revelation. They aren’t. Don’t confuse the media’s search for content as in-the-field development.

Coming from the standpoint as someone that’s actively developed forecasting models and lately done lots of cloud processing with spatial data: models are hard. They are also limited. I’ve seen multiple recent posts on conflicting models saying conflicting things from preprint papers waiting to be peer reviewed and published. Without thorough review of the collected data, methodology, and analysis these “reports” are near worthless. In data science, epidemiology has had something of a gnarly reputation as a weak science: poor documentation of method, lack of reproducible results, significant examples of p-hacking, poorly documented data sources, “studies” consisting of largely anecdotal data or retrospective with significant bias, etc… Once we get out of today’s weeds, I’m hopeful that we’ll see some better development with increased interest and more realization of the importance of the field. But – if I’m objective, at this point, the “models” have been mediocre at best.

TL;DR – Please, please, PLEASE, stop shouting at each-other over SCIENTIST SAYS THING. We all have our own risk tolerance and concern over relatives or the economy. Chill, sit down, and don’t be a douche. Realize that personal space is now 6′ for a lot of us. You’re gonna see people wearing masks – they may even be Orange-Man voters. If someone isn’t wearing a mask – maybe they have a respiratory issue or some reason they can’t / aren’t. If you favor the shutdown or reopen – realize there are some REALLY good points on both sides and that it’s OK to disagree.