Research Comments 0x01: Modelling Text

I’ve found my growing collection of “Note.txt” files floating around in random directories to be an unbearable way of keeping notes. I’ve debated moving those into some sort of Wiki form – but at present, I’ve decided to move documents to this blog in the hopes that others might find them useful. I’ll be labelling these posts as “Research Comments” and numbering them in publish order for my own reference. These documents are not intended to be academic or authoritative in nature – they are research notes collected with links to other documents.

As I type into the new fancy “block based” wordpress editor, I’m reminded of the complexity of HTML vs the simplicity of a simple text block control. WordPress enables fancier things in blocks, but for a long time (and multiple purposes) simple text blocks where all we had.

I’m not so sure that the box layout model of CSS is really the “best” thing when it comes to the idea of supporting simple text layout – and it definitely was not intended (originally) to compete with advanced manual typesetting that might be done by a Publisher. Still, today’s graphics artists are forced to utilize a system developed by 90s era nerds without much concern of the needs of the typesetting industry.

For forum software, this presents a unique set of challenges in that we want the ability to include a substantial number of tags and at the same time limit the feature set to preserve the overall representation of the page. Early social media giant “MySpace” provided minimal filtering allowing teenagers to customize their pages to extreme levels – at times completely replacing significant elements of the MySpace UI. Unfortunately, not much is left of the ‘old web’ to point to, but it does make for some fun discussions on current forums.

We can group approaches utilized by social networks into several major camps. Facebook and Twitter provide “text only” modelling with the addition of meta data to allow a degree of enhancement (eg: link detection, post backgrounds in FB, ‘moods’, and attached images). This model allows substantially simplified user interface (no need to deal with formatting shortcuts) – but at the cost of the user’s ability to represent more complicated text with inline Linux. The second camp utilized by major web forums allows text via some other markup language with limited functionality (eg: BBCode). The translation from this syntax into HTML removes the need for detection of error codes and extended cases that may be problematic if utilizing HTML directly. Finally, the rare case is a system that allows direct HTML / CSS with (hopefully!) filtering.

One of the more interesting ideas I’ve seen is utilization of TeX under the hood for allowing better document markup potential than HTML. While working on complex documents with extensive sourcing – the tooling TeX provides is fairly invaluable. The resultant TeX documents also tend to “look” fairly decent as well – so long as one is careful to utilize modern features and fonts when creating finished documents.

For word processor / document based formats – multiple major techniques appear to exist. Confluence utilizes a subset of HTML intended to allow better “flow” of document text. This subset includes additions for elements such as images and tables. The table model itself allows nesting, but does not allow control of text flow around the table itself (the table is treated as a breaking paragraph). Historically, Word has hidden underlying makeup of documents from users as much as possible, but WordPerfect provides detailed commentary as to the internal “codes” used for document markup. Adobe FrameMaker text bares a degree similarity to HTML, but is likely better compared to TeX in operation.

After Hours and Season 2 Preview (Ep 14)

In this episode I’ll talk a bit about plans for the next season of Zombie Coder, and review some of the lessons learned over the past several months. Look for more episodes to resume after the new year!


CS Topics: Welcome to Hash World (Ep. 13)

In this episode, I conclude a series on Merkle Trees – or, the key technology and ideas behind distributed systems. I hope this episode captures your imagination to the potential applications of the distributed web.

Resources and Links:


Avoiding Death March Projects

What is a death march project? Recognizing Death March projects is easier than you might expect, and avoiding them means simply setting your own boundaries.

And remember:

Don’t work for Sh*tBags!

CS Topics: Rand and Xor Magic

In this week’s episode, we dive a bit deeper into the underlying theory behind how random numbers and the importance behind the XOR operator when creating hashes.

This episode does depend on pre-existing knowledge of basic boolean operators. If you aren’t familiar check out this page.


Nix News: Embedded Linux Update

Today I’ll cover my major take-homes from a couple embedded Linux conferences that I (virtually) attended.


  • Let’s be positive
  • All about embedded linux
  • Be prepared: Containers are coming! (Someone please make docker-lite for embedded!)
  • Risc V is up and coming real soon now
  • Learn VSCode if you want a common IDE experience
  • Diversity thoughts – there’s a person behind the email
  • Open Source maintainers ask for our patience
  • Check out Virtual Conferences in 2020


Special Thanks to:

  • – Rockin’ zombie track

Hope for the Camel’s Second Hump

Today I offer some advice for those currently staring down the barrel of failing midterm grades for CS and Engineering students.

CS Topics: Hash Functions

Today we cover one of the primary building blocks for blockchain – Hash Functions.


Episode 7 – XLS, XLSX What is Difference?

Today I cover a small “oops” from the UK health center, and extend some thoughts on its relationship to “Real Programming”.