BaseML is for writers. As an ultra-light version of CommonMark and John Gruber's original Markdown it diverges from Markdown's “overriding design goal ... to make [the source] as readable as possible” in favor of making it as fast as possible to learn, write, organize, parse, render, and transfer.
Inspired by modern writing, social media, and self-publishing services, BaseML is ideal for note taking, writing articles, blogging, and building conversational user interfaces. In fact, BaseML, rendered or raw, can usually be directly cut and paste into another editor with little or no change.
Gruber emphasized readability of Markdown for understandable reasons. But open just about any Markdown file today and you will see things like soft-wrapping lines, use of
# for headers, and other least-difficulty-to-write decisions by those who write it. Markdown has evolved into the syntax of knowledge source. Knowledge coders (aka writers) like a clean syntax that is—above all—quick to write, easy to understand, and fast to parse. This allows their code to be rendered and published in the most places.
Unfortunately most Markdown writers are also coders. As one would expect, they have taken the pure simplicity of readable Markdown and bastardized it with extensions into virtually unreadable source code. Gruber would probably shake his head at this (which might explain his lack of involvement in CommonMark). Some files today have been so overly extended and adapted that they no longer resemble anything close to Markdown.
BaseML seeks to strike a happy medium by giving up some source readability for writing efficiencies and streaming without throwing readability completely out the window.
Perhaps the biggest reason to consider BaseML is the notion of transitive conversion which means going back and forth between rendered and raw BaseML.
Any rendered BaseML, on GitHub for example, can be copied and pasted directly into blogging sites without a problem.
The same is true in reverse. Most blog content document that is cut and paste from rendered HTML can be easily converted to BaseML for use elsewhere.
The applications of this are many. Most importantly this allows the core content to be maintained as a GitHub repository and simply copied by dragging and dropping. This is common when a writer wants to maintain his or her own content but enjoy the benefits of posting to a specific blogging site or community.
A Modern Markdown for an Internet of Things
BaseML is great when you don't want to bloat your application with a full CommonMark parsing and rendering engine, for example, when developing anything that does not need a full web pane to render like with Qt or GTK+. This opens up many possibilities for things like
- light, open reader apps,
- modern writing standards,
- an essential web of signed, light-weight content, or even
- a decentralized knowledge net with no web dependency at all.
One Best Way
BaseML gets out of the way and let's you create. This seems to be one large motivator for the original Markdown which was created by a writer and podcaster. BaseML, however, sacrifices redundancy for simplicity. There is only one way to do something in BaseML reduce the cognitive overhead allowing you to focus on your content and message. Since the simplifications are based on well-established best practices you need not worry.
💬 This constraint is exactly the reason behind the success of Medium.
Down, Not Up
Modern web development favors a JAMstack approach to make best use of content delivery networks and offline first design.
M is for "Markup" but could equally mean "Markdown" since most web content starts out written in it. Because BaseML is the fastest way to write Markdown it is therefore the quickest way to create JAMstack sites and applications. In fact, with as little as a single
README.md file and VuePress you instantly have a progressive web app that is ready for JAMstack deployment.
Inline Parsing, Streamable
Unlike every other Markdown flavor, which all require at least two-passes through the entire document to parse properly, BaseML elements have no interdependency on one another (such as with reference based links) meaning parsing can be streamed and processed reliably with parse-event-driven handler callbacks (like SAX) to produce a stream of nodes that can be rendered or piped immediately. This opens possibilities heretofore impossible with existing Markdown formats.
Streamed parsing requires a fraction of the memory required by other parsers. The parser only needs enough to parse the currently open nodes meaning parsers can be implemented on very small devices—even potentially streamed on a single-row LED display.
Other parsers can be added using a render pipeline model (such as to LaTeX/MathJax, web components, VSCode extensions, or HTML rendering).
When combined with BaseML's front matter, several BaseML documents can be reliably streamed over a single open network connection.
The BaseML specification is to be submitted to the IETF and other standards bodies. While Leonard's IETF submissions clarify the dilemma with Markdown no version of Markdown has been submitted currently. This is not surprising given the monumental amount of work just to arrive at the CommonMark consensus. The time is right for a light-weight minimal version of Markdown to be submitted. The ability to stream BaseML makes it particularly well-suited for an IETF standard as more Internet content is streamed over open network connections. No other existing Markdown version allows this.
Minimal blogging services with constraints on formatting have successfully demonstrated that readers and writers of all types prefer a consistent, predictable format to allow the focus to be on the content, the writing. The cognitive overhead to learn and process a different presentation unnecessarily distracts writers from their writing. This makes BaseML particularly useful when paired with frameworks like VuePress for documentation and educational content and distribution.
No Regular Expression Parser Dependency
Regular expressions are great tools, but they are horrible when built into a production parser. Just the regx library size itself puts most that would depend on them out of range for many small devices with parsers written in C. Therefore, even though the entire BaseML specification can be represented as a single complex regular expression, parsers should never use them for anything other than experimentation.
Suggested Semantic Macrostructure
Semantic macrostructure is the idea that the structure of the content itself provides meaning and levels of importance. No MIME headers, properties, JSON, hashtags or extra keywords and categories are needed. The content itself provides this. When additional meta information is needed frontmatter can be used as well.
💬 Semantic macrostructure cannot be gamed by those looking to manipulate an SEO ranking without being completely obvious to anyone ready the content.
Here is a suggested relevance ranking for any BaseML document. The primary and secondary data can be read and ranked before any other parsing even takes place.
- summary (first paragraph)
- image alt text
- list content
- subdocs (blockquotes)
- fenced content
- link addresses
Links can be easily identified and cataloged as they are parsed as well providing a collection of all outbound links that can be checked for validity or extracted into an index.
This set structure allows creators of BaseML search engines to easily provide users the ability to set their own search priorities without any guess work about how things are ranked.