Forth ‘n Chips

Toronto Skyline Wide 2014


In the frenetic, almost frantic world of modern software development, it’s easy to run right past good, solid practices. Newer is better, almost by definition. Often the worst place to be, at least career-wise, is defender of the past. A.I. is a good example. Symbolic systems are seen as stone-age and the headlong rush into generative systems and Large Language Models are a siren song. However, a moment’s pause may be in order.

Modern methods may indeed be the correct course, but we didn’t get here by chance or luck. Something came before. It takes much more than labeling things to fully grok them. Consideration from first principles can re-calibrate one’s perspective. This was a touchstone of scientists like Richard Feynman and Michael Faraday. Feynman advised that being able to create something, or at least explain it in simple terms, showed real understanding well beyond just knowing names and definitions. Faraday once said to students:

Do not refer to your toy-books, and say that you have seen that before.
Answer me rather, if I ask you, have you understood it before?

The best way to learn how to bake a cake is to actually bake a cake. Perhaps even several of them. Unfortunately, a lot of modern programming languages and frameworks lean heavily on upfront doctrine and formality. We’re asked to ‘trust the experts’ and spend months or years learning from ‘toy-books’ (sometimes written and promoted by those very same experts). That’s a very big risk, with a very long-deferred payoff. Why not first spend a small fraction of that time investing in some first principles thinking?



This brings me to a brief discussion of the Forth language. It’s been many years since I wrote any commercial code in Forth. But it’s been only a few minutes since I thought about a problem from first principles using Forth. What makes it unique, beyond all its quirkiness, Reverse Polish Notation (RPN), stack architecture, concatenative programming, extreme simplicity, etc., etc., is it’s syntonic nature. Forth enables one to think fully computationally while remaining fully human. That’s one heck of a neat trick. An hour of interactive thinking/exploration/coding in Forth can often produce the kernel of a solution, even to complex problems. Or even serendipitous treasures. No cutting & pasting, mysterious black boxes, or hyper-abstraction is needed. No team is needed. In fact, Isaac Asimov once wrote that isolation is crucial to deep thinking 1959 essay. The team will still be there in an hour, ready to work and interested in anything you can contribute.

 

Picture Permission of Raspberry Pi Foundation


In more extended sessions, I call this mode of thinking “Forth ‘n Chips”. Working with integrated circuits, gates, and even transistors can be quite liberating. Getting away from your routine can open your mind. Some writers enjoy using pen and paper occasionally. It doesn’t mean they’re laptop-hating Luddites. Some hockey players need to leave the video room and go for a fun skate to clear their head and regain their muscle memory. It doesn’t mean they’re unthinking brutes. It means they know how to re-calibrate.

There’s a natural, almost biological feel to Forth. Concatenation is a more reptilian way of thinking than deductive reasoning.  See here. Perhaps counter intuitively, subjective thinking can lead to stronger objective thinking. It’s a way to look a bit more carefully at what you’re rushing past.

Walking through a computer history museum, or browsing old magazines and manuals can give one perspective on the path that lead to this frenetic future. Remember that the homo sapiens that evolved on the grasslands to find food, avoid predators, and raise offspring had basically the same brain as we do today. Fashion comes and goes, but first principles remain.

The Abacus

Huge Abacus at Guohua Abacus Museum


Counting is one of the most powerful human capabilities. In ancient times, common objects such as pebbles were used as abstract symbols to represent possessions to be tallied and traded. Simple arithmetic soon followed, greatly augmenting the ability of merchants and planners to manipulate large inventories and operations. Manipulating pebbles on lined or grooved ‘counting tables’ was improved upon by a more robust and easy to use device – the abacus. The name ‘abacus’ comes from Latin which in turn used the Greek word for ‘table’ or ‘tablet’. The abacus also had the advantage of being portable and usable cradled in one arm, a harbinger of the pocket calculator.

Variants included Roman, Chinese, Japanese, Russian, etc. Most commonly, they worked in base 10 with upper beads representing fives and lower beads representing ones. Vertical rods represented powers of 10, increasing right to left. Manual operation proceeded from left to right, with knowing the complement of a number being the only tricky part (eg. the complement of 7 is 10-7=3). The basic four operations + – x ÷ were fairly easy to learn, mechanical procedures. One did not have to be formally educated to learn to use an abacus, as opposed to pencil and paper systems. This was computation for the masses.

The abacus is comprised of beads and rods, grouped together into several ‘stacks’. The stack is the central object in concatenative programming languages, such as Forth. These are very well-suited for teaching and learning computational thinking.

The abacus was one of the most successful inventions in history. In fact, it’s still in use today, mostly in small Asian shops. It is a universal and aesthetic symbol of our ancient love of counting.

Syntonic Scaling

“There are three things that we need to focus on as a growing organization: scalability, scalability, and scalability”. If you’re like me, you’ve heard that line at least once. The pace of modern social and technological change is staggering. Most of us struggle just to maintain balance and not be swept away by the tsunami. This is clearly visible in the nurture and guidance of a commercial startup, a new initiative in government, or really any group effort at expansion and/or adaptation. Perhaps the biggest challenge is to grow smoothly without exploding or imploding — put simply, to scale.

The trick is that we’re not honey bees. Brute force scaling is a very lossy process. Forced increase in scale can directly precipitate a reduction in scope. Perspective is lost, diversity is lost, opportunity is lost, horizons shrink, siloes are erected. Not good. Our technology is vastly more complex and complicated. A small error that might safely be ’rounded off’ in a bee colony might bring the entire system crashing down in a human-scale system. Fault tolerance and error correction must be ubiquitous and automatic.

everything fails all the time
– Werner Vogels

There are two main modes of strategic thinking: Deductive and Inductive. Deductive thought employs logic and rationality in order to understand or even predict trends and events. Inductive thought observes the emergence of the complex from the simple to do the same. Both can draw on a mix of mathematical/statistical/historical/computational analysis, although the specific mix is often quite different for the two modes. Both have the aim of making better, more informed decisions. Somewhere in between the two, like the overlapping area of a Venn diagram, is the concept of syntonicity. This is the ability to at least temporarily move one’s mindset to another place/time/perspective. It requires imagination, which is definitely, though perhaps not exclusively, a human capability.

My own area is mostly computational analysis. The vast seas of data available today long ago swamped human capabilities and now require the mechanical tools of automation. The timeline is roughly: counting to writing to gears and rotors (eg. Antikythera mechanism, Pascal calculator) to Jacquard machine (punch cards) to digital computers (vacuum tubes to transistors to Large Scale Integration to distributed computation) to quantum computers and beyond. Along the way, formal concepts were developed such as algorithms, objects, feedback (eg. cybernetics), and artificial intelligence. Many tools and languages have been developed over recent decades (then mostly out-grown), with ages and fashion passing by like scenes from an old Time Machine movie. Those of us who have enough years, enough curiosity, and enough patience, have remained engaged in the movie over the long haul. Simultaneously futurists and dinosaurs, I guess. The red plastic case of my childhood trusty pocket radio proudly boasted of its “3 Transistor” innards. Like most others, I now carry a smart phone that has a billion times that many. That’s modern life — we must scale by orders of magnitude between cradle and grave.

How can the human mind grapple with this much scaling? We evolved to find food and avoid predators in grasslands, not to hop among sub-atomic particles, swim through protoplasm, or wander intergalactic space-time. How can we explore and comprehend reality all the way from quantum mechanics to femtosecond biomolecular reactions to bacteriophages to cellular biology to physiology to populations to geology to astronomy to cosmology?
Whither scope?

Things on a very small scale behave like nothing that you have any direct experience about. They do not behave like waves, they do not behave like particles, they do not behave like clouds, or billiard balls, or weights on springs, or like anything that you have ever seen.
– Richard P. Feynman

Most programming languages are quite horrible at scaling. Scaling down is nigh-on impossible because languages have evolved to be ever bigger. Even those that claim to be lean and elegant usually require vast libraries and modules to do anything useful. Scaling up is normally accomplished by bolting on new data types, functions, and capabilities. Examples include objects, functional programming, and exotic techniques such as machine learning, thus making the language ever bigger and often more opaque. Their standardization and doctrine come at a heavy price. Much architecture and machinery is present and required ‘right out of the box’. Methodology is either implicit or actually enforced through templates and frameworks. This provides tremendous leverage when working at the human scale of activity. It squelches innovation though, when one moves too far down or up in scale.


One of the computer languages I learned and used early-on was Forth. Although I have discarded it several times over the last nearly five (!) decades, I keep coming back to it. It is a very natural almost biological language. I have also found it to be a very syntonic one. A crude definition of syntonicity is ‘fitting in’ or ‘when in Rome, do as the Romans do’. This is the key to scaling the applicability of human thought.

At its heart, Forth is incredibly tiny. It’s essentially a method for calling subroutines simply by naming them. It has a simple model for storage management and sharing: the stack. A stack is one of the oldest computational structures, perhaps going back to ancient time (the abacus, for example). However, it is brilliantly elegant. It combines elemental simplicity with tremendous functionality, a key to high scalability. The entire interpreter and compiler can be implemented in several hundred bytes. Perhaps most importantly, it can be learned, remembered, and used without a bookshelf full of manuals and references. Scaling up is unlimited and quite self-consistent; one basically bootstraps the Forth system like a growing embryo, not like a Rube Goldberg machine. Using this process, Forth can actually become fairly large and capable, see gForth. Note that scaling Forth and the underlying scale of the environment are orthogonal. The real power and utility of Forth comes from its simplicity. For example, with today’s many-core CPUs, it is possible to implement many separate, even heterogeneous, Forth engines in one computer, fully independent yet still communicating. Try that with a behemoth language or even hefty virtual machine.

Thusly, and personally, armed with the smallest possible computational toolkit, the freedom to think is restored. Researcher-programmer meetings can be cancelled. The horse can be put back in front of the cart. One can focus on grokking (grasping syntonically) the environment, physics, and inhabitants of the new scale (and thus horizons broaden again).

Of course, I’m not advocating Forth to be used for things like massive data manipulation, replacing tools like SQL, NoSQL, and beyond. Concurrency, seamless replication, automated inferencing, and vast interoperability are somewhat beyond Forth’s capability (though not entirely, suprisingly). Such tasks usually apply to teamwork. Elementary Forth is not a team language. It’s more suited to individual thought and exploration. Isaac Asimov once mused about the benefits of isolation at least in early stages. Again, we’re not honey bees.

Learning Forth is best done by using it – it’s tiny and simple to start with. If you’re more the reading type, one of the first, and best, books on Forth is Starting FORTH (1981) by Leo Brodie.

Foundations

Building a solid foundation in the early years of a child’s life will not only help him or her reach their full potential but will also result in better societies as a whole.
– Novak Djokovic

I’m not a starry-eyed Isaac Asimov fanboy. He had his warts. But his life mattered in the big scheme of things. I like Asimovians. To me, an Asimovian is a skeptical optimist with deep scientific, historical, and sociological erudition. Some are novices, some are students, some are teachers, and some are leaders. Others are Forrest Gump types who just stumbled into a few of the right rooms, or who read “Foundation” because it was only $1 in their rural school’s bookmobile 🙂

I clearly remember the day. It was blue-sky late spring in rural Ontario, just before the end of the school year. I stood near the front of the bookmobile, on its teetering floor, with “Foundation” in one hand, and “Foundation and Empire”, “Second Foundation”, and $2 (birthday money) in the other. Each had a sticker price of $1. “Foundation” was the thinnest, which seemed unfair, as I had no choice on #1. The arithmetic was heart-breaking. As I struggled to choose between #2 and #3, their covers alternately calling to me, the book lady said, “Those are buy two get one free.” I quickly plunked down my $2 (tax was either included or exempt, I don’t remember which) and ran from the bookmobile with the trilogy like a thief in the night.

I first read those books sitting in a tree on our farm. Tales of a course for mankind stretching into the almost unimaginably distant future. A future where science, rationalism, and humanism hold dominion. The galaxy in decline, yet enlightenment rekindled. A tiny spark of hope that grows into a vast, new, near utopia.

I read them several times over my youth and young adulthood. Hidden behind text books, in waiting rooms, while camping, wherever. Like Psychohistory itself, it wasn’t just a story, it was a guide, a ‘Plan’. Several attempts have been made at bringing the Foundation Trilogy to the screen, both large and small. They failed simply because it’s too big a story to be captured on film. It only truly lives in the imagination of the reader. Perhaps someone with enough time, money, and vision will succeed some day. I hope they don’t damage it.

Asimov died in 1992, the year I lived in Vancouver. He was on my mind as I had my first inkling of Geopense. Around 2000, still inspired by Asimov, I created an AI company with one main product. I avoided non-Asimov sequels to the story, and was slightly disappointed even with those penned by Asimov himself. They seemed a bit rushed and contrived. The only later book I liked and would recommend was the final one in the series, Foundation’s Triumph by David Brin. It had Hari Seldon as the central character amidst an epic search for that elusive utopia.

Over the years, I’ve often wondered if the book lady had lied about the price. I like to think she had. Asimovians are a resourceful lot. That’s one of the many reasons why they’ll win in the end, and a brighter future for humanity will dawn.

The picoXpert Story

picoXpert was one of the first (if not THE first) handheld artificial intelligence (AI) tools ever. It provided for the capture of human expert knowledge and later access to that knowledge by general users. It was a simplistic, yet portable implementation of an Expert System Shell. Here is the brief story of how it came to be.

When I was about 10, my grandfather (an accomplished machinist in his day) gave me his slide rule. It was a professional grade, handheld device that quickly performed basic calculations using several etched numeric scales with a central slider. I was immediately captivated by its near-magical power.

In high school, I received an early 4-function pocket calculator as a gift. Such devices were often called ‘electronic slide rules’. It was heavy, slow, and sucked battery power voraciously. I spent many long hours mesmerized by its operation. I scraped my pennies together to try to keep up with ever newer and more capable calculators, finally obtaining an early programmable model in 1976. Handheld machines that ‘think’ were now my obsession.

 I read and watched many science fiction stories, and the ones that most fired my imagination were those that involved some sort of portable computation device.

By 1980, I was building and programming personal computers. These were assembled on an open board, using either soldering or wire wrap to surround an 8-bit microprocessor with support components. I always sought those chips with orthogonality in memory and register architecture. They offered the most promise for the unfettered fields on which contemporary AI languages roamed. I liked the COSMAC 1802 for this reason. It had 5,000 transistors; modern processors have several billion. The biggest, baddest, orthogonal processor was the 16- or 32-bit Motorola 68000, but it was too new and expensive, so I used its little brother, the 6809, which was an 8-bit chip that looked similar to a 68000 to the programmer.

I spent much of the 1980s canoeing in Muskoka and Northern Ontario, with a Tandy Model 100 notebook, a primitive solar charger, and paperback editions of Asimov’s “Foundation” trilogy onboard (I read them five times). Foundations.

By the mid 1990s, Jeff Hawkins had created the Palmtm handheld computer. The processor he chose was a tiny, cheap version of the 68000 called the ‘DragonBall’. I don’t know which I found more compelling – this little wonder or the fact that it was designed by a neuroscientist. I finally had in my hand a device with the speed, memory, and portability to fulfill my AI dreams.

The 1990s saw the death of Isaac Asimov (one of my greatest heroes), but also saw me finally gaining enough software skills to implement a few Palm designs. These were mainly created in Forth and Prolog. The Mars Pathfinder lander in 1997 was based on the same 80C85 microprocessor found in the Tandy Model 100 that I had used over a decade earlier. This fact warmed my heart.

In 2001, I formed Picodoc Corporation, and released picoXpert.

 

Here are: the original brochure, an Expert Systems Primer, and a few slides.

It met with initial enthusiasm by a few, such as this review:

Handheld Computing Mobility
Jan/Feb 2003 p. 51
picoXpert Problem-solving in the palm of your hand
by David Haskin

However, the time for handheld AI had not yet come. After a couple of years of trying to penetrate the market, I moved on to other endeavours. These included more advanced AI such as Neural Networks and Agent-Based Models. In 2011, I wrote Future Psychohistory to explore Asimov’s greatest idea in the context of modern computation.

Picodoc Corporation still exists, although it has been dormant for many years. It’s encouraging to see the current explosion of interest in AI, especially the burgeoning Canadian AI scene. For those like me, who have been working away in near anonymity for decades, it’s a time of great excitement and hope. Today, I’m mainly into computational citizen science, and advanced technologies, such as blockchain, that might be applied to it.

Forth: A Syntonic Language

Educators sometimes hold up an ideal of knowledge as having the kind of coherence defined by formal logic. But these ideals bear little resemblance to the way in which most people experience themselves. The subjective experience of knowledge is more similar to the chaos and controversy of competing agents than to the certitude and orderliness of p’s implying q’s. The discrepancy between our experience of ourselves and our idealizations of knowledge has an effect: It intimidates us, it lessens the sense of our own competence, and it leads us into counterproductive strategies for learning and thinking.¹
– Seymour Papert

People are sentient and we know that we are sentient. We are also social creatures. The human mind is aware of ‘agency’, in both itself and others. Since the beginning of recorded history, and probably long before, we have even ascribed agency to objects and events in the natural world. Once we attain consciousness, it becomes powerful and efficient to reuse this mind machinery to understand the world around us in familiar terms. Mirror neurons in the human brain enable us to read or infer intention in others. They do not merely form patterns of thought – they reflect them.

Syntonic is a word sometimes found in music theory, but in psychology, syntonicity means having a strong emotional connection with the environment. It is understanding something through identification with it. It is achieved by putting oneself “in another’s shoes”, and it is key to human learning. One form of this is imitation, and children do this all the time, with everything from trees and animals to teapots and airplanes. Another form is metaphor, where an established understanding is substituted for a new and unfamiliar thing. See Ancient Metaphors for examples from the IT world.

When we mature, many of us discover computer programming and would like to use the same built-in learning method that allowed us to develop ‘common sense’. However, most programming languages and their text books are so abstract, formal, and doctrinal, that it’s nigh-on impossible to get very far this way. As a result, computational thinking recedes off into the ivory tower, and the world is much worse off because of it.

This type of learning is well suited to therapeutic applications due to its natural, informal flow. Starting with the ‘hard wired’ facilities we all have, ‘bootstrapping’ (discussed below) is a very individual and non-doctrinal process.

An interesting and more in-depth exploration of syntonicity in programming languages was written by Stuart Watt.²

I have found Forth to be a very syntonic language.

Syntax and structure are the first face that we see of a new language. Some are fairly natural (e.g. BASIC), most are verbose and intricate, and some are at the extreme end of mathematical formality (e.g. Haskell). Forth is positively Spartan by comparison, with the majority of keywords being three letters or less, and the ‘: … ;’ pair providing most of the structure and context. Some long time users of Forth do not even like its syntax and quickly override keywords (a trivial task in Forth), inventing their own to craft the language to be more in tune with their way of thinking. Ironically, this makes them love Forth more, not less. It is an extremely simple language. At its core, Forth is just a mechanism to reduce the cost of calling a subroutine to merely referring to it by name. But this is not what makes Forth syntonic.

“syntonicity is not directly a property of the syntax, or even the semantics or pragmatics of a language, but a property of the stories that we tell about it” (Watt p.5)

Forth is a stack-based, threaded, concatenative language with a tightly coupled interpreter and compiler. To explain these terms, and thus ‘tell a story of Forth’, we’ll use a metaphor:
Forth – a hiker (named ‘Arkady’ for convenience)
Stack – her backpack
Thread – the trail
Interpreter – her mind (neurology)
Compiler – her logbook

Notice that there is no clear distinction between traveler (Forth) and journey (program). Arkady’s journey is of more interest to her than is her destination. She is open to distraction, side-trips, and unexpected eventualities. Even mistakes are sometimes learning opportunities. This is the way of Forth – less focus on design and protocol, more on exploration, discovery, and emergence.

Life is not a problem to be solved, but a reality to be experienced.
– Soren Kierkegaard

While most languages are applicative (functions are applied to data), Forth is concatenative, which is a much more natural and emergent process. There are also few safeguards in Forth; the entire environment (machine) is wide open. This freedom is anathema to authorities of language design, which might partly explain why Forth-like languages are not more widespread. It’s much safer to clone an application from an existing template and/or framework than to invent a new one. Of course, the result is a copy, not an original. Incidentally, in much research funding, the outcome must almost be predicted along with careful metrics and timetables (ask anyone who has prepared a grant application). This squelches creativity and serendipity. It also puts a premium on positive results over negative (and perhaps greatly informative) ones. In fairness, just so we don’t make the mistake of assuming that everyone agrees that doctrine is a bad thing, there has been a large and successful effort to come up with a more standardized, ANS Forth.

Her backpack (the stack) is a readily available, last in, first out (LIFO) pile of items that she adds or removes as required. As a good hiker, Arkady is always aware of what’s in her backpack. She keeps often-used items (tools, map, compass) near the top. Longer term necessities (food, tent) are towards the bottom.

She walks along the trail (thread), step-over-step. Some steps are short and obvious, like avoiding big rocks and snakes. Some are more complex and subtle, such as crossing streams and staying downwind of bears. Many are actually composites of smaller steps. Some require more items from the backpack than others. At each step, she has the same backpack, although the exact contents may vary. She may sometimes just take the top few items, drop the pack, and head off down an interesting side trail. This is how a Forth program is built up. New ‘words’ are written to accomplish tasks. These words are then threaded together as higher level words invoke a series of them. Eventually, a single, highest level word is the starting point for the whole program, which is just a chain of subroutine calls.

Using all of her neurology, from senses to brain to common sense, logic, and rationality, she ‘interprets’ her world. She learns new facts and skills as she walks, gradually ‘bootstrapping’ a deeper understanding of her environment and more powerful and efficient use of resources (food, energy, time). To learn if and how something works, she prods it and observes the results. If she finds an unfamiliar mineral, she could bring whatever tools she is carrying to bear. Or, she could invent an entirely new tool. She could craft a crude microscope from the magnifying lenses and rolled up map in her backpack. If she has time, a mass spectrometer may be suitable. In Forth, quick tests and trials are inexpensive and easy. There is no odious edit-compile-run loop to get in the way. For example, to find what OVER does, put a few numbers on the stack, type OVER, and examine the stack to see the result. Stumbling around with her nose in manuals and books would perhaps cause Arkady to miss a fossil or patch of berries. It is also a good way for her to become lunch for a mountain lion.

A word about bootstrapping. Sometimes, when a skill is applied to produce a result, new knowledge (or a new tool) is gained in the process. This new knowledge opens the door to skill-set improvement or refinement, which enables hitherto impractical or unseen possibilities. In turn, even more new knowledge is obtainable, and the process repeats. The result is flint knife to bow and arrow to alchemy to chemistry to electronics to spaceflight. Bootstrapping is the essence of Forth. Sadly, multitasking, object-orientation, dissociation and data-hiding, and that ultimate chestnut, ‘leveraging’ are all much more in vogue today. Multitasking in human thinking is greatly overrated. Deeper, syntonic thinking enables more creativity and innovation. If Forth was more widely used and understood, you might well be reading this on Mars right now. If it had been invented earlier, you might be reading this on Triton.

From time to time, Arkady considers new observations, knowledge, thoughts, and ‘steps’ to be well-established and important enough to jot down in her logbook. This log serves as a permanent record of all she has learned on her journey. It is in fact, a purified, corrected, ‘frozen concentrate’ of her trail.

Arkady eventually arrives at the end of her journey. It may have been cut short for various reasons or maybe her destination is different (maybe even better) than she originally intended. This is a function of her curiosity as much as weather or circumstance. In any case, she still has her logbook, for her or others to use as a future guide (be they human or machine).

Some would say that it’s wrong to argue for Forth going forward. After all, it’s an old language, and was originally created for control of machinery in an era of sparse computational resources. As programs and systems grow ever-more complex, capable, and intelligent, such a language has outlived its usefulness. However, I must disagree. The future is not just about frameworks, big data, and augmented reality. It will be at least as important to build new, ad hoc, ‘micro’ systems rapidly and locally in a time of dizzying acceleration of technology. Creating, testing, and problem solving starting from first principles will always be valuable.

Understanding the world requires thinking in new languages. Forth is to computer science what math is to physics. Computational thinking, syntonicity, scalability, learning-by-doing*, and good old human common sense are not headed for the dust bin of history any time soon. Even basic game narratives are best written in simple languages. Here are some further thoughts on learning.


I am, however, perhaps not a very good constructionist, try as I might. I once tried (unsuccessfully) to acquire a minor Logo implementation. I believe in learning by doing something else! A wide, diverse search is often the most efficient (just ask the ants).
(1) Papert, S. (1980). Mindstorms: Children, Computers, and Powerful Ideas. New York: Basic Books
(2) Watt, S. (1998). Syntonicity and the psychology of programming – Psychology of Programming Interest Group 10th Annual Workshop.

Concatenative Biology

There are several ways to categorize programming languages. One is to distinguish between applicative and concatenative evaluation. Most languages are applicative – functions are applied to data. In contrast, a concatenative language moves a single store of data or knowledge along a ‘pipeline’ with a sequence of functions each operating on the store in turn. The output of one function is the input of the next function, in a ‘threaded’ fashion. This sequence of functions is the program. Forth is an example of a concatenative language, with the stack serving as the data store that is passed along the thread of functions (‘words’). “Forth is a simple, natural computer language” – Charles Moore, inventor of Forth.

One of the great advantages of concatenative languages is the opportunity for extreme simplicity. Since each function really only needs to know about its own input, machinery, and output, there is a greatly reduced need for overall architecture. The big picture, or state, of the entire program is neither guiding nor informing each step. As long as a function can read, compute, and write, it can be an entity unto itself, with little compliance or doctrine to worry about. In fact, in Forth, beyond the stack and the threaded execution model, there’s precious little doctrine anyway! Program structure is a simple sequence, with new words appended to the list (concatenated). The task of the programmer is just to get each word right, then move on to the next.

In nature, the concatenative approach is the only game in town. Small genetic changes occur due to several causes, random mutation being one of them. Each change is then put through the survivability sieve of natural selection, with large changes accumulating over large time scales (evolution). (Evolution is active across the entire spectrum of abstraction levels. Hierarchies emerge naturally, not through doctrine or design.) Concatenation is the way by which these small changes are accumulated. Much of the epic and winding road of human evolution is recorded in our DNA, which is billions of letters long.

This process can be seen happening right now in molecular biology. Consider the ribosome. This is the little machine inside a cell that reads a piece of RNA (a chain of nucleotides) and translates it into a protein (a chain of amino acids). There is no Master Control Program assigning names, delegating work, and applying functions. There is only a concatenative program, built up over the ages by evolution. So, basic life uses a fairly powerful and successful form of computation: DNA for storage, RNA for code, ribosome for computing, protein for application.
(and natural selection for testing) 🙂

We flatter ourselves when we talk of our ‘invention’ of levers, gears, factories, and computers. Nature had all that stuff and much more long before we ever came down from the trees. Math, engineering, and science are great not because of their products, but rather because they enable 3-pound hominid brains to explore nature and ponder the possibilities.

Forth – not quite dead

In my previous post, Thinking in Forth I said that Forth is a dead language. I did this to make two points. The first is that I don’t have to revive or advocate a dead language, hurray, because that’s boring. The second point is that even a dead language still has value, like Latin does. However, just because I consider it to be a dead language doesn’t mean that everyone else does too.

 

 

Forth is an excellent choice for education for many reasons:

  • it’s small, maybe 25 required core words
  • most of these words are one, two, or three letters long, so familiarity with English is less of a prerequisite for learning Forth
  • it’s simple with no built-in abstractions – source code maps directly to assembly code (a Forth word is just a list of calls to previously defined words)
  • a Forth program is just a linked list of Forth words
  • it has little syntax to learn
  • there is only one data type in basic Forth – integers (called ‘cells’)
  • useful programs can be written early in the learning process
  • it’s very ‘bootstrappable’ – one learns by concatenation, not starkly delineated ‘levels’ or ‘tiers’ (tears?)
  • paradigms such as object-oriented and functional programming are consistent with Forth (and in fact many of them have often already been very well implemented and documented)
  • concatenative programming is much more pure and natural
  • See Concatenative Biology
  • syntonic programming is easier to learn and use
  • See Forth: A Syntonic Language
  • Forth is useful for building models and simulations
  • open and non-proprietary versions of Forth are freely available for a wide range of platforms.

There are several efforts at delivering basic, classic Forth to beginning programmers. One that I particularly like is 4E4th which runs on a tiny, inexpensive protoyping card.
The Raspberry Pi can easily run the bigger, more capable Gforth, and the Python-based Pygmy Forth.

Several commercial enterprises continue to promote Forth. Some can be found towards the end of the excellent and extensive Forth page on Wikipedia.

The creator of Forth, Chuck Moore, is a tireless, brilliant, and dynamic force. He has never stopped developing cutting edge technology and innovation based on Forth. For a look at some of his latest work, here’s a talk he gave in September 2013 on the ga144 chip. It has 144 cores, with energy measured in picoJoules per instruction and power in nanoWatts per core, and is of course programmed in Forth. That’s hardly dead technology.

Forth has been described as a ‘timeless’ language. It is so fundamental, so elementary, so non-doctrinal, and so general purpose, that it may live on forever, even if just as an idea. It’s very Platonically intellectual in this sense. Arguably, Forth provides the most primitive computational platform possible. It’s as machine-like as assembly language, yet not nearly as machine-specific. A stand-alone Forth routine is often smaller than the equivalent code in assembler! (For others who would make the same claim, remember that library calls are cheating). It can be suitable for teaching and practicing computational thinking. I have found it to be the perfect tool for tinkering with computation from first principles.

One caveat: Forth is a difficult mindset to forget or ‘un-learn’ — it can be a one-way door.

“If you are squeamish, don’t prod the beach rubble.”
– Sappho

So, while I do not aspire to revive Forth or advocate it generally, I may have been in error in describing it as dead. I come to praise Forth, not to bury it.

Thinking in Forth

I consider Forth to be a dead language, at least in the same sense that Latin is a dead language – i.e. nobody uses it in daily life anymore. I now only use it and write about it in the context of computational thinking. Forth may be a dead language, but that doesn’t preclude it from being vibrantly important. If you don’t like irony, you’re reading the wrong blog.

Forth was developed for over a decade by Charles (Chuck) Moore before being written down formally in 1970. Moore was a radio telescope guy at the start. He needed an expressive, helpful computer language for his own work, much as Newton needed calculus. The story is best told in the book: “FORTH – The Early Years”.

What makes Forth important is just that – it was created to serve as a computational assistant. Using it is very conversational and immersive. One doesn’t think for an hour, consult authority and standards, edit for an hour, then compile, then test, then go for lunch (or go home), etc. There is a non-stop dialectic, a friendly and thoughtful argument, with you handling the imagination and it handling the bits and bytes.
And a program emerges.

I started using Forth in the late 1970s. Using it, I learned binary and hexadecimal numbers and arithmetic, register manipulation (the main task of a CPU), memory manipulation (the main task of a computer), and many concepts and paradigms of programming. These included imperative (states, flow control), functional (stateless, mathematical), object oriented (encapsulation, re-use), factoring, concatenative, declarative (mostly through a Prolog written in Forth), machine control, etc, etc,. I read several books and many articles, but the real learning happened by doing. What’s key here is the fact that I’ve never been a great programmer (very good, but never great). Anyone can gain tremendous benefit from spending even a few hours learning Forth. It enables one to invent an ad hoc language to solve an ad hoc problem. It’s not magic and other worldly like Excalibur. It’s more like an early hominid’s stone tool – it teaches one to think, especially to think about the next tool.

My Forth days are long in the past. Today, I write mainly in Haskell, Assembler, Prolog, and English. For database work, I use SQL, key-value stores, and JSON. Whenever I think about any problem, I think computationally. I’ve spent almost four decades thinking in Forth.

I intend to write many posts and maybe even a few articles on Forth, it will be a mainstay of this blog.