Programming Language Evolution

Photo by Anne Nygård on Unsplash

The history of programming languages is ripe with evolution. Existing languages constantly evolve and new languages are created to address the emerging needs. Sometimes there are radical, revolutionary breakthroughs, with a complete paradigm shift, but often there are just gradual improvements and refinements. The latter is the topic of this story. The practice of programming at any given era usually goes ahead of capabilities that programming languages provide, while programming language designers recognize it and catch up to fulfill the demand. Let us see some examples to the point.

In early languages, you had to write a lot of repetitive code just to do a simple loop. The loop was such a common programming pattern, that it was adopted even by the primitive higher-level languages in the era predating structured programming. So, there was a time when you still had GOTOs in your programming language but a significant fraction of the code had structured loops:

10 LET N=10
20 FOR I=1 TO N
30 PRINT "Hello, World!"
40 NEXT I

As we know, the subsequent generation of languages not only added structured IF statements but also made the structure explicit in the source and ended up abolishing GOTOs completely. This kind of evolution can be seen in other areas, too.

Let’s take a brief look at OOP. The object-oriented style of programming does not need an object-oriented language. Even nowadays you can find software written in C where methods are just a convention of writing functions whose first parameter is a pointer to the receiver:

void Point_move(Point* self, int dx, int dy) { ... }

Virtual methods are routinely implemented in pure C, too, explicitly keeping a virtual methods table with references to methods somewhere in the object’s structure.

However, the rising popularity of object-oriented programming back in the day cemented the growth of languages that incorporated these patterns natively with classes and methods as we know them today.

As hardware became faster, working with objects by reference became so popular in large-scale software systems that even an effort of having to explicitly use Point* or some other syntactic notation to designate references to an object came to be perceived as a boilerplate — a syntactic incantation that you have to always write when working with business objects. So it was abolished, paving the road to the modern approach of having an implicit reference to all object instances that top application-programming languages (Python, JS, Java) use.

This is also a good example of how a language feature that is ubiquitous today might have a very long history but was not universally adopted by main-stream languages until the right time.

The first generations of object-oriented languages were created in the 20th century, well before the massive software projects (in terms of lines of code and 3rd-party libraries used) that we see today. These languages were quickly adopted at a scale hardly envisioned by their creators.

The original idea that you can simply declare all the methods you’ll ever need on a class together with its declaration quickly crumbled, leading to hundreds StringUtil , FileUtil, and other “utility classes” in a typical project, which were merely collections of “extension methods”. They were essentially written in an old-school pre-OOP style with a convention of having a receiver of the call as the first parameter as we saw with the example in C. History goes in circles! So it is not a surprise that a generation of languages designed in the 21st century for large-scale development typically incorporates this pattern via some form of a 1st-class extension concept.

I could go on with more examples, like the tenuous malloc/free dance that was replaced by various automated memory-management schemes, unwieldy callback-based programming that gave rise to language support for coroutines, repeated boilerplate of collection-manipulating operations leading to support for standard higher-order functions like filter, map, etc.

I hope that you already see a common motif here. As programming evolves, various ubiquitous patterns of code emerge. These patterns are boring, error-prone, and are totally not fun to work with. Eventually, they get recognized by programming language designers and are incorporated into languages as new features, increasing the overall level of abstraction, improving developer productivity and job satisfaction. This story repeats with each generation of languages as newer patterns build on top of features that used to be just programming patterns themselves a generation ago.

What does it mean for a programming language to be modern and to stay modern? It means to accept the change, this inevitable cycle of evolution, and to evolve with it. When software developers consistently write or have to automatically generate some repeated pattern of code it sends a strong signal that something is not right, that some language feature is missing.

Take a look at one more example. As OOP design style scaled, decoupling the internal storage of data from its externally-visible representation became the norm, thus giving rise to a convention of getXxx (getter) and setXxx (setter) methods to achieve this encapsulation. It has been going on for 20+ years all over the enterprise codebases with an extensive set of tools helping to generate this boilerplate. So nowadays, simply looking at whether a language has or lacks built-in support for this property pattern, we can see if a language is keeping up with the needs of modern large-scale software development.

Let’s get back to the example we’ve started with — a transition from programming with GOTOs to modern structured programming. The real breakthrough came not just from adding structured loops and conditionals to languages, but from the complete removal of the GOTO statement.

A case in point is a notorious “billion-dollar problem” with null pointers. Adding language support for better null-handling is only half a solution that helps developers with writing the null-avoidance boilerplate. The real breakthrough is to completely forbid storing nulls into arbitrary reference types.

This removal of obsolete features is often the hardest hurdle that prevents languages from adapting to the ever-changing world, leading to their ultimate downfall under the weight of features they add. Of course, it also depends on the inertia of the language, the size of its ecosystem. The more widespread the language is, the more deadweight it can afford to support without crumbling. But ultimately, the general law of evolution, as stated by Charles Darwin, holds for programming languages, too:

It is not the most intellectual of the species that survives; it is not the strongest that survives; but the species that survives is the one that is able best to adapt and adjust to the changing environment in which it finds itself.

If you are interested in trends that shape the design of modern programming languages, then check out my other related stories:

Project Lead for the Kotlin Programming Language @JetBrains