How big should a programming language be?

  • What is important is not really size but consistency and "self-containment" - are the language constructs and semantics (including the std library) semantically and logically consistent with one another.

    If that's not the case it gets much harder to get your head around how it all works as you cannot effectively apply some simple pattern to all the stuff you see and you need to juggle different ways of thinking when doing basic stuff. And this propagates even further and harder to other libraries.

  • This is related to my main issue with Swift.

    It started out as a very usable, mid-sized language, with many tooling issues. Now it is a kitchen-sink language with almost every imaginable feature (and more coming!) that still has tooling issues. It's no longer as pleasant to read unless you keep up on the latest additions. You can't easily read code that uses the myriad of new features without a long weekend learning what every property wrapper in this particular program means and how to use the new feature of the month.

    In my opinion while it was still a mid-size language, they should have fixed all of the tooling issues, and then very incrementally over a long period of time added new features. Instead it's more of a "move fast and break things" culture. Which is not what I want in the evolution of a programming language.

  • Golang is really the best example of a minimal language (Yes yes I know late to generics and all that). I think it works because they have spent huge amounts of effort to create a well designed and fairly comprehensive standard library. The best example of this is io.Copy. You give it a reader and writer and it does an optimal copy in almost all cases.

    A minimal language that doesn't have resources like Golang to put in the stdlib is just not gonna work.

  • I am a team lead in c++ shop. Every interview, I ask: tell me about c++’s std::move().

    Not a single interviewee has even had a good guess in 2 years of open job recs.

    It really has me reconsidering. I’m a language nerd through and through. I love C++. But I also have to be honest and say there’s such a thing as “the average amount of c++ the team knows”.

    To me, the biggest threat to C++ isn’t rust, but rather the speed at which the language is changing. It’s already past the tipping point.

    In my domain, embedded programming, I’m keeping an eye on zig with fingers crossed.

  • The core language should be small enough so that a programmer can remember all its features. This makes for efficient and quick programming. The rest should be in standard or community libraries. In my book over the last few years only Go and Lua meet the criteria.

  • To me, Objective-C 1.0 was the perfect size (before Apple added properties and stuff in 2007). As long as you avoided the weirdest undefined corners of C, it felt like you could understand every construct available and its performance implications without needing to read the docs or ask Google.

    Maybe it’s just the rose-colored memories of a first love.

  • What is very, very tricky is that the language get bigger the harder is to do composability. And "composability" is far bigger and broader than the normal understanding.

    For example, consider Rust that is built around the idiom of composability (aka: Traits).

    Is nice, until you get async, custom allocaters + runtimes, const, how do macros, type reflection, code generation, etc...

    Each thing is something user want to make code more composable and flexible. But each thing is also a big thing: You can make a WHOLE LANG around just one of that!

    ---

    Lisp, Forth and similar make easy "syntax composability" -not solve all the rest- that is probably the bare minimum. This is the MAJOR "flaw" of algol-like languages.

    For example, I wish Rust allow:

        struct Person {name:String} 
        struct User: +Person {pwd:String} //not inheritance: Copy of fields!
    
        let u = User {..}
        let p = Person::copy_from(u)
        for f in p.iter_fields(): //walk over name:String, pwd:String
    
    With lisp/relational/array like paradigms this stuff is easy. But algol-like languages are too "nominal" and lack the ways to manipulate the construct without resorting to macros, that are for most cases a band-aid.

  • Haskell was a small clean language but now with the innumerable language extensions has become large

    Bash is a small clean language that has maintained its clean core

    Golang is the archetype of a small clean effective language that has tried intentionally not to become cpp and it shows the amount of wrangling needed to add generics to golang is a testament to their focus on a small language albeit imperative

  • In the design of Starlark (https://github.com/bazelbuild/starlark), I often had to push back against new feature requests to keep the language simple. I explicitly listed simplicity as a design goal.i

    Of course, the scope of the language is not the same as general purpose languages, but there's always pressure from the users to add more things. I also think many people underestimate the cost of adding new features: it's not just about adding the code in every compiler/interpreter, specifying every edge-case in a spec, updating all the tooling for the language and writing tutorials; it's also a cost on everyone who will have to read any of the code.

  • C is about right. Common Lisp is too big and Scheme is too small.

    The language itself should be small but its standard library should be extensive, and largely written in itself. If the language's own implementors don't want to program in it neither do you.

  • I've always figured you can ignore the features of a language that you don't like, but a drawback of a giant language is that it's hard for beginners to read code, or to make sense of tutorials that all use different ways of doing something.

    You can always put extra features into a library, and identify certain libraries as "standard." So it seems reasonable to demand that actual language changes are limited to those where the readability benefit outweighs the need for everybody to learn the new feature in order to read code.

    As mentioned in other posts, a huge sprawling language makes it hard to gauge whether someone is proficient in a language when applying for a job. I don't really know how important that should be.

    In my case, I got a low score on a Python knowledge test because I've been programming in Python for a decade but haven't kept up with the latest features. Not that I'm looking for a job, but I was just curious, and it was part of figuring out what kind of training I should get.

  • This Guy Steele talk is related to this topic: https://www.youtube.com/watch?v=lw6TaiXzHAE

    A good watch for the fine people in this thread (and please ignore explicit or implied linkages to Java--it's full of the broad concepts, not narrowly JVM stuff)

  • There is something very appealing about learning a small(ish) language. However, a small language does not always mean simple. And it means code can be difficult to unravel.

    Niklaus Wirth, the creator of Pascal, Modula-2 and Oberon, believe strongly that teaching programming is most effective when students can grasp the entirety of a small language.

    From a 2018 interview with Wirth on the topic:

    > "In order to do a good experience, you have to have a clean language with a good structure concentrating on the essential concepts and not being snowed in."

    > "That is my primary objection to the commercial languages – they’re simply too huge, and they’re so huge that nobody can understand them in their entirety. And it’s not necessary that they’re so huge."

  • > know a lot less about languages like C# and the like, but I wouldn't be surprised if they've gone, or are going, through something similar.

    100% true, it is. More keywords in each release. More ways to do things.

    e.g. Now we use "record" and "with" keywords, and operators such as "a!.b" or "a?.b" .

    And we set projects up with "implicit usings", and "file-scoped namespaces" which do the same thing will less indenting.

    While I like all of these, the downside is yes, it's an ever-expanding language.

  • I’m still not clears of what it means to be big? Is the API big? The lines in the stand library make it big? Available syntax options?

  • I don't see problem with growing languages as long as designers are doing their job.

    There's difference between adding features randomly and putting effort into making it obvious, transparent, integrated well into the ecosystem and overall UX.

    UX is something that languages and API designers must put more effort into.

    Start creating APIs with assumption that people have no access to documentation - it makes everybody's life easier. Just take a look at .NET BCL - it's very well designed. They even wrote a book about framework / libraries design basing on their decades of experience.[0]

    Also "small language" is relative.

    If you have started doing X lang development decade ago then that version of language is "small" to your perception. For somebody who started 5 years ago it was small 5 years ago.

    The point is that at every point language have some features which aren't popular because they serve some specific cases which are required for somebody.

    Very often those features are required for base class libraries makers so they can create either good API, or have good performance/safety, or anything else.

    For me it seems like programming languages and its ecosystems lack of deprecation process which actually removes stuff.

    [0] - Framework Design Guidelines: Conventions, Idioms, and Patterns for Reusable .Net Libraries

  • The guy is not distinguishing clearly between language and library which undermines his point.

  • I'm facing this dilemma now with the Dogma metalanguage [1]. On the one hand, I want it to be small so that it can be picked up quickly and mastered easily (very important since its primary use case is documentation). On the other hand, it needs enough power to actually be useful in the real world.

    In the end I opted to avoid most conveniences since they can easily be built using a few common macros such as `u16(v) = ordered(uint(16,v))` and the like, except for cases where it would be just silly not to (like strings as syntactic sugar for concatenated codepoints, and regex-style repetition chars ? * + instead of {0|1} {0~} {1~}).

    But even then the spec is over 2000 lines long (admittedly the examples take up a fair bit of room, though).

    [1] https://github.com/kstenerud/dogma/blob/master/v1/dogma_v1.m...

  • Surprised to see no mention of Elixir so far. Its designers have explicitly focused on "a compact and consistent core". And indeed I found it very pleasant and rewarding to learn for that reason. Everything fits together nicely.

    From https://elixir-lang.org/development.html

    > Since v1.0, the language development has become focused to provide a compact and consistent core. The Elixir team focuses on language features that: > > 1. are necessary for developing the language itself > 2. bring important concepts/features to the community in a way its effect can only be maximized or leveraged by making it part of the language

    In other words, as I've heard it put, the language is mostly not growing anymore. It probably helps that it has strong metaprogramming facilities.

  • I'd argue pretty small.

    I primarily work in a language out of the APL family.

    When I first started it, the entire reference page of every function/command/flag fit on a double sided printout in regular sized font.

    Past a certain point you just end up with overlapping versions of the same thing maintained for backward compatibility or slight differences of opinion & behaviour.

    I feel the same way about framework ecosystems. Some languages almost have too many, to the point that they are constantly being introduced, changed in ways that make old code non-portable, and then sunsetted. It's hard to keep internal corporate software running when if you choose something mainstream it's less than 5 years from EOL, but if you choose something emerging it may never go mainstream. A lot of it seems like reinventing the wheel.

  • I haven't paid attention in many years, but at least for the first few years (after it grew from a data format to a scripting language), each release of Lua was more powerful, faster, and smaller in executable size, than its predecessor.

  • There are two different ways to go about this question:

    - asking about the scope of a programming language is asking about the intended programming situations one wants to cover.

    - what is a good balance between convenience (sytactic sugar, "features" and abstractions) and asking the programmers to write out things explicitly.

    - more subtly: who is using the language, what concepts can be expected to be known to the programmers.

    These are of course connected: a particular type of recurring programming situation may lead one to think of patterns and a need to abstract them, whereas to someone who intentionally constrains the scope to a specific area, it can be easier to keep the number and kind of source or "programmer intent" abstractions low.

    Often these discussions happen in the context of a general purpose language but there is little consideration that some programming situations are better served by DSLs.

    Also a decision on "MIT vs "New Jersey" (worse-is-better) style affects the whole thing, as the notion of correctness is at some level always one of programming situations (a higher level discussion on using the API right way, checking error conditions etc)

    Thus is a long-winded way to say: in the end the starting point matters. C++ can be considered a failure language design if you consider it from the readability/cognitive load angle but design choices seem to consistently be about permitting all possible choices. This is why Stroustrup can claim things like "you can write memory safe C++" and not even be wrong. Java can be considered a failure in concision but tool support means that programmers do not pay the cost of writing everything out explicitly.

    Haskell and anything in the academic tradition will emphasize abstractions and demand more and more knowledge from users whereas industry tradition will be biased towards low (or lower) cognitive overhead, but if one is willing to specialize, there is an infinite number of nuances of possible points in the design space. We will see more programming languages and come to accept that there will be many that we use at the same time.

    So rather than asking how big, maybe we should wonder how to understand the involved trade-offs.

  • Glossed over in the brief analysis of Haskell is that the expansion of features are:

    1) Not in the language spec, just options in the de facto-standard compiler 2) Are often shunned by industrial users, see “Simple Haskell” 3) Are, among other reasons, there to support experimentation and research which is an important use case for the language 4) Are opt-in and largely unnecessary in practice, in my code I generally only use a couple of well known extensions.

  • If a language ignores common problems we all face, it’s too small.

    If a competent professional can’t become fluent in a year, it’s too big. But I have pretty low regard for people who complain when it takes longer than a weekend. This is part of what we’re paid to do, and the investment in your toolbox will pay off for the rest of your career. I still use ideas that came from languages I can’t officially use right now.

  • I started Python with 2.4 or 2.5. It’s not big, it’s been many years since I’ve been actually surprised about something.

    Evolution, or incrementing version numbers, don’t necessarily equal increased size.

    For example Python now has async functions. But it had them in Python 2.5 as well when it introduced generators (see the twisted.deferred decorator) - you’d just use “yield” rather an “await”.

  • > How Big Should a Programming Language Be?

    As big as it needs to be and no bigger.

    A programming language is almost always built for a purpose - to solve a problem or fit into a certain domain. We've all seen the needless complexities in programming languages that accrete new features, year after year, just to have new features.

  • This makes me think of how the R language has been "slow" to make (some) changes and many people now use packages from the tidyverse. Writing R with the tidyverse can be practical unrecognizable to people writing with base R (and vice versa).

  • If you feel the need for a new feature in a language, you should ask why the language does not make it easy to build that feature in the language itself.

  • PHP always had everything you needed built into the language. Coming from a C background and moving to a Javascript lifestyle php always offered you everything you need out of the box and extensions for that little bit more.

  • I've started pondering what a language would look like that was optimized for LLMs, where token length is the most important budget.

    Maybe a language with 10,000 keywords would actually be better in LLM world?

  • I don't know, but there has to be a reason Gamemaker Language is so easy to pick up and play with. Imagine if C let you get away with murder and still compiled.

  • There's a pre-supposition in the community that bigger languages are typically worse, but I'm not sure that's true. There are also multiple ways to measure language "size" and most arguments I've seen conflate language features and standard library (which can often have overlap).

    As counter-examples, I'd raise the Wolfram language (and maybe Matlab, but I haven't used that) as an example of a truly vast language – in terms of "standard library". On the other hand almost all criticism of Go has historically revolved around it being too small.

  • Hard to do the wrong thing, easy to do the right thing.

  • Somewhere between the size of C and C++ probably.

  • "big enough to reach the ground" ?

  • As big as C or only a tiny bit bigger.