ANTLR4: So awesome it makes Java look good

I have been trying to write my own programming language off and on for about 5 years now. That doesn’t mean I have spent 5 years on a single project. Only that I have been toying with the idea and educating myself for that amount of time.

Writing your own language is kind of the Holy Grail of computer science. Well, writing a language that is actually useful.

Over the years, my experience in Ruby, PHP, C/C++, Python and even LISP, has informed my idea of what a good language would look like. Note I didn’t say perfect language. There is no such thing. I have come to believe that all languages are good in some way or another, and bad in other ways. Emotionally I like Ruby, and I like C. The elegant syntax of Ruby makes you happy to program, while the Do-Anything attitude of C makes it the ultimate language.

I have been looking at Java( due to it’s cross platform nature ) and have to say at first I hated it. Now I hate it less. The number one reason I hate it less is ANTLR. I love the code generated by ANTLR because it looks like code that a human being would actually write. It’s very dense, but it’s alive. It’s hackish and clever.

I have been following along the book The Definitive ANTLR4 Reference. This is a first time for me. Normally I, quite wrongly, just jump in and get it done. This time I am going to RTFM.

In reading the text, I began to really see that my language may one day be a possibility. I started to think: What would a good language for me look like? What kinds of features would it have?

What is essential in a language will depend on its purpose. PHP was for the web, and until recently, had all the great features needed to pump out web pages. Lately it’s gone down the tubes, due mostly to Rails Envy. Which it shouldn’t have.

Ruby was intended to be a replacement to PERL. It was a great start in that direction until DHH came along. Like a reverse Pretty Woman, Ruby went from High Class, to webscort as it was more or less whored by the Rails Community into pointlessness. A language designed for the Command Line all dolled up for the Web. Fish out of water.

Everyone wants to design a Web Language these days, but not me. I want to return to the command line, to text processing. To Data processing. Where did Ruby go right, and where did it go wrong? Where did LISP go right and wrong?

In my opinion, LISP is the most perfect language ever developed. It’s rules are so simple, so elegant, but with them you can build amazing things. If you could just understand the god-damned code. Which you can’t. Where LISP went wrong is the parenthesis. Too many parentheses, and all cobbled together at the end of the line.

Where did LISP go right? Functional. Uniform. Consistent.

Where did Ruby go wrong? That’s harder to say. Ruby did so much right, that the only thing I can really think is wrong about it is the syntax, though awesomely elegant, is just a little too verbose. Getting derailed by RIA frenzy was another problem.

Whaddya mean the syntax?

All those begin, ends are a bit irritating. Nil is not 0? Irritating. def initialize. Irritating. Alias Method and no goto? Irritating. Aliasing Methods causes more bugs and headaches than goto ever did. No standard for/foreach loops? Irritating. @ and @@, irritating.

One thing that they all do wrong: They don’t know enough about your world. That’s why so many frameworks need to be built. But the language is the framework.

Making a programming language, has historically been, like giving someone access to a forest and telling them to build a house. Bring your own chainsaw too btw.

Here is my list for what makes a language good:

  1. Easy is better than Hard.
  2. Simple is better than Complex.
  3. Complex is better than Convoluted.
  4. Convoluted is better than Impossible.
  5. Languages should solve more problems than they create.
  6. Write less, do more.
  7. Don’t Repeat Yourself, and don’t Make the Programmer Repeat themselves either!
  8. Know thyself, in all things, and Know Thy Users. ( Reflection is essential, Know the algorithms users need ).
  9. An Arbitrary Limit or Rule is a Design Flaw.
  10.  A User should be able to change anything.
  11. Don’t be paradogmatic. ( A paradogma is a paradigm made into a dogma.  OOP isn’t good for everything. )
  12. Always follow the path of least surprise for the programmer.
  13. Represent the real, not the ideal.
  14. Source Code is Documentation.
  15. Don’t be afraid of {, [, or (.

Practically speaking what this all means is: When thinking about the most common tasks, these should be easy. Here’s an example. How many times do you loop over the Keys/Values of a Hash and do something with both? A lot.

PHP

Ruby

Peridot

In the language I am designing, why not look at it as the most common task with the data type present. with is a keyword that is aware of the datatype of the argument and chooses the most common task. With a Hash, that is looping over each Key and Value. Notice we don’t need to explicitly print a string. If you have a string, guess what, you either assign it or print it. This is the most common, it’s implicit. However nothing stops you from being explicit.

The with keyword is meant to be smart, it’s a shortcut to saying: You know what I mean right? Sometimes that is not what you want, so nothing stops you from writing:

One thing I really love about Ruby is the almost natural languageness of things. Lets say you only wanted the keys?

In this case, of is the same as $array.keys(), or keys(array). In this case, ‘.’ is syntactic sugar, same as in C++. Object.method(), is the same as method(Object); I.e. the Object is the first argument. The ‘.’ is something people have come to expect, and they think that is what makes a language object oriented. Here we are just talking about aliases.

In the above example, $key is made for you automatically because the language itself is aware of common name schemes. If $array.keys() returns a collection of values, then each one of those values is most likely a key without the s. Inflection built in.

The other thing that Peridot would like to have is being Data Format Aware. Things like XML, YAML, JSON, CSV etc are universal, it should know about them automatically. You should never have to write a CSV parser.

All CSVs are rows and columns, I have never encountered a situation where this isn’t the case. That doesn’t mean it doesn’t exist, but unless you are working with a special case, the language should figure out that if you are opening a CSV file,you aren’t doing it for your health, you are intending to read it into rows and columns.

When you use a programming language, you soon learn that comments are your friend, when you are reading code, but shouldn’t they be your friend when running them too? What if the comments were actually part of the language, instead of completely ignored?

Or, when calling a function:

My goal of course is not to make a fast language, but to make one that is easy for common computing tasks, specifically for the command line. If it runs at a good speed, I will be happy.

About Jason

I am a 31 year old programmer living in the south of France. I currently work actively in the fields of Ruby/PHP/Javascript and server/website administration.
This entry was posted in Computer Science, General Computing, Programming Language Design, Topics, Writings. Bookmark the permalink.