A long, long time ago, before there was Facebook or Twitter, and when Google was simply a search engine, I had a revelation. I realized that programming languages were simply a set of tokens. Of course, this is not so mindblowing now. But to a young 17-year old who had just started college, this was an amazing revelation. Being a nerd and all, I decided that I wanted to make my own programming language. For some reason I wanted to make it look really ugly and difficult to read. Had I known about Perl or other esoteric languages, I probably wouldn’t have decided to write it. Who am I kidding, I probably would have anyway and also would have felt a little less weird. Anyway, the very first incarnation of my language, bAdkOde, was written in Java. At this point, I had never heard about EBNF’s, recursive-descent parsers, or anything related to language parsing. But I was eager, knew how to write code, and therefore I was dangerous. My first implementation had no recursion whatsoever. I parsed the text iteratively. I had this horribly complex function that tried to validate and parse a printable string-of-text (for example, print “This is variable {a}\n”). I even had interpolation in my language (I thought it was novel, but I eventually found out that Perl already had it) and so that made the parsing even harder. My language was also typed and you could even create variables (I maintained a list of declared variables – I eventually found out that this was called a “Symbol Table”). It also supported basic mathematical operations. There was no way to do flow control though, and so the language didn’t have any if-statements or looping constructs. I had plans to add all of this in but never really got around to it.

Eventually, I took a class at ASU (CSE 240) that taught me about parsing languages. Using this information I rewrote bAdkOde using recursive-descent parsers. Then something happened and I lost interest. bAdkOde languished and was forgotten for a time. Then, I was suddenly interested in writing a language again because I was going through a phase where I read a lot about esoteric programming-languages. I thought I’d write my own. At this time, I think I was also taking an assembly class (possibly CSE 421) and so the eventual design of the language was definitely influenced by a few assembly concepts. This time, I decided to write the language in Perl since it is suited to parsing text. In retrospect, I probably should have written the language using a recursive-descent parser (even if it had to be in Perl) instead of my current implementation.

First, I wrote up a quick spec of the features of the language. It was going to be an extremely minimal language with only a few operators. It would have a stack, memory, and two registers. Memory and stack are unlimited (theoretically, but it depends on how much physical memory you have – the language enforces no constraint), but the actual code you write doesn’t reside in the memory that you’re able to use from the code. After I wrote the specs, I wrote an EBNF:

stmt		::= b-stmt | u-stmt | while-stmt ;
b-stmt		::= binary-op, (number|mem-reg), mem-reg ;
u-stmt		::= unary-op, mem-reg ;
while-stmt	::= "{", logic-op, mem-reg, {stmt}, "}" ;
binary-op	::= > | + | - ;
unary-op	::= stack-op | ' | " | ? ;
mem-reg		::= mem | reg ;
logic-op	::= = | ! | + | - ;
stack-op	::= push | pull ;
push		::= ")", (number|mem-reg) ;
pull		::= "(", mem-reg  ;
mem			::= "[", reg ;
reg			::= a | b ;
number		::= digit, {digit} ;
digit		::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 ;

Note: That’s right. The EBNF shows that you can’t use negative numbers as operands. This wasn’t a design decision… I think I just forgot about it. It just means that makes things a little more fun

From the very first definition in the EBNF, you can see that the language has three types of statements: binary-operator statement, unary-operator statement, and a while statement. What this means is that there are statements with an operator that requires two operands, statements with operators that require only on operand, and a while (looping) statement.