Jim's Depository

this code is not yet written
 

In the 80s I wrote an image processing language with extensible syntax using Earley’s Algorithm. It worked well enough, but it required expert knowledge to understand what was happening when syntax from disparate modules interacted.

I’ve recently learned of Parsing Expression Grammars (PEGs) and Packrat Parsing. I think this could be well suited to an extensible language. This is a linear time algorithm which can easily take unfactored BNF style productions as its language definition. It is a memory hog, a reasonable size module might take 50M to parse, but I think the simplicity of syntax specification will more than out weigh the memory foot print. (That image processing language ran on a Vax11-750 with 8M of RAM shared with 10 people. This is a technique which is only now becoming feasible with larger memory sizes.)

I am currently prototyping a packrat parser for an extensible language. The prototype is in Javascript and runs inside a browser with fields for the grammar and the input string which makes interaction trivial.

It is not on a publicly accessible machine. I may put a snapshot of it in a public place if anyone asks.

Here I keep a list of things I think are important for the language:

  • Legibility. I appreciate the simplicity and cleanliness of lisp with its powerful macro capabilites, but I think it suffers in legibility. A programmer should be able to glance and understand, not look and decode. I think this means an Algol style syntax.
  • Performance. Computers have become fast enough that interpreted scripting languages are fast enough for mid volume web sites and user intensive applications, but higher performance is needed for large web sites and compute intensive applications.
  • Extensibility. The language must be able to grow syntax to support new constructs. I dislike proof by example, but the chaos of adding “for each X” functionality to javascript should serve as a fresh warning.

There are a host of other requirements that I just expect of a modern language and will not mention here. I think the ones above will serve to establish a direction.

More many years now I have been disappointed in the state of programming languages. I followed the basic,pascal,C,C++ path through the 80s and 90s with side trips to a dozen other languages. Currently I use PHP for web sites, Javascript for prototyping, and objective C for applications, but none of these are satisfactory.

In the late 90s and early 00s I used Dylan for many things. (Algo syntax, scheme derived, CLOS) I think the biggest problem with Dylan is its rewriting rule based macro system. For a language as simple and powerful as Dylan to be saddled with a macro system that will give you sendmail flashbacks is simply wrong.

I think that perhaps what is needed is a new language that keeps the expressiveness and legibility of Dylan, but uses an equally legible syntax extension system.

So, I am writing one.

In coding femtoblogger I wanted a simple way to avoid SQL injection attacks. I think I’ve settled on one simple rule:

“Never paste any variable into a query string.”

That is much simpler than the “never paste user input into a query string” or the “always call the proper escape function for variables” methodology. I use the ‘?’ and bind all variables.

Somethings come out in two lines (prepare,execute) instead of one (query), but overall I think the code is more legible without having to read through the concatenation, string delimiting, and escape functions.

48 hours in and I’ve crossed the 1000 line mark for combined HTML and PHP. I will soon need to add a ‘next page’ function to the front page as we go past the 10 article cutoff. I think I’ll take that opportunity to shrink the code somewhat.

I’m not happy with the

getting duplicated in all the primary files, but sometimes I want to tweak it and I’m not sure how to best do that. Maybe I can leave some expandable markers in the standard , and when the primary page calls the WriteHead() function it could pass in a dictionary of marker expansions.

Likewise the DIV structure to make the left and right columns on the pages is replicated in all the primary pages. I’d like to make that go away, but somehow the WritePageFront()…writemystuff…WritePageBack() does not appeal. I dislike having the front code and back code split apart like that. Pasting up the inside as a string is ugly. Passing in a function to write the middle might be the way to go. Perhaps as an entry in a dictionary like WriteHead(). It would be much nicer in a language with continuations.

I pulled the boilerplate html into a NormalPageTop() and NormalPageBottom() function. It looks gross, but it gets it all together in one spot. Each of these takes a dictionary argument to supply non-standard values for various pieces of the boilerplate.

This gets femtoblogger back to right at 1000 lines of code with the RSS feed added.

Update: and right back over 1000. I changed things around so clicking on an article title takes you to a page with just that article. There is now a little edit pencil on the articles to edit them, like on the comments.

I added comments to femtoblogger. Just in case someone wants to say something. You can comment anonymously, but you will have to pass the captcha. People logged in can comment freely. That isn’t too much of a hardship. You can create yourself an account if you would like.

This is an anonymous comment.
This comment is by jim, and has been edited.

I’m not releasing femtoblogger for a while. I am enjoying the luxury of changing things willy-nilly without worrying about converting deployed databases. I’m not even worrying about sometimes breaking the screens while I change code.

The road map looks thusly:

  • Develop in secret until the datastore schema seems relatively stable.
  • Move the svn library to googlecode and have a quiet release.
  • Maintain.

I added a browser type tally to the right hand column. I have very little idea why. It is another database query and update for each page load, but it doesn’t have significant impact on the performance. I’m still loading 280 front pages/second which is 5 times my available bandwidth. No worries yet.

I’m measuring my page load capacity with “openload”. There is a debian package and it is trivial to use. I like that in a tool. They can be found over at sourceforge, http://openwebload.sourceforge.net/

The ‘C’ key started to miss and got progressivly worse on my Powerbook G4 Aluminum 1.25GHz. First, go read about putting keycaps back on… http://docs.info.apple.com/article.html?artnum=88106 … then it is time to take the bad keycap off.

  1. Depress the key in front of the bad key.
  2. Lift the bad key gently.
  3. Peek in.
  4. Get a knife point in between the white scissor piece and the keycap.
  5. Twist gently to pop the front of the keycap off.
  6. Repeat for the back of the keycap.
  7. Notice the size of the plunger top.
  8. Cut a dot of tape big enough to cover the plunger top, but small enough not to interfere with the scissors.
  9. Stick it to the back of the key.
  10. Reassemble.

With any kind of luck you have just repaired your keyboard and saved \$160. I think the plungers wear out a bit and no longer reach the key contacts. That extra bit of tape lets you push it down just a bit further.

Good luck. (I had to try twice. My dot was too big the first time and kept getting caught on the scissors and holding the key down.)

Because I used nanoblogger, but found it too large and complicated and there was already a reference to picoblogger in google.

When I write a new improved version in a new improved language I will call it attoblogger or perhaps zeptoblogger. Ain’t wikipedia grand?

I wonder if people will think femtoblogger has something to do with women?

more articles