Simulating Recursive Regex in JavaScript

I've been playing around with the idea of writing lexers in JavaScript, it seems like a language that should be well suited to the task. Unfortunately rhino's regex engine isn't the most powerful on Earth.

While researching this I found an interesting link on Jon Aquino's Blog that discusses two features I wish rhino had: recursive references to captured matches and named references to captured matches, both apparently are available in PHP and other languages too.

I tried to simulate something similar in JavaScript with questionable success: this example shows a method for working outwards from the innermost parenthesized content, out to the outermost, evaluating as it goes.

var str = "(10+(6+(1+1)*(3+2)))";

while (str.indexOf("(") > -1) print(str), str = str.replace(
    /\(([^()]+)\)/g,
    function() { return eval(arguments[1]); }
);

print(str);

// (10+(6+(1+1)*(3+2)))
// (10+(6+2*5))
// (10+16)
// 26

Edit: Is Steven Levithan's XRegExp library a solution?


Resig: BBC Removing Microformat Support

John Resig mentions the recent discussion at the BBC regarding some of the issues with Microformats. I know it wasn't a discussion that was undertaken lightly: there are many developers at The Beeb who are passionate about standards, accessibility and Microformats.

The important point to remember here is that, unlike nearly every other commercial company, publically-funded organizations such as the BBC are mandated to make their content as accessible as possible.

So what happens when a screen reader sees an example of a Microformat date like this?

Am I childish because I am looking forward to a big party on
<abbr class="date" title="2008-07-16T13:06:00EST">my birthday</abbr>?

Apparently some will read out that long string of numbers in the title. It's hard to fault them, it is labeled a "title" after all. But it's not being used as a title here, and the phrase isn't even an abbreviation. Hopefully this discussion will inspire a more accessible solution.

Edit: Seems the BBC are looking into the RDFa format while the Microformats people debate what to do about their dates.


Detecting JavaScript Arrays

It can be difficult dealing with JavaScript's duck-typing when you just want to know if a given object is or isn't an array. Especially as the typeof operator will return "object" for an array -- true but not very specific.

Douglas Crockford suggests the following as a good (but not perfect) technique to determine if you have an array:

function isArray(value) {
    return value &&
        typeof value === 'object' &&
        typeof value.length === 'number' &&
        typeof value.splice === 'function' &&
        !(value.propertyIsEnumerable('length'));
};

JavaScript Arrays or Objects?

If you want to keep a series of data together in a collection JavaScript provides two built-in choices: array or object. Douglas Crockford writes in his book JavaScript: The Good Parts:

The rule is simple: when the property names are small sequential integers, you should use an array. Otherwise, use an object.

It's not that simple. In practice this decision is going to be weighted one way or the other based on your usage of the data.

For example, if you have a collection of employee objects where the employee id numbers happen to be small sequential integers, you might decide to go with an array. But what if you find that most of the time you need to extract employees based on their name property?

function getElementsByProperty(property, value, array) {
    var found = [];
    for (var i = 0; i < array.length; i++) {
        if (array[i][property] == value) {
            found.push(array[i]);
        }
    }
    return found;
}

var susan = getElementsByProperty("name", "Susan Smith", employees)[0];

Not pretty and not fast, especially on big arrays. But if that property you are searching for is unique for any element it becomes a primary key and you can do something much more elegant by creating employees as an object:

var susan = employees["Susan Smith"];

Okay that looks a lot nicer, so maybe we should try to use objects with primary keys? There are two problems you could have. The first has to do with order: in JavaScript the order of keys in an object is not guaranteed to be the same as the order in which you added them, so if the order of your elements is important you can't use an object.

And there's a second problem: what if you frequently need to modify and read the total number of employees? Unlike arrays, there is no magical length property to objects, instead you'll have to loop over every key in the object, incrementing a counter as you go. We're back to "not pretty and not fast" again.

So your choices look more like this:

  • If the order of the elements must be predictable: use an array.
  • If you will need a fast, simple way to get the number of elements: use an array.
  • But if you will need a fast, simple way to access elements by a string primary key: use an object.

And if you need some combination, you must resort to one of the not pretty, not fast approaches.


contact

tags

archive

more blogs