Over the last week I've been spending a great deal of time looking over the
both because I want to understand both code bases more, but also because I'm interested in Firefox extension
internals in general. While looking through these two code bases I saw a common file which could be optimized.
This file was the convert2RegExp.js file, which looks like it came from Adblock at some point, at least in part.
In fact I saw a number of changes that I could try in order to speed up the function, so I decided to time them all.
Factor I: Check for /\.[^\.ld]*t[^\.td]*l[^\.lt]*d/ in pattern string character loop
The first thing I noticed was that the regular expression that checks the pattern string for a ".tld" string
was being run on every pattern input, this seemed obviously bad to me since there was a loop prior
to the regular expression which runs through the pattern string's characters, so why didn't that loop at least check
for the ".tld" substring first? the check could be as simple as making sure all of the required letters
exist in the string first, or make sure that the ".tld" substring, exactly, is in the pattern string, or
what I found to be the best way was to make sure the regular expression /\.[^\.ld]*t[^\.td]*l[^\.lt]*d/ matched the
pattern string via the character loop through the pattern string. The average cases and best cases
will go much faster despite the fact that the latter case would be
matching some pattern strings that following tld regular expression (that is already in convert2regexp) would not match,
thus just adding work in this case, and making the worst possible case even slower.
Factor II: Cache tldStr and tldRegExp
The convert2RegExp( pattern ) function creates the tldRegExp and tldStr from literals on every execution!
that might be a faster operation I thought, but I expected it to be slower than simply looking up the value of a variable when I timed it, and even though I
haven't figured out how to test the memory consumption yet, I expect creating a large string from a string literal over and over again would increase
the peak memory usage, and thus the garbage collector pause time as well. So, caching the tldStr and tldRegExp seemed like it would be a small win.
Factor III: Use array.push()/array.unshift() and array.join() instead of +=
From what I've read about Firefox 3.6 the += operator in loops is optimized to use a single StringBuffer, but there were a few += outside of the loop,
and I figured some people will probably use versions of Firefox < 3.5 for some time to come still, and using an array there would certainly improve/decrease
past to present, mainly because of the memory issue. Furthermore, the more memory that is used, the more work there is for the garbage collector to deal with,
which means your computer is even slower, and that time is hard to measure. I do know that a new feature to Firefox 3.6 is that the garbage collector frees the memory in a new thread,
which means that older versions of Firefox did not, and that means that the chances of pauses were even greater. For more reading on these changes I am mentioning
in Firefox 3.6 please read this article.
In order to test the factors listed above I knew I needed a number of different versions of the convert2RegExp function, but the other piece I needed
was a collection of pattern strings to test. I made the different versions of convert2RegExp easily, but when it came to making the pattern set(s) to test
I had to do some more thinking.
Test Pattern Sets
I may not have made the best choice, but I decided to use two pattern sets:
- Pattern Set 0: 3/50 patterns use the magic tld expression, 2/50 do not use the tld expression, but match /\.[^\.ld]*t[^\.td]*l[^\.lt]*d/
- Pattern Set 1: 2/50 patterns use the magic tld expression
In retrospect I think testing an even worse case might be advantages, although I don't suspect the tld expression is
used very often. It's hard to say however, I would like to see an audit of userscripts.org, but even with that it'd
hard to know what the average case is for all Greasemonkey users, even when you take the install counts from
userscripts.org into account, although all of that perspective would be nice to have.
I am an avid user of Greasemonkey and I write quite a few userscripts, and I find that in practice that there are
very few times when I would need to userscript to use the magic tld expression, so my feeling is that the ratio is probably closer to
1 tld pattern per 100 or more.
I decided to run my tests of the different factors listed above for FF3.5/FF3.6 on WinXP, OSX, and Ubuntu (only FF3.5). The test pages worth pointing out are:
- Alternate 3: test of only factor I
- Alternate 5: test of only factor II
- Alternate 6: test of factor II + minor change to return a value asap.
- Alternate 11: test of factor I + II, uses array.unshift() and array.join(), and returns a value asap
- Alternate 12: test of factor I + II, uses array.push() and array.join(), and returns a value asap
- Alternate 13: test of factor I + II, and returns a value asap
- Alternate 15: test using array.unshift() and array.join() as only change
Here are the tests I ran and the results I record (the links are to published google docs spreadsheets, all values are in seconds):
AMD 2.21Ghz, DDR2, WindowsXP
Intel Duo 2.0Ghz, DDR3, Mac OSX
Intel Duo 2.0Ghz, DDR3, Windows XP
Intel 2.0Ghz, DDR, Ubuntu
Each test was using a version of convert2RegExp, on a set of 50 patterns, 2500 times.
- In all cases alternate 3 was faster than the original.
- In all cases alternate 5 and 6 were faster than the original.
- In all cases alternate 6 seemed to be slightly faster than 5.
- In all cases alternate 15 is much faster than the original for FF3.6, and slightly slower for FF3.5.
- For FF3.5 alternate 13 > alternate 12 ~= original ~> alternate 11
- For FF3.6 alternate 11 > alternate 13 > alternate 12 > original.
Although for FF3.5 we know that alternate 13 is a bad choice because it uses the += operator which result in O(N^2) characters copied, so we can't use that in my opinion. The final choice has to be between alternate 11 and alternate 12.
This is where I need your help, I have no idea why using array.unshift() is so much faster in Firefox 3.6, and taken that it is, and that the majority of Greasemonkey users will be using the latest version of Firefox, should we use array.unshift() and not array.push()?