Page 1 of 2

EasyList Element Efficiency

Posted: Fri May 07, 2010 9:14 pm
by Michael
I was recently motivated to improve the time it takes to match patterns to items in EasyList after fanboy provided the following comparison of EasyList and his subscription based on his subscription benchmarking tool:

EasyList (24 Apr 2010 17:40 UTC):

Code: Select all

Load Time: 1632ms (Average: 163.2)   
Pattern Total: 17.531 (1.7531)
Fanboy's List (25 Apr 2010):

Code: Select all

Load Time: 2109ms  (Average: 210.9)
Pattern Total: 14.8925 (Average: 1.48925)
As far as I am aware the primary reason that EasyList is slower is due to the element-hiding filters, which are less specific in EasyList. I would therefore like to rewrite the inefficient filters in a more precise manner that is designed with the new style selectors (3.6) in mind. Unless there are any issues I have missed I will make a start immediately.

Re: EasyList Element Efficiency

Posted: Sat May 08, 2010 8:16 am
by fanboy
Maybe its not just elements?

Re: EasyList Element Efficiency

Posted: Sat May 08, 2010 8:26 am
by Hubird
Looks like the credit for the Pattern matching benchmark has to go to Wladimir as noted in the page title (have to give credit where it is due :-) )

Re: EasyList Element Efficiency

Posted: Sat May 08, 2010 8:45 am
by fanboy
I've had the benchmarking tool for a while now, slightly modified from Wladimir's copy. The benchmark rates can vary a bit depending on the computer and browser, probably why I ran the tests a few times.

Re: EasyList Element Efficiency

Posted: Sat May 08, 2010 8:58 am
by Hubird
Yes, I just had a play around. I noticed the variance too. I found IE was useless for running it as it kept complaining about script errors (taking too long to respond). Minefield complained about non-responsive scripts too but allows you to easily ignore the warning. I ended up using SRWare Iron which seemed to do the trick.

Re: EasyList Element Efficiency

Posted: Sat May 08, 2010 9:09 am
by Michael
I added the blocking filters and the element filters separately to calculate the following statistics:
Blocking Filters

Code: Select all

Time to load filters: 118ms
Average time for pattern matching: 1.7005ms (Loops: 2000)
Element Filters

Code: Select all

Time to load filters: 169ms
Average time for pattern matching: 7.078ms (Loops: 2000)
I think the results are fairly conclusive about where the issue is...

Re: EasyList Element Efficiency

Posted: Sat May 08, 2010 9:36 am
by fanboy
Hubird wrote:Yes, I just had a play around. I noticed the variance too. I found IE was useless for running it as it kept complaining about script errors (taking too long to respond). Minefield complained about non-responsive scripts too but allows you to easily ignore the warning. I ended up using SRWare Iron which seemed to do the trick.
For testing I'll generally use chrome or opera, seems using Firefox the numbers vary more. Never used IE though

Re: EasyList Element Efficiency

Posted: Sat May 08, 2010 11:45 am
by Erunno
Some more numbers:

only general id (###) filters:

Number of filters: ~465
Time to load filters: 27ms
Average time for pattern matching: 1.40525ms (Loops: 4000)

Number of filters: ~931
Time to load filters: 39ms
Average time for pattern matching: 2.11925ms (Loops: 4000)

Time for pattern matching difference: ~1.5x longer

only general class (##.) filters:

Number of filters: ~508
Time to load filters: 28ms
Average time for pattern matching: 1.279ms (Loops: 4000)

Number of filters: ~1016
Time to load filters: 43ms
Average time for pattern matching: 1.57875ms (Loops: 4000)

Time for pattern matching difference: ~1.23x longer

general hiding filters without slow ones (last rows in general hiding sublist):

Time to load filters: 68ms
Average time for pattern matching: 3.45525ms (Loops: 4000)

all general hiding filters:

Time to load filters: 69ms
Average time for pattern matching: 3.459ms (Loops: 4000)



Judging by the numbers the so-called slow general filters (i.e. those not defined via ### or ##.) seem to not contribute much to the overall matching time. It's the sheer size of our general hiding filter list which contributes to the slowdown. Now, the question is, whether this is significant or not. Is the average time just for matching all filters against 1 element or all addresses defined in the addresses1.txt and adresses2.txt?

Re: EasyList Element Efficiency

Posted: Sat May 08, 2010 12:42 pm
by fanboy
The benchmark tool is really mainly useful for filters rather than elements fyi

Re: EasyList Element Efficiency

Posted: Sat May 08, 2010 12:52 pm
by Michael
Why would some blocking filters be less efficient than others? I would have assumed that, provided there are no hash conflicts, all filters would take an equal amount of time to process.

Re: EasyList Element Efficiency

Posted: Sat May 08, 2010 12:54 pm
by Hubird
fanboy wrote:The benchmark tool is really mainly useful for filters rather than elements fyi
I was under the same impression. I remember there was a different test floating around a while back which seemed to focus more on hiding rules (site does not work any more).

Re: EasyList Element Efficiency

Posted: Sat May 08, 2010 12:56 pm
by Erunno
I'm trying to understand how this benchmark works. As far as I can see addresses1.txt and addresses2.txt provide a list of bogus addresses. This addresses are used to test the provided filters against this address list with a re-implementation of ABP's matching algorithm. But since hiding filters work on HTML documents, how are hiding filters tested at all?

EDIT: Does ABP do any matching for the hiding filters at all or does it delegate this work to Gecko completely?

EDIT 2: Didn't Wladimir write a test suite specifically for hiding filters a few months ago?

Re: EasyList Element Efficiency

Posted: Sat May 08, 2010 1:31 pm
by fanboy

Re: EasyList Element Efficiency

Posted: Sat May 08, 2010 1:39 pm
by Erunno
Wladimir to the rescue!

Re: EasyList Element Efficiency

Posted: Mon May 17, 2010 10:43 am
by Wladimir Palant
Oh my... I recognize this webpage - but it very, very old. Back when I created it it was a "good enough" approximation of what Adblock Plus did. However, it is only an approximation, it wasn't identical to Adblock Plus implementation even then. By now what Adblock Plus really does it quite different. Note that I removed this page very soon after publishing it - the results usually lead to wrong conclusions.

If you set up a development environment for Adblock Plus - there are a bunch of performance tests, particularly chrome://mochikit/content/tests/performance/filter_fromText.html (filter parsing time), chrome://mochikit/content/tests/performance/matcher_init.html (initialization of internal data structures, blocking and exception rules only) and chrome://mochikit/content/tests/performance/matching.html (actual filter matching performance, blocking and exception rules only). These tests measure "live" Adblock Plus performance, they are running the same code that is executed whenever you use Adblock Plus. Both use a fixed list, I used some version of EasyList for that (no point for variable filter list since this is meant to measure Adblock Plus efficiency). I can add a tweak to make the filters configurable however...

Note that there is no proper way to measure element hiding performance - whatever you thought you measured, it's not it.

Re: EasyList Element Efficiency

Posted: Mon May 17, 2010 11:32 am
by Wladimir Palant
Ok, there we go - the performance tests allow changing the test data now. In case you are interested, here the results from my laptop with the current Minefield nightly:

chrome://mochikit/content/tests/performance/filter_fromText.html
EasyList: 107.7 ms +/- 3.6 ms
EasyList without element hiding: 50.1 ms +/- 1.3 ms
Fanboy's List: 127.1 ms +/- 2.7 ms

chrome://mochikit/content/tests/performance/matcher_init.html
EasyList: 35.7 ms +/- 5.2 ms
Fanboy's List: 46.5 ms +/- 0.5 ms

chrome://mochikit/content/tests/performance/matching.html
EasyList: 57.0 ms +/- 2.8 ms
Fanboy's List: 57.5 ms +/- 2.7 ms
Two filters (testtest and @@testtest): 56.9 ms +/- 2.1 ms

The first two tests are mainly determined by the number of filters. The last one - by the number of "slow" filters. So not much to "optimize" here...

Re: EasyList Element Efficiency

Posted: Mon May 17, 2010 12:14 pm
by fanboy
chrome://mochikit/content/tests/performance/filter_fromText.html
EasyList: 107.7 ms +/- 3.6 ms
EasyList without element hiding: 50.1 ms +/- 1.3 ms
Fanboy's List: 127.1 ms +/- 2.7 ms
without elements for my list also?

Re: EasyList Element Efficiency

Posted: Mon May 17, 2010 12:25 pm
by Wladimir Palant
I only copied the file contents as they are available on the web already - I'm not aware of any variant of your list without element hiding.

Re: EasyList Element Efficiency

Posted: Mon May 17, 2010 12:26 pm
by fanboy
Ah right, just thought you were coping/pasting the lists in

Re: EasyList Element Efficiency

Posted: Mon May 17, 2010 12:31 pm
by fanboy
The results aren't too bad, considering the size of the lists.. but as always, theres room for improvements :)

Code: Select all

153K 2010-05-17 17:40 easylist.txt
179K 2010-05-17 22:53 fanboy-adblocklist-current-expanded.txt

Re: EasyList Element Efficiency

Posted: Mon May 17, 2010 12:39 pm
by Wladimir Palant
Yes, the results aren't bad even though in the real system more things come together. E.g. in the startup sequence with EasyList I see 200ms for parsing patterns.ini - filter_fromText test measures only a part of it. Then there is 100ms in FilterListener.startup() - again, matcher_init test measures only part of it. And there is 300ms for registering element hiding stylesheet - that part isn't measured at all (and cannot be really measured in a meaningful way given that in a running browser it depends on the number of open tabs for example).

You can see the startup sequence in the console if you are running a development build. On Windows you need to run Firefox with a "-console" command line parameter.

Re: EasyList Element Efficiency

Posted: Mon May 17, 2010 12:54 pm
by fanboy
how will these speeds matter when they sandboxing for each tab in firefox.next? (if they're still doing that..)

Re: EasyList Element Efficiency

Posted: Mon May 17, 2010 1:36 pm
by Wladimir Palant
I don't know when they are going to do this - almost certainly after Firefox 4, that change is too scary to be ready by autumn. We will see when it is there of course but I guess that nothing changes fundamentally as long as they only separate chrome and content rather than create a separate process per tab. When they move on to the latter Adblock Plus will most likely be in for large changes...

Re: EasyList Element Efficiency

Posted: Mon May 17, 2010 5:27 pm
by Erunno
I've attended the chat accompanying Mike Beltzner's presentation and one of Mozilla's developers confirmed that Electrolysis won't be ready for Firefox 4.

And many thanks for doing the performance tests for us, Wladimir. This is more than I actually expected when I issued my cry for help :D

Re: EasyList Element Efficiency

Posted: Mon May 17, 2010 6:06 pm
by Michael
I also didn't expect this level of detail after my (e-mailed) "cry for help". Thanks for your all your assistance with this issue, Wladimir.

Re: EasyList Element Efficiency

Posted: Tue May 25, 2010 7:14 am
by fanboy
Wladimir Palant wrote:I only copied the file contents as they are available on the web already - I'm not aware of any variant of your list without element hiding.
Now available with out elements,

http://www.fanboy.co.nz/adblock/fanboy-adblocklist.txt

Re: EasyList Element Efficiency

Posted: Tue May 25, 2010 7:28 am
by Wladimir Palant
Here we go, filter_fromText performance test again:

EasyList: 108.2 ms +/- 2.8 ms
EasyList without element hiding: 50.9 ms +/- 2.0 ms
Fanboy's List: 134.6 ms +/- 4.3 ms
Fanboy's List without element hiding: 62.8 ms +/- 2.2 ms

Note that these results can deviate from what I measured previously: not only is this a newer Firefox nightly build now, my laptop's performance also varies due to CPU stepping.

Re: EasyList Element Efficiency

Posted: Wed Aug 04, 2010 2:07 pm
by Wladimir Palant
I created a new performance test - page_load_overhead.html. This one takes local copies of five popular pages and measures the time required to load them. Normally it takes around 2 seconds on my computer. Adblock Plus adds around 100-150ms to that, mostly due to "recording" of all requests I guess. As to filters - the difference introduced by either EasyList or Fanboy's List (with nothing being blocked) is too small to be measured, in the area of 10ms or below. This means in particular that element hiding rules used in EasyList and Fanboy's List right now are good enough, usually they won't have any noteworthy performance impact (there might be pathological cases however, these five websites are not representative for the entire web).

Re: EasyList Element Efficiency

Posted: Wed Aug 04, 2010 3:25 pm
by Erunno
Are the restrictions still in effect that we should stay well below 30 "slow" general hiding filters, e.g. ##a[href^="http://www.liutilities.com/products/campaigns/adv/"]?

Re: EasyList Element Efficiency

Posted: Wed Aug 04, 2010 3:33 pm
by Wladimir Palant
I didn't set these restrictions because I don't know how the number of such filters influences performance. But given that EasyList performance right now is good this is probably a sane threshold.