Page 1 of 2

Redundancy checker

Posted: Tue Nov 15, 2011 6:50 pm
by Famlam
Hi everyone,

Just because I still had a few minutes of spare time in my schedule and I hate to be bored, I recently wrote a redundant AdBlock Plus rule checker.
I know there already is another redundancy check available, but that one has some limitations, like not being able to check hiding rules and a no support for the $domain=* option. This checker does contain that functionality. Although support for hiding rules is limited (try ##a[href^="http://ad."] and ##a[href*="ad."] for example), it's sufficient in most cases. And, something that is very important if you just have a few minutes of spare time, it is faster than the other one in most cases :-) !
For more advanced users, there also is an option to ignore the options of blocking rules (except for the ones that have to be specified in order to have any effect, like $document and $popup), so you can find the redundancy of ||foo.com^/bar. and /bar.$domain=foo.com. But be aware, this option also causes a lot of false positives!

You can check it out here: https://arestwo.org/famlam/redundantRuleChecker.html (mirror http://abp.surge.sh/redundantRuleChecker/)

Kind regards,
Famlam

Re: Redundancy checker

Posted: Wed Nov 16, 2011 2:31 pm
by anonmouse
awesome work !

Re: Redundancy checker

Posted: Mon Dec 05, 2011 10:44 am
by Hubird
Seems to be an issue with your checker

eg:

Code: Select all

-ad-banner. has been made redundant by -banner.$domain=coastvoip.com.au|nationalturk.com|totalscifionline.com|reviewlinux.com
The above statement is one of the many redundant rules reported for Adversity however it is not true.

Re: Redundancy checker

Posted: Mon Dec 05, 2011 12:11 pm
by Famlam
Hi, I think you enabled the option "Do not check if the options for whitelisting and blocking rules match", which strips all things behind $ (document, elemhide, donottrack, popup). I can't reproduce it with the given filters if I do not enable that option.

Re: Redundancy checker

Posted: Tue Dec 06, 2011 12:40 am
by Hubird
Yes I did enable that option while trying to get the redundancy checker to properly parse $domain switches.

For example, if I add the following 2 rules (without the box ticked)

Code: Select all


||example.com/ads/
/ads/*$domain=example.com
One rule should be marked as redundant, but this is not the case.

Re: Redundancy checker

Posted: Tue Dec 06, 2011 12:50 am
by Famlam
But they aren't redundant:
||example.com/ads/ can also match on foo.com (if used as a third party resource). /ads/*$domain=example.com however, won't ever match on foo.com. Therefore they aren't redundant.

Re: Redundancy checker

Posted: Tue Dec 06, 2011 6:32 am
by Hubird
Famlam wrote:But they aren't redundant:
||example.com/ads/ can also match on foo.com (if used as a third party resource). /ads/*$domain=example.com however, won't ever match on foo.com. Therefore they aren't redundant.
:oops: True... Sorry about the false alarm.

Thanks

Re: Redundancy checker

Posted: Sun Mar 04, 2012 6:30 pm
by Famlam
Just to mention it, as this is a bit bigger update than any before since the initial 'official' release, the hiding rules check has been improved a lot today :)

The total changelog of today would be:
Hiding rules:
- sequence independent in the last part of the tree selector (###a.b is redundant of ##.b#a, but ##.b#a > c and ###a.b > c not (yet)).
- FOO.COM##* and foo.com##* are now redundant
- ###ads and ##[id="ads"] and ##[id] and and are now redundant (except when in :not(...) selector)
- ##.ads and ##[class="ads"] and ##[class] and and are now redundant (except when in :not(...) selector)
- attribute selectors: ##[x=A], ##[x='A'], ##[x="A"], [x] are redundant (except when in :not(...) selector)
- attribute selectors: ##[x] and ##[x(* or ~ or | or ^ or $)="something"] are redundant (except when in :not(...) selector)
- ##div and ##DIV are now redundant (except when in :not(...) selector)
- the attributes in attribute selectors are now redundant (##[border="a"] and ##[BoRdEr="a"]) (except when in :not(...) selector)
- ##* makes, well, everything redundant
- speed improvements (however, sequence independent checking costs more time, so actually it'll be a little bit slower)

Blocking rules:
- $domain=abc,image and $dOmAiN=aBc,ImAgE are redundant
- FIXED: a line with nothing but tabs did make every resource redundant (hopefully no-one followed that suggestion :D)
- FIXED: ||foo.com and ^foo.com were reported as redundant
- small speed improvements

Re: Redundancy checker

Posted: Thu Jan 17, 2013 9:01 am
by Crits
What would be cool to have:
If we only input example.com#@##foobar, the script would tell us that this filter is useless because ###foobar doesn't exist anymore.

It would be interesting for supplementary subscriptions so as to detect obsolete exception hiding rules, if the hiding rule in question has been deleted from EasyList (and it's likely to happen as the deleted hiding rules are the ones that were causing many false-positives).

Re: Redundancy checker

Posted: Thu Jan 17, 2013 11:50 am
by Famlam
True. I'll however not implement this by default, because the tool should also work when either
1. a supplemental list only wants to find it's own issues
2. a non-supplemental list may have a couple of exceptions for other filter lists that are commonly used
3. a user has #@# rules in their custom filters and they only check those (although this won't be often, especially as ABP doesn't allow creating hiding exception rules via a wizard)
4. you only check a part of the list
and I don't want to risk the less smarter people who may use it to remove filters that shouldn't be removed, only because my tool says so. (The rule should find redundancies based upon the syntax, but not based upon absence of filters)
I'm however thinking of implementing a new 'Tools' tab in the future, which could contain a few of such features. (Including a request from MonztA recently, to report rules with the same rule but a different domain, like abcdef#@##ads and ghijkl#@##ads)

Re: Redundancy checker

Posted: Thu Jan 31, 2013 12:28 am
by Famlam
Hi everyone,
My supposed-to-be-absence ended 1.5 day earlier than I initially expected, so I've spend a couple of hours on implementing the suggestions.
@Crits and @MonztA (the latter via PM): your suggestions have been implemented.
This is the complete changelog, for who is interested:
- Add several tools:
  • A similar rules finder tool
  • Tools to ignore domains and blocking options (this replaces the checkbox)
  • A tool to use a less strict matching algorithm
  • A tool to find rules which have the same rule, but a different domain (requested by MonztA)
  • A tool to find the rules that make whitelisting rules necessary, which also displays if no rules could be found for a whitelisting rule (requested by Crits)
Note: those tools try their best, but certainly aren't perfect. So don't rely on their results.

- Fixes
  • '//' isn't a regex
  • ' !x' (with whitespace in front of it) is a comment, not a blocking rule
  • '*!x' and similar will no longer trigger 'unnecessary preceeding wildcard found' warnings, since it would become a comment without *
  • '/\|\|x/' and similar regex rules no longer make ||x redundant
- Added/Modified
  • depricate $donottrack, since it's removed from ABP too
  • ||x will now also be matched if both /x and .x are present
  • A lot of internal code changes certainly worth mentioning, but you wouldn't notice the difference anyway, so I'm too lazy to explain them

Re: Redundancy checker

Posted: Thu Jan 31, 2013 9:17 am
by Crits
Awesome, thanks!

Re: Redundancy checker

Posted: Thu Jan 31, 2013 10:49 am
by MonztA
Thanks!

Re: Redundancy checker

Posted: Sun Jan 19, 2014 10:20 pm
by Famlam
Hi all, I'm glad to announce that I got stuck in a train that didn't continue to its destination a couple of times in a time span of a few weeks and thus found some time to update the code of the redundancy checker such that it is about 30 percent faster now. Enjoy ;)

Re: Redundancy checker

Posted: Thu May 15, 2014 1:40 pm
by gymka
filter list:

Code: Select all

site1.lt##.banner
site2.lt##.banner
actual result: none, no redundancies/optimizations found
expected result: write rule "##.banner" and you'll get same result as with "site1.lt##.banner site2.lt##.banner"

or i'm wrong and it's faster to have few lines for same item?

Re: Redundancy checker

Posted: Thu May 15, 2014 4:45 pm
by Famlam
It's not a redundancy, therefore it's not listed immediately. However, there is a tool to do so. Check "Tools" > "Search for equal rules for which only the domain differs"

Re: Redundancy checker

Posted: Sun Jul 06, 2014 9:15 am
by ZolPar2
Today found unoptimized usually fix it, please:

Code: Select all

@@||jjcast.com^$elemhide and footstream.tv,jjcast.com,leton.tv##div[id^="timer"] : They are redundant for at least domain 'jjcast.com'
@@||imagebam.com/image/$popup! Last modified: 06 Jul 2014 08:20 UTC : Unnecessary whitespace character(s) found

Re: Redundancy checker

Posted: Sun Aug 17, 2014 3:34 am
by vier986
it‘s so awesome, thanks! :banana: :banana:

Re: Redundancy checker

Posted: Fri Jan 16, 2015 6:29 pm
by monnawynter
So the checker on Adblock's site is inferior to this one? Out of curiosity I checked it and it gave some results like
'||webstats.sapo.pt^' has been made redundant by '.webstats.'
'||c.bigmir.net^' has been made redundant by '||bigmir.net^$third-party'
while this one showed nothing. Are these false positives?

Re: Redundancy checker

Posted: Fri Jan 16, 2015 7:53 pm
by Famlam
monnawynter wrote:So the checker on Adblock's site is inferior to this one? Out of curiosity I checked it and it gave some results like
'||webstats.sapo.pt^' has been made redundant by '.webstats.'
'||c.bigmir.net^' has been made redundant by '||bigmir.net^$third-party'
while this one showed nothing. Are these false positives?
They are false positives indeed:
First one is not redundant because '.webstats. will not match 'http://webstats.sapo.pt/' (and the ||webstats.sapo.pt^ one will)
Second one is not redundant because ||c.bigmir.net^ will also work on first-party base, while the other filter will only match on third-party domains.
I would advice to use this redundancy checker over the one on adblockplus.org, because adblockplus.org has quite some issues.

Re: Redundancy checker

Posted: Wed Feb 18, 2015 11:48 pm
by alexz
i just did a check for EasyList without rules for adult sites and the result is
Finished (after 3125 seconds)! 13 redundant rules found!
discovery.com##.banner-video has been made redundant by @@||discovery.com^$elemhide
mangabird.com#@#.ad468 has been made redundant by @@||mangabird.com^$elemhide
locatetv.com##.adBlock has been made redundant by ##.adBlock
search.yahoo.com###ysch #doc #bd #results #cols #left #main .ads .left-ad has been made redundant by ##.left-ad
kclu.org,notjustok.com##.widget_openxwpwidget has been made redundant by ##.widget_openxwpwidget
vodly.to,vodly.unblocked2.co##a[href^="http://ads.integral-marketing.com/"] has been made redundant by ##a[href^="http://ads.integral-marketing.com/"]
search.yahoo.com###doc #cols #right #east has been made redundant by search.disconnect.me,search.yahoo.com###east
search.yahoo.com###ysch #doc #bd #results #cols #right #east .ads has been made redundant by search.disconnect.me,search.yahoo.com###east
search.yahoo.com###ysch #doc #bd #results #cols #left #main .ads .more-sponsors has been made redundant by yahoo.com##.more-sponsors
search.yahoo.com###ysch #doc #bd #results #cols #left #main .ads .spns has been made redundant by yahoo.com##.spns
search.yahoo.com###ysch #doc #bd #results #cols #left #main .ads has been made redundant by 1sale.com,7billionworld.com,abajournal.com,altavista.com,androidfilehost.com,arcadeprehacks.com,asbarez.com,birdforum.net,coinad.com,cuzoogle.com,cyclingweekly.co.uk,disconnect.me,domainnamenews.com,eco-business.com,energylivenews.com,facemoods.com,fcall.in,flashx.tv,foxbusiness.com,foxnews.com,freetvall.com,friendster.com,fstoppers.com,ftadviser.com,furaffinity.net,gentoo.org,gmanetwork.com,govtrack.us,gramfeed.com,gyazo.com,hispanicbusiness.com,html5test.com,hurricanevanessa.com,i-dressup.com,iheart.com,ilovetypography.com,isearch.whitesmoke.com,itar-tass.com,itproportal.com,kingdomrush.net,laptopmag.com,laweekly.com,lfpress.com,livetvcafe.net,lovemyanime.net,malaysiakini.com,manga-download.org,maps.google.com,marinetraffic.com,mb.com.ph,meaningtattos.tk,mmajunkie.com,movies-online-free.net,mugshots.com,myfitnesspal.com,mypaper.sg,nbcnews.com,news.nom.co,nsfwyoutube.com,nugget.ca,panorama.am,pastie.org,phpbb.com,playboy.com,pocket-lint.com,pokernews.com,previously.tv,radiobroadcaster.org,reason.com,ryanseacrest.com,savevideo.me,sddt.com,searchfunmoods.com,sgcarmart.com,shopbot.ca,sourceforge.net,tcm.com,tech2.com,thecambodiaherald.com,thedailyobserver.ca,thejakartapost.com,thelakewoodscoop.com,themalaysianinsider.com,theobserver.ca,thepeterboroughexaminer.com,theyeshivaworld.com,tiberium-alliances.com,tjpnews.com,today.com,tubeserv.com,turner.com,twogag.com,ultimate-guitar.com,wallpaper.com,washingtonpost.com,wdet.org,wftlsports.com,womanandhome.com,wtvz.net,yahoo.com,youthedesigner.com,yuku.com##.ads
||wikifeet.com/mgid.html has been made redundant by /mgid.html
||pitchfork.com^*/ads.css has been made redundant by /ads.css
thers also 1 warning
The following error, warning or optimalization was encountered while checking the rules:
@@||discovery.com^$elemhide and discovery.com,freemake.com###top-advertising : They are redundant for at least domain 'discovery.com'

Re: Redundancy checker

Posted: Mon Mar 09, 2015 9:35 pm
by Famlam
For those of you who are interested: I released an update for the redundancy checker. A newly included tool allows one to check whether domains are live, dead or redirected. Unfortunately such checks do not depend on the speed of your computer, but rather on the response times of servers. Therefore it may take a long while to process all domains. Fortunately you can export intermediate results, and by pasting them as if they were a new filter list, you can resume the check at a later point.
At this very moment the tool only works on Chrome. Opera will follow once I receive confirmation that the server can handle the necessary operations. Other browsers are unlikely to follow soon. The cause of this is that web pages itself are not allowed to perform the necessary checks, so a browser extension is needed to aid in this process (and I'm not familiar with the FireFox APIs).
Example output (for EasyList Dutch) can be viewed here: http://pastebin.com/v1LckS5L . One should take care when interpreting the results:
  1. (resources on) subdomains may exist, even though the main domain is dead. To be sure that no resources exists on a domain, one could Google for site:thedomainyouwantto.check
  2. in the case of blocking rules where the checked domain is the domain between the || and the ^ or / (e.g. xx.xx in ||xx.xx/url$domain=yy.yy) even dead or redirected resources may still consume space on the original website. The only way to check this is by visiting the URL for which the filter was originally added.
Have fun!

Re: Redundancy checker

Posted: Tue Mar 10, 2015 5:24 pm
by JordanElliott
pastebin.com/xTcNmupS

Re: Redundancy checker

Posted: Thu Apr 23, 2015 11:11 am
by Lain_13
Hi, following duplicates are still in the list:

search.yahoo.com###ysch #doc #bd #results #cols #left #main .ads .left-ad has been made redundant by ##.left-ad
search.yahoo.com###cols > #left > #main > ol > li[id^="yui_"] has been made redundant by search.yahoo.com###main > ol li[id^="yui_"]
search.yahoo.com###doc #cols #right #east has been made redundant by search.disconnect.me,search.yahoo.com###east
search.yahoo.com###ysch #doc #bd #results #cols #right #east .ads has been made redundant by search.disconnect.me,search.yahoo.com###east
search.yahoo.com###left > #main > div[id^="yui_"][class] > ul[class] > li[class] has been made redundant by search.yahoo.com###left > #main > div[id^="yui_"]
search.yahoo.com###left > #main > div[id^="yui_"][class]:first-child > div[class]:last-child has been made redundant by search.yahoo.com###left > #main > div[id^="yui_"]
search.yahoo.com###right .first > div[style="background-color:#fafaff;border-color:#FAFAFF;padding:4px 10px 12px;"] has been made redundant by search.yahoo.com###right div[style="background-color:#fafaff;border-color:#FAFAFF;padding:4px 10px 12px;"]
search.yahoo.com###right ol li[id^="yui_"] > .dd > .layoutMiddle has been made redundant by search.yahoo.com###right li[id^="yui_"] .dd > .layoutMiddle
search.yahoo.com###ysch #doc #bd #results #cols #left #main .ads .more-sponsors has been made redundant by yahoo.com##.more-sponsors
search.yahoo.com###ysch #doc #bd #results #cols #left #main .ads .spns has been made redundant by yahoo.com##.spns
search.yahoo.com###ysch #doc #bd #results #cols #left #main .ads has been made redundant by 1sale.com,7billionworld.com,abajournal.com,altavista.com,androidfilehost.com,arcadeprehacks.com,asbarez.com,birdforum.net,coinad.com,cuzoogle.com,cyclingweekly.co.uk,disconnect.me,domainnamenews.com,eco-business.com,energylivenews.com,facemoods.com,fcall.in,flashx.tv,foxbusiness.com,foxnews.com,freetvall.com,friendster.com,fstoppers.com,ftadviser.com,furaffinity.net,gentoo.org,gmanetwork.com,govtrack.us,gramfeed.com,gyazo.com,hispanicbusiness.com,html5test.com,hurricanevanessa.com,i-dressup.com,iheart.com,ilovetypography.com,irennews.org,isearch.whitesmoke.com,itar-tass.com,itproportal.com,kingdomrush.net,laptopmag.com,laweekly.com,lfpress.com,livetvcafe.net,lovemyanime.net,malaysiakini.com,manga-download.org,maps.google.com,marinetraffic.com,mb.com.ph,meaningtattos.tk,mmajunkie.com,movies-online-free.net,mugshots.com,myfitnesspal.com,mypaper.sg,nbcnews.com,news.nom.co,nsfwyoutube.com,nugget.ca,panorama.am,pastie.org,phpbb.com,playboy.com,pocket-lint.com,pokernews.com,previously.tv,radiobroadcaster.org,reason.com,ryanseacrest.com,savevideo.me,sddt.com,searchfunmoods.com,sgcarmart.com,shopbot.ca,sourceforge.net,tcm.com,tech2.com,thecambodiaherald.com,thedailyobserver.ca,thejakartapost.com,thelakewoodscoop.com,themalaysianinsider.com,theobserver.ca,thepeterboroughexaminer.com,theyeshivaworld.com,tiberium-alliances.com,tjpnews.com,today.com,tubeserv.com,turner.com,twogag.com,ultimate-guitar.com,wallpaper.com,washingtonpost.com,wdet.org,wftlsports.com,womanandhome.com,wtvz.net,yahoo.com,youthedesigner.com,yuku.com##.ads
groups.yahoo.com##.yg-mbad-row > * has been made redundant by groups.yahoo.com##.yg-mbad-row


Additionally this filter:
||topbinaryaffiliates.ck-cdn.com^$third-party
Could be replaced with this:
||ck-cdn.com^$third-party

Re: Redundancy checker

Posted: Tue May 19, 2015 9:22 pm
by barbaz
https://arestwo.org/famlam/changelog.html wrote:Fixwhitelist rules that imply $document were made redundant by rules that do not imply $document if no options were present (@@|http:// was made redundant by @@http)
Is the behavior that was fixed now correct behavior in light of https://hg.adblockplus.org/adblockplus/rev/cc3f3887226a?

Re: Redundancy checker

Posted: Wed May 20, 2015 6:29 am
by Famlam
I'll have to check it a bit more carefully, but on first sight it seems you're right. Thanks for telling me! I'll fix it later this week!

Re: Redundancy checker

Posted: Thu Jul 30, 2015 2:52 am
by beast
My firefox runs slowly. So I just want url filters, not elementary hide rules.

Could you add a new option to the "Redundancy checker" :cut away elementary hide rules ?

Re: Redundancy checker

Posted: Thu Jul 30, 2015 2:18 pm
by Famlam
That's not the purpose of the redundancy checker ;).
A much faster method is to install a text editor that allows you to mark lines containing a specific character ("#" in this case), then remove all marked lines (Example: Notepad++).
For just plain EasyList, there is a filter list which does not contain hiding filters: https://easylist-downloads.adblockplus. ... emhide.txt

Re: Redundancy checker

Posted: Sat Dec 12, 2015 2:09 pm
by OnlyHereForTheBeer
459 redundancies in the current Fanboy’s Ultimate list.
459 redundant rules found!

Re: Redundancy checker

Posted: Sat Dec 12, 2015 2:18 pm
by monnawynter
The 12-12-15 Fanboy’s Ultimate list contains 459 redundancies.
Finished (after 1227 seconds)! 459 redundant rules found!
[Adblock Plus 2.0]
! Checksum: 6ABuG+WEF7Dozj8VngKjQA
! Title: Fanboy+Easylist-Merged Ultimate List
! Version: 201512121200
! Last modified: 12 Dec 2015 12:00 UTC