(Adware filters) invalidated filters - bug

General information, announcements and questions about the EasyList subscriptions.
Locked
barbaz
Postaholic
Postaholic
Posts: 204
Joined: Mon Sep 15, 2014 12:55 am

(Adware filters) invalidated filters - bug

Post by barbaz »

Code: Select all

||*-a.akamaihd.net/a24/gsd.html?$third-party
||*-a.akamaihd.net/a24/scripts/gsd.js?$third-party
The leading domain-anchors in those filters are unnecessary and are causing uBlock Origin to reject the filters. Please fix, thanks :-)
User avatar
smed79
Liste AR/FR Author
Liste AR/FR Author
Posts: 15839
Joined: Sun Jan 17, 2010 4:00 am
Location: EasyList Forum

Post by smed79 »

•► Read RULES / Use forum Search
••► Don't post clickable links
•••►Upload screenshots at imgbb.com
WarFame
Postaholic
Postaholic
Posts: 713
Joined: Fri Nov 27, 2015 1:35 pm

Post by WarFame »

gorhill
uBlock Origin Author
uBlock Origin Author
Posts: 230
Joined: Mon Aug 18, 2014 3:17 pm

Post by gorhill »

WarFame wrote:https://github.com/gorhill/uBlock/issues/1669#issuecomment-224822448
Yes, the problem was in uBO, not EasyList. Fixed with 72fdce64f0cd9ee763032e9bb340658086ffd987.
barbaz
Postaholic
Postaholic
Posts: 204
Joined: Mon Sep 15, 2014 12:55 am

Post by barbaz »

Thanks gorhill for the fast fix.

But I'm confused about the filter syntax now. So are these two filters not actually equivalent? How is the wildcard in the first different from the implied wildcard on the left of the second?

Code: Select all

||*www.example.com/foofoofoo
www.example.com/foofoofoo
I can't see from reading https://adblockplus.org/filters when wildcards are "special" and when they're just wildcards?
gorhill
uBlock Origin Author
uBlock Origin Author
Posts: 230
Joined: Mon Aug 18, 2014 3:17 pm

Post by gorhill »

barbaz wrote:How is the wildcard in the first different from the implied wildcard on the left of the second?
https://github.com/gorhill/uBlock/issue ... -164430835
barbaz
Postaholic
Postaholic
Posts: 204
Joined: Mon Sep 15, 2014 12:55 am

Post by barbaz »

Ok, so in uBlock Origin because of the missing explicit left wildcard, and that the leftmost character is in

Code: Select all

[0-9A-Za-z]
then the character to the left (if any) would have to match this regex:

Code: Select all

[^0-9A-Za-z]
in order for the filter to match. (Seems like I've got a lot of filter reworking to do...)

What if the leading wildcard is explicit?

Code: Select all

||*www.example.com/foofoofoo
*www.example.com/foofoofoo
gorhill
uBlock Origin Author
uBlock Origin Author
Posts: 230
Joined: Mon Aug 18, 2014 3:17 pm

Post by gorhill »

barbaz wrote:What if the leading wildcard is explicit?

Code: Select all

||*www.example.com/foofoofoo
*www.example.com/foofoofoo
Then `www` won't be used as a token to store the filter, another token will be picked It's just how uBO stores its filters internally.

Why the "seems like I've got a lot of filter reworking to do..."? Among the 10s of thousands of filters in EasyList/EasyPrivacy, there is not a single one anchored with `||` which is not on a word boundary at the `||` position (neither RU AdList, neither EL Germany). I don't get why the worry. The case here merely exposed a *code path* issue in uBO for the filters in OP, not a profound difference in the resultset of applying hostname-anchored filters in general.
barbaz
Postaholic
Postaholic
Posts: 204
Joined: Mon Sep 15, 2014 12:55 am

Post by barbaz »

gorhill wrote:Then `www` won't be used as a token to store the filter, another token will be picked It's just how uBO stores its filters internally.
I mean are the filters equivalent in terms of what they block? lewisje seems to think not.
gorhill wrote:Why the "seems like I've got a lot of filter reworking to do..."?
It's about my custom lists, and it's not specifically about the domain-anchor case. First off, uBlock Origin's use of word boundaries means my lists can surely get significantly shorter and more effective. There were times I added like 3 or 4 filters where if I had known about this behavior only one (better) filter would have been needed.

I also had a number of custom filters that depended on the implied wildcarding behavior of ABP. I've already gone through and added wildcards on the ones I didn't think I had intended to match only at word boundaries.
Locked