Name: Yuyang
Subject: Questions about efficiency of filter list
Hey EasyList authors,
I'm Yuyang (from China). I'm a student in PolyU and mainly focus on web performance and privacy. Recently I'm digging into ad blocker things, and I found most popular ad blockers use EasyList for filting ads, so I'm curious about how EasyList works. However, after searching lots of things on Google and GitHub, I can only found few knowledge about it, I have no idea about the questions so I want to reach out to you guys directly. Below are my questions:
1. How to determine what rules should be included.
2. More and more rules being added to the list, are there any rules be removed?
3. How to evaluate the quality of rules? For example, apply EasyList to a sample of 10,000 websites in a regular time and check the coverage and how many of them really works.
I'm really looking forward to your reply, many thanks!
Questions about efficiency of filter list
-
- Contact Bot
- Posts: 0
- Joined: Fri Mar 12, 2021 1:18 am
Code: Select all
1. How to determine what rules should be included.
- Whether the filter is actually needed or not
- Knowledge of existing filters
- History of previously removed filters
- Can suggested filter be improved or optimised?
Code: Select all
2. More and more rules being added to the list, are there any rules be removed?
Code: Select all
3. How to evaluate the quality of rules? For example, apply EasyList to a sample of 10,000 websites in a regular time and check the coverage and how many of them really works.
-
- Site Member
- Posts: 23
- Joined: Thu May 09, 2019 5:15 pm
A huge part of ad blocking performance is determined by how the ad blocking software applies the filters. For example, uBlock Origin (originally named "HTTP Switchboard") made it's debut by dramatically reducing memory consumption in comparison to Ad Block Plus. See https://github.com/gorhill/httpswitchbo ... onsumption
You might be interested in this 2020 ad blocker performance benchmark:
https://www.debugbear.com/blog/2020-chr ... d-blockers
You might be interested in this 2020 ad blocker performance benchmark:
https://www.debugbear.com/blog/2020-chr ... d-blockers
@fanboy annoyance encountered after scrolling down.
-
- New Member
- Posts: 8
- Joined: Sun Oct 17, 2021 7:36 am
Has anyone tried to automate this process? Now I see many rules for undelegated domains and hosts that are down or have no web server running. It is quite easy to write a script that will find such rules and remove them. I can help with this work if you are interested.
-
- New Member
- Posts: 8
- Joined: Sun Oct 17, 2021 7:36 am
Sorry, I missed that link. That's great if easylist is cleaned up this way, but it doesn't seem enough. I thought about different variants and decided that relying on whois only is not the best idea. It is better to check DNS records. What looks reliable enough to me is querying NS records and consider domain is dead if reply is NXDOMAIN.fanboy wrote: ↑Tue Jun 01, 2021 2:41 am See, https://github.com/easylist/easylist/is ... -821931112 We're currently around 360-370k since that graph was made.
There are also IP addresses listed that need to be checked. The only idea I have is to try connecting web servers on them. This should be repeated several times in case of failure to ensure that it is permanently down.
From https://github.com/easylist/easylist/tree/master/docs an update of Easylist. Pruning from 2019-2021