automate flter list generation
automate flter list generation
a script could crawl the internet and search for known ad content like spam filters scan for known spam mails. if content is found and the url/domain is not in the list yet then a new filter rule would be added. if a script like that existed then an adblocker could have a simple report button which starts a scan on the reported site. so as long as new ad networks or locally hosted ads do not only show content that has not been shown somewhere else yet they can be blocked without the need of manual maintainer interaction.
That wouldn't be too unmanageable? How could a script to distinguish between legit content and advertising? Still you also need someone to compile and check the lists and someone else to check and correct false positives...
Not to mention you'll likely get blocked by many sites for wasting their bandwidth. Unless you can pretend you're Google.
Not very scalable solution. It will work for a short period of time.
Not very scalable solution. It will work for a short period of time.
"If it ain't broke don't fix it."
>How could a script to distinguish between legit content and advertising?
if there is a pink kangaroo selling toilet paper image in the database then the script will be able to idetify it on a different ad network or website.
a manual filter would still be better in terms of speed, element hiding and correctness but manual labor doesnt scale very well. imagine what would happen if ad network domain count exploded like the spam mail count exploded some years in the past. and its not an either or choice. the automatic list could be used as a source for the manual one
>you'll likely get blocked by many sites for wasting their bandwidth.
checking a site once in a while doesnt get you blocked. you dont have to crawl every page on a site like google does.
if there is a pink kangaroo selling toilet paper image in the database then the script will be able to idetify it on a different ad network or website.
a manual filter would still be better in terms of speed, element hiding and correctness but manual labor doesnt scale very well. imagine what would happen if ad network domain count exploded like the spam mail count exploded some years in the past. and its not an either or choice. the automatic list could be used as a source for the manual one
>you'll likely get blocked by many sites for wasting their bandwidth.
checking a site once in a while doesnt get you blocked. you dont have to crawl every page on a site like google does.
-
- Guest
Por no hablar de lo más probable conseguir bloqueado por muchos sitios para perder su ancho de banda. A menos que usted puede fingir que eres Google. No es solución muy escalable. Se trabajará por un corto período de tiempo.
Most small sites won't care but those that have traffic analyzers will know what you're doing. You're welcome to try it and see how fast you won't be able to hit some sites.elypter wrote:checking a site once in a while doesnt get you blocked. you dont have to crawl every page on a site like google does.
"If it ain't broke don't fix it."
if curl or wget wont do the trick then there is still selenium. telling a remote controlled browser appart from a regular one wont be that easy.