Research to identify global ad networks, trackers

General information, announcements and questions about the EasyList subscriptions.

Moderator: EasyList authors

Post Reply
ponjovi
Guest

Research to identify global ad networks, trackers

Post by ponjovi » Wed Sep 07, 2016 11:48 pm

Hi, I'm new to the community and looking for advice.

I'm doing research on online advertising. (The final report will be free and available to the public. You can see a previous report I did on the app economy here: https:[email protected]/winn ... .ixmaefl6e)


I want to do an inventory of top websites globally (based on Alexa rankings), documenting all the ad networks, trackers, analytics, and other 3rd-party tools that each one is using. I have two problems/needs that I'm looking for advice on:

First, I can do this manually by visiting each site with an ad blocker such as Ghostery installed and recording what it finds, but I want to run it on 1,000+ URLs so need an automated solution. Any idea if such a tool already exists? or if one of the existing ad block tools can do this? Clearly I need to record, not just block.

Secondly, even if I have a list of URLs that a website was calling, I need to translate those URLs into more useful company names. For example, the tool may record that a website was trying to reach "log.dmtry.com" but I want some list for matching that URL to the company name, "Adometry."

Any advice or suggestions on existing tools or places to look very much appreciated.

Bryan

anonsubmitter
Postaholic
Postaholic
Posts: 1183
Joined: Sat Feb 07, 2015 9:18 pm
Reputation: 1

Post by anonsubmitter » Thu Sep 08, 2016 9:53 pm

Disconnect (https://disconnect.me/freeprotection) and Lightbeam (https://addons.mozilla.org/firefox/addon/lightbeam/) are good tracker visualisation tools for single websites, but there isn't any potential for automation there as far as I know.
If there is such a tool, it's most likely on GitHub.

Something that might be worth trying is asking the EFF for help (https://www.eff.org/about/contact). They are pretty bright when it comes to coding solutions like that, they have a history of creating custom software for Internet privacy and security related purposes as you can see on their GitHub profile (https://github.com/EFForg) and I think they would probably be pretty interested in your research.

WarFame
Postaholic
Postaholic
Posts: 713
Joined: Fri Nov 27, 2015 1:35 pm
Reputation: 0

Post by WarFame » Thu Sep 08, 2016 10:02 pm

Is something like this what you had in mind?
https://github.com/gorhill/uBlock/wiki/ ... C-malwares
Sessbench was used in the above benchmark.

anonsubmitter
Postaholic
Postaholic
Posts: 1183
Joined: Sat Feb 07, 2015 9:18 pm
Reputation: 1

Post by anonsubmitter » Thu Sep 08, 2016 10:16 pm

By the way, I've heard that SimilarWeb is more accurate than Alexa, so it might be worth comparing those two services before deciding which one to use.

ponjovi
New Member
New Member
Posts: 1
Joined: Sat Sep 10, 2016 4:22 pm
Reputation: 0

Post by ponjovi » Sat Sep 10, 2016 4:26 pm

Thanks for suggestions. I'm actually working with the Mozilla Foundation and have asked for intros to the Lightbeam team, so might be something there. I've also reached out to the folks at EFF that work on Privacy Badger, but still waiting to hear back. My friend there says their devs get requests like this all the time and are overwhelmed, so don't expect a response...

I will look more closely at the uBlock and Sessbench tools, the latter seems like it might be able to do batch URLs.

Post Reply