[GUIDE] How Ad Networks (and everyone else) Knows You're A Bot

AdvertisingGuy

Junior Member
May 8, 2019
149
170
Hi all! Long-time lurker, short-time poster.

There are a lot of BHW users that want to do arbitrage--meaning buying traffic to a site on a CPC basis and getting paid from ad networks on a CPM or CPA/CPI basis. If you do this, it is essential that the traffic that you buy is human, or else that, not identified as a bot. If you send bot traffic to the ads that your ad network displays, the ad network is well within their rights to ban you and withhold your payment, and since you mostly likely paid for the traffic upfront, then that means all of your traffic buying is wasted and money gone forever.

I've been in digital advertising a long time and I've worked with every major ad verification provider. I've also built my own "home rolled" ad verification tools. So while I certainly don't know everything, I know enough to (at least try!) to let you know how to protect yourself against bot traffic. The idea here is to go very in-depth with data, screenshots and the works so that you know exactly what you're dealing with.

Note that I've been posting bits of this in various other threads but having all the information in one place makes it easier to digest, so I've created this thread. I hope this is the appropriate place for the thread--any mods please move it at your discretion.

More content will follow below!

Mod Updates -- Bot Flags:
BOT FLAG #1 -- Automated Browsers
https://www.blackhatworld.com/seo/guide-how-ad-networks-and-everyone-else-knows-youre-a-bot.1119582/

BOT FLAG #2 -- DATA CENTER IP ADDRESSES
https://www.blackhatworld.com/seo/g...nows-youre-a-bot.1119582/page-2#post-12057559

BOT FLAG #3 -- Invalid Graphics Processing Units (GPUs)
https://www.blackhatworld.com/seo/g...nows-youre-a-bot.1119582/page-3#post-12060921

BOT FLAG #4 -- Missing / Invalid Plugins Media + Devices
https://www.blackhatworld.com/seo/g...nows-youre-a-bot.1119582/page-4#post-12066141

BOT FLAG # 5 -- Browser Spoofing / Incongruous Browsers
https://www.blackhatworld.com/seo/g...nows-youre-a-bot.1119582/page-4#post-12069251

BOT FLAG #6 -- BAD / NO / IMPROPER Referrer URLs
https://www.blackhatworld.com/seo/g...nows-youre-a-bot.1119582/page-4#post-12089635

---

BOT FLAG #1 -- Automated Browsers

As can be deduced, browser automation tools (or WebDrivers) automate actions in a web browser. Anything that can be done in a web browser, such as visiting websites, moving the mouse, moving back or forward, or clicking on links, a browser automation tool can do. Automation is often done using frameworks such as Selenium (using Java or Python) or Puppeteer (using JavaScript). These frameworks are primarily used for website testing, but they can be used for scraping data, filling out forms, and taking screenshots.

An automated browser will open up a full web browser and load all content on a web site. Chrome, Firefox, and even Safari have their own WebDrivers used for automated testing. Loading the full website, which includes JavaScript and images, will necessarily load ads as well. Because the browser is being operated by a computer program, and not by a human user interested in the content of a site, automated browsers are and should be flagged as invalid traffic for advertising 100% of the time. If you think about it, every bot that visits websites an automated browser in one way or another.

Automated browsers are identified using the webdriver parameter of the navigator WebAPI.

To see this in action, open up your browser, and open up developer tools (CTRL+SHIFT+i on your keyboard). From there, go to the (JavaScript) "Console."

If you type in 'navigator.webdriver' and hit enter in Safari or Firefox (screenshot below), the response will be 'false'. If do it in Chrome, it will be undefined / null. That's because you, as a normal browser user, are not using an a automated browser.

TNWbeoCKcxqHXxB5ly1c1lsB5m0hRNG4iUK8RsqF_tdgcoYNC1IxG71qRhNvWT4TqcIhlMJ8QUYRZIOK0GtcjEhYTCvk_q8h_zjHBpAo1g90JoYRWa0onLmO89SxT42rbR22E1dL


Below is a ultra-lite Python script that will open up a automated browser and will send it to Google. Python comes pre-installed with Mac computers if you want to try it for yourself, but you will have to download the Chrome WebDriver to get it to work.

from selenium import webdriver

browser = webdriver.Chrome(executable_path='chromedriver.exe')
brower.get('http://www.google.com')​

If you were using an automated browser, then open up developer tools / console and type in "navigator.webdriver", the response will be true.

--E1fXzhFs1M-uusKqfsXMWoRpTTnfl8jyvEK-lLbY8UyXWgfdGcI4rZSt_tJP56BZ5UXZBytJfjMY1TeIrfIDnMgDuhlBZ6W5-759SBEYvc8XWZMkn1yQ_SUotub7rassJ0S6_5


There are bot-blocking tools such as Encapsula, Distil Networks and PerimeterX which will prevent bad bots from websites. If you visit a site using these tools using a normal web browser, such as twentytwowords.com or streeteasy.com, you should be able to view the site’s content without issue. However, if you point your automated browser towards that same site (screenshot below), the browser will get blocked from entering. Even if the same computer, ip address, and browser are used.


from selenium import webdriver

browser = webdriver.Chrome(executable_path='chromedriver.exe')
brower.get('http://www.streeteasy.com')
04ksIZVZZtQ-Jahfty2GftltXYd942Zwi0qd-ftDM-7Zf_8fMCDWtfC8pPYZ9z3_ZNN-kYxeT-YyNDUfn7UWqE2geUl41tqqV2Y9Ov8fckqbPmtkoOAlsxLtzG1IZfPs9GeDCHpM


Why? Because bot-blockers are identifying the browser as a bot from that navigator.webdriver parameter.
 
Last edited by a moderator:
Now, the question is.. How do I set navigator.webdriver to false? AFAIK it's a readonly variable, but people seemed to have cracked it already. Like @jamie3000 i guess. :D
I would be glad to know the answer.
 
Book marked

Very interesting, would be nice to have more bot flag :)

Please do update as it is interesting

Interesting bookmarked!

Good share very interesting

Thank you! I'll probably post one per day and they take a lot of effort to put together.

Now, the question is.. How do I set navigator.webdriver to false? AFAIK it's a readonly variable, but people seemed to have cracked it already. Like @jamie3000 i guess. :D
I would be glad to know the answer.

That's a good question! From what I know you can add an extension to your browser to change the field, but I can't tell you exactly how to do it. I have been told a rumor (from a semi-reliable source) that because Google uses made/owns Chrome there are some special ways for them to determine if someone is using an automated browser beyond that flag. But I don't know what that is and they sure ain't telling.
 
Hmm, I will give it a go when I have some free time. My idea was, that if I could somehow custom build the chrome driver executable; I could set that var as false.
 
Now, the question is.. How do I set navigator.webdriver to false? AFAIK it's a readonly variable, but people seemed to have cracked it already. Like @jamie3000 i guess. :D
I would be glad to know the answer.
Object.defineProperty and javascript injection. There are limits to what you can do with it on chromium based browsers though.
 
Thanks @AdvertisingGuy
Any more bot flags you're aware of?
I heard that some advanced detection techniques involve measuring page load time, tracking user's mouse activity, using flash/java and storing files or so-called permanent cookies etc etc. Maybe there are even more twists you've seen. Maybe you know of any ways to distinguish a wiped or freshly installed browser from one in normal use?
 
Bookmarked for later, Btw nice profile picture. Haven't seen in a while Don, the boss who can hire himself for the job ahhaah

I am an AdvertisingGuy after all. :D

Ok, so how to get around it.

Object.defineProperty and javascript injection. There are limits to what you can do with it on chromium based browsers though.

I can't comment beyond what I've already said. If you have the right answer and can confirm, put it in this thread. Some of these things can't be gotten around, however.


Some of those things I'll cover in this series. I don't know of any "permanent cookies" that track users--Verizon used to do this and they were hit by a $1.35B fine. According to the below link it was pretty prevalent once upon a time, but I'm not sure if it's done now. A billion dollars is a good way to dissuade people.

https://qz.com/634294/a-short-guide-to-supercookies-whether-youre-being-tracked-and-how-to-opt-out/

As for the freshly wiped / installed browser, I know it's done with cookies. If you were to close Chrome, delete the Chrome cookie folder(s) and then open them back up, a bunch of new cookie folders would be recreated, especially after you visit your first site. But beyond the cookie, ad tech companies have additional ways to ID users. There are advanced fingerprinting techniques which take your graphics card, ip address / dma, browser version/user agent and can reliably ID people to a high degree of accuracy. I'll get into that stuff as well.
 
Object.defineProperty and javascript injection. There are limits to what you can do with it on chromium based browsers though.
Hmmmm interesting! I will def try that out! Thanks!

Initial browser size, initial cursor position could also be tracked I guess. For a bot, it will be same mostly..
 
@AdvertisingGuy you forgot one more thing. Ad networks are also checking for your IP address ASN if you looked at the tracking scripts they use you will see that they use 3rd party JS scripts that return whether your IP is mobile, residential or a bot IP. The way they check for a bot IP is of the IP address came from a datacenter IP.

You forgot to mention this as it is extremely easy to spot if you look at your requests under Chrome Developer Tools. Just like Instagram and YouTube care about the type of IPS Ad networks care as well.
 

You just spoiled yesterday's post! lol. :D :D

I'll be going a little bit more in-depth into data center IPs tomorrow.
 
here we go again :D btw you can also use any automation library that's based on cef, that will fix everything
 
Object.defineProperty and javascript injection. There are limits to what you can do with it on chromium based browsers though.
Yeah I always tell people that if you really must use a browser then puppeteer and Object.defineProperty are your friends. This is a good example of how to do it properly: https://gist.github.com/nicoandmee/7d7dc2e79e2d553a22645d312f39bf3c
 
i personally use nightmare.js (based on electron) i switch useragent signature & proxy and bunch of other stuff as a paramaters in every call for full stealth.


it works.

i also tested the link you provided.


What's sweet about nightmare is that you can easly access a bunch of methods that define ur emulated navigator such as WebRTC etc..
 
Last edited by a moderator:
Yeah I always tell people that if you really must use a browser then puppeteer and Object.defineProperty are your friends. This is a good example of how to do it properly: https://gist.github.com/nicoandmee/7d7dc2e79e2d553a22645d312f39bf3c

Nico Mee is a beast some of his work is outstanding!
 
OP, entering into the tech savy aspect, as sellenium can be seen as a automated tool, this means any tool that use it can be flagged, right?

I think I realized why many creation tools are leaving footprints that arent related to proxies but the tool itself
 
Back
Top
AdBlock Detected

We get it, advertisements are annoying!

Sure, ad-blocking software does a great job at blocking ads, but it also blocks useful features and essential functions on BlackHatWorld and other forums. These functions are unrelated to ads, such as internal links and images. For the best site experience please disable your AdBlocker.

I've Disabled AdBlock