Introduction

To automate a variety of online browser operations, Selenium uses Python as a library and tool. Web scraping is one of them. Some websites may identify that you’re using Selenium even if there is no automation at all. After google there are some advices how to prevent selenium from being blocked or tracked:

  • Using Spoofing User Agents
  • using Proxies
  • Changing the property value of navigator for webdriver to undefined as follows:
    • disable-blink-features
    • Exclude the collection of enable-automation switches
    • Turn-off useAutomationExtension

Chrome Driver

Finally discovered the simple solution for this with a simple flag! 😃
Use the below code, this will remove any traces of webdriver.

options.add_argument("--disable-blink-features")
options.add_argument("--disable-blink-features=AutomationControlled")

Code Example

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service

options = Options()
options.add_argument("--disable-blink-features")
options.add_argument("--disable-blink-features=AutomationControlled")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
options.add_argument('--incognito')
options.add_argument("disable-cache")
options.add_argument('disable-infobars')
options.add_argument('log-level=3')
options.add_argument('--headless')
options.add_argument('user-agent="Mozilla/5.0 (iPhone; CPU iPhone OS 13_5 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.1.1 Mobile/15E148 Safari/604.1"')

driver = webdriver.Chrome(options=options, service=Service(r'C:\path\to\chromedriver.exe'))
driver.get("https://www.google.com/")

History

As per the W3C Editor’s Draft the current implementation strictly mentions:

The webdriver-active flag is set to true when the user agent is under remote control which is initially set to false.

文章作者: Coin
版权声明: 本站所有文章除特别声明外,均采用 CC BY-NC-SA 4.0 许可协议。转载请注明来自 Coin's blog
python Default Category python Selenium Software development Programming
喜欢就支持一下吧