Logo

dev-resources.site

for different kinds of informations.

Stealth Mode—Enhanced Bot Detection Evasion—Launch week day 3

Published at
11/13/2024
Categories
scraping
automation
dataextraction
ai
Author
nearestnabors
Author
13 person written this
nearestnabors
open
Stealth Mode—Enhanced Bot Detection Evasion—Launch week day 3

One of the biggest challenges with scraping or automating actions on external websites is getting past bot detection. It’s a constant hurdle for developers who rely on web data but find their scripts blocked or throttled by anti-bot systems due to automated activity detection.

That’s why we invested in Playwright Stealth (a port of puppeteer-extra-plugin-stealth) when its original project was no longer maintained. We call this powerful tool Stealth Mode.

AgentQL in Stealth Mode

Stealth Mode is AgentQL’s solution for minimizing the chance of detection when running scripts on third-party sites. Built with bot-detection evasion in mind, Stealth Mode leverages Playwright to simulate human-like browsing behaviors, reducing tell-tale signs of automation that can trigger anti-scraping mechanisms. It works by masking automation indicators, mimicking natural browsing, and adjusting browser settings to evade detection—making your interactions smoother, safer, and less likely to get blocked.

Modern websites often deploy sophisticated bot detection systems that analyze browser behavior, properties, and interactions to distinguish human users from bots. Without Stealth Mode, several signals can reveal the use of automation, such as:

  • navigator.webdriver property: Indicates whether a browser is controlled by automation.
  • Headless mode detection: Certain differences in how headless browsers behave compared to regular browsers.
  • Missing or inconsistent browser APIs: Bots often miss certain APIs or provide inconsistent values (for example, WebGL, and media codecs).

These detection methods can result in websites blocking automated sessions, presenting CAPTCHA challenges, or even banning IP addresses. AgentQL Stealth Mode helps to bypass such measures by minimizing the traces of automation and simulating real users in a real browser environment.

More than just bot detection: Why imitate regular users?

Stealth Mode is ideal for anyone using AgentQL to scrape or automate interactions with sites they don’t directly control. There are a few good reasons to imitate real humans:

  • Data Scrapers working with popular sites: If you’re extracting public data from sites with lots of bot traffic, Stealth Mode reduces the chance of interruptions, allowing your scripts to run smoothly.
  • More human-like interactions: Whether you’re testing in a live environment or automating workflows for dynamic web pages, Stealth Mode lets AgentQL behave more like a real person, getting more consistent results.
  • Developers focused on reliability: Stealth Mode is a must-have for developers who need uninterrupted script execution, particularly when running frequent, high-volume, or long-duration tasks on external URLs.

As websites become increasingly sophisticated in detecting automation and bot activity, tools that simulate human behavior are essential for reliable, uninterrupted access. With Stealth Mode, you can ensure that your scripts operate effectively on third-party sites, allowing you to focus on building and gathering content on websites rather than troubleshooting detection issues.

Stealth Mode is particularly beneficial for:

  • Ensuring Stability: Your scripts are less likely to be flagged and disrupted, so you can collect data or perform actions reliably.
  • Expanding Automation Possibilities: With fewer detection risks, you’re free to scale your operations without constant modifications.
  • Reducing Manual Intervention: Fewer interruptions mean less time tweaking or restarting scripts due to detection.

How to Enable and Use Stealth Mode

You can enable Stealth Mode in AgentQL by calling the enable_stealth_mode function on an AgentQL page object:

from playwright.async_api import async_playwright
import agentql
async with async_playwright() as Playwright, await playwright.chromium.launch(
    headless=False,
) as browser:
    page = await agentql.wrap_async(browser.new_page())

        # enables stealth mode
    await page.enable_stealth_mode()
    await page.goto("https://bot.sannysoft.com/")
    await page.wait_for_timeout(30000)
Enter fullscreen mode Exit fullscreen mode

Visit our documentation to learn more about using Stealth Mode with headless browsers.

Run the sample script

We’ve prepared an in-depth working example script to show how to use Stealth Mode and modified browser attributes alongside other anti-bot best practices to overcome anti-bot measures by:

  • Randomizing various HTTP request headers the browser sends to the server. Randomizing User-Agent, Accept-Language, Referer, etc, helps make consecutive requests look more like they are coming from normal users.
  • Use real request headers. Use browser fingerprinting—set real request headers on a headless browser to imitate a real human.
  • Randomizing browser window size. Real users use different sizes of windows. Some websites track the window size; if it’s the same for all requests, it’s a sign of a bot.
  • Randomizing timezone and geolocation. Some websites track the timezone and geolocation, and if it’s the same for all requests, it’s a sign of a bot.
  • (Optional) Using a proxy server. You would need to secure proxy services from an external proxy provider (NetNut, BrightData, or similar) to configure things like host, username, and password separately.

Best Practices for Avoiding Bot Detection

When using Stealth Mode, keep these practices in mind to maximize effectiveness:

  • Adjust interaction timing: Extend the period of time between actions and use random delays to mimic natural browsing patterns and reduce bot detection triggers.
  • Use rotating User Agents: Cycling through different user-agent strings can further reduce detection risks.
  • Reduce request frequency: A real human won’t flood a server with thousands of requests. Avoid overloading sites with excessive requests all at once—this improves reliability and reduces detection likelihood.

Ready to Start Scraping & Automating in Stealth?

AgentQL’s Stealth Mode is an essential tool for reliably executing scripts and collecting data without triggering anti-bot detection. Visit our Stealth Mode reference for a full feature breakdown, check out our guide to using Stealth Mode, explore our example scripts to see Stealth Mode in action.

Your feedback is what made this launch week possible, and we want more! Join our Discord or take our survey so we can keep making AgentQL serve your needs better!

We are keen to take on maintaining the original Playwright Stealth but are having difficulty reaching the maintainer. If you or someone you know can put us in touch, please reach out!

Stay tuned for tomorrow’s feature: Query Generation—an AI-powered way to create complex queries faster than ever.

Happy coding!

—The Tiny Fish Team Building AgentQL

scraping Article's
30 articles in total
Favicon
Scraping real estate data with Python to find opportunities
Favicon
A Comprehensive Guide to Google Maps Scraping
Favicon
Instant Data Scraper Guide - Web Scraping with No Code
Favicon
AgentQL Launch Week Recap—make the web AI-ready
Favicon
Black Friday para Datos Scraping Tool
Favicon
Get data from any page: AgentQL’s Rest API Endpoint—Launch week day 5
Favicon
Natural Language Query Generation for Faster Results—Launch week day 4
Favicon
Fast Mode—AgentQL is “fast by default”—Launch week day 2
Favicon
AgentQL for fullstack developers: announcing our JavaScript SDK—Launch week day 1
Favicon
Stealth Mode—Enhanced Bot Detection Evasion—Launch week day 3
Favicon
[Python] Скрипт для получения новостей с сайта Chita.ru
Favicon
Scrape but Validate: Data scraping with Pydantic Validation
Favicon
What Do You Need for Scraping Amazon?
Favicon
Building a Terabox Video player online with download option || Scraping APIs
Favicon
Building a Terabox Video player online with download option || Scraping APIs
Favicon
Scraping de Open Data utilizando GitHub
Favicon
Go, Gemini e Alexa: Como criar automações para o seu dia a dia
Favicon
How to Scrape Amazon: A Comprehensive Guide
Favicon
Step-by-Step Guide to Scraping JavaScript-Rich Websites in Laravel with PuPHPeteer
Favicon
Rustify some puppeteer code(part I)
Favicon
How to Web Scrape Bing: Main Stages and Difficulties
Favicon
Step-by-Step Guide for Web Scraping Using BeautifulSoup
Favicon
Web Scraping With PowerShell
Favicon
Gopherizing some puppeteer code
Favicon
Why is a web data scraper necessary in the age of AI?
Favicon
What Should Be Followed While Scraping Data From Local Citations?
Favicon
Scrape Redfin Property Data
Favicon
Building a scraper
Favicon
爬蟲與反爬蟲
Favicon
The Legality of Scraping Google: What You Need to Know

Featured ones: