Improved Robots.txt Settings Integration
Robots.txt files that allow-list the "Dubbotbot" User Agent will now find better application of that allow to their crawls.
In addition, Crawl-delay settings in the robots.txt file are now combined with any delays set in a site's Advanced Settings.
For example, when a Site has a delay configured (up to 10 sec in DubBot), and its robots.txt has set a Crawl-delay (which can be any number), the crawler will use the larger of the two numbers (e.g. Site has delay = 5 and their robots.txt has Crawl-delay: 10 it will use the 10 second delay)
Learn more about Robots.txt and site settings
New 'Wait Until Element' Advanced Settings
To slow our crawler to allow for differing page load times across clients, we have added Wait Until Element settings. Using this setting tells a site's crawl to wait for a specific element to be visible (or not visible) on a page before performing checks.
In the Advanced tab of a Site's Settings, the Wait Until Element Type dropdown has the following options:
None (default)
Attached - wait for element to be present in DOM (Document Object Model)
Detached - wait for element to not be present in DOM
Visible - wait for element to have non-empty bounding box and no visibility: hidden
Hidden - opposite of Visible; wait for element to be either detached from the DOM, or have an empty bounding box or visibility:hidden
Selecting any option other than the default None will expose the Wait Until Element Selector field. In this field, enter the CSS selector the crawler should look for, as configured in the Wait Until Element Type dropdown.
Learn more about Advanced Settings for Sites.
If you have questions, please contact our DubBot Support team via email at help@dubbot.com or via the blue chat bubble in the lower right corner of your screen. We are here to help!
