Looking for basic information? Review Site Setup or Modifying an existing Site's Settings.
In this Article:
To configure a site's Advanced Settings, select the Advanced tab in the Site Settings section of the Site Settings panel.
Enable PDF Testing
Select box to Enable PDF checking on the site. If you don't see this box, the feature hasn't been turned on for your account. Contact us using the chat bubble and we can get that set up for you.
Obey Robots.txt
By default, sites will obey robots.txt rules configured within a site.
Disobey robots.txt rules by deselecting the Obey robots.txt checkbox.
Page Preview/Analysis Options
Viewport
By default, the Viewport is set to Use Default Viewport, 1440x900.
For more options, select the Viewport dropdown. In the dropdown, you will find the following options:
Use Default Viewport : 1440 x 900
Desktop : 1920 x 1080
Macbook Air: 1280 X 800
Tablets
iPad Pro : 834 x 1194
iPad Air : 820 x 1180
iPad Mini : 768 x 1024
Galaxy Tab S7: 800 x 1280
Apple Smartphones
iPhone 15 Plus/PRO MAX : 430 x 932
iPhone 15 (PRO) : 393 x 852
iPhone 13 Mini : 375 x 812
iPhone 12/13/14 (PRO) : 390 x 844
iPhone 11/XR : 414 x 896
iPhone 5/SE : 320 x 568
Android Smartphones
Samsung Galaxy S20 : 393 x 568
Google Pixel 5 : 360 x 800
Custom Viewport
Custom Viewport
Selecting Custom Viewport in the Viewport dropdown will allow you to enter the exact pixel specification you need.
The following fields will appear when this option is selected :
Viewport Width - Pixels (up to 2560) the crawler will use for the viewport width. Enter only a whole number, no unit is required.
Viewport Height - Pixels (up to 1440) the crawler will use for the viewport height. Enter only a whole number, no unit is required.
Crawler Viewport Pixel Ratio/Scale Factor - The measure of how many physical pixels there are in a virtual inch of space on the device.
Crawl Using a Mobile Device and/or Touch Screen Device checkboxes - These provide you the option to choose the type of options used for the re-crawl needed after you adjust these settings.
The default values for the Viewport Width/Height and the Crawl Using a Mobile/Touch Screen checkboxes are taken from the last pre-set selection in the Viewport dropdown.
After adjusting the Viewport Width and/or Height, the site must be re-crawled for the data and viewports to be updated.
Selector to click on page load
Refer to the Remove optional banner/popup content from displaying on webpages within DubBot article for more information about learning how to close out content sections like Cookie notifications, Survey popups, Alerts, etc., before any analysis occurs on the page.
Entries in this field produce a click action, which can be useful for expanding accordions or accepting cookies' terms of use.
Selector to remove matching elements from a cached page
This field is useful when there is some element that needs to be removed from the page to enhance the page preview experience where a JavaScript click action will not remove the content that may be making the page hard to view.
This is useful for removing elements that make viewing the cached copy of a page difficult inside the DubBot app. This is common for loader icons that do not close out on their own and Chat boxes that take up space or cover important page elements. Examples could include modals and overlays that appear over pages.
Delay processing (in seconds)
Admins can determine a set number of seconds for DubBot to wait between webpages being crawled. This is done to slow DubBot traffic to a web server for web servers that require a slower amount of activity.
Note: Setting any amount of delay (above 0 seconds) will also prevent DubBot from performing parallel crawls. Entering the amount of 1 second (or higher) will result in the application crawling pages one by one. The delay is entered in whole seconds. The maximum amount of time for a delay is 10 seconds.
Changing this setting will result in a slower crawl and analysis by DubBot. The recommendation is to start with 1 when updating this setting.
Scroll to the bottom of each page
If your site implements lazy loading, check the Scroll to the bottom of each page box. This ensures that all of your pages’ content is loaded before processing the page.
Advanced Crawler Configuration
Disable using proxy rewriting for URL
By default, DubBot uses proxy rewriting to display some content from a client’s site in the app. Sometimes, clients have a setup that doesn't allow this to work. A good, simple test if something isn’t displaying correctly in DubBot is to check this box and re-crawl the site.
Generally, disabling the proxy is fine, but there are two main reasons the proxy exists:
To show content (images, CSS) from HTTP sites (not as much of an issue currently, as HTTPS adoption is so high)
To access content when client servers have security settings that block our crawler.
Additional Domains Allowed in Crawl
The Additional Domains allowed in Crawl field allows you to enter supplementary domains that may be needed in your site crawl. Some possible uses:
The site uses both a www and www1 subdomain for its pages.
The site utilizes a media server with a different URL to hold PDF files.
One full domain, including http(s)://, should be entered on each line.
Allow Crawler to check HTTPS and HTTP links
Check the Allow crawler to check HTTPS and HTTP links box if you have a mix of these protocols on pages in your site.
Page load timeout (in seconds)
The Page load timeout (in seconds) field allows administrators to select how long it will take for a page to be timed-out. The default timeout is 60 seconds. This can be modified to be up to 120 seconds, using whole numbers for seconds.
For websites with slower loading webpages, increasing the Page load timeout can be necessary to ensure the webpages are properly inventoried, tested, and not reported as a page that timed out.
Delay after each page crawl (in seconds)
In this field enter the number of seconds (up to 10 ) the crawler will wait before crawling the next page or asset. Use the default value of 0 for parallel crawling. This sometimes needs to be adjusted due to settings on a site's server.
Custom Crawler User Agent
Setting a Customer Crawler User Agent is only for the advanced user. Refer to this article that outlines Example Custom User Agents.
Crawler Authentication
This option must be enabled by DubBot on the account. If you do not see this option in your Advanced tab, please contact DubBot's Support team via the app's chat, or you can email help@dubbot.com.
Enable crawler authentication check box.
If the Site you are crawling is behind a login, you can have our crawler access that content. Learn more about Crawling behind a login.
Crawler Scheduling
Days between crawls
The default cadence for crawling and testing sites is every seven (7) days. To slow this cadence, update the Days between crawls field. DubBot requires whole numbers, and the cadence can be seven (7) or more days.
More on the Site Settings panel
Site Setup (General Tab)
Advanced Settings for Sites << You are here
If you have questions, please contact our DubBot Support team via email at help@dubbot.com or via the blue chat bubble in the lower right corner of your screen. We are here to help!