{"id":6386,"date":"2020-07-26T09:42:09","date_gmt":"2020-07-26T09:42:09","guid":{"rendered":"https:\/\/docs_v3.dataforseo.com\/v3\/?page_id=6386"},"modified":"2024-12-26T16:09:31","modified_gmt":"2024-12-26T16:09:31","slug":"on_page-overview","status":"publish","type":"page","link":"https:\/\/docs.dataforseo.com\/v3\/on_page-overview\/","title":{"rendered":"on_page\/overview"},"content":{"rendered":"<div class=\"wpb-content-wrapper\"><p>[vc_row][vc_column][vc_column_text]<\/p>\n<h1 id=\"introduction\">OnPage API: Overview<\/h1>\n    <div class=\"endpoint\">\n        <img decoding=\"async\" class=\"endpoint__icon\" src=\"https:\/\/docs.dataforseo.com\/v3\/wp-content\/themes\/dataforseo\/assets\/img\/icons\/checked-circle.svg\" alt=\"checked\">\n\n                    OnPage API is the customizable crawling engine for extracting website performance data            <\/div>\n    \n<h2>Endpoints and Parameters<\/h2>\n<p>OnPage API encompasses multiple endpoints, which allow you to crawl any website or webpage according to customizable parameters and evaluate its on-page optimization performance against a multitude of SEO and website health benchmarks.<\/p>\n<p>Sending a website for crawling is done through a POST request to the <a href=\"\/v3\/on_page\/task_post\/\" target=\"_blank\" rel=\"noopener noreferrer\">OnPage Task Post<\/a> endpoint. Alongside the required input fields (domain name or URL and maximum number of pages to crawl), you can also use additional customizable parameters, such as:<\/p>\n<p><strong>\u25cf Custom thresholds<\/strong> are applied through the <code>checks_threshold<\/code> field in the Task Post request and can be used to customize default threshold values for parameters in the <code>checks<\/code> array of OnPage API responses.<\/p>\n<p><strong>\u25cf Custom JavaScript rules<\/strong> are applied through the <code>custom_js<\/code> field in the Task Post request and can be used to execute a custom JavaScript code when crawling pages. You can also use the <code>enable_javascript<\/code> parameter to execute built-in JavaScript rules set on a crawled site.<\/p>\n<p><strong>\u25cf Store raw HTML<\/strong> is applied through the <code>store_raw_html<\/code> field in the Task Post request and can be used to obtain the HTML of the crawled page by making a request to the <a href=\"\/v3\/on_page\/raw_html\" target=\"_blank\" rel=\"noopener noreferrer\">Raw HTML<\/a> endpoint.<\/p>\n<p>Besides these parameters, you can also instruct our crawler to:<\/p>\n<p><strong>\u25cf <code>load_resources<\/code><\/strong> such as images, stylesheets, scripts, and broken resources;<\/p>\n<p><strong>\u25cf <code>enable_javascript<\/code><\/strong> &#8211; that is execute Javascript on the crawled pages;<\/p>\n<p><strong>\u25cf <code>enable_browser_rendering<\/code><\/strong> to measure Core Web Vitals;<\/p>\n<p><strong>\u25cf <code>calculate_keyword_density<\/code><\/strong> to obtain keyword density values for target site.<\/p>\n<p><strong>Note:<\/strong> additional charges may apply. To learn more about the cost of all OnPage API parameters, please refer to <a href=\"https:\/\/dataforseo.com\/help-center\/cost-of-onpage-api-parameters\" target=\"_blank\" rel=\"noopener noreferrer\">this help article<\/a>. Check our <a title=\"Pricing\" href=\"https:\/\/dataforseo.com\/pricing\/on-page\" target=\"_blank\" rel=\"noopener noreferrer\">Pricing<\/a> to calculate the costs.<\/p>\n<p>After the website is fetched for crawling, you can start retrieving results using the following endpoints:<\/p>\n<div style=\"width: 92%; padding: 0 28px; box-sizing: border-box;\">\n<div class=\"wpb_column vc_column_container vc_col-lg-4 vc_col-md-12 vc_col-sm-12\">\n<ul>\n<li><a href=\"\/v3\/on_page\/summary\/\" target=\"_blank\" rel=\"noopener noreferrer\">Summary<\/a> &#8211; provides a summary of on-page issues found on a website;<\/li>\n<li><a href=\"\/v3\/on_page\/pages\/\" target=\"_blank\" rel=\"noopener noreferrer\">Pages<\/a> &#8211; returns a list of crawled pages with check-ups and other page performance metrics;<\/li>\n<li><a href=\"\/v3\/on_page\/page_by_resource\/\" target=\"_blank\" rel=\"noopener noreferrer\">Pages by Resource<\/a> &#8211; provides a list of pages and related data that contain a specific resource;\n<\/li>\n<li><a href=\"\/v3\/on_page\/resources\/\" target=\"_blank\" rel=\"noopener noreferrer\">Resources<\/a> &#8211; offers a list of resources on a website, including images, scripts, stylesheets, etc.;<\/li>\n<li><a href=\"\/v3\/on_page\/duplicate_tags\/\" target=\"_blank\" rel=\"noopener noreferrer\">Duplicate Tags<\/a> &#8211; returns a list of pages that contain duplicate title or description tags;<\/li>\n<li><a href=\"\/v3\/on_page\/duplicate_content\/\" target=\"_blank\" rel=\"noopener noreferrer\">Duplicate Content<\/a> &#8211;  returns a list of pages that have content similar to the page specified in the request;<\/li>\n<\/ul>\n<\/div>\n<div class=\"wpb_column vc_column_container vc_col-lg-4 vc_col-md-12 vc_col-sm-12\">\n<ul>\n<li><a href=\"\/v3\/on_page\/links\/\" target=\"_blank\" rel=\"noopener noreferrer\">Links<\/a> &#8211; provides a list of internal and external links detected on a target website;<\/li>\n<li><a href=\"\/v3\/on_page\/redirect_chains\/\" target=\"_blank\" rel=\"noopener noreferrer\">Redirect Chains<\/a> &#8211; helps to quickly identify and trace down multiple redirects issues;<\/li>\n<li><a href=\"\/v3\/on_page\/non_indexable\/\" target=\"_blank\" rel=\"noopener noreferrer\">Non-indexable<\/a> &#8211; returns a list of pages that are blocked from being indexed by search engines;<\/li>\n<li><a href=\"\/v3\/on_page\/waterfall\/\" target=\"_blank\" rel=\"noopener noreferrer\">Waterfall<\/a>&#8211; provides page speed insights data;<\/li>\n<li><a href=\"\/v3\/on_page\/keyword_density\/\" target=\"_blank\" rel=\"noopener noreferrer\">Keyword Density<\/a> &#8211; provides keyword density and keyword frequency data for terms appearing on the specified website or web page;<\/li>\n<li><a href=\"\/v3\/on_page\/raw_html\/\" target=\"_blank\" rel=\"noopener noreferrer\">Raw HTML<\/a> &#8211; returns the HTML of a page you indicate in the request.<\/li>\n<\/ul>\n<\/div>\n<\/div>\n<p>&nbsp;<br \/>\nYou can fetch data on pages gradually as our crawler processes the pages; this way you don&#8217;t have to wait until all the submitted pages are crawled. Alternatively, you can request complete results when the crawling is finished. The crawling process indicator is the <code>crawl_progress<\/code> field in the results of the <a href=\"\/v3\/on_page\/summary\/\" target=\"_blank\" rel=\"noopener noreferrer\">Summary<\/a> endpoint. \u200c<\/p>\n<p>OnPage API allows you to use pingbacks by specifying the <code>pingback_url<\/code> when setting a task, and we will notify you upon the completion of tasks. Learn more on our <a href=\"https:\/\/dataforseo.com\/help-center\/pingbacks-postbacks-with-dataforseo-api\" target=\"_blank\" rel=\"noopener noreferrer\">Help Center.<\/a> If you use the Standard method without specifying the <code>pingback_url<\/code>, you can receive the list of id for all completed tasks using the <strong>&#8216;Tasks Ready&#8217;<\/strong> endpoint. It is designed to provide you with a list of completed tasks, which haven&#8217;t been collected yet. <\/p>\n<p>Besides auditing sites with the endpoints listed above, you can also perform quick scans of individual pages with the <a href=\"\/v3\/on_page\/instant_pages\" target=\"_blank\" rel=\"noopener noreferrer\">Instant Pages<\/a> endpoint and capture screenshots of individual pages using the <a href=\"\/v3\/on_page\/page_screenshot\" target=\"_blank\" rel=\"noopener noreferrer\">Page Screenshot<\/a> endpoint. Both of these endpoint work in the Live mode, meaning that you&#8217;ll get the results right away in the API response without making a separate request.<\/p>\n<p>To find answers on common questions about OnPage API and find guidance on efficient use of its features, <a href=\"https:\/\/dataforseo.com\/help-center\/category\/onpage-api\" target=\"_blank\" rel=\"noopener noreferrer\">visit our Help Center.<\/a><\/p>\n<h2>Limits and Force Stop<\/h2>\n<p>Using the <a href=\"\/v3\/on_page\/task_post\/?bash\">Task POST<\/a> function, you can send up to 2000 API calls per minute, with each POST call containing no more than 100 tasks. Contact us if you&#8217;d like to raise the limit.<\/p>\n<p><strong>Note 1:<\/strong> For all other endpoints of OnPage API (except Instant Pages and Page Screenshot), we do not recommend sending several tasks in one POST call as it may result in system overload and undesirable 4xx or 5xx errors.<\/p>\n<p><strong>Note 2:<\/strong> Unlike other OnPage API endpoints, <a href=\"\/v3\/on_page\/instant_pages\" target=\"_blank\" rel=\"noopener noreferrer\">Instant Pages<\/a> and <a href=\"\/v3\/on_page\/page_screenshot\" target=\"_blank\" rel=\"noopener noreferrer\">Page Screenshot<\/a> work based on a Live method of data processing, meaning you don&#8217;t have to make a separate GET request to obtain the results. Using this endpoint, you can send up to 2000 API requests per minute, with each request containing no more than 20 tasks.<\/p>\n<p><strong>Note 3:<\/strong> The maximum number of simultaneous requests you can send is limited to 30.<\/p>\n<p>In case you need to force stop the crawl process of websites you specified in a task, use <a href=\"\/v3\/on_page\/force_stop\/?bash\" target=\"_blank\" rel=\"noopener noreferrer\">the Force Stop endpoint<\/a>.<\/p>\n<p>Visit <a href=\"https:\/\/dataforseo.com\/help-center\/best-practices-for-handling-onpage-api-requests\" target=\"_blank\" rel=\"noopener noreferrer\">DataForSEO Help Center<\/a> to get practical tips for request handling depending on your OnPage API payload volume.<\/p>\n<h3>The crawling requests will be sent from the following IPs:<\/h3>\n<p><code>94.130.93.30<br \/>\n168.119.141.170<br \/>\n168.119.99.190<br \/>\n168.119.99.191<br \/>\n168.119.99.192<br \/>\n168.119.99.193<br \/>\n168.119.99.194<br \/>\n68.183.60.34<br \/>\n134.209.42.109<br \/>\n68.183.60.80<br \/>\n68.183.54.131<br \/>\n68.183.49.222<br \/>\n68.183.149.30<br \/>\n68.183.157.22<br \/>\n68.183.149.129<\/code><\/p>\n<h3>The default user agent of the DataForSEO OnPage Crawler<\/h3>\n<p><code>Mozilla\/5.0 (compatible; RSiteAuditor)<\/code><\/p>\n<p>Note that the user agent can be customized by the user.<\/p>\n<h2>Cost<\/h2>\n<p>The cost of using OnPage API endpoints depends on the parameters set in the <a href=\"\/v3\/on_page\/task_post\/\" target=\"_blank\" rel=\"noopener noreferrer\">OnPage Task Post<\/a> request. In particular, using <code>load_resources<\/code>, <code>enable_javascript<\/code>, <code>enable_browser_rendering<\/code>, <code>custom_js<\/code>, and <code>calculate_keyword_density<\/code> parameters will result in additional charges. To learn more about the cost of all OnPage API parameters, please refer to <a href=\"https:\/\/dataforseo.com\/help-center\/cost-of-onpage-api-parameters\" target=\"_blank\" rel=\"noopener noreferrer\">this help article<\/a>.<\/p>\n<p>Your account is charged for the actual number of crawled pages. If you specified more pages than a website contains, the difference will be refunded to your account after a task is completed.<\/p>\n<p>The cost can be calculated on the <a title=\"Pricing\" href=\"https:\/\/dataforseo.com\/pricing\/on-page\" target=\"_blank\" rel=\"noopener noreferrer\">Pricing<\/a> page. You can check your spending in your <a href=\"https:\/\/app.dataforseo.com\/api-access\" target=\"_blank\" rel=\"noopener noreferrer\">account dashboard<\/a> or by making a separate call to <a href=\"\/v3\/appendix\/user_data\/?php\" target=\"_blank\" rel=\"noopener noreferrer\">the User Data endpoint.<\/a><\/p>\n<p>You can test OnPage API for free using DataForSEO <a href=\"\/v3\/appendix\/sandbox\/\">Sandbox.<\/a><br \/>\n[\/vc_column_text][\/vc_column][\/vc_row]<\/p>\n<\/div>","protected":false},"excerpt":{"rendered":"<p>[vc_row][vc_column][vc_column_text] OnPage API: Overview OnPage API is the customizable crawling engine for extracting website performance data Endpoints and Parameters OnPage API encompasses multiple endpoints, which allow you to crawl any website or webpage according to customizable parameters and evaluate its on-page optimization performance against a multitude of SEO and website health benchmarks. Sending a website [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"template.php","meta":{"apibase_doc_request_yaml":"","apibase_doc_request_additional_yaml":"","apibase_doc_response_yaml":"","footnotes":""},"class_list":["post-6386","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/docs.dataforseo.com\/v3\/wp-json\/wp\/v2\/pages\/6386","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/docs.dataforseo.com\/v3\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/docs.dataforseo.com\/v3\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/docs.dataforseo.com\/v3\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/docs.dataforseo.com\/v3\/wp-json\/wp\/v2\/comments?post=6386"}],"version-history":[{"count":53,"href":"https:\/\/docs.dataforseo.com\/v3\/wp-json\/wp\/v2\/pages\/6386\/revisions"}],"predecessor-version":[{"id":6770,"href":"https:\/\/docs.dataforseo.com\/v3\/wp-json\/wp\/v2\/pages\/6386\/revisions\/6770"}],"wp:attachment":[{"href":"https:\/\/docs.dataforseo.com\/v3\/wp-json\/wp\/v2\/media?parent=6386"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}