Setting Google Dataset Search Tasks

‌‌
Google Dataset Search API provides top 20 Google Dataset search engine results. These results are specific to the indicated keyword. You can specify other parameters optionally.

There are two different priorities that stand for the relative speed of task execution: normal and high.

checked POST
Pricing

Your account will be charged only for setting a task.
The cost can be calculated on the Pricing page.

All POST data should be sent in the JSON format (UTF-8 encoding). The task setting is done using the POST method. When setting a task, you should send all task parameters in the task array of the generic POST array. You can send up to 2000 API calls per minute, with each POST call containing no more than 100 tasks. If your POST call contains over 100 tasks, the tasks over this limit will return the 40006 error. Visit DataForSEO Help Center to get practical tips for request handling depending on your SERP API payload volume.

You can retrieve the results of completed tasks using the unique task identifier id. Alternatively, we can send them to you as soon as they are ready if you specify the postback_url or pingback_url when setting a task. Note that if your server doesn’t respond within 10 seconds, the connection will be aborted by timeout, and the task will be transferred to the ‘Tasks Ready’ list. The error code and message depend on your server’s configuration. See Help Center to learn more about using pingbacks and postbacks with DataForSEO APIs.

Below you will find a detailed description of the parameters that are required or recommended for setting a task.

Main Parameters

Field nameTypeDescription
keywordstring

keyword

required field

you can specify up to 700 characters in the keyword field

all %## will be decoded (plus character ‘+’ will be decoded to a space character)

if you need to use the “%” character for your keyword, please specify it as “%25”;

if you need to use the “+” character for your keyword, please specify it as “%2B”.


learn more about rules and limitations of keyword and keywords fields in DataForSEO APIs in this Help Center article

language_codestring

search engine language code

optional field

possible value:

en

depthinteger

parsing depth

optional field

number of results in SERP

default value: 20

max value: 700


Your account will be billed per each SERP containing up to 20 results;

Setting depth above 20 may result in additional charges if the search engine returns more than 20 results;

If the specified depth is higher than the number of results in the response, the difference will be refunded to your account balance automatically;

priorityinteger

task priority

optional field

can take the following values:

1 – normal execution priority (set by default)

2 – high execution priority


You will be additionally charged for the tasks with high execution priority.

The cost can be calculated on the Pricing page.

devicestring

device type

optional field

return results for a specific device type

possible value: desktop

pingback_urlstring

notification URL of a completed task

optional field

when a task is completed we will notify you by GET request sent to the URL you have specified

you can use the ‘$id’ string as a $id variable and ‘$tag’ as urlencoded $tag variable. We will set the necessary values before sending the request.

example:

http://your-server.com/pingscript?id=$id

http://your-server.com/pingscript?id=$id&tag=$tag

Note: special characters in pingback_url will be urlencoded;

i.a., the # character will be encoded into %23


learn more on our Help Center

postback_urlstring

URL for sending task results

optional field

once the task is completed, we will send a POST request with its results compressed in the gzip format to the postback_url you specified

you can use the ‘$id’ string as a $id variable and ‘$tag’ as urlencoded $tag variable. We will set the necessary values before sending the request

example:

http://your-server.com/postbackscript?id=$id

http://your-server.com/postbackscript?id=$id&tag=$tag

Note: special characters in postback_url will be urlencoded;

i.a., the # character will be encoded into %23


learn more on our Help Center

postback_datastring

postback_url datatype

required field if you specify postback_url

corresponds to the datatype that will be sent to your server

only value: advanced


Below you will find a drop-down list with the additional parameters you can use for setting a task.
Additional Parameters
Field nameTypeDescription
language_namestring

full name of search engine language

optional field

if you use this field, you don't need to specify language_code

possible value:

English

osstring

device operating system

optional field

possible values: windows, macos

default value: windows

tagstring

user-defined task identifier

optional field

the character limit is 255

you can use this parameter to identify the task and match it with the result

you will find the specified tag value in the data object of the response

last_updatedstring

last time the dataset was updated

optional field

possible values: 1m, 1y, 3y

file_formatsarray

file formats of the dataset

optional field

possible values: other, archive, text, image, document, tabular

usage_rightsstring

usage rights of the dataset

optional field

possible values: commercial, noncommercial

is_freeboolean

indicates whether displayed datasets are free

optional field

possible values: true, false

topicsarray

dataset topics

optional field

possible values: humanities, social_sciences, life_sciences, agriculture, natural_sciences, geo, computer, architecture_and_urban_planning, engineering


‌‌‌‌‌‌‌‌
As a response of the API server, you will receive JSON-encoded data containing a tasks array with the information specific to the set tasks.

Description of the fields in the results array:

Field name Type Description
version string the current version of the API
status_code integer general status code
you can find the full list of the response codes here
Note: we strongly recommend designing a necessary system for handling related exceptional or error conditions
status_message string general informational message
you can find the full list of general informational messages here
time string execution time, seconds
cost float total tasks cost, USD
tasks_count integer the number of tasks in the tasksarray
tasks_error integer the number of tasks in the tasks array returned with an error
tasks array array of tasks
        id string unique task identifier in our system
in the Universally unique identifier (UUID) format
        status_code integer status code of the task
generated by DataForSEO; can be within the following range: 10000-60000
        status_message string informational message of the task
        time string execution time, seconds
        cost float cost of the task, USD
        result_count integer number of elements in the result array
        path array URL path
        data object contains the same parameters that you specified in the POST request
        result array array of results
in this case, the value will be null

‌‌

Instead of ‘login’ and ‘password’ use your credentials from https://app.dataforseo.com/api-access

# Instead of 'login' and 'password' use your credentials from https://app.dataforseo.com/api-access 
login="login" 
password="password" 
cred="$(printf ${login}:${password} | base64)" 
curl --location --request POST 'https://api.dataforseo.com/v3/serp/google/dataset_search/task_post' 
--header "Authorization: Basic ${cred}"  
--header "Content-Type: application/json" 
--data-raw '[
  {
    "keyword": "water quality",
    "last_updated": "1m",
    "file_formats": [ "archive", "image" ],
    "usage_rights": "noncommercial",
    "is_free": true,
    "topics": [ "natural_sciences", "geo" ]
  }
]'
<?php
// You can download this file from here https://cdn.dataforseo.com/v3/examples/php/php_RestClient.zip
require('RestClient.php');
$api_url = 'https://api.dataforseo.com/';
try {
   // Instead of 'login' and 'password' use your credentials from https://app.dataforseo.com/api-access
   $client = new RestClient($api_url, null, 'login', 'password');
} catch (RestClientException $e) {
   echo "n";
   print "HTTP code: {$e->getHttpCode()}n";
   print "Error code: {$e->getCode()}n";
   print "Message: {$e->getMessage()}n";
   print  $e->getTraceAsString();
   echo "n";
   exit();
}
$post_array = array();
// example #1 - a simple way to set a task
// this way requires you to specify a keyword.
$post_array[] = array(
   "keyword" => "water quality",
   "last_updated" => "1m",
   "file_formats" => [ "archive", "image" ],
   "usage_rights" => "noncommercial",
   "is_free" => true,
   "topics" => [ "natural_sciences", "geo" ]
);
// example #2 - a way to set a task with additional parameters
// high priority allows us to complete a task faster, but you will be charged more credits.
// after a task is completed, we will send a GET request to the address you specify. Instead of $id and $tag, you will receive actual values that are relevant to this task.
$post_array[] = array(
   "keyword" => "water quality",
   "last_updated" => "1m",
   "file_formats" => [ "archive", "image" ],
   "usage_rights" => "noncommercial",
   "is_free" => true,
   "topics" => [ "natural_sciences", "geo" ],
   "priority" => 2,
   "tag" => "some_string_123",
   "pingback_url" => 'https://your-server.com/pingscript?id=$id&tag=$tag'
);
// this example has a 3 elements, but in the case of large number of tasks - send up to 100 elements per POST request
if (count($post_array) > 0) {
   try {
      // POST /v3/serp/google/dataset_search/task_post
      // in addition to 'google' and 'dataset_search' you can also set other search engine and type parameters
      $result = $client->post('/v3/serp/google/dataset_search/task_post', $post_array);
      print_r($result);
      // do something with post result
   } catch (RestClientException $e) {
      echo "n";
      print "HTTP code: {$e->getHttpCode()}n";
      print "Error code: {$e->getCode()}n";
      print "Message: {$e->getMessage()}n";
      print  $e->getTraceAsString();
      echo "n";
   }
}
$client = null;
?>
const post_array = [];

post_array.push({
    "keyword" => "water quality",
    "last_updated" => "1m",
    "file_formats" => [ "archive", "image" ],
    "usage_rights" => "noncommercial",
    "is_free" => true,
    "topics" => [ "natural_sciences", "geo" ]
});

const axios = require('axios');

axios({
    method: 'post',
    url: 'https://api.dataforseo.com/v3/serp/google/dataset_search/task_post',
    auth: {
        username: 'login',
        password: 'password'
    },
    data: post_array,
    headers: {
        'content-type': 'application/json'
    }
}).then(function (response) {
    var result = response['data']['tasks'];
    // Result data
    console.log(result);
}).catch(function (error) {
    console.log(error);
});
from client import RestClient
# You can download this file from here https://cdn.dataforseo.com/v3/examples/python/python_Client.zip

client = RestClient("login", "password")

post_data = dict()
# example #1 - a simple way to set a task
# this way requires you to specify a keyword.
post_data[len(post_data)] = dict(
    keyword="water quality",
    last_updated="1m",
    file_formats=[ "archive", "image" ],
    usage_rights="noncommercial",
    is_free=true,
    topics=[ "natural_sciences", "geo" ]
)
# example #2 - a way to set a task with additional parameters
# high priority allows us to complete a task faster, but you will be charged more.
# after a task is completed, we will send a GET request to the address you specify. Instead of $id and $tag, you will receive actual values that are relevant to this task.
post_data[len(post_data)] = dict(
    keyword="water quality",
    last_updated="1m",
    file_formats=[ "archive", "image" ],
    usage_rights="noncommercial",
    is_free=true,
    topics=[ "natural_sciences", "geo" ],
    priority=2,
    tag="some_string_123",
    pingback_url="https://your-server.com/pingscript?id=$id&tag=$tag"
)
# POST /v3/serp/google/dataset_search/task_post
# in addition to 'google' and 'dataset_search' you can also set other search engine and type parameters
# the full list of possible parameters is available in documentation
response = client.post("/v3/serp/google/dataset_search/task_post", post_data)
# you can find the full list of the response codes here https://docs.dataforseo.com/v3/appendix/errors
if response["status_code"] == 20000:
    print(response)
    # do something with result
else:
    print("error. Code: %d Message: %s" % (response["status_code"], response["status_message"]))
using Newtonsoft.Json;
using System;
using System.Collections.Generic;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Text;
using System.Threading.Tasks;

namespace DataForSeoDemos
{
    public static partial class Demos
    {
        public static async Task serp_task_post()
        {
            var httpClient = new HttpClient
            {
                BaseAddress = new Uri("https://api.dataforseo.com/"),
                // Instead of 'login' and 'password' use your credentials from https://app.dataforseo.com/api-access
                DefaultRequestHeaders = { Authorization = new AuthenticationHeaderValue("Basic", Convert.ToBase64String(Encoding.ASCII.GetBytes("login:password"))) }
            };
            var postData = new List<object>();
            // example #1 - a simple way to set a task
            // this way requires you to specify a keyword.
            postData.Add(new
            {
                keyword="water quality",
                last_updated="1m",
                file_formats=[ "archive", "image" ],
                usage_rights="noncommercial",
                is_free=true,
                topics=[ "natural_sciences", "geo" ]
            });
            // example #2 - a way to set a task with additional parameters
            // high priority allows us to complete a task faster, but you will be charged more money.
            // after a task is completed, we will send a GET request to the address you specify. Instead of $id and $tag, you will receive actual values that are relevant to this task.
            postData.Add(new
            {
                keyword="water quality",
                last_updated="1m",
                file_formats=[ "archive", "image" ],
                usage_rights="noncommercial",
                is_free=true,
                topics=[ "natural_sciences", "geo" ]
                priority = 2,
                tag = "some_string_123",
                pingback_url = "https://your-server.com/pingscript?id=$id&tag=$tag"
            });
            // POST /v3/serp/google/dataset_search/task_post
            // in addition to 'google' and 'dataset_search' you can also set other search engine and type parameters
            // the full list of possible parameters is available in documentation
            var taskPostResponse = await httpClient.PostAsync("/v3/serp/google/dataset_search/task_post", new StringContent(JsonConvert.SerializeObject(postData)));
            var result = JsonConvert.DeserializeObject<dynamic>(await taskPostResponse.Content.ReadAsStringAsync());
            // you can find the full list of the response codes here https://docs.dataforseo.com/v3/appendix/errors
            if (result.status_code == 20000)
            {
                // do something with result
                Console.WriteLine(result);
            }
            else
                Console.WriteLine($"error. Code: {result.status_code} Message: {result.status_message}");
        }
    }
}

The above command returns JSON structured like this:

{
  "version": "0.1.20220819",
  "status_code": 20000,
  "status_message": "Ok.",
  "time": "0.0469 sec.",
  "cost": 0.0006,
  "tasks_count": 1,
  "tasks_error": 0,
  "tasks": [
    {
      "id": "12141652-4426-0066-0000-db8c5b84a565",
      "status_code": 20100,
      "status_message": "Task Created.",
      "time": "0.0043 sec.",
      "cost": 0.0006,
      "result_count": 0,
      "path": [
        "v3",
        "serp",
        "google",
        "dataset_search",
        "task_post"
      ],
      "data": {
        "api": "serp",
        "function": "task_post",
        "se": "google",
        "se_type": "dataset_search",
        "keyword": "water quality",
        "last_updated": "1m",
        "file_formats": [
          "archive",
          "image"
        ],
        "usage_rights": "noncommercial",
        "is_free": true,
        "topics": [
          "natural_sciences",
          "geo"
        ],
        "device": "desktop",
        "os": "windows"
      },
      "result": null
    }
  ]
}