NAME

gcloud dataplex tasks update - update a Dataplex task resource

SYNOPSIS

gcloud dataplex tasks update (TASK : --lake=LAKE --location=LOCATION) [--async] [--description=DESCRIPTION] [--display-name=DISPLAY_NAME] [--update-labels=[KEY=VALUE,...]] [--clear-labels | --remove-labels=[KEY,...]] [--execution-args=EXECUTION_ARGS --execution-project=EXECUTION_PROJECT --execution-service-account=EXECUTION_SERVICE_ACCOUNT --kms-key=KMS_KEY --max-job-execution-lifetime=MAX_JOB_EXECUTION_LIFETIME] [--notebook=NOTEBOOK --notebook-archive-uris=[NOTEBOOK_ARCHIVE_URIS,...] --notebook-file-uris=[NOTEBOOK_FILE_URIS,...] --notebook-batch-executors-count=NOTEBOOK_BATCH_EXECUTORS_COUNT --notebook-batch-max-executors-count=NOTEBOOK_BATCH_MAX_EXECUTORS_COUNT --notebook-container-image=NOTEBOOK_CONTAINER_IMAGE --notebook-container-image-java-jars=[NOTEBOOK_CONTAINER_IMAGE_JAVA_JARS,...] --notebook-container-image-properties=NOTEBOOK_CONTAINER_IMAGE_PROPERTIES --notebook-vpc-network-tags=[NOTEBOOK_VPC_NETWORK_TAGS,...] --notebook-vpc-network-name=NOTEBOOK_VPC_NETWORK_NAME | --notebook-vpc-sub-network-name=NOTEBOOK_VPC_SUB_NETWORK_NAME | --spark-archive-uris=[SPARK_ARCHIVE_URIS,...] --spark-file-uris=[SPARK_FILE_URIS,...] --batch-executors-count=BATCH_EXECUTORS_COUNT --batch-max-executors-count=BATCH_MAX_EXECUTORS_COUNT --container-image=CONTAINER_IMAGE --container-image-java-jars=[CONTAINER_IMAGE_JAVA_JARS,...] --container-image-properties=CONTAINER_IMAGE_PROPERTIES --container-image-python-packages=[CONTAINER_IMAGE_PYTHON_PACKAGES,...] --vpc-network-tags=[VPC_NETWORK_TAGS,...] --vpc-network-name=VPC_NETWORK_NAME --vpc-sub-network-name=VPC_SUB_NETWORK_NAME --spark-main-class=SPARK_MAIN_CLASS | --spark-main-jar-file-uri=SPARK_MAIN_JAR_FILE_URI | --spark-python-script-file=SPARK_PYTHON_SCRIPT_FILE | --spark-sql-script=SPARK_SQL_SCRIPT | --spark-sql-script-file=SPARK_SQL_SCRIPT_FILE] [--trigger-disabled --trigger-max-retires=TRIGGER_MAX_RETIRES --trigger-schedule=TRIGGER_SCHEDULE --trigger-start-time=TRIGGER_START_TIME] [GCLOUD_WIDE_FLAG ...]

DESCRIPTION

Update a Dataplex task resource with the given configurations.

EXAMPLES

To update a Dataplex task test-task within lake test-lake in location us-central1 and change the description to Updated Description, run:

$ gcloud dataplex tasks update \ projects/test-project/locations/us-central1/lakes/test-lake/\ tasks/test-task --description='Updated Description'

POSITIONAL ARGUMENTS

Task resource - Arguments and flags that specify the Dataplex Task you want to

update. The arguments in this group can be used to specify the attributes of this resource. (NOTE) Some attributes are not given arguments in this group but can be set in other ways. To set the project attribute:

provide the argument task on the command line with a fully specified name;

set the property core/project;

provide the argument --project on the command line.

This must be specified.

TASK

ID of the task or fully qualified identifier for the task. To set the task attribute:

  • provide the argument task on the command line.

This positional argument must be specified if any of the other arguments in this group are specified.

--lake=LAKE

Identifier of the Dataplex lake resource.

To set the lake attribute:

  • provide the argument task on the command line with a fully specified name;

  • provide the argument --lake on the command line.

--location=LOCATION

Location of the Dataplex resource.

To set the location attribute:

  • provide the argument task on the command line with a fully specified name;

  • provide the argument --location on the command line;

  • set the property dataplex/location.

FLAGS

--async

Return immediately, without waiting for the operation in progress to complete.

--description=DESCRIPTION

Description of the task.

--display-name=DISPLAY_NAME

Display name of the task.

--update-labels=[KEY=VALUE,...]

List of label KEY=VALUE pairs to update. If a label exists, its value is modified. Otherwise, a new label is created.

Keys must start with a lowercase character and contain only hyphens (-), underscores (_), lowercase characters, and numbers. Values must contain only hyphens (-), underscores (_), lowercase characters, and numbers.

At most one of these can be specified:
--clear-labels

Remove all labels. If --update-labels is also specified then --clear-labels is applied first.

For example, to remove all labels:

$ gcloud dataplex tasks update --clear-labels

To remove all existing labels and create two new labels, foo and baz:

$ gcloud dataplex tasks update --clear-labels \ --update-labels foo=bar,baz=qux

--remove-labels=[KEY,...]

List of label keys to remove. If a label does not exist it is silently ignored. If --update-labels is also specified then --update-labels is applied first.

Spec related to how a task is executed.
--execution-args=EXECUTION_ARGS

The arguments to pass to the task. The args can use placeholders of the format ${placeholder} as part of key/value string. These will be interpolated before passing the args to the driver. Currently supported placeholders:

  • ${task_id}

  • ${job_time} To pass positional args, set the key as TASK_ARGS. The value should be a comma-separated string of all the positional arguments. To use a delimiter other than comma, refer to https://cloud.google.com/sdk/gcloud/reference/topic/escaping. In case of other keys being present in the args, then TASK_ARGS will be passed as the last argument.

--execution-project=EXECUTION_PROJECT

The project in which jobs are run. By default, the project containing the Lake is used. If a project is provided, the --execution-service-account must belong to this same project.

--execution-service-account=EXECUTION_SERVICE_ACCOUNT

Service account to use to execute a task. If not provided, the default Compute service account for the project is used.

--kms-key=KMS_KEY

The Cloud KMS key to use for encryption, of the form: projects/{project_number}/locations/{location_id}/keyRings/{key-ring-name}/cryptoKeys/{key-name}

--max-job-execution-lifetime=MAX_JOB_EXECUTION_LIFETIME

The maximum duration before the job execution expires.

Select which task you want to schedule and provide the required arguments:-

spark tasks

notebook tasks

At most one of these can be specified:

Config related to running custom Notebook tasks.
--notebook=NOTEBOOK

Google Cloud Storage URIs of the notebook file or the path to a Notebook Content. Path to input notebook.

--notebook-archive-uris=[NOTEBOOK_ARCHIVE_URIS,...]

Google Cloud Storage URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.

--notebook-file-uris=[NOTEBOOK_FILE_URIS,...]

Google Cloud Storage URIs of files to be placed in the working directory of each executor.

Compute resources needed for a Task when using Dataproc Serverless.
--notebook-batch-executors-count=NOTEBOOK_BATCH_EXECUTORS_COUNT

Total number of job executors.

--notebook-batch-max-executors-count=NOTEBOOK_BATCH_MAX_EXECUTORS_COUNT

Max configurable executors. If max_executors_count > executors_count, then auto-scaling is enabled.

Container Image Runtime Configuration.
--notebook-container-image=NOTEBOOK_CONTAINER_IMAGE

Optional custom container image for the job.

--notebook-container-image-java-jars=[NOTEBOOK_CONTAINER_IMAGE_JAVA_JARS,...]

A list of Java JARS to add to the classpath. Valid input includes Cloud Storage URIs to Jar binaries. For example, gs://bucket-name/my/path/to/file.jar

--notebook-container-image-properties=NOTEBOOK_CONTAINER_IMAGE_PROPERTIES

Override to common configuration of open source components installed on the Dataproc cluster. The properties to set on daemon config files. Property keys are specified in prefix:property format, for example core:hadoop.tmp.dir. For more information, see Cluster properties https://cloud.google.com/dataproc/docs/concepts/cluster-properties.

Cloud VPC Network used to run the infrastructure.
--notebook-vpc-network-tags=[NOTEBOOK_VPC_NETWORK_TAGS,...]

List of network tags to apply to the job.

The Cloud VPC network identifier.

At most one of these can be specified:

--notebook-vpc-network-name=NOTEBOOK_VPC_NETWORK_NAME

The Cloud VPC network in which the job is run. By default, the Cloud VPC network named Default within the project is used.

--notebook-vpc-sub-network-name=NOTEBOOK_VPC_SUB_NETWORK_NAME

The Cloud VPC sub-network in which the job is run.

Config related to running custom Spark tasks.
--spark-archive-uris=[SPARK_ARCHIVE_URIS,...]

Google Cloud Storage URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.

--spark-file-uris=[SPARK_FILE_URIS,...]

Google Cloud Storage URIs of files to be placed in the working directory of each executor.

Compute resources needed for a Task when using Dataproc Serverless.
--batch-executors-count=BATCH_EXECUTORS_COUNT

Total number of job executors.

--batch-max-executors-count=BATCH_MAX_EXECUTORS_COUNT

Max configurable executors. If max_executors_count > executors_count, then auto-scaling is enabled.

Container Image Runtime Configuration.
--container-image=CONTAINER_IMAGE

Optional custom container image for the job.

--container-image-java-jars=[CONTAINER_IMAGE_JAVA_JARS,...]

A list of Java JARS to add to the classpath. Valid input includes Cloud Storage URIs to Jar binaries. For example, gs://bucket-name/my/path/to/file.jar

--container-image-properties=CONTAINER_IMAGE_PROPERTIES

Override to common configuration of open source components installed on the Dataproc cluster. The properties to set on daemon config files. Property keys are specified in prefix:property format, for example core:hadoop.tmp.dir. For more information, see Cluster properties https://cloud.google.com/dataproc/docs/concepts/cluster-properties.

--container-image-python-packages=[CONTAINER_IMAGE_PYTHON_PACKAGES,...]

A list of python packages to be installed. Valid formats include Cloud Storage URI to a PIP installable library. For example, gs://bucket-name/my/path/to/lib.tar.gz

Cloud VPC Network used to run the infrastructure.
--vpc-network-tags=[VPC_NETWORK_TAGS,...]

List of network tags to apply to the job.

The Cloud VPC network identifier.
--vpc-network-name=VPC_NETWORK_NAME

The Cloud VPC network in which the job is run. By default, the Cloud VPC network named Default within the project is used.

--vpc-sub-network-name=VPC_SUB_NETWORK_NAME

The Cloud VPC sub-network in which the job is run.

The specification of the main method to call to drive the job. Specify either

the jar file that contains the main class or the main class name.

At most one of these can be specified:

--spark-main-class=SPARK_MAIN_CLASS

The name of the driver's main class. The jar file that contains the class must be in the default CLASSPATH or specified in

--spark-main-jar-file-uri=SPARK_MAIN_JAR_FILE_URI

The Google Cloud Storage URI of the jar file that contains the main class. The execution args are passed in as a sequence of named process arguments (--key=value).

--spark-python-script-file=SPARK_PYTHON_SCRIPT_FILE

The Google Cloud Storage URI of the main Python file to use as the driver. Must be a .py file.

--spark-sql-script=SPARK_SQL_SCRIPT

The SQL query text.

--spark-sql-script-file=SPARK_SQL_SCRIPT_FILE

A reference to a query file. This can be the Google Cloud Storage URI of the query file or it can the path to a SqlScript Content.

Spec related to how often and when a task should be triggered.
--trigger-disabled

Prevent the task from executing. This does not cancel already running tasks. It is intended to temporarily disable RECURRING tasks.

--trigger-max-retires=TRIGGER_MAX_RETIRES

Number of retry attempts before aborting. Set to zero to never attempt to retry a failed task.

--trigger-schedule=TRIGGER_SCHEDULE

Cron schedule https://en.wikipedia.org/wiki/Cron for running tasks periodically.

--trigger-start-time=TRIGGER_START_TIME

The first run of the task will be after this time. If not specified, the task will run shortly after being submitted if ON_DEMAND and based on the schedule if RECURRING.

GCLOUD WIDE FLAGS

These flags are available to all commands: --access-token-file, --account, --billing-project, --configuration, --flags-file, --flatten, --format, --help, --impersonate-service-account, --log-http, --project, --quiet, --trace-token, --user-output-enabled, --verbosity.

Run $ gcloud help for details.

API REFERENCE

This command uses the dataplex/v1 API. The full documentation for this API can be found at: https://cloud.google.com/dataplex/docs

NOTES

This variant is also available:

$ gcloud alpha dataplex tasks update