gcloud dataproc - create and manage Google Cloud Dataproc clusters and jobs
gcloud dataproc GROUP [GCLOUD_WIDE_FLAG ...]
The gcloud dataproc command group lets you create and manage Dataproc clusters and jobs.
Dataproc is an Apache Hadoop, Apache Spark, Apache Pig, and Apache Hive service. It easily processes big datasets at low cost, creating managed clusters of any size that scale down once processing is complete.
More information on Dataproc can be found here: https://cloud.google.com/dataproc and detailed documentation can be found here: https://cloud.google.com/dataproc/docs/
To see how to create and manage clusters, run:
$ gcloud dataproc clusters
To see how to submit and manage jobs, run:
$ gcloud dataproc jobs
These flags are available to all commands: --help.
Run $ gcloud help for details.
GROUP is one of the following:
- autoscaling-policies
Create and manage Dataproc autoscaling policies.
- batches
Submit Dataproc batch jobs.
- clusters
Create and manage Dataproc clusters.
- jobs
Submit and manage Dataproc jobs.
- node-groups
Manage Dataproc node groups.
- operations
View and manage Dataproc operations.
- workflow-templates
Create and manage Dataproc workflow templates.
These variants are also available:
$ gcloud alpha dataproc
$ gcloud beta dataproc