这是我目前的Hadoop工作。
java -cp `hadoop classpath`:/usr/local/src/jobs/MyJob/tony-cli-0.1.5-all.jar com.linkedin.tony.cli.ClusterSubmitter \
--python_venv=/usr/local/src/jobs/MyJob/mnist_venv.zip \
--src_dir=/usr/local/src/jobs/MyJob/ \
--executes=/usr/local/src/jobs/MyJob/src/mnist_distributed.py \
--conf_file=/usr/local/src/jobs/MyJob/tony.xml \
--python_binary_path=venv/bin/python3.5
如何将其转换为
gcloud dataproc jobs submit hadoop
工作
我试过:
gcloud dataproc jobs submit hadoop --cluster tony-dev \
--jar /usr/local/src/jobs/MyJob/tony-cli-0.1.5-all.jar --class com.linkedin.tony.cli.ClusterSubmitter -- \
--python_venv=/usr/local/src/jobs/MyJob/mnist_venv.zip \
--src_dir=/usr/local/src/jobs/MyJob/ \
--executes=/usr/local/src/jobs/MyJob/src/mnist_distributed.py \
--conf_file=/usr/local/src/jobs/MyJob/tony.xml \
--python_binary_path=venv/bin/python3.5
我不断得到:
ERROR: (gcloud.dataproc.jobs.submit.hadoop) argument --class: Exactly one of (--class | --jar) must be specified.
Usage: gcloud dataproc jobs submit hadoop --cluster=CLUSTER (--class=MAIN_CLASS | --jar=MAIN_JAR) [optional flags] [-- JOB_ARGS ...]
optional flags may be --archives | --async | --bucket | --class |
--driver-log-levels | --files | --help | --jar |
--jars | --labels | --max-failures-per-hour |
--properties | --region
For detailed information on this command and its flags, run:
gcloud dataproc jobs submit hadoop --help
如果我通过:
gcloud dataproc jobs submit hadoop --cluster tony-dev \
--jar /usr/local/src/jobs/MyJob/tony-cli-0.1.5-all.jar com.linkedin.tony.cli.ClusterSubmitter -- \
--python_venv=/usr/local/src/jobs/MyJob/mnist_venv.zip \
--src_dir=/usr/local/src/jobs/MyJob/ \
--executes=/usr/local/src/jobs/MyJob/src/mnist_distributed.py \
--conf_file=/usr/local/src/jobs/MyJob/tony.xml \
--python_binary_path=venv/bin/python3.5
我得到:
ERROR: (gcloud.dataproc.jobs.submit.hadoop) unrecognized arguments: com.linkedin.tony.cli.ClusterSubmitter
参考
here
.