[UI][EDP] Schema for "args" is incorrect for Pig jobs
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Sahara |
Fix Released
|
Medium
|
Chad Roberts |
Bug Description
For Pig jobs, job_configs["args"] should be a list of strings. Savanna currently requires it to be a dictionary and generates a workflow which is incorrect but will still run.
Here is some background:
Oozie allows <param> tags and <argument> tags in pig actions. Both are used to pass options to the Pig script (Pig usage is shown at the end of this description).
For Oozie pig actions
<param>
is short hand for
<argument>
<argument>
So, job_configs[
Savanna currently uses <param> tags to set the INPUT and OUTPUT values for the pig script based on data sources.
Job_configs["args"] would be used to set additional flags (see the Pig usage below).
As an example, the following 2 Oozie workflows are equivalent:
<?xml version="1.0" ?>
<workflow-app name="job-wf" xmlns="
<start to="job-node"/>
<action name="job-node">
<pig>
<
<
<
<property>
</property>
<property>
</property>
<property>
</property>
<
<
<
<
</pig>
<ok to="end"/>
<error to="fail"/>
</action>
<?xml version="1.0" ?>
<workflow-app name="job-wf" xmlns="
<start to="job-node"/>
<action name="job-node">
<pig>
<
<
<
<property>
</property>
<property>
</property>
<
<
<
<
<
<
</pig>
<ok to="end"/>
Here is Pig usage, taken from Oozie logs on a failed job:
Apache Pig version 0.10.1 (r1426282)
compiled Dec 27 2012, 11:24:26
USAGE: Pig [options] [-] : Run interactively in grunt shell.
Pig [options] -e[xecute] cmd [cmd ...] : Run cmd(s).
Pig [options] [-f[ile]] file : Run cmds found in file.
options include:
-4, -log4jconf - Log4j configuration file, overrides log conf
-b, -brief - Brief logging (no timestamps)
-c, -check - Syntax check
-d, -debug - Debug level, INFO is default
-e, -execute - Commands to execute (within quotes)
-f, -file - Path to the script to execute
-g, -embedded - ScriptEngine classname or keyword for the ScriptEngine
-h, -help - Display this message. You can specify topic to get help for that topic.
properties is the only topic currently supported: -h properties.
-i, -version - Display version information
-l, -logfile - Path to client side log file; default is current working directory.
-m, -param_file - Path to the parameter file
-p, -param - Key value pair of the form param=val
-r, -dryrun - Produces script with substituted parameters. Script is not executed.
-t, -optimizer_off - Turn optimizations off. The following values are supported:
All - Disable all optimizations
All optimizations listed here are enabled by default. Optimization values are case insensitive.
-v, -verbose - Print all error messages to screen
-w, -warning - Turn warning logging on; also turns warning aggregation off
-x, -exectype - Set execution mode: local|mapreduce, default is mapreduce.
-F, -stop_on_failure - Aborts execution on the first failed job; default is off
-M, -no_multiquery - Turn multiquery optimization off; default is on
-P, -propertyFile - Path to property file
Changed in savanna: | |
status: | New → Confirmed |
importance: | Undecided → Medium |
assignee: | nobody → Trevor McKay (tmckay) |
milestone: | none → icehouse-3 |
tags: | added: edp |
summary: |
- [EDP] Schema for "args" is incorrect for Pig jobs + [UI][EDP] Schema for "args" is incorrect for Pig jobs |
Changed in savanna: | |
assignee: | Trevor McKay (tmckay) → Chad Roberts (croberts) |
Changed in savanna: | |
status: | In Progress → Fix Committed |
Changed in savanna: | |
status: | Fix Committed → Fix Released |
Changed in sahara: | |
milestone: | icehouse-3 → 2014.1 |
Fix proposed to branch: master /review. openstack. org/67588
Review: https:/