9. Parsing The following program examines its arguments: /* Parse the arguments */ parse arg a.1 a.2 a.3 a.4 do i=1 to 4 say "Argument" i "was:" a.i end Execute it as usual, except this time type "alpha beta gamma delta" after the program name on the command line, for example: I rexx arguments "alpha beta gamma delta" O argumnts alpha beta gamma delta The program should print out: Argument 1 was: alpha Argument 2 was: beta Argument 3 was: gamma Argument 4 was: delta The argument "alpha beta gamma delta" has been parsed into four components. The components were split up at the spaces in the input. If you experiment with the program you should see that if you do not type four words as arguments then the last components printed out are empty, and that if you type more than four words then the last component contains all the extra data. Also, even if multiple spaces appear between the words, only the last component contains spaces. This is known as "tokenisation". It is not only possible to parse the arguments, but also the input. In the above program, replace "arg" by "pull". When you run this new program you will have to type in some input to be tokenised. Replace "parse arg" with "parse upper arg" in the program. Now, when you supply input to be tokenised it will be uppercased. The instruction "parse upper" is a variant of "parse" which always translates the data to upper case. "arg" and "pull" are, respectively, abbreviations for the instructions "parse upper arg" and "parse upper pull". That explains why the "pull" instruction appeared in previous examples, and why it was that input was always uppercased if you typed letters in response to it. Other pieces of data may be parsed as well. "parse source" parses information about how the program was invoked, and what it is called, and "parse version" parses information about the interpreter itself. However, the two most useful uses of the parse instruction are "parse var [variable]" and "parse value [expression] with". These allow you to parse arbitrary data supplied by the program. For example, /* Get information about the date and time */ d=date() parse var d day month year parse value time() with hour ':' min ':' sec The last line above illustrates a different way to parse data. Instead of tokenising the result of evaluating time(), we split it up at the character ':'. Thus, for example, "17:44:11" is split into 17, 44 and 11. Any search string may be specified in the "template" of a "parse" instruction. The search string is simply placed in quotation marks, for example: parse arg first "beta" second "delta" This line assigns to variable first anything which appears before "beta", and to second anything which appears between "beta" and "delta". If "beta" does not appear in the argument string, then the entire string is assigned to first, and the empty string is assigned to "second". If "beta" does appear, but "delta" does not, then everything after "beta" will be assigned to second. It is possible to tokenise the pieces of input appearing between search strings. For example, parse arg "alpha" first second "delta" This tokenises everything between "alpha" and "delta" and places the tokens in first and second. Placing a dot instead of a variable name during tokenising causes that token to be thrown away: parse pull a . c . e This keeps the first, third and last tokens, but throws away the second and fourth. It is often a good idea to place a dot after the last variable name, thus: parse pull first second third . Not only does this throw away the unused tokens, but it also ensures that none of the tokens contains spaces (remember, only the last token may contain spaces; this is the one we are throwing away). Finally, it is possible to parse by numeric position instead of by searching for a string. Numeric positions start at 1, for the first character, and range upwards for further characters. parse var a 6 piece1 +3 piece2 +5 piece3 The value of piece1 will be the three characters of a starting at character 6; piece2 will be the next 5 characters, and piece3 will be the rest of a.