NAGWare Fortran Tools - f77 Tools - nag_polopt

 

Index

NAME
DESCRIPTION
OPTIONS FILE
DEFAULT FORMATTING FEATURES
BASIC FORMATTING OPTIONS
COMMON CUSTOMISATION
OTHER CUSTOMISATION PARAMETERS
CONVERSION OPTIONS
MULTIPLE-WORD TOKENS
SEE ALSO
 

NAME

nag_polopt - Polish Options Editor

 

DESCRIPTION

nag_polopt is the NAGWare f77 Tools Polish Option File Editor.

NAGWare f77 Tools has a pretty printer, nag_polish, which can be used specifically to reformat Fortran 77 programs. Others tools also make use of the polish routines to recreate Fortran 77 source code from the internal representation used by these tools. Hence the appearance of the output of all the transformation tools, including nag_polish itself, can be controlled by the polish options file. The routines that carry out the final conversion of the internal representation of a program to Fortran 77 are here referred to as "polish". nag_polopt is the recommended way of creating and editing this polish options file. It is a menu driven editor with an inbuilt help facility.

 

OPTIONS FILE

The polish options file is read from the first item found from the following list:

1. If the environment variable ``NAG_POLOPT77'' has been set, its value is used as the name of the polish options file, e.g. to set this from the C-shell:

setenv NAG_POLOPT77 ~/pol77.opt

2. The file ``./.polish_options77''

3. The file ``~/.polish_options77''

4. If none of the above is found, built-in defaults are used and the default name for the polish options file in the polish options editor is set to ``.polish_options77''.

 

DEFAULT FORMATTING FEATURES

This section describes the NAGWare f77 Tools standard formatting features. These will not normally need to be altered.

Margins

The left margin for a statement is column 7 and the right margin is column 72. For comment text, the left margin is column 2 and the right margin is column 72.

Token Spacing

A token is a lexical element of the language. For example, keywords, numeric constants, character constants, operators etc. are tokens. Adjacent tokens are normally output with one space between them, except as noted below, where they are output with no intervening space:

Before a left parenthesis if the preceding token is a name;
After a left parenthesis;
Before a right parenthesis;
Before a comma or colon, and after a comma or colon if within parentheses;
Before and after a monadic arithmetic operator (plus and minus);
Before and after the monadic logical operator (.NOT.) within parentheses;
Before and after binary arithmetic operators, except plus and minus;
Around plus and minus within parentheses;
Before and after the assignment operator (equals sign) within parentheses;
Before and after relational operators;
Before and after binary logical operators within at least 2 levels of parentheses;
Around the concatenation operator;
Around a star occurring after a datatype keyword (e.g. CHARACTER*3);

Blank Lines

A blank line is placed between the last declaration and the first executable statement. Otherwise all blank line insertion options are switched off.

Breaking a statement line

A statement line is broken if it would otherwise extend beyond the right margin (normally column 72). The following rules control the line breaking action.

The token before the break is, in descending order of preference: a separator at parenthesis level zero;
the lowest precedence binary operator with parenthesis level zero; a separator at parenthesis level one;
the lowest precedence binary operator with parenthesis level one; a separator at parenthesis level two or more;
the lowest precedence binary operator with parenthesis level two or more; any token other than a left parenthesis;
a left parenthesis.

The break is located as far to the right as possible with the above constraints.

The break will be located at least halfway between the start of the line and the right margin.

A continuation line is indented relative to its initial line according to a heuristic. This heuristic lines up the continuation text as follows:

(1) Following the assignment operator in an assignment statement.

(2) With the first actual parameter in a CALL statement.

(3) With the first variable name in a type declaration.

(4) With the token following the first keyword on the line with all other statements.

If the position produced by the heuristic is more than half the distance between the start of the initial line and the right margin, then the continuation is indented by two columns further than the initial line.

When the token which causes the line break is so large that it would not fit on the continuation line if the line was broken at the "best" position, the line is instead broken as far to the right as possible. If the token still will not fit on the continuation line, then the indentation is adjusted so that it will fit. If the token is longer than 66 characters it will be continued on an additional continuation line.

The character placed in column 6 of the continuation line is a plus sign.

A line break will also be produced by an embedded comment line. This line break will occur in the same position as in the original source code.

Note that since polish works from a token stream, line breaks in the original source code are not received by the tool.

According to the Fortran 77 Standard, a statement may not have more than 19 continuation lines. If more continuation lines than this are output, an error message will be produced.

Indentation

Statement lines within the range of a DO-loop or block-IF statement are indented four columns relative to the current indentation. The final statement of a DO-loop, and the ELSE, ELSE IF and END IF statements, are not indented. A statement will never be indented more than two-thirds of the way along a line. Once this point has been reached, further block-if and DO-loop nesting has no effect on the indentation.

 

BASIC FORMATTING OPTIONS

This section describes the basic formatting options.

Statement Relabelling

This is controlled by two LOGICAL parameters, RLBSTM and RLBFMT, which control relabelling of executable statements and FORMAT statements respectively. Setting either to .TRUE. will cause that type of statement to be relabelled. The default settings are .FALSE. (no relabelling).

Executable statements will be relabelled so that the first label which occurs will become 10, and successive labels will be 10 more than their predecessor. FORMAT statements will be labelled similarly, but beginning at 9000.

No check is made to prevent a clash between the executable statement label sequence and the FORMAT statement label sequence. For this reason it is best to always relabel both statement types, as otherwise the output could be incorrect (ie have duplicate labels).

Note that non-standard keywords in i/o statements which take labels as parameters will not have their values changed to the new label value. This is a permanent restriction. A warning message is not produced when a non-standard i/o keyword is encountered.

CONTINUE statement insertion

This is controlled by two LOGICAL parameters, DOCONI and IOTHCO. When DOCONI is set to .TRUE., polish will end every DO-loop on a unique CONTINUE statement; this requires that RLBSTM be also set to .TRUE..

When IOTHCO is set to .TRUE., then any labelled executable statement, apart from DO loop terminator statements, will be unlabelled, and a CONTINUE statement with that label inserted before it.

Move FORMAT Statements

This is controlled by the LOGICAL parameter MOVEF. When MOVEF is set to .TRUE., all FORMAT statements will be output just before the END of the program unit. In this case, there will be no blank line following the FORMAT statements, regardless of the setting of BLAFT(TFORMA) or BLBEF(TEND); this prevents the generation of an extra blank line each time the tool is run (on the same program unit).

Sequence Numbering

Polish can produce sequence numbers in columns 73-80 of the output file. This is controlled by the parameter SEQRQD. Sequence numbers consist of the first four characters of the program unit name (padded with trailing blanks if necessary) in columns 73-76 and the line number right-justified in columns (with leading blanks if necessary). The first line of each program-unit is numbered 1 and successive lines have numbers one more than their predecessor.

The name of an unnamed main program is considered to be "MAIN" and the name of an unnamed BLOCK DATA subprogram is considered to be blank. The default setting for SEQRQD is .FALSE. (i.e. no sequence numbers).

Progress Trace

This is controlled by the logical parameter TRACE. If TRACE is .TRUE., an information message will be displayed when beginning to process each program unit (except an unnamed main program). The default setting is .FALSE.

Continuation Character

The character placed in column 6 of continuation lines is controlled by the parameter CONCHR. This parameter may be set to "numeric", "alphabetic", "alphanumeric" or to a particular character (not a blank or the digit zero).

When CONCHR=numeric, the continuation characters of successive lines are 1, 2, ... 9, 1, ... 9, 1.

When CONCHR=alphabetic, the continuation characters of successive lines are A, B, ..., S.

When CONCHR=alphanumeric, the continuation characters of successive lines are 1, 2, ... 9, A, ... J.

Error Messages

The insertion of error messages as comments in the output program is controlled by the parameter ERRCMT. If this is set to .FALSE., error messages will only be reported to the standard error file (normally the user's terminal). The default setting is .TRUE..

 

COMMON CUSTOMISATION

This section describes the parameters which may be commonly used to alter the operation of polish.

Sequence Number Format

The form of sequence numbers (which are inserted by SEQRQD is .TRUE.) is controlled by four parameters: SEQINI, SEQINC, SEQDIG and SEQFIL. SEQINI specifies the line number of the first line of a program unit and SEQINC specifies the amount by which to increment the line number on successive lines. SEQDIG specifies the number of digits in the sequence number; thus the amount of space for the program-unit name is equal to 8-SEQDIG characters. SEQFIL specifies the character both for padding out short program-unit names and for leading characters of the line number.

The default settings of the parameters are: SEQINI=1, SEQINC=1, SEQDIG=4 and SEQFIL='~'.

Relabelling parameters

The values used for the new labels in a program unit are controlled by the parameters FLBINI, FLBINC, SLBINI and SLBINC. The labels of the first occurring FORMAT statement and labelled executable statement are FLBINI and SLBINI respectively. Each label produced differs from the preceding label by FLBINC (for FORMAT statements) or SLBINC (for executable statements). The default settings are: SLBINI=10, SLBINC=10, FLBINI=9000 and FLBINC=10.

If it is required that FORMAT statements be labelled in the same sequence as executable statements, then FLBINI should be set to zero.

For compatibility with Polish-66, FLBINC may be given a negative value, thus labelling FORMAT statements in a descending sequence.

Indentation Parameters

Indentation is controlled by 5 parameters: INDDO, INDIF, INDCON, INDCMT and INDDOC. INDDO and INDIF specify the indentation within DO-loops and block-IF statements respectively. INDCON specifies the additional indentation for a continuation line. If this parameter is negative, then polish will attempt instead to line up the continuation line according to the heuristic described previously (in the section "Breaking a statement line"). If this heuristic fails, then the absolute value of INDCON is used as the indentation.

INDCMT is a LOGICAL parameter which specifies whether comment text should be indented to the same level as the current statement's indentation. INDDOC is a LOGICAL parameter which specifies whether the CONTINUE statement at the end of a DO-loop will be indented to the same column as the body of the DO loop; this requires the parameter DOCONI to be .TRUE., to ensure that each DO loop ends on a unique CONTINUE statement.

The default values for these parameters are: INDDO=4, INDIF=4, INDCON=-2, INDCMT=.FALSE. and INDDOC=.FALSE..

Assignment Line-up

Polish can pad variable names on the left hand side of an assignment operator, so that consecutive assignment statements will have the assignment operator appearing in the same column. The parameter VLEN specifies the length to which each variable name should be padded (names as long as this or longer do not receive any padding at all). The default setting is VLEN=0.

 

OTHER CUSTOMISATION PARAMETERS

This section describes parameters which will only rarely need to be changed.

Margin Control

The left and right margins for statements and comment lines are controlled by the parameters LMARGS, RMARGS, LMARGC and RMARGC. RMARGC only serves to prevent polish from indenting a comment line too far - polish does not automatically break long comment lines.

The default settings for these parameters are LMARGS=7, RMARGS=72, LMARGC=2 and RMARGC=72.

Blank Line Character

The character which appears in column one of a blank line (which may be in the input or generated by polish) is determined by the parameter BLCHAR. BLCHAR may take one of the values ' ', 'C', 'c' or '*'. The default setting is BLCHAR=' '.

Declaration Line-up

The parameters DLEN and DLUP control the column in which the body of a declaration appears. The parameter DLEN specifies the length to pad the declaration keywords out to (so that the body of the declaration will start in column LMARGS+DLEN). The LOGICAL parameter DLUP specifies whether polish should line up the bodies of the declarations with the first argument in the argument list of a function or subroutine subprogram.

These parameters do not affect PROGRAM, SUBROUTINE, FUNCTION and BLOCK DATA statements. They affect all other specification statements, including COMMON blocks and DATA statements. The setting DLUP=.TRUE. will take precedence over the DLEN setting within function and subroutine subprograms with argument lists. If the argument list is empty, then the declarations will be lined up with the closing parenthesis. If there is no argument list for the function or subroutine subprogram, then the setting of DLEN will remain in force.

The line break rules are changed when DLUP is in force. Normally a continuation line will not be indented to an "intelligent" position if this would use up more than half of the line (halfway between the current indentation and the right margin). When DLUP is .TRUE., declarative statements may have their continuation lines indented up to two-thirds along the line. This is to allow DLUP to work when the program unit header is something like "DOUBLE PRECISION FUNCTION ABCD".

The default settings are DLUP=.FALSE. and DLEN=0.

Label Formatting

The parameters LABELC and LABELF control the format of the labels occurring in the label field of a statement. LABELC specifies the leftmost column for the label; the rightmost column is always 5. LABELF specifies the the format of the label, and may be set to left_justified, right_justified or zero_padded.

The default settings are LABELC=1 and LABELF=right_justified.

Comment Input

This is controlled by the parameter CMMODE. Normally, polish ensures that comment text does not start until the current comment indentation. If this parameter is set to "skip_leading_blanks", then the comment text will be shifted so that the first non-blank character always appears in the current comment indentation column (column 3). If CMMODE is set to "verbatim", then comment text is output exactly as it appears in the original program. The default setting is "normal".

Comment Decoration

The form of comments in the output file is governed by the parameters CBOX, CBTOP, CBSIDE and CMCHAR. CBOX determines the amount of decoration which is to be added to a block of comment lines (a block of comment lines is a sequence of two or more comment lines which are not entirely blank). CBOX may take the values none, half_box or whole_box. The values half_box and whole_box result in half a box or a whole one being placed around the comment text. A half box consists of the left-hand side and the bottom of a whole box. The character used for the top and bottom of the box is CBTOP; CBSIDE is used for the sides. CMCHAR determines the comment character (in column one) and may take the values
 'C', 'c', '*' or a blank. If CMCHAR is blank then the comment character from the source is used.

The default settings are: CMCHAR=' ', CBOX=none, CBTOP='*' and CBSIDE='*'.

 

CONVERSION OPTIONS

Case Conversion

Polish may be used to intelligently convert the case of a program. The case of the output is controlled by the parameters KWCASE, IDCASE, STRCAS, CMCASE and FFCASE. KWCASE specifies the case for the keywords (IF, etc.), and is one of: uppercase, lowercase or mixedcase. Mixedcase means that the first character is in uppercase and the remainder is in lowercase (for items such as .GE., mixedcase is the same as lowercase). IDCASE specifies the case for identifiers (names), and may take the same values as KWCASE, plus original_case or invertcase. STRCAS, CMCASE AND FFCASE specify the case for strings, comments and FORMAT fields (e.g. I2) respectively, and may have the values original_case, uppercase or lowercase.

The default settings are: KWCASE=uppercase, IDCASE=original_case, STRCAS=original_case, CMCASE=original_case and FFCASE=original_case.

FORMAT Conversion Parameters

There are three conversion parameters, CVTHFM, FMSBRK and RMOPCF. CVTHFM specifies whether to change H-editing fields in FORMAT statements, FMSBRK specifies whether to break strings within FORMAT statements differently to elsewhere and RMOPCF specifies whether to remove optional commas in FORMAT statements to the equivalent character strings.

The special string breaking rules within FORMAT are to insert a closing quote and trailing comma just short of or in column 72, and insert an opening quote before the rest of the string begins on the next line. Optional commas in FORMAT statements may be removed following a slash or colon; they will not be removed following any other editing field (so "I6,/,/,I6" will become "I6,//I6").

The default setting for these options is .FALSE..

Include Files

The internal token stream with which polish works has include files merged. The parameter DELINF controls the treatment of include files, if set to .TRUE., the default, on output polish deletes the contents of each include file and replaces the INCLUDE statement. If DELINF is set to .FALSE. include files are left merged into the program text delimited by comments.

 

MULTIPLE-WORD TOKENS

Fortran 77 has a number of tokens which are often split into two words. Polish splits the following tokens into two words: DOUBLE COMPLEX, DOUBLE PRECISION, END DO, GO TO, ELSE IF and END IF. The ENDFILE token is returned as a single word. This action is not user configurable.

 

SEE ALSO

nag_apt, nag_chname, nag_decs, nag_polish, nag_struct

Copyright, Numerical Algorithms Group, Oxford, 1991-2001