Chapter 6 Louden Semantic Analysis deals with features not expressed by the grammar. 2 aspects Such things as type checking Code optimization Has to do with meaning. Static vs Dynamic LISP and Smalltalk have no static semantic analysis. Pascal has significant static semantic analysis. C falls in the middle. Static Analysis involves description implementation Description uses attribute grammars - writes semantic rules. Syntax-directed semantics the semantics are closely related to its syntax. Attributes are associated with the syntactic constructs. Attribute - property of a programming language construct. Examples: data type value of expression (if constant or if dynamic) location of variable in memory object code of a procedure number of significant digits (if dynamic) When an attribute is fixed is the binding time: prior to execution - static during execution - dynamic Type checking - c and pascal during compile time lisp during execution - compiler generates code to do type checking during execution Constant Folding - 2+3*5 replaced by 17 Allocation of variables Fortran - all are static LISP - all are dynamic C - a combination Object code is static - generated at compile time Number of significant digits often dynamic 6.1.1 Attribute Grammars Attributes are associated with the syntactic constructs. X.a is how attributes are written, a is an attribute (value) of the grammar symbol X. Table 6.1 gives an attribute grammar for simple numbers, where the attribute being represented is the value of the number. (this example might be used to find the value of an integer constant, for example) Table 6.2 gives an attribute grammar for simple expressions involving numbers where again, the attribute of the production is its value. Note that + in the grammar is playing a dual role, one of grammar symbol and one of addition of values. (this is a bit confusing, we are referring to values as attributes, but the values could be real values, as in this example.) exp1 -> exp2 + term exp1.val = exp2.val + term.val Your compiler is not to evaluate expressions... here we are using the value as an example of the attribute only. Table 6.3 gives an attribute grammar for C like declarations. dtype is the name for the data type, the attribute of importance. GRAMMAR RULE SEMANTIC RULE decl -> type var-list var-list.dtype = type.dtype type -> int type.dtype = integer type -> float type.dtype = float var-list1 -> id, var-list2 id.dtype = var-list1.dtype var-list2.dtype = var-list1.dtype var-list -> id id.dtype = var-list.dtype decl => type var-list => int var-list => int id, var-list => integer integer integer integer integer int id, id, var-list => int id, id, id integer integer Symbols in attribute grammars may have more than one attribute. Table 6.4 gives an attribute grammar for base 8 or 10 as well as the value. Has two attributes. 6.1.2 Simplifications and Extensions to Attribute Grammars if-then-else extends attribute grammars - metalanguage. Use digit.val = numval(D) -- here we use D for all the digits. Use ambiguous grammars table 6.5 E -> E + E | E * E | E - E | E / E | ( E ) | num e1.val = e2.val + e3.val ... Table 6.6 shows an example attribute grammar for constructing abstract syntax trees. e1 -> e2 + t e1.tree = mkOpNode(+,e2.tree,t.tree) again, actions or attributes ... 6.2 Algorithms for Attribute Computation An attribute depends on the attributes of other components. To compute the attribute of an item, the attributes of the subordinate items must be present in the proper order. A dependency graph can be used to specify the order. 6.2.1 Dependency Graph and Evaluation Order Looks like parse tree with arrows pointing up indicating the dependency. Consider Example 6.7 versus Example 6.6. The dependency arrows point up in 6.6 but point down in 6.7. The difference is that 6.6 is referring to the parse tree or a derivation sequence associated with the grammar whereas 6.7 is referring to the grammar rules. The dependencies vary depending on what is being shown. The dependency is shown as A <- B if A is dependent on B for an attribute. Remaining sections indicate routines for checking attributes. Symbol table efficiency scope static dynamic Example: int i = 1; void f(void) { print i; } void main() { int i = 2; f(); return; } What is printed under static or dynamic scope? scope considerations int i = 1; void f() { int i = 2, j = i + 1; ... } Is j 2 or 3? If 3 then i must be placed in the symbol table immediately when seen, rather than delayed. C does make it 3. This is called sequential declarations. Alternatively, collateral declaration -- languages like ML and Scheme. Declarations are held temporarily and added at once, thus j above would be 2. Recursive declaration must have the function name added before processing the body, but this in itself is inadequate. f() { g(); } g() { f(); } to do checking either must add g without a declaration or insist on a prototype. Most languages rich in types allow the definition of types. The symbol table must hold the definition of types as well as the variables declared of that type. Type Equivalence becomes an issue. Consider: struct one { float * x; int y[10]; }; Creates tree: struct / var x ------- var y | | ptr array (10) | | float int STRUCTURAL EQUIVALENCE if syntax tree is identical then two types are the same. ie two defined types are equivalent if they have the same structure. (syntax tree) Consider: struct one { float * x; int y[10]; }; and typedef t1 * float; typedef t2 int t2[10]; struct two { t1 x; t2 y; } NAME EQUIVALENCE - two types are the same only if they have the same name. Thus one and two above are STRUCTURALLY equivalent but are not NAME equivalent. int x[10]; int y[10]; x and y are name equivalent. DECLARATION EQUIVALENCE - if a type name is equivalent to some base type, they are equivalent. typedef t1 int; typedef t2 int; t1 x; t2 y; x and y would be equivalent. Additional example: typedef t1 int mm[10]; typedef t2 int nn[10]; typedef t3 t1; t3 and t1 are equivalent under declaration equivalence Does declaration equivalence always yield the same result as structural equivalence? Attributes to consider: Coertion to float: float = int / float ... array[index type] of component type function argument/parameters type and number static scope declared variables duplicate declared variables insure that array variables are subscripted and not. x = y + foo() where foo is declared void. if foo is void it cannot be an argument to a function or an array index. the following would be ok if foo is of type void. but the number of parameters and arguments do not agree, hence a semantic error. x = y(foo); int y() { ... } the following is bad. int y() { ... return; ... } the following is good. int y() { int x; ... return x; ... } note all returns when more than one must return and int if required.