Loaders and Linkers Loading - puts object program in memory. Relocation - modifies object program so that it can be loaded at a new, alternate address Linking - combines two or more separate object programs and supplies information needed to allow references between them. Loader system program that does loading. Some do linking and relocation. Linkers (linkage editors) may exist separate from the loader. 3.1 BASIC LOADER FUNCTIONS 3.1.1 Design of an Absolute Loader Uses single pass Checks header record for correct program Checks to determine if there is sufficient memory Moves each text record to the indicated address When End record is encountered, loader begins execution at the indicated address Figure 3.1 Comment - the object programs represented in the text are in character format -- each character stored in 1 byte. Thus a STL instruction represented as byte 1 followed by byte 4 is loaded as byte 14. This is inefficient, thus object programs are not stored in character format but rather binary, but this is hard to read. 3.1.2 A Simple Bootstrap Loader A speical absolute loader Loads the OS SIC Loader (example) Loads at location 80, w/o addresses Figure 3.3 is an example of an absolute loader that loads at address 80, also converts 1 byte codes to half byte codes. Disadvantages of absolute loader: The need for multiple programs requires the ability to relocate. An absolute loader requires the programmer to know the load address in advance. The use of libraries requires the library written so that it is relocatable. 3.2 Machine Dependent loader features 3.2.1 Relocation Method 1 Modification record In Fig 3.4 which is Fig 2.6 instructions 15, 35, and 65 are effected by relocation. HCOPY 000000001077 ^ ^ ^ T0000001D17202D69202D4B1010360320262900003320074B10105D3F2FEC032010 ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ T00001D130F20160100030F200D4B10105D3E2003454F46 ^ ^ ^ ^ ^ ^ ^ ^ T0000361DB410B400B44075101000E32019332FFADB2013A00433200857C003B850 ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ T0010531D3B2FEA1340004F0000F1B410774000E32011332FFA53C003DF2008B850 ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ T001070073B2FEF4F000005 ^ ^ ^ ^ ^ M00001405+COPY ^ ^ M00000705+COPY ^ ^ M00002705+COPY ^ ^ E000000 ^ Method 2 if direct addressing and fixed instruction format no modification records same text records but with a relocation bit with each word for SIC where each instruction is a word there is 1 bit/instruction relocation bits are gathered into a bit mask following the length if relocation bit is 1, starting address is added to this word HCOPY 000000001077A ^ ^ ^ T0000001EFFC1400334810390000362800303000154810613C000300002A0C003900002D ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ FFC 111111111100 all 10 words need modification T00001E15E000C00364810610800334C0000454F46000003000000 ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ T0000391EFFC0400300000030E0105D30103FD8105D2800303010575480392C105E38103F ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ T0010570A8001000364C0000F1001000 ^ ^ ^ ^ ^ ^ ^ The F1 fouls up alignment thus a new text record had to be started. All records assume an appropriate alignment. T00106119FE0040030E01079301064508039DC10792C00363810644C000005 ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ E000000 ^ 3.2.2 Program Linking Problem arises when control sections (CSECTS) exist generating separate object code. 3 separate lists in each CSECT. LISTA-ENDA, LISTB-ENDB, LISTC-ENDC, each are external symbols available for linking. Each program contains exactly the same set of references to these external symbols. Instruction operands: REF1, REF2, and REF3 Values of data words: REF4, REF5, REF6, REF7, and REF8 Consider the differences these identical expressions are handled. ------------------------- study Fig3.8, 3.9, 3.10 ---------- In PROGA the statement 0020 REF1 LDA LISTA 03201D Assembles normally, PC relative. In PROGB, however, the statement 0036 REF1 +LDA LISTA 03100000 is an external reference, thus the address is 0's and will be fixed by the linker, a modification record is required: M 000037 05 + LISTA PROGC handles the reference in the same way. Consider REF2 0023 REF2 +LDT LISTB+4 77100004 The modification record: M 000024 05 + LISTB In PROGB the same expression is a local reference and is assembled using PC relative addressing, no modification record is required. Consider REF3 in PROGA 0027 REF3 LDX #ENDA-LISTA 050014 can be assembled normally. For PROGB and PROGC the values of the labels are unknown 003D REF3 +LDX #ENDA-LISTA 05100000 Two modifications records are required even though the difference is an absolute value. M 00003E 05 + ENDA M 00003E 05 - LISTA Fig 3.11 shows how REF4 has been loaded in PROGA: REF4 is at 0054, which is 4054. Value from the Text record is 000014. 000014 + addr(LISTC) 000014 + 4112 = 004126 LISTC is at beginning addr of PROGC + 30 = 4112 in PROGB: PROGB's REF4 is at relative 70 actual 40D3 To initial 000000 the loader adds ENDA(4054) and LISTC(4112) and subtracts value of LISTA(4040). Result is 004126 as it was in PROGA in PROGC: Similar computation to the above is done for REF4 in PROGC. In the above all the addresses appear the same. This is not true for references in instruction operands, since there is additional address calculation involved with pc reative instructions (or base relative). In these cases the target addresses are the same. For example: PROGA's REF1 is pc relative w/ disp 01D and pc 4023, resulting target address is 4040. No relocation is necessary. In PROGB's REF1 is an extended format instruction containing actual address. The address after linking is 4040, the same target address as in PROGA. Work through REF2 and REF3 as well as REF5 through REF8 to see that they are the same for each of the three programs. 3.2.3 Algorithm and Data Structures for a Linking Loader Forward references are common, thus 2 pass linking/loader. Pass 1 assigns addresses to all external symbols Pass 2 performs the actual loading, relocation and linking. data structure -- 1) ESTAB - and external symbol table stores name, address, CSECT of external symbol in the set of CSECTS being loaded. typically a hash table holds CSECT name from header record with value of CSADDR all external symbols from Define record value found by adding CSADDR to value in Define record. CSECT length CSLTH found at end record, provides the starting address of the next CSECT 2) PROGADDR - program load address provided by the OS 3) CSADDR - control section address Pass 1 provides a load map, mainly the info in the symbol table: Control Symbol section name Address Length PROGA 4000 0063 LISTA 4040 ENDA 4054 PROGB 4063 007F LISTB 40C3 ENDB 40D3 PROGC 40E2 0051 LISTC 4112 ENDC 4124 Loader takes the End record as the beginning execution point. If all End records have an address, the last one is used, if none the PROGADDR is used. Efficiency can be increased if a reference number is given to each external symbol. Use the reference number rather then the characters in the Modification records. This number can be used to index an array, thus avoiding lookup. 3.3 Machine-indpendent loader features Loader Functions load object program include library routines. specify options 3.3.1 Automatic Library Search Requires an indication of them being external references. Symbols in ESTAB that do not have an address present at the end of pass 1 are unresolved, which are treated as errors. The user can include routines such as sqrt defined themselves, thus overriding the loaders, since sqrt would already be defined in the ESTAB. Indeed, this is allowed. Library search is really a search of a directory that contains the address of the routine. 3.3.2 Loader Options Given in a command language in a separate file or Given as part of the compiled/assembled program. *Select alternative source INCLUDE program-name(library-name) would direct the loader to read from a library as if it were part of the primary loader input *Delete external symbols or entire CSECTS DELETE csect-name deletes the CSECT from those being loaded *Change names CHANGE name1,name2 Example: Fig2.15 is COPY using RDREC and WRREC. Suppose new routines READ and WRITE are to replace them, but we want to test READ and WRITE first. Without assembling we could give the loader: INCLUDE READ(UTLIB) INCLUDE WRITE(UTLIB) DELETE RDREC, WRREC CHANGE RDREC, READ CHANGE WRREC, WRITE Now we have the new routines for execution without removing and reassembling. *Specify alternative libraries to be searched. These are searched before system libraries, allowing user versions to replace system versions. LIBRARY MYLIB *Specify that library routines not be included. If, for example, statistics were normally done, but not done in this run. NOCALL STDDEV, PLOT, CORREL allows these references to be unresolved, but the assemble to succeed. *Specify no external references be resolved. Good for programs are linked but not executed immediately. Calls to external references of course will be error. *Output from the loader can vary load map with the level of detail. CSECT only, CSECT and addresses, external symbol address and cross reference table showing where each is used. *Specify the location beginning execution. *Execution even when there are unresolved external references. 3.4 LOADER DESIGN OPTIONS Choices Linking loaders - provide all linking and relocation at load time. Linkage editors - perform linking prior to load time Dynamic linking - linking is performed at execution time. 3.4.1 Linkage Editors Writes the executable image, linked to a file. Called a load module. To run, a relocating loader just loads the load module. Loading can be done in one pass without an external symbol table. All symbols have values relative to the start of the program, thus they need not be modified. Good for programs that are run over and over without modification. If a program is run only once, a linking loader is better. A linked program can be reprocessed to replace control sections or modify external references. A linkage editor can be used to replace one subroutine without relinking. Much as make does for compiling. INCLUDE PLANNER(PROGLIB) DELETE PROJECT (delete from existing planner) INCLUDE PROJECT(NEWLIB) (include new version) REPLACE PLANNER(PROGLIBK) A linkage editor can be used to combine several library routines into a package so that they do not need to be recombined each time a program is run that uses those packages. INCLUDE READR(FTNLIB) INCLUDE WRITER(FTNLIB) INCLUE BLOCK(FTNLIB) . . . SAVE FTNIO(SUBLIB) Result is a much more efficient linking of functions. A linkage editor can indicate that external references are not to be resolved by automatic library search. Example: suppose 100 programs use I/O routes, if all external references were resolved, there would be 100 copies of the library. Using commands to the linkage editor like those above, the user could specify not to include the library. A linking loader could be used to include the routines at run time. There would be a little more overhead since two linking operations would be done, one for user external references by the linkage editor and one for librarys by the linking loader. 3.4.2 Dynamic Linking Perform the above operations but during load time. A subroutine is loaded and linked to the rest of the program when it is first called. Used to allow several executing programs to share one copy of a subroutine or library. One copy of the function could be provided for all programs executing that use that function. Dynamic linking is used for objects in an object oriented language, thus allowing the object to be shared by several programs. An implementation of an object can be changed without effecting the program making use of the object. A subroutine is loaded only if it is needed, maybe an error handler routine would never be loaded if the error was never found, saving time and space. To make it happen, during execution time the loader must be kept and invoked when the function is needed. In this case the loader can be thought of as part of the OS and thus an OS call occurs. The binding is at execution time rather than load time. Delayed binding gives more capabilities at higher cost. 3.4.3 Bootstrap Loaders How is the loader loaded? Machine is idle and empty, thus no need for relocation. Specify the absolute address for the first program.(the OS) Some computers have a permanently resident in read-only memory (ROM) an absolute loader. Upon hardware signal occurring the machine executes this ROM program. Alternate solution is to have a built-in hardware function or short ROM program that reads a fixed-length record from some device into memory at a fixed location. Device can often be selected via switches. The record contains instructions that load the absolute program. This first record can cause reading of other records, thus the name bootstrap. Read 3.5 IMPLEMENTATION EXAMPLES next Chapter 4