Preview only show first 10 pages with watermark. For full document please download

Cray Assembler For Mpp (cam) Reference Manual Sr–2510 2.2

Rating
Date

October 2018
Size

361.8KB
Views

9,006
Categories

Home Domestic appliances Floor care Vacuum cleaners

Transcript

Cray Assembler for MPP (CAM) Reference Manual SR–2510 2.2 Copyright © 1993, 1996 Cray Research, Inc. All Rights Reserved. This manual or parts thereof may not be reproduced in any form unless permitted by contract or by written permission of Cray Research, Inc. Portions of this product may still be in development. The existence of those portions still in development is not a commitment of actual release or support by Cray Research, Inc. Cray Research, Inc. assumes no liability for any damages resulting from attempts to use any functionality or documentation not officially released and supported. If it is released, the final form and the time of official release and start of support is at the discretion of Cray Research, Inc. Autotasking, CF77, CRAY, Cray Ada, CraySoft, CRAY Y-MP, CRAY-1, CRInform, CRI/TurboKiva, HSX, LibSci, MPP Apprentice, SSD, SUPERCLUSTER, SUPERSERVER, UNICOS, and X-MP EA are federally registered trademarks and Because no workstation is an island, CCI, CCMT, CF90, CFT, CFT2, CFT77, ConCurrent Maintenance Tools, COS, Cray Animation Theater, CRAY APP, CRAY C90, CRAY C90D, Cray C++ Compiling System, CrayDoc, CRAY EL, CRAY J90, Cray NQS, Cray/REELlibrarian, CRAY S-MP, CRAY SSD-T90, CRAY SUPERSERVER 6400, CRAY T90, CRAY T3D, CRAY T3E, CrayTutor, CRAY X-MP, CRAY XMS, CRAY-2, CS6400, CSIM, CVT, Delivering the power . . ., DGauss, Docview, EMDS, GigaRing, HEXAR, IOS, ND Series Network Disk Array, Network Queuing Environment, Network Queuing Tools, OLNET, RQS, SEGLDR, SMARTE, SUPERLINK, System Maintenance and Remote Testing Environment, Trusted UNICOS, UNICOS MAX, and UNICOS/mk are trademarks of Cray Research, Inc. DEC, DECchip, VAX, and VMS are trademarks of Digital Equipment CorporationUNIX is a registered trademark in the United States and other countries, licensed exclusively through X/Open Company Limited. X/Open is a registered trademark, and the X device is a trademark, of X/Open Company Ltd. The UNICOS operating system is derived from UNIX® System V. The UNICOS operating system is also based in part on the Fourth Berkeley Software Distribution (BSD) under license from The Regents of the University of California. New Features Cray Assembler for MPP (CAM) Reference Manual SR–2510 2.2 The following new features are included in CAM 2.2: • Support for CRAY T3E systems CRAY T3E systems use the EV-5 version of the DCE Alpha microprocessor. CAM 2.2 includes instructions and assembler directives that are specific to CRAY T3E systems. • A .uses_eregs assembler directive The .uses_eregs assembler directive indicates that E registers are used. This assembler directive applies only to CRAY T3E systems. Assembler directives are described in Section 5.6, page 93. • The wmb and excb user instructions These user instructions apply only to CRAY T3E systems. On CRAY T3D systems, instructions that perform byte operations on registers (such as extxx, mskxx, and insxx) exhibited little endian behavior. On CRAY T3E systems these instructions exhibit big endian behavior. Endian behavior is described in Section 5.6, page 93. Record of Revision The date of printing or software version number is indicated in the footer. Version Description 1.0 November 1993 Draft version to support the MPP 1.0 release. 1.1 March 1994 Original printing to support the CAM 1.1 release. 2.0 June 1994 Rewrite to support the CAM 2.0 release. 2.1 October 1994 Rewrite to support the CAM 2.1 release. 2.2 July 1996 Draft printing to support the CAM 2.2 release. 2.2 September 1996 Rewrite to support the CAM 2.2 release. SR–2510 2.2 Cray Research, Inc. i Contents Page Preface . . . . . . . . . . . . . . . . . . . xiii Related Publications . . . . . . . . . . . . . . . . . . . xiii Ordering publications . . . . . . . . . . . . . . . . . . . xiv Conventions . . . . . . . . . . . . . . . . . . . . . . . . . xiv Online information . . . . . . . . . . . . . . . . . . . . xvii Reader comments . . . . . . . . . . . . . . . . . . . . xvii Introduction [1] . . . . . . . . . . . . . . . . . . . 1 CRAY T3D operation . . . . . . . . . . . . . . . . . . . 1 CRAY T3E operation . . . . . . . . . . . . . . . . . . . 2 Manual organization . . . . . . . . . . . . . . . . . . . 2 Capabilities . . . . . . . . . . . . . . . . . . . . . . 3 Limitations . . . . . . . . . . . . . . . . . . . . . . 3 . . . . . . . . . . . . . . . 5 Execution of the CAM assembler Source statement format Listing file format . . . . . . . . . . . . . . . . . . . . 6 . . . . . . . . . . . . . . . . . . 7 . . . . . . . . . . . 11 System Information and Usage [2] cam(1) command line . The execution environment . . . . . . . . . . . . . . . . . . 11 . . . . . . . . . . . . . . . . . 17 . . . . . . . . . . . . . 17 Execution on the CRAY T3D system SR–2510 2.2 Cray Research, Inc. iii Contents Cray Assembler for MPP (CAM) Reference Manual Page Execution on the CRAY T3E system Environment variables . . The CAM Program [3] Program segment . . . . . . . . . . . . . . 19 . . . . . . . . . . . . . . . . 19 . . . . . . . . . . . . . . . . 21 . . . . . . . . . . . . . . . . . . . 22 . . . . . . . . . . . . . . . . . . . 24 . . . . . . . . . . . . . . . . . . . . 24 Source statement . . . . . . . . . . . . . . . . . . . . 24 Statement editing . . . . . . . . . . . . . . . . . . . . 25 . . . . . . . . . . . . . . . . . 26 Global definitions Program module Instructions and directives Assembler directives . . . . . . . . . . . . . . . . . . 26 Assembler instructions . . . . . . . . . . . . . . . . . . 26 Micros . . . . regnum function DEX function . . Location counter . . . . . . . . . . . . . . . . . . . . 27 . . . . . . . . . . . . . . . . . . . . 28 . . . . . . . . . . . . . . . . . . . . 28 . . . . . . . . . . . . . . . . . . . . 28 . . . . . . . . . . . . . . . . . . . 30 Register designators Identifiers . . . . . . . . . . . . . . . . . . . . . . 31 . . . . . . . . . . . . . . . . . . . . . . . 32 . . . . . . . . . . . . . . . . . . . . . . . . 33 Operators . . . . . . . . . . . . . . . . . . . . . . . 34 . . . . . . . . . . . . . . . . . . 36 . . . . . . . . . . . . . . . . . . . 38 . . . . . . . . . . . . . . . . . . . . 39 . . . . . . . . . . . . . . . . . . . . 40 Symbols Labels Operator precedence Operator definitions Numeric constants Expressions iv . . Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Contents Page Constant integer expressions . . . . . . . . . . . . . . . . 41 Floating-point expressions . . . . . . . . . . . . . . . . . 41 DEX expressions . . . . . . . . . . . . . . . . . 42 CAM Instruction Set [4] . . . . . . . . . . . . . . . . 45 Operand qualifiers . . . . . . . . . . . . . . . . . . . . 45 CAM instructions . . . . . . . . . . . . . . . . . . . . 47 . . . . . . . . . . . . . . 48 . . . . . . . . . . . . . . . 52 . . . . . . . . . . . . . . . 56 . . . . . . . . . . . . . . . . 59 . . . . . . . . . . . . . . . . 62 . . . . . . . . . . . . . . . 63 . . . . . . . . . . . . . . . 65 . . . . . . . . . . . . . . . 66 . . . . . . . . . . . . 72 . . . Integer load and store instructions Integer control instructions . Integer arithmetic instructions Integer compare instructions Logical instructions . . . Conditional move instructions Shift instructions . . . . Byte manipulation instructions Floating-point load and store instructions Floating-point control instructions Floating-point copy instructions . . . . . . . . . . . . . . . 74 . . . . . . . . . . . . . . 75 . . . . . . . . . . . . . 76 Floating-point conversion instructions Floating-point move instructions . . . . . . . . . . . . . . . 78 Floating-point compare instructions . . . . . . . . . . . . . . 80 . . . . . . . . . . . . . 82 Floating-point arithmetic instructions Miscellaneous instructions Assembler Directives [5] SR–2510 2.2 . . . . . . . . . . . . . . . . 85 . . . . . . . . . . . . . . . 87 Cray Research, Inc. v Contents Cray Assembler for MPP (CAM) Reference Manual Page Conditional assembly . . . . . . . . . . . . . . . . . . . 87 Data definition . . . . . . . . . . . . . . . . . . . . . 90 Macro control . . . . . . . . . . . . . . . . . . . . . 91 . . . . . . . . . . . . . . . . . . 91 . . . . . . . . . . . . . . . . . . 92 . . . . . . . . . . . . . . . 93 Message/listing control Program control . . . Assembler directive descriptions .align . . . . . . . . . . . . . . . . . . . . . . 93 .ascic . . . . . . . . . . . . . . . . . . . . . . 94 .ascii . . . . . . . . . . . . . . . . . . . . . . 95 .asciz . . . . . . . . . . . . . . . . . . . . . . 95 . . . . . . . . . . . . . . . . . . . . . . 95 . . . . . . . . . . . . . . . . . . . . . 96 .bits . .blk_bits .blkb . . . . . . . . . . . . . . . . . . . . . . . 96 .blkl . . . . . . . . . . . . . . . . . . . . . . . 97 .blkq . . . . . . . . . . . . . . . . . . . . . . . 97 .blks . . . . . . . . . . . . . . . . . . . . . . . 97 .blkt . . . . . . . . . . . . . . . . . . . . . . . 98 .blkw . . . . . . . . . . . . . . . . . . . . . . . 98 .byte . . . . . . . . . . . . . . . . . . . . . . . 99 .comment . . . . . . . . . . . . . . . . . . . . . . 99 .dexend . . . . . . . . . . . . . . . . . . . . . . 99 . . . . . . . . . . . . . . . . . . . . . 99 . . . . . . . . . . . . . . . . . . . . . . 100 .dexstart .double vi .else . . . . . . . . . . . . . . . . . . . . . . . 100 .end . . . . . . . . . . . . . . . . . . . . . . . 101 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Contents Page . .endc .endif . . . . . . . . . . . . . . . . . . . . . . 101 . . . . . . . . . . . . . . . . . . . . . . 101 .endm . . . . . . . . . . . . . . . . . . . . . . . 101 .endp . . . . . . . . . . . . . . . . . . . . . . . 102 .endr . . . . . . . . . . . . . . . . . . . . . . . 102 . . . . . . . . . . . . . . . . . . . . . . 102 . . . . . . . . . . . . . . . . . . . . . . 103 . . . . . . . . . . . . . . . . . . . . . . 103 . . . . . . . . . . . . . . . . . . . . . 103 .error . .even .extern .external .float . . . . . . . . . . . . . . . . . . . . . . 103 .ident . . . . . . . . . . . . . . . . . . . . . . 104 . . . . . . . . . . . . . . . . . . . . . 104 .if_false . . . . . . . . . . . . . . . . . . . . . . . 104 .iff . . . . . . . . . . . . . . . . . . . . . . . 104 .iif . . . . . . . . . . . . . . . . . . . . . . . 105 .list . . . . . . . . . . . . . . . . . . . . . . . 105 .long . . . . . . . . . . . . . . . . . . . . . . . 106 .macro . . . . . . . . . . . . . . . . . . . . . . 107 .mdelete . . . . . . . . . . . . . . . . . . . . . . 107 .mexit . . . . . . . . . . . . . . . . . . . . . . 108 . . . . . . . . . . . . . . . . . . . . . . 108 .print . . . . . . . . . . . . . . . . . . . . . . 108 .psect . . . . . . . . . . . . . . . . . . . . . . 108 . . . . . . . . . . . . . . . . . . . . . . 111 . . . . . . . . . . . . . . . . . . . . . . 112 .if . .odd .quad .repeat SR–2510 2.2 . . Cray Research, Inc. vii Contents Cray Assembler for MPP (CAM) Reference Manual Page .restore . . . . . . . . . . . . . . . . . . . . . . 113 . . . . . . . . . . . . . . . . . . . 113 . . . . . . . . . . . . . . . . . . . . 113 . . . . . . . . . . . . . . . . . . . . 113 . . . . . . . . . . . . . . . . . . . . 114 . . . . . . . . . . . . . . . . . . . . 114 .restore_psect .s_float . . .s_floating .save . . . .save_psect . . . . . . . . . . . . . . . . . . . . . . 114 .stack . . . . . . . . . . . . . . . . . . . . . . 115 .start . . . . . . . . . . . . . . . . . . . . . . 115 . . . . . . . . . . . . . . . . . . . . . 116 . . . . . . . . . . . . . . . . . . . . 116 . . . . . . . . . . . . . . . . . . . . 117 . . . . . . . . . . . . 117 .soft . .subtitle .t_floating . .title . .uses_eregs (CRAY T3E systems only) .warning . . . . . . . . . . . . . . . . . . . . . . 118 .weak . . . . . . . . . . . . . . . . . . . . . . . 118 .word . . . . . . . . . . . . . . . . . . . . . . . 118 . . . . . . . . . . . . . . . . . . . . . 121 CAM macro facility . . . . . . . . . . . . . . . . . . . . 121 . . . . . . . . . . . . . . . . . . . 121 . . . . . . . . . . . . . . . . . . 121 . . . . . . . . . . . . . . . . . . 122 . . . . . . . . . . . . . . . . . . 123 . . . . . . . . . . . . . . 124 . . . . . . . . . . . . . . 125 Macros [6] Macro definitions Formal arguments Default values String arguments . Macro-defined temporary labels Argument concatenation viii . . Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Contents Page Macro calls . . . . . . . . . . . . . . . . . . . . . 126 Actual arguments . . . . . . . . . . . . . . . . . . 126 Keyword arguments . . . . . . . . . . . . . . . . . . 127 . . . . . . . . . . . . . 128 Passing numeric values of symbols Macro call nesting . . . . . . . . . . . . . . . . . . 128 Micros [7] . . . . . . . . . . . . . . . . . . . . . 133 Register micros . . . . . . . . . . . . . . . . . . . . . 134 Numeric micros . . . . . . . . . . . . . . . . . . . . . 135 String micros . . . . . . . . . . . . . . . . . . . . . 136 Assembler-defined micros . . . . . . . . . . . . . . . . . . 136 . . . . . . . 139 Appendix A Interlanguage Calling Protocol Subroutine linkage . . . . . . . . . . . . . . . . . . . . 139 Accessing the linkage macros . . . . . . . . . . . . . . . . 139 Register use conventions . . . . . . . . . . . . . . . . 140 . . . . . . . . . . . . . . . . 140 The ALLOC, LOAD, and STORE macros . . . . . . . . . . . . 140 The DEFARG, ENTER, and EXIT macros . . . . . . . . . . . . 142 . Linkage macro descriptions The ADDRESS and VALUE macros . . . . . . . . . . . . . . 145 The CALL and MXCALLEN macros . . . . . . . . . . . . . . 147 The SETARG and CALLV macros . . . . . . . . . . . . . . 148 . . . 150 The CRI_REGISTER_NAMES and CRI_STACK_DEFINITIONS macros The calling sequence for Cray MPP systems Background SR–2510 2.2 . . . . . . . . . . . . . . . . . . . . . 150 . . . . . . . . . . . . 151 Cray Research, Inc. ix Contents Cray Assembler for MPP (CAM) Reference Manual Page Differences . . . . . . . . . . . . . . . . . 151 Contents of the calling sequence . . . . . . . . . . . . . . . 152 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 153 . . . . . . . . . . . . . . . . . 155 . . . . . . . . . . . . . 155 . . . . . . . . . . . 157 E register usage conventions (CRAY T3E systems only) . . . . . . . 158 Register use conventions Integer register usage conventions Floating-point register usage conventions Data structures . . . . . . . . . . . . . . . . . . . . 160 Private stack frame . . . . . . . . . . . . . . . . . . 161 Shared stack frame . . . . . . . . . . . . . . . . . . 169 . . . . . . . . . . . . . . . . 170 . . . . . . . . . . . . . . . . 170 . . . . . . . . . . . . . . 171 . . . . . . . . . . . . . . 177 . 181 Calling sequence elements Program start-up state . User subprogram entry and exit Call site actions Appendix B Index . . . . . . Privileged Architecture Library (PAL Code) . . . . . . . . . . . . . . . . . . . . . . 185 . . . . . . . . . . . . . . 23 Figures Figure 1. CAM program structure Figure 2. E register allocation . . . . . . . . . . . . . . . . 159 Figure 3. Private stack frame . . . . . . . . . . . . . . . . 162 Figure 4. Dynamic subprogram information block (DSIB) . . . . . . . 165 Figure 5. Call information word . . . . . . . 166 x . . . . . Cray Research, Inc. . . . SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Contents Page Figure 6. Static subprogram information block (SSIB) Figure 7. Shared stack frame . . . . . . . . 167 . . . . . . . . . . . . . . . . 170 . . . . . . . . . . . . . . . . 45 . . . . . . . . . . . 48 Tables Table 1. Operand qualifiers Table 2. Integer load and store instructions Table 3. Integer control instructions . . . . . . . . . . . . . . 53 Table 4. Integer arithmetic instructions . . . . . . . . . . . . . 57 Table 5. Integer compare instructions . . . . . . . . . . . . . 60 Table 6. Logical instructions . . . . . . . . . . . . . 62 Table 7. Conditional move instructions . . . . . . . . . . . . . 64 Table 8. Shift instructions . . . . . . . . . . . . . 66 Table 9. Byte manipulation instructions . . . . . . . . . . . . 67 . . . . . . . . . 73 . . . . . . . Table 10. Floating-point load and store instructions Table 11. Floating-point control instructions Table 12. Floating-point copy instructions Table 13. Floating-point conversion instructions Table 14. Floating-point move instructions . Table 15. Floating-point compare instructions Table 16. Floating-point arithmetic instructions Table 17. Miscellaneous instructions Table 18. Integer register usage conventions Table 19. Floating-point register usage conventions Table 20. E register usage conventions (CRAY T3E systems only) SR–2510 2.2 . . . . . . . . . . . . . . . 74 . . . . . . . . . . . 76 . . . . . . . . . . 77 . . . . . . . . . . . 79 . . . . . . . . . . . 81 . . . . . . . . . . 83 . . . . . . . . . . . 85 . . . . . . . . . . . 155 . . . . . . . . . 157 . . . . 160 Cray Research, Inc. xi Contents Cray Assembler for MPP (CAM) Reference Manual Page Table 21. xii User PAL codes . . . . . . . Cray Research, Inc. . . . . . . . . . . 182 SR–2510 2.2 Preface This publication documents CAM release 2.2 running on Cray MPP systems. CAM assembly language is a mnemonic-based language that generates object code for execution on Cray MPP systems. Related Publications The following documents contain additional information that may be helpful: • UNICOS User Commands Reference Manual, publication SR–2011 • UNICOS Macros and Opdefs Reference Manual, publication SR–2403 • CF90 Fortran Language Reference Manual, Volume 1, publication SR–3902 • CF77 Fortran Language Reference Manual, publication SR–3772 • UNICOS System Libraries Reference Manual, publication SR–2080 • Cray C/C++ Reference Manual, publication SR–2179 • Cray Research MPP Software Guide, publication SG–2508 • CRAY T3D Administrator’s Guide, publication SG–2507 • CRAY T3D Emulator User’s Guide, publication SG–2500 • Cray MPP Simulator User’s Guide, publication SG–2503 The Alpha Architecture Handbook, publication TPD-0007, and the Alpha AXP Architecture Handbook, publication TPD-0012, which are Digital Equipment Corporation publications, also provide information related to the CAM assembler. SR–2510 2.2 Cray Research, Inc. xiii Preface Cray Assembler for MPP (CAM) Reference Manual Ordering publications The User Publications Catalog, publication CP–0099, describes the availability and content of all Cray Research hardware and software documents that are available to customers. Cray Research customers who subscribe to the Cray Inform (CRInform) program can access this information on the CRInform system. To order a document, either call the Distribution Center in Mendota Heights, Minnesota, at +1–612–683–5907, or send a facsimile of your request to fax number +1–612–452–0141. Cray Research employees may send electronic mail to orderdsk (UNIX system users). Customers who subscribe to the CRInform program can order software release packages electronically by using the Order Cray Software option. Customers outside of the United States and Canada should contact their local service organization for ordering and documentation information. Conventions The following conventions are used throughout this document: xiv Convention Meaning command This fixed-space font denotes literal items such as commands, files, routines, path names, signals, messages, and programming language structures. manpage(x) Man page section identifiers appear in parentheses after man page names. The following list describes the identifiers: 1 User commands 1B User commands ported from BSD 2 System calls Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Preface 3 Library routines, macros, and opdefs 4 Devices (special files) 4P Protocols 5 File formats 7 Miscellaneous topics 7D DWB-related information 8 Administrator commands Some internal routines (for example, the ddcntl() routine) do not have man pages associated with them. variable Italic typeface denotes variable entries and words or concepts being defined. user input This bold, fixed-space font denotes literal items that the user enters in interactive sessions. Output is shown in nonbold, fixed-space font. [] Brackets enclose optional portions of a command line. ... Ellipses indicate that a preceding command-line element can be repeated. The following machine naming conventions may be used throughout this document: Term Definition Cray PVP systems All configurations of Cray parallel vector processing (PVP) systems, including the following: CRAY C90 series CRAY C90D series SR–2510 2.2 Cray Research, Inc. xv Preface Cray Assembler for MPP (CAM) Reference Manual CRAY EL series (including CRAY Y-MP EL systems) CRAY J90 series CRAY T90 series CRAY Y-MP E series CRAY Y-MP M90 series Cray MPP systems All configurations of Cray massively parallel processing (MPP) systems, including the CRAY T3D series and CRAY T3E series All Cray Research systems All configurations of Cray PVP and Cray MPP systems that support this release SPARC systems All SPARC platforms that run the Solaris operating system version 2.3 or later The default shell in the UNICOS and UNICOS/mk operating systems, referred to in Cray Research documentation as the standard shell, is a version of the Korn shell that conforms to the following standards: • Institute of Electrical and Electronics Engineers (IEEE) Portable Operating System Interface (POSIX) Standard 1003.2–1992 • X/Open Portability Guide, Issue 4 (XPG4) The UNICOS and UNICOS/mk operating systems also support the optional use of the C shell. Cray UNICOS Version 9.0 is an X/Open Base 95 branded product. The POSIX standard uses utilities to refer to executable programs that Cray Research documentation usually refers to as commands. Both terms may appear in this document. xvi Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Preface Online information The following types of online information products are available to Cray Research customers: • Cray DynaWeb server, which allows you to view documents online by using a World Wide Web (WWW) browser such as Netscape or Mosaic. To access the Cray DynaWeb server, see your Cray Research system administrator for the local URL. • Man pages, which describe a particular element of the operating system or a compatible product. To see a detailed description of a particular command or routine, use the man(1) command. • UNICOS message system, which provides explanations of error messages. To see an explanation of a message, use the explain(1) command. • Cray Research online glossary, which explains the terms used in a document. To get a definition, use the define(1) command. • xhelp help facility. This online help system is available within tools such as the Program Browser (xbrowse) and the MPP Apprentice tool. For detailed information on these topics, see the User’s Guide to Online Information, publication SG–2143. Reader comments If you have comments about the technical accuracy, content, or organization of this document, please tell us. You can contact us in any of the following ways: • Send us electronic mail from a UNICOS or UNIX system, using the following UUCP address: uunet!cray!publications SR–2510 2.2 Cray Research, Inc. xvii Preface Cray Assembler for MPP (CAM) Reference Manual • Send us electronic mail from any system connected to the Internet, using the following Internet address: [email protected] • Contact your Cray Research representative and ask that a Software Problem Report (SPR) be filed. Use PUBLICATIONS for the group name, PUBS for the command, and NO-LICENSE for the release name. • Call our Software Publications Group in Eagan, Minnesota, through the Customer Service Call Center, using either of the following numbers: 1–800–950–2729 (toll free from the United States and Canada) +1–612–683–5600 • Send a facsimile of your comments to the attention of “Software Publications Group” in Eagan, Minnesota, at fax number +1–612–683–5599. • Use the postage-paid Reader’s Comment Form at the back of the printed document. We value your comments and will respond to them promptly. xviii Cray Research, Inc. SR–2510 2.2 Introduction [1] Massively parallel processing (MPP) is the execution of operations in parallel on independent code segments or on portions of data sets using a large number of processors. For certain types of algorithms and applications, this type of processing provides results at rates greatly exceeding those of conventional parallel-vector-scalar supercomputers. Massively parallel processing is used on both the CRAY T3D and CRAY T3E systems. This manual describes the Cray Assembler for MPP (CAM). The underlying microprocessor in Cray MPP systems is an Alpha reduced instruction set computer (RISC) 64-bit microprocessor developed by Digital Equipment Corporation. The CRAY T3D system uses the EV-4 version of the Alpha microprocessor and the CRAY T3E system uses the EV-5 version of the Alpha microprocessor. The Alpha architecture is described in the Alpha Architecture Handbook, publication TPD-0007 and the Alpha AXP Architecture Handbook, publication TPD-0012 by Digital Equipment Corporation. Note: Specific differences in CAM between CRAY T3D and CRAY T3E systems are noted as appropriate in this manual. 1.1 CRAY T3D operation The CRAY T3D system combines the strengths of a Cray PVP host supercomputer and the CRAY T3D system into a scalable heterogeneous system (SHS). The host system can be any Cray Research system in the CRAY Y-MP E, CRAY Y-MP M90, or CRAY C90 series. The host system provides support for applications running on the CRAY T3D system. Applications written for the CRAY T3D system are compiled on the host system, but run on the CRAY T3D system. The Cray Assembler for MPP (CAM), version 2.2, runs on the host system and generates object code for execution on a CRAY T3D system. SR–2510 2.2 Cray Research, Inc. 1 Introduction [1] Cray Assembler for MPP (CAM) Reference Manual Modifications were made to the UNICOS operating system with the release of UNICOS 7.C.3 and UNICOS 8.0 to support CRAY T3D systems. These modifications are discussed in further detail in the UNICOS 7.C.3 Release Letter, publication RL-5001, the UNICOS 8.0 Release Overview, publication RO-5000, and in the Cray Research MPP Software Guide, publication SG–2508. The CRAY T3D system uses the UNICOS MAX operating system. 1.2 CRAY T3E operation The CRAY T3E system does not require a host system. Applications written for the CRAY T3E system are compiled and executed on the CRAY T3E system. CAM 2.2 runs on the CRAY T3E system and generates object code that runs on the CRAY T3E system. The CRAY T3E system uses the UNICOS/mk operating system. 1.3 Manual organization This publication is organized as follows: 2 Chapter Description 1 Provides an overview of the capabilities and features of the CAM assembler and lists the new features available in CAM 2.2 2 Describes the CAM command line and execution environment 3 Describes the organization of a CAM program 4 Lists and describes the CAM instruction set 5 Describes the assembler directives that are available with the CAM assembler 6 Describes macros Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Introduction [1] 7 Describes micros A Describes the linkage macros associated with CAM and the Cray MPP system’s calling sequence 1.4 Capabilities CAM provides the following capabilities: • The free-format source statements of CAM let you control the size and location of source statement fields. • With some exceptions, source statements can be entered in either uppercase or lowercase letters. • Data within CAM is individualized. You can define data areas during assembly and load them along with the program. • Data can be designated in integer, floating-point, or character notation. • You can control the content of the assembler listing. 1.5 Limitations CAM assembly language has the following limitations: • Constants defined using the == operator must appear within the bounds of a program module. • Global symbol sets that are defined within a program module are visible to other program modules only through the use of the .external assembler directive. • All definitions and references to user symbols are case-sensitive, however, assembler directives and instructions are not. • The maximum number of instructions allowed per assembly is 1,048,575. • Micro names are case-sensitive. SR–2510 2.2 Cray Research, Inc. 3 Introduction [1] Cray Assembler for MPP (CAM) Reference Manual • The valid range of registers is 0 through 31 for both integer and floating-point registers. • The first character of an identifier cannot be a number, and special characters should generally be avoided. • Because names of program sections, symbols, and labels occupy the same name space, they cannot share the same name. • Global symbols can be referenced only within the program module in which they are defined. • Global symbols cannot be redefined. • If a label definition is used in a statement, it must be the first item to appear on the statement line. • User-defined temporary labels must be in the range from 0 through 29999. • Only comments and assembly time symbols (for example, micros, macros, and local symbols) can precede the .ident assembly directive in a program module. • A macro must be defined prior to use. • Macro definitions can be nested, however, the enclosed macro definition is not defined until the enclosing macro is invoked. • If an actual argument is a string containing characters that the assembler interprets as separators (such as a tab, space, or comma), the string must be enclosed by delimiters. • Macro-defined temporary labels must be associated with positional actual arguments; they cannot be associated with keyword actual arguments. • Actual arguments in macro calls must be separated by commas. • Micro expansion will not occur in macro definition code, conditional code that is skipped, or in repeat block definition code. 4 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Introduction [1] 1.6 Execution of the CAM assembler The CAM assembler executes under the control of the operating system. It has no hardware requirements beyond those required for the minimum system configuration. When you specify the CAM invocation statement, the assembler is loaded and begins executing. Parameters can be specified on the invocation statement to define the characteristics of an assembler run, such as the file containing source statements. For more information on the CAM command line, see Section 2.1, page 11, or the cam(1) man page. Note: Because CAM is a one-pass assembler, all macros and assembler constants must be defined before they are used. Assembly code and data definitions are processed as they are read. For more information concerning the format of the listing file, see Section 1.8, page 7. The object code must be linked and loaded before execution. References to external symbols are resolved during the link and load phase. The absolute file that the linker or loader creates is ready for execution. Execution can only occur on the MPP system or in the Cray MPP simulator, mppsim. The assembler returns an exit status when assembly is completed. If no errors have occurred, a zero is returned. If the assembly results in errors, the assembler returns a status greater than zero. If a preprocessor has been invoked by using command-line options and preprocessing is not successful, the following exit status is returned and the assembly is halted: • If the preprocessor has exited through an exit, the exit status value is returned. • If the preprocessor stops without exiting through an exit or is stopped by a signal, the assembler returns 1 as its exit status. If the assembly detects errors the object file that is being created by the assembler is not kept. For more information about the Cray MPP simulator, see the Cray MPP Simulator User’s Guide, publication SG–2503. SR–2510 2.2 Cray Research, Inc. 5 Introduction [1] Cray Assembler for MPP (CAM) Reference Manual 1.7 Source statement format A CAM assembler source statement can be a mnemonic machine instruction, an assembler directive, a macro instruction, a label, a symbol definition, a micro definition, or a comment. Mnemonic machine instructions provide a way of expressing all supported functions of a Cray MPP system processor. Assembler directives let you control the assembly process. Macros define sequences of instructions to be called later in the program. Labels, symbol definitions, micro definitions, and comments are defined by the user. Source statements can contain a maximum of 256 characters on a physical line. A physical line can be continued to the next subsequent physical line by ending the physical line with a single dash character (-). This continuation indicator must appear before any comments on that line. Logical lines (statements) are not limited in length except by memory considerations. The following is an example of a source statement that has been continued: lda R1, (val1 - val2)*size CAM source statements are free-format and can contain any or all of the following: • Identifiers An identifier is chosen by the user and describes a name, symbol, or label. If no identifier appears on a statement line, an instruction can be the first item on the line. See Chapter 3, page 21, for more information on names, labels, and symbols. • Operators/operands An operator identifies the operation to be performed by the statement and the operand is the item or items affected by the operation. An operator can be an instruction that generates binary code in the object module, an assembler directive that performs control operations during the actual assembly, or a call to a macro that expands the macro into its statements. The operand is usually a memory location or a register that is used by an instruction, but operands can also be used with macros. Operands are 6 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Introduction [1] separated by commas. Operators start at the end of an optional location field and are separated by white space. Each field is terminated by the start of a comment or by the end of the line. • Comments Comments are documentation that explain the source statement. Comments are optional and are ignored by the CAM assembler. They can start after the statement is finished or they can be on a line by themselves. They must be preceded by a semicolon (;), and terminate at the end of the line. The following is an example source statement: ABC: addq R10,R11,R12; Sum 1.8 Listing file format The format of the listing file is as follows: # [offset] [value] [state] source line [:macro_name] The components of the listing file are as follows: SR–2510 2.2 # Line number in the source code. offset Hexadecimal byte offset from the beginning of the current program section. It is blank when not associated with a particular alpha instruction, a macro call in a code program section, or a data directive that reserves space. value Hexadecimal value of an alpha instruction, data item initialized by a data directive, or the value of a local or global set. The value of a global set is supplied only when the initialization expression contains no label or global set identifiers. Otherwise, it is blank. state Assembler-generated commentary controlled by the -e and -d command line parameters and the .list assembler directive. The commentary generated takes one of the following forms: Cray Research, Inc. 7 Introduction [1] source line Cray Assembler for MPP (CAM) Reference Manual state Description CNDSKP Code skipped inside of a conditional MACALL Macro call line MACDEF Macro definition MACEXP Macro text expanded REPDEF Repeat ( .repeat) code being collected REPEXP Repeat code expanded User-supplied source line. macro_name The name of the macro being expanded. An example listing file follows. instruction_test -MPP Assembler: Version 2.2 (ed 2706) page: 1 1 .ident instruction_test 2 3 .start begin 4 5 .psect INS_DATA,data 6 data: .blkq 50 7 00000000 0000000a quad: .quad 10 8 bpt = 24 9 fred = . 10 11 .psect INSTRS,code 12 . = . + fred 13 begin:: 14 00000cc0 203f0020 lda R1,32(r31) 15 00000ce0 203f0018 lda R1,bpt(r31) 16 00000d00 203f0000 lda R1,quad(r31) 17 . = . + 52 8 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 SR–2510 2.2 00000ec0 00000ee0 00000f00 00000f20 00000f40 20ab0018 d33fff88 45cf0410 6aad0000 463ff412 Introduction [1] lda bsr bis jmp bis R5,bpt & ^xffff(R11) r25,begin R14,R15,R16 R21,(r13),21 R17,^xff,R18 00000f60 601d8000 fetch 0(R29) 00000f80 00000018 00000fa0 00000078 call_pal bpt call_pal 120 . = 0 00000000 44220403 00000020 44220403 00000040 44220403 bis bis bis R1,R2,R3 R1,R2,R3 R1,R2,R3 end: Cray Research, Inc. 9 System Information and Usage [2] This section describes the cam(1) command, its options, and the environment variables that are associated with it. The cam(1) command is also documented on the cam(1) man page. 2.1 cam(1) command line The cam(1) command invokes the Cray Assembler for MPP (CAM). The format of the command is as follows: cam [-C target] [-d opts] [-D symbol [=def]] [-e opts] [-g] [-i] [-I include_dir] [-l list_file] [-m mlevel] [-M] [-o obj_file] [-P] [-U symbol] [-v] [-V] [--] [source_file] -C target SR–2510 2.2 Determines the system for which code will be generated. Overrides the value set in the TARGET environment variable. Enter one of the following for target: target Target Systems cray-t3d, CRAY-T3D CRAY T3D systems cray-t3e, CRAY-T3E CRAY T3E systems Cray Research, Inc. 11 System Information and Usage [2] -d opts Cray Assembler for MPP (CAM) Reference Manual Disables listing options. If a specified option is turned on, it is turned off. Otherwise, there is no effect. The default option is -da. Enter one or more of the following for opts: opts Listing Action a All listing options are turned off (the d, m, r, and s options are implied), and no listing file is generated. (default) c Show only the code that is not skipped by conditional assembly. Conditional directives are not shown. d Macro definition expansion is not included in the listing file. e Edited line listing display is turned off. m Macro expansion is not included in the listing file. q An unlimited number of errors are displayed. r Repeat block expansion is not included in the listing file. s No listing file is generated. To specify multiple opts, enter them with no separator. The following command line example disables the m, r, and s listing options: cam -dmrs myfile.s 12 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual -D symbol [=def] System Information and Usage [2] Defines variables used for source preprocessing. This option is used in conjunction with the -P and -M options. Specifying this option sets a preprocessor to run automatically. If neither the m4(1) nor the cpp(1) preprocessor has been selected, the cpp(1) preprocessor is run. If passed to cpp(1) (see -P) and [=def] is not present, the def is set to 1. If passed to m4(1) (see -M) and [=def] is not present, the def is set to NULL. To specify more than one symbol or def, specify multiple -D options. -e opts SR–2510 2.2 Enables listing options. The default is for all options to be off unless the -e option is specified. Enter one or more of the following for opts: opts Listing Action a All listing options are to be turned on (the d, m, r, and s options are implied). c Code that is skipped during conditional assembly is shown. d Macro definition expansion is included in the listing file (s option implied). e Edited line listing display is turned on. The lines displayed include the expansion of all micro and macro arguments. Comments are not displayed. Cray Research, Inc. 13 System Information and Usage [2] Cray Assembler for MPP (CAM) Reference Manual m Macro expansion is included in the listing file (s option implied). q The number of errors listed is limited to 100. r Repeat block expansion is included in the listing file (s option implied). s Implies default listing options are to be used (default is that offsets and opcodes are generated by the user source file). To specify multiple opts, enter them with no separator. The following command line example enables the m, r, and s listing options: cam -emrs myfile.s -g Generates symbolic debugging tables. -i Retains the preprocessed source file as source_file.i. By default, this file is deleted. -I include_dir Changes the #include file search algorithm. The algorithm changes from one that looks for names that do not begin with a slash (/) to one that looks in the directory specified by include_dir prior to looking in the directories on the standard list. The algorithm first searches in the directory of the input file for #include files with names that are enclosed in quotation marks (" "). All #include files (those whose names are enclosed in angle brackets < > or quotation marks " ") are then searched for in directories 14 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual System Information and Usage [2] named in -I options and finally in the standard directory /usr/include. To specify more than one include_dir, specify multiple -I options. If multiple -I options are specified, the directories are searched in the order specified on the command line. This option invokes the cpp(1) preprocessor. SR–2510 2.2 -l list_file Writes the listing to list_file. By default, a listing file is not created. This option must be used with the -e option to actually obtain a listing. -m mlevel Changes the severity level of the messages that are reported by the assembler. The default, mlevel is 3 (Warning). mlevel Message type 0 Comment 1 Note 2 Caution 3 Warning (default) 4 Error -M Executes the /usr/bin/m4 preprocessor and writes the result to source_file.i. By default, the preprocessor is not invoked. -o obj_file Writes the relocatable assembly output to obj_file. By default, the relocatable assembly output is written to source_file.o. The link editor or loader processes obj_file. -P Executes the /lib/cpp preprocessor or executes the preprocessor specified by the CAM_CPP_LOCATION environment variable. The Cray Research, Inc. 15 System Information and Usage [2] Cray Assembler for MPP (CAM) Reference Manual result is written to source_file.i. The -P option is similar to the -E and -N options of the cpp(1) command. The exception is that the preprocessed source code does not go to standard output. By default, the preprocessor is not invoked. The default setting for the TARGET environment variable is CRAY-T3D (on CRAY T3D systems) or CRAY-T3E (on CRAY T3E systems). The -C command line option overrides the setting in the TARGET environment variable. mpp include files are picked up by cpp(1). -U symbol Undefines variables that had been defined by the -D option. If passed to either cpp(1) or m4(1) (see -P and -M), any initial value of symbol is removed. This option is used in conjunction with the -P and -M options and is passed through to the preprocessor. Specifying this option invokes a processor. If neither the m4(1) nor the cpp(1) preprocessor has been selected, the cpp(1) preprocessor is run. For more information, see cpp(1) and m4(1). To undefine more than one symbol, specify multiple -U options. 16 -v Echoes the preprocessor statement used by the assembler when the preprocessor is used. -V Reports the version information of the assembler. -- Marks the end of the options list; only source file names can follow the double dash. Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual source_file System Information and Usage [2] CAM source file to be assembled. All options must precede source_file. 2.2 The execution environment This subsection describes aspects of the execution environment that affect the behavior of CAM. 2.2.1 Execution on the CRAY T3D system Applications are run interactively on the CRAY T3D system by logging into and interacting with the Cray Research host to manage and monitor the application running on the CRAY T3D system. You can submit a job interactively to the CRAY T3D system by executing an a.out file that has been built on the Cray Research host to run on the CRAY T3D system. To assemble and execute a CAM assembler program interactively, enter the following: host$ cam [options] source_file.s host$ mppldr source_file.o host$ a.out The cam command specified at the first system prompt assembles the code from source_file.s into binary code and stores that output in source_file.o. The mppldr command loads the binary file and creates an executable file suitable for execution on the CRAY T3D system. The a.out file calls the CRAY T3D mppexec command. As a result of the mppexec command, a process is created that initializes the hardware and operating system environment appropriately to manage the CRAY T3D partition into which an application will be loaded. This involves setting up a number of memory-mapped registers on each PE in the assigned partition and allows the creation of partitions to be completely dynamic. See the mppexec(1) man page for more information. SR–2510 2.2 Cray Research, Inc. 17 System Information and Usage [2] Cray Assembler for MPP (CAM) Reference Manual Once an application is downloaded, mppexec initiates a UNIX agent process on the Cray Research host. The agent process continues to serve the application by handling system calls, exceptions, and signals during the entire execution of the application. While the application is running on the CRAY T3D system, you can issue the ps(1) command with the -m option to monitor the job. The -m option has been added to the ps(1) command. This option displays information about the active application running on the CRAY T3D system. The following information is displayed: • User ID • Process ID of the mppexec process • Terminal identifier • Cumulative time that processes have run on the Cray Research host • CRAY T3D partition ID and type (OS or HW) • Partition shape • Partition state • Number of PEs allocated to the partition • Wall-clock time used by the CRAY T3D application • CRAY T3D command name You can monitor your CRAY T3D process using either the -m or -M options. The -m option provides process information about your processes running on both the host system and the CRAY T3D system. The -M option displays only information from the preceding list that pertains to CRAY T3D processes. See the ps(1) man page for more information. 18 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual System Information and Usage [2] 2.2.2 Execution on the CRAY T3E system To assemble and execute a CAM assembler program interactively, enter the following: host$ cam [options] source_file.s host$ cld source_file.o host$ a.out This executes directly on the CRAY T3E hardware under UNICOS/mk. 2.2.3 Environment variables The following environment variables affect the exection environment: Variable Properties CAM_CPP_LOCATION CAM_CPP_LOCATION specifies the location of an alternate preprocessor. The default is /lib/cpp. NLSPATH The CAM assembler uses the NLSPATH environment variable. NLSPATH specifies the location of the CAM message catalog for use with the explain(1) command. NLSPATH needs to be set only if the message catalog has been installed in an alternate location. TARGET Setting the TARGET environment variable to CRAY-T3D or cray-t3d tells the assembler to accept and generate CRAY T3D code. Setting the TARGET environment variable to CRAY-T3E or cray-t3e tells the assembler to accept and generate CRAY T3E code. SR–2510 2.2 Cray Research, Inc. 19 The CAM Program [3] This section describes the components, organization, and features of a Cray Assembler for MPP (CAM) program. This section discusses the following: • Program segment • Source statement • Statement editing • Instructions and directives • Micros • regnum function • DEX function • Location counter • Register designators • Identifiers • Symbols • Labels • Operators • Numeric constants • Expressions SR–2510 2.2 Cray Research, Inc. 21 The CAM Program [3] Cray Assembler for MPP (CAM) Reference Manual 3.1 Program segment A program segment is contained within an assembly file and includes global definitions and program modules (for example, the structure of Figure 1, page 23). Sequences of global definitions and program modules can appear one or more times within a program segment. 22 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual The CAM Program [3] Program segment Global definitions: Macros Micros Local symbol sets Program module: .ident name Special global definitions: Macros Micros Local symbol sets Module definitions: Global symbol sets Code psects Data psects .end name a10026 Figure 1. CAM program structure SR–2510 2.2 Cray Research, Inc. 23 The CAM Program [3] Cray Assembler for MPP (CAM) Reference Manual 3.1.1 Global definitions Global definitions are valid throughout the entire assembly file. Local symbol sets, macros, and micros are valid throughout the entire assembly file regardless of where they are defined. Definitions only change if they are redefined or deleted. Constants defined using the == operator must appear within the bounds of a program module. Code is not generated by a global definition. Note: Local symbol sets, micros, and macros can be defined anywhere in the program. 3.1.2 Program module Program modules begin with a .ident assembler directive; terminate with a .end assembler directive; and can include code, data, and assembler definitions. Global definitions (local symbol sets, macros, and micros) within a program module are visible throughout the entire assembly file and can be used within any program module. Global symbol sets defined within a program module are visible to other program modules only through the use of the .external assembler directive . 3.2 Source statement A CAM program is a sequence of source statements. Figure 1, page 23, shows the structure of a CAM program. A source statement can be an instruction, an assembler directive, a macro call, a symbol definition, a label, a micro definition, or a comment. Note: All definitions and references to user symbols are case-sensitive, however, assembler directives and instructions are not. 24 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual The CAM Program [3] The following example demonstrates the use of case in source statements (source statements are free-format): .ident case_TEST .STart Main xYz= 1 .PSECT code_sec, code Main:: laR1,xYz .enDcase_TEST 3.3 Statement editing The CAM assembler processes source statements sequentially from the source file. As each statement is read in, all necessary editing functions are performed on the statement. If the statement is coming from a macro body or a repeat body, most of these same functions are applied. The editing functions performed by the assembler are as follows: • Concatenation The assembler supports both line concatenation and character concatenation. Any line ending in the dash (-) character (before the optional comment field) will have the next line appended to it. Character editing is supported only during the expansion of a macro argument. If a single quote ( ’)appears at the beginning or end of a macro argument, the replacement string for the macro argument is concatenated to the character that preceded or followed the single quote. • Micro substitution A micro can appear in any assembler statement (nonassembler keyword character sequence), primarily in strings that form arguments to instructions or assembler directives. The micro is delimited by the grave accent character (‘). This character is sometimes referred to as the backtick character. The grave accent character must appear at the beginning and end of the micro name, except in the case of register micros. Register micros can appear only in the context of a register where SR–2510 2.2 Cray Research, Inc. 25 The CAM Program [3] Cray Assembler for MPP (CAM) Reference Manual the delimiter is not necessary. See Chapter 7, page 133, for more information on micros. Note: You can choose to list the edited statement rather than the original source line by using the -ee command-line option or by using the EDIT option with the .list assembler directive. 3.4 Instructions and directives The CAM assembler recognizes assembler directives and assembler instructions. 3.4.1 Assembler directives Assembler directives assist the assembler in its task of interpreting the source statements and generating an object program. The CAM assembler has a large complement of directives, each with a unique identifier. The contents of the operand field depend on the assembler directive. Through the use of the .macro assembler directive, the user can identify a sequence of instructions that will be saved for assembly at a later point in the source program . This sequence of instructions is called a macro. A macro can be called any number of times after it has been defined. The .repeat directive can be used to define a sequence of instructions that will be inserted into the code a specified number of times at the location of the .repeat directive. Section Chapter 5, page 87, describes individual assembler directives and their formats. Macros are described in Chapter 6, page 121. 3.4.2 Assembler instructions Assembler instructions are machine instructions that manipulate data by performing arithmetic operations, memory retrieval and storage, and transfer of control. Each machine instruction is represented mnemonically in the 26 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual The CAM Program [3] CAM assembler. The assembler identifies a machine instruction according to its mnemonic and generates a binary machine instruction in object code. The maximum number of instructions allowed per assembly is 1,048,575. If an attempt is made to generate more instructions, an error message is generated. Assembly continues, however, no more instructions will be issued and an object file will not be generated. The assembly code must be partitioned into smaller units in order to assemble. An optional label can be included on a line before every instruction. If included, the optional label is not redefinable and receives a value equal to the value of the current location counter. Machine instruction syntax is completely mnemonic based. All instruction formats and descriptions can be found in Chapter 4, page 45. Contact your systems support staff for more information. 3.5 Micros Through the use of micros, you can assign a name to a character string, number, or register identifier and subsequently refer to the item by its name. A reference to a micro causes the character string, number, or register identifier to replace the name before assembly of the source statement containing the reference. Micro names are case sensitive. The <- character sequence assigns the name to the micro value. The following are examples of valid micro assignments: str <- "tempstring" num <- 5 sink <- r31 ;String micro ;Numeric micro ;Register micro Micros are described in Chapter 7, page 133. SR–2510 2.2 Cray Research, Inc. 27 The CAM Program [3] Cray Assembler for MPP (CAM) Reference Manual 3.6 regnum function The regnum function is a built-in function. The regnum function takes the name of a floating-point or integer register as an argument and returns the register number as an integer. The regnum function can only be used in expressions within macros. In the following example the symbol S receives the value 3: S = regnum(R3) 3.7 DEX function The DEX function is a built-in function. It references a delayed expression (DEX) by taking a DEX expression number as a single argument. The referenced DEX expression appears within the DEX section of the .ident in which the DEX reference appears. The value of the DEX expression is used as the value of the DEX function. DEX expressions are described in more detail in Section 3.15.3, page 42. 3.8 Location counter The CAM location counter, specified by a period (.), identifies the current offset of a program within a program section. The location counter can be used in any integer constant expression on the right side of a local set ( =) operation. The value specified by the user is assumed to be in bytes. It can also be the result of a local set operation. When used as the result, the offset in the current program section will be reset with the value of the integer constant expression on the right side of the set. Note: The location counter (.) can be set both forward or backward in the program section. In the following example, note the two columns of hexadecimal numbers following the line number. The left column is the offset in the program section. Notice the change in offset after the location counter (.) assignment. 28 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual The CAM Program [3] instruction_test -MPP Assembler: Version 2.2 1 (ed 2706) page: 1 .ident instruction_test 2 3 .start begin 4 5 6 .psect INS_DATA,data data: .blkq 50 7 00000000 0000000a quad: 8 bpt .quad = 24 10 9 = . fred 10 11 12 .psect INSTRS,code . = . + fred 13 begin:: 14 00000cc0 203f0020 lda R1,32(r31) 15 00000ce0 203f0018 lda R1,bpt(r31) 16 00000d00 203f0000 lda R1,quad(r31) 17 . = . + 52 18 00000ec0 20ab0018 19 00000ee0 d33fff88 lda bsr R5,bpt & ^xffff(R11) r25,begin 20 00000f00 45cf0410 bis R14,R15,R16 21 00000f20 6aad0000 jmp R21,(r13),21 22 00000f40 463ff412 bis R17,^xff,R18 23 24 00000f60 601d8000 fetch 0(R29) 25 26 00000f80 00000018 call_pal bpt 27 00000fa0 00000078 28 call_pal 120 . = 0 29 30 00000000 44220403 bis R1,R2,R3 31 00000020 44220403 bis R1,R2,R3 32 00000040 44220403 bis R1,R2,R3 33 SR–2510 2.2 Cray Research, Inc. 29 The CAM Program [3] 34 35 Cray Assembler for MPP (CAM) Reference Manual end: 3.9 Register designators Register designators are used in mnemonic machine instructions to identify the register(s) that will be used in an operation. Registers are identified by an alphabetic character followed by a decimal number. Registers are designated as integer registers or floating-point registers. The integer registers are identified by R or r and the floating-point registers are identified by F or f. For example, R1 is a register from the integer set of registers. The valid range of registers is 0 through 31 for both integer and floating-point registers. CAM accepts register mnemonics that are specified in uppercase or lowercase. For example: addl R20,r21,R22 ;Perform an integer ;add of registers R20 ;and r21 placing the ;result in R22. divs f2,F3,f4 ;Perform a floating ;point divide of f2 ;and F3,the result is ;in f4. A register can be renamed to any legal identifier by using the CAM micro facility. This redefinition must come prior to use and can be anywhere in the program. This allows for programs to be written using symbolic names of registers that represent specific functions. For example, R31 is generally used as a sink register for many instructions. The following example shows the syntax of the redefinition: sink <- 30 R31 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual The CAM Program [3] 3.10 Identifiers Identifiers are used to identify macros, micros, program modules, program sections, symbols and labels. An identifier can be from 1 through 128 characters in length and can contain any of the following: • Alphabetic characters A through Z (uppercase or lowercase) • Decimal digits 0 through 9 • Dollar signs ($), periods (.), "at" (@) symbols, or underscores (_) The following are examples of valid identifiers: count Lowercase alphabetic characters are permitted. $ADD $ is a legal beginning character. INDEX@ @ is a valid character. ABC5 Combinations of letters and digits are legal. The following are examples of identifiers that are not valid: Y+Z3 The name contains an illegal character. 9count The first character is a number. Note: The first character of an identifier cannot be a number, and special characters should generally be avoided. However, temporary labels can begin with a number. CAM supports separate name space for the following: • Program modules • Program sections, symbols, and labels • Macros • Micros Because names of program sections, symbols, and labels occupy the same name space, they cannot share the same name. For example, a program section (psect) can have the same name as a macro, but not a symbol or SR–2510 2.2 Cray Research, Inc. 31 The CAM Program [3] Cray Assembler for MPP (CAM) Reference Manual label. The example that follows illustrates the use of the same name to identify both a program section and a macro: .ident name .psect name .macro name (macro coding here) .endm .end 3.11 Symbols Symbols are defined by the user. They can be defined as either local or global. The symbol is assigned the value of the expression on the right side of the operator. Local symbols must be defined prior to use and terminated by an equal sign (=). Symbols defined as local can be referenced from anywhere within the assembly file following their definition. Local symbols can only be defined using constant integer expressions and can be redefined later in the program. Global symbols can be referenced only within the program module in which they are defined and are terminated by a double equal sign character (==). They can be referenced outside of the program module in which they are defined only through the use of the .external assembler directive. Global symbols can be defined using constant integer expressions or DEX expressions and cannot be redefined. 32 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual The CAM Program [3] The following are examples of valid symbol definitions: SYM1 = 1 ;Defines a local symbol to ;have the value 1 SYM2 == 2 ;Defines a global symbol to have ;the value 2 SYM3=3 SYM4 == 4 ;Symbols may or may not ;have spaces separating the ;symbol, the equal sign, ;and the value User symbol definitions are always the first item on a statement line and, generally, start in the first column; however, they are not restricted to the first column. Because symbols, labels, and program sections share the same name space, their names must be distinct. 3.12 Labels A label is an identifier that identifies a location in a program. Labels are created and named by users. If a label definition is used in a statement, it must be the first item to appear on the statement line. Because labels, symbols, and program sections share the same name space, their names must be distinct. CAM supports three types of labels: • Local labels are defined by terminating an identifier with a colon (:). Local labels can only be referenced by the program module in which they are defined. The following are legal local labels: local: label: • Global labels are defined by terminating an identifier with a double colon (::). Global labels can be used by other program modules through the use of the .external assembler directive. The following are legal global labels: SR–2510 2.2 Cray Research, Inc. 33 The CAM Program [3] Cray Assembler for MPP (CAM) Reference Manual global:: label:: • Temporary labels are defined by the user. They are defined by terminating a numeric string of characters with a dollar sign ($) and a single colon (:). The scope of temporary labels is as follows: – Between two local or global labels – Between the beginning and ending of a program section – Between .save_psect and .restore_psect assembler directives User-defined temporary labels must be in a range from 0 through 29,999. Labels between 30,000 and 65,535 are reserved for use by the assembler. Note: An identifier that terminates with a dollar sign ($) is not a user-defined temporary label. If such a construct is defined by the user, it will be viewed as local label. If such a construct is used in a forward reference, it will be classified as unknown until defined by a label definition or the .external assembler directive. The following examples are legal temporary labels: 00223$: 101$: The following example illustrates a forward reference to the temporary label 30$: bne r12,30$ . . 30$: 3.13 Operators Assembly time operators can operate on integer values, floating-point expression values, or DEX expression values. 34 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual The CAM Program [3] Mixed mode operators have as one operand an integer value and as the other a floating-point value. The integer value is converted to floating point and the operation result is floating point. In the following table, operators are classified by the type of operand they use: Operator Symbol Operand Types One’s compliment ˜ Integer, DEX Unary minus - Integer, floating-point, mixed mode, DEX Unary plus + Integer, floating-point mixed mode, DEX Binary plus + Integer, floating-point mixed mode, DEX Ninary minus - Integer, floating-point mixed mode, DEX Multiplication * Integer, floating-point mixed mode, DEX Division / Integer, floating-point mixed mode, DEX Remainder % Integer, DEX Left shift << Integer, DEX Right shift >> Integer, DEX Bitwise AND (logical product) & Integer, DEX Bitwise exclusive OR (logical difference) SR–2510 2.2 Integer, DEX Cray Research, Inc. 35 The CAM Program [3] Cray Assembler for MPP (CAM) Reference Manual Operator Symbol Operand Types Bitwise OR | Integer, DEX Exponentiation pow DEX Log to the base 2 log2 DEX Maximum max DEX Minimum min DEX Mask left #< DEX Mask right #> DEX Arithmetic right shift #>> DEX Selection ? DEX Selection else : DEX Cast DEX Mode () DEX Note: Operators pow, log2, max, and min are only recognized if the user has not redefined the names previous to the occurrence of the operators in an expression. 3.13.1 Operator precedence The operators associated with CAM 2.2 are summarized in the following table, in order of descending precedence: 36 Order Operator Description Right to left ˜ One’s complement - Negative Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Order Operator Description + Unary plus log2 Log to the base 2 Cast () Mode set Left to right pow Exponentiation Left to right * Multiplication / Division max Maximum value of its operands min Minimum value of its operands % Remainder + Binary add - Binary minus << Left shift (zero fill) >> Right shift (zero fill) #< Mask left #> Mask right #>> Right shift with sign extension Left to right ? Selection Left to right : Selection else Left to right & Bitwise AND (logical product) Left to right Left to right Left to right Left to right SR–2510 2.2 The CAM Program [3] Bitwise exclusive OR (logical difference) | Cray Research, Inc. Bitwise OR (logical sum) 37 The CAM Program [3] Cray Assembler for MPP (CAM) Reference Manual 3.13.2 Operator definitions The operators used in CAM 2.2 are defined as follows: 38 Operator Definition ˜ One’s compliment - Unary minus + Unary plus + Binary plus - Ninary minus * Multiplication / Division % Remainder << Left shift >> Right shift & Bitwise AND (logical product) \ Bitwise exclusive OR (logical difference) | Bitwise OR pow Exponentiation log2 Log to the base 2 max Maximum value of its operands min Minimum value of its operand #< Mask left #> Mask right #>> Right shift with sign extension ? Left operand is evaluated and its value is used to select one of the following selection else (:) values Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual The CAM Program [3] : The value of the preceding selection operator (?) is used to select the left operand if the value is 1. Otherwise the right operand is selected. Its operand is converted to . () The mode of a DEX expression evaluation is set to . 3.14 Numeric constants Numeric constants can be floating-point numbers or integers. Floating-point constants are specified in scientific notation or contain a decimal point. Integer constants are specified either in the form of a string of digits without a decimal point or in radix format. The radix format has the following forms: • Radix operator preceding a digit string Radices can be specified in binary, decimal, octal, or hexadecimal format. Radices are specified by a caret (^) followed by a radix indicator. The following listing describes radices that can be specified in CAM assembly language. SR–2510 2.2 Radix indicator Specification ^B or ^b The following binary digits (0 and 1) form an unsigned binary constant. ^D or ^d The following decimal digits (0 through 9) form an unsigned decimal constant. ^O or ^o The following octal digits (0 through 7) form an unsigned octal constant. Cray Research, Inc. 39 The CAM Program [3] Cray Assembler for MPP (CAM) Reference Manual ^X or ^x The following hexadecimal digits (0 through 9, A through F, and a through f) form an unsigned hexadecimal constant. The following are examples of radix specifications: ^B0101 Specifies a binary constant ^d99 Specifies a decimal constant ^o127 Specifies an octal constant ^XFF Specifies a hexadecimal constant • Hexadecimal C format The hexadecimal C format is specified by the 0x designation followed by a string of hexadecimal digits (0 through 9, A through F, and a through f). The following is an example of the hexadecimal C format: 0xA72 • Octal C format The octal C format is specified by the 0 designation followed by a string of octal digits (0 through 7). In octal C format, the decimal number 511 would be specified as follows: 0777 Note: In octal C format, if the digits 8 or 9 appear, the constant is a decimal constant. For example, if 078 appears, the number is decimal 78. 3.15 Expressions The types of expressions used in CAM assembly language can be either integer or floating-point expressions. Expressions can contain symbol names, label names, and the current location indicator (specified by .). The value of a local symbol replaces the local symbol name and the value of the current location counter replaces the current location counter. Types of expressions cannot be mixed. Expressions in which global symbols or labels appear are 40 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual The CAM Program [3] known as DEX expressions. The subsections that follow describe the implementation of constant integer, floating-point, and DEX expressions as they apply to CAM. 3.15.1 Constant integer expressions Constant integer expressions can use any non-DEX specific operator and all values are one of the following: • An integer constant • A local symbol whose value is an integer • The current location indicator (specified by .) • Micros that have an expanded value that is an integer • The regnum function CAM 2.2 subexpression values are limited to 64 bits or less in size. If overflow occurs, an appropriate ERROR message will be issued to the user and assembly will continue. The following is an example of an integer expression: s = 10*2 3.15.2 Floating-point expressions Floating-point expressions can be used to define IEEE double-precision numbers. They can only be used with the floating-point data description assembler directives. See Section 5.2, page 90, for more information on the data description directives. They allow the user to initialize words of data by specifying an expression. The assembler calculates the value. The following example illustrates the use of a floating-point expression: X = 8 SR–2510 2.2 ;Assign a value to X Cray Research, Inc. 41 The CAM Program [3] Cray Assembler for MPP (CAM) Reference Manual i = 2 ;Assign a value to i .t_floating (2.0*X)/3*i ;Initialize a long word ;of data with the value ;of the expression 3.15.3 DEX expressions Delayed expressions (DEX) are constant integer expressions that can contain global symbol names, label names and references to other DEX expressions in addition to integer and floating-point expressions. They can also use DEX expression extended operators. DEX expressions can appear in any of the following: • Instruction operands, anywhere that an integer expression can appear • The DEX section • Global symbol initialization expressions • Distribution specifications on .psect declarations • The size, value, or repeat count on data directives References to DEX expressions are of the following form: DEX(constant integer expression) Such references can appear anywhere in a DEX expression as the operand of an operator or as a subexpression. Such references must be defined in the DEX section that appears at the end of the .ident in which the reference occurs. The DEX section is bounded by .dexstart and .dexend assembler directives. The default evaluation mode for a DEX expression is long (64-bit integer), however, it can be set by using the following syntax: (), 42 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual The CAM Program [3] The type of an individual item within a DEX expression can be changed by using the following syntax: The mode variable is specified in the following form: #[] The sign variable is specified as either signed or unsigned. The default is signed. The mode_type variable can be specified as any of the following values: char 8-bit character complex 64-bit complex float 32-bit floating point float16 16-bit floating point int 32-bit integer long 64-bit integer short 16-bit integer If long is specified for mode_type, then you must specify as one of the following values: SR–2510 2.2 double 128-bit integer double complex 128-bit complex complex 64-bit complex Cray Research, Inc. 43 CAM Instruction Set [4] This section describes the set of instructions and the mnemonics supported by Cray Assembler for MPP (CAM) 2.2. See the Alpha AXP Architecture Handbook, publication TPD-0007, or the Alpha AXP Architecture Handbook, publication TPD-0012, for more information. Note: Instructions are valid only when used within a code program section. 4.1 Operand qualifiers Operand qualifiers provide variations to various integer and floating-point instructions. Qualifiers are specified by immediately following an instruction with a forward slash (/) followed by the letters that specify each particular qualifier. Table 1, page 45 lists the operand qualifiers available to users of CAM. Table 1. Operand qualifiers Operand qualifier SR–2510 2.2 Description c Results of floating-point operations are rounded toward zero (chopped). Negative numbers become smaller negative numbers and positive numbers become smaller positive numbers. d Results of floating-point operations are rounded according to the rounding mode specified in the floating-point control register (FPCR). The rounding mode specified in the FPCR is dynamic. The FPCR is accessed by the mf_fpcr (read) and the mt_fpcr (write) instructions. For more information on these instructions, see Table 14, page 79. Cray Research, Inc. 45 CAM Instruction Set [4] Operand qualifier 46 Cray Assembler for MPP (CAM) Reference Manual Description i Inexact result trapping is enabled. An inexact result occurs if the infinitely precise result differs from the rounded result. If an inexact result occurs and the qualifier is not specified in the instruction, the rounded result will be stored in the result register. m Results of floating-point operations are rounded to the smaller of two surrounding representable results (toward minus infinity). Negative numbers become larger negative numbers and positive numbers become smaller positive numbers. s Software trapping is enabled. Software assistance is provided for programs that deliberately operate outside the underflow/overflow range to complete floating-point operations correctly. u An underflow arithmetic trap is enabled. An underflow occurs if the rounded result of an operation is smaller than the smallest finite number of the destination format. If underflow occurs and the qualifier is not specified in the instruction, zero is stored in the result register. v Integer overflow checking is enabled. In conversions from floating point to integer, integer overflow occurs if the rounded result is outside the range –263 through 263–1. In conversions from 64-bit integers to 32-bit integers, integer overflow occurs if the result is outside of the range –231 through 231–1. If integer overflow occurs and the qualifier is not specified in the instruction, the true result truncated to the low-order 64 bits is stored in the result register. Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual CAM Instruction Set [4] 4.2 CAM instructions The CAM instruction set includes instructions of the following types: • Integer load and store • Integer control • Integer arithmetic • Integer compare • Logical • Conditional move • Shift • Byte manipulation • Floating-point load and store • Floating-point control • Floating-point copy • Floating-point conversion • Floating-point move • Floating-point compare • Floating-point arithmetic • Miscellaneous Each subsection that follows provides an overall description and a format for the instructions within the instruction type. The table within each subsection provides a description and the valid qualifiers (if applicable) for each mnemonic. Note: In the tables, the term literal refers to a DEX expression that does not contain label references. The term target_address refers to a DEX expression and refers to an address in memory. SR–2510 2.2 Cray Research, Inc. 47 CAM Instruction Set [4] Cray Assembler for MPP (CAM) Reference Manual 4.2.1 Integer load and store instructions The integer load and store instructions move data between integer registers and memory. The format for these instructions is as follows: mnemonic RA,literal[(RB)] If the term (RB) is not supplied by the user, the assembler substitutes R31. Table 2, page 48 describes the integer load and store instructions. Operand qualifiers are not valid for this type of instruction. Table 2. Integer load and store instructions Mnemonic Description Usage la The 48-bit constant specified by target_address is loaded into the specified register. The assembler generates the instructions required to form the constant. la R10,target_address lal The sign-extended lower 16 bits (0–15) of literal is added to the contents of the register specified in parenthesis and the result is stored in the first specified register. lal R10,literal(R11) lalm The sign-extended lower-middle 16 bits (16–31) of literal is multiplied by 65,536 and added to the contents of the register specified in parentheses. The sum is stored in the first specified register. lalm R10,literal(R11) 48 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual CAM Instruction Set [4] Mnemonic Description Usage lau The sign-extended upper 16 bits (48–63) of literal is multiplied by 65,536 and added to the contents of the register specified in parentheses. The sum is stored in the first specified register. The value stored must be shifted into the correct alignment using a separate shift instruction. lau R10,literal(R11) laum The sign-extended upper-middle 16 bits (32–47) of literal is added to the register specified in parentheses and the result is stored in the first specified register. The value stored must be shifted into the correct alignment using a separate shift instruction. laum R10,literal(R11) lda The result of adding the sign-extended literal to the contents of the register specified in parentheses is stored in the first specified register. lda R10,literal(R11) ldah The result of multiplying the sign-extended literal by 65,536 is added to the contents of the register specified in parentheses. The sum is stored in the first specified register. ldah R10,literal(R11) SR–2510 2.2 Cray Research, Inc. 49 CAM Instruction Set [4] Cray Assembler for MPP (CAM) Reference Manual Mnemonic Description Usage ldl The 32-bit value located in memory at the address specified by the sum of the sign-extended literal and the contents of the register specified in parentheses are stored in the first specified register. ldl R10,literal(R11) ldl_l The implementation of this instruction for Cray MPP systems differs from that of other Alpha architectures. This instruction loads a 32-bit value from the DTB annex portion of the Cray Researchsupporting logic. This value is stored in the first specified register. ldl_l R10,literal(R11) ldq The 64-bit value located in memory at the address specified by the sum of the sign-extended literal and the contents of the register specified in parentheses is stored in the first specified register. ldq R10,literal(R11) ldq_l The implementation of this instruction for Cray MPP systems differs from that of other Alpha architectures. This instruction loads a 64-bit value from the DTB annex portion of the Cray Research supporting logic. This value is stored in the first specified register. ldq_l R10,literal(R11) 50 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual CAM Instruction Set [4] Mnemonic Description Usage ldq_u The contents of the memory location formed by adding the sign-extended literal to the contents of the register in parenthesis and then clearing the low-order 3 bits of that 64-bit value is stored in the first specified register. ldq_u R10,literal(R11) li The 64-bit constant specified by constant is loaded into the specified register. The assembler generates the instructions required to form the constant. la R10,constant stl The 32-bit value in the first specified register is stored at the memory address specified by the sum of the sign-extended literal and the contents of the register specified in parentheses. stl R10,literal(R11) stl_c The implementation of this instruction for Cray MPP systems differs from that of other Alpha architectures. This instruction stores a 32-bit value from the specified register to the DTB annex portion of the Cray research supporting logic. stl_c R10,literal(R11) stq The 64-bit value in the first specified register is stored at the memory location specified by the sum of the sign-extended literal and the contents of the register specified in parentheses. stq R10,literal(R11) SR–2510 2.2 Cray Research, Inc. 51 CAM Instruction Set [4] Cray Assembler for MPP (CAM) Reference Manual Mnemonic Description Usage stq_c The implementation of this instruction for Cray MPP systems differs from that of other Alpha architectures. This instruction stores a 64-bit value from the specified register to the DTB annex portion of the Cray Research supporting logic. stq_c R10,literal(R11) stq_u The contents of the first specified register is stored at the memory location formed by adding the sign-extended literal to the contents of the register in parentheses and then clearing the low-order 3 bits of that 64-bit value. stq_u R10,literal(R11) 4.2.2 Integer control instructions The integer control instructions provide a means of transferring program control to a different location within the program. The updated program counter (PC) is an implied register in all of these instructions. Instructions beq through bsr are branch instructions and use the following format: mnemonic RA,target_address Instructions jmp through ret are jump instructions and use the following format: mnemonic RA,(RB)[,literal] 52 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual CAM Instruction Set [4] Table 3, page 53 describes the integer control instructions. Operand qualifiers are not valid for this type of instruction. Table 3. Integer control instructions Mnemonic Description Usage beq If the specified register is equal to zero, the specified target address is loaded into the program counter. Otherwise, the program continues with the next sequential instruction. beq R10,target_address bge If the specified register is greater than or equal to zero, the specified target address is loaded into the program counter. Otherwise, the program continues with the next sequential instruction. bge R10,target_address bgt If the specified register is greater than zero, the specified target address is loaded into the program counter. Otherwise, the program continues with the next sequential instruction. bgt R10,target_address blbc If the low bit of the specified register is clear, the specified target address is loaded into the program counter. Otherwise, the program continues with the next sequential instruction. blbc R10,target_address SR–2510 2.2 Cray Research, Inc. 53 CAM Instruction Set [4] Cray Assembler for MPP (CAM) Reference Manual Mnemonic Description Usage blbs If the low bit of the specified register is set, the specified target address is loaded into the program counter. Otherwise, the program continues with the next sequential instruction. blbs R10,target_address ble If the specified register is less than or equal to zero, the specified target address is loaded into the program counter. Otherwise, the program continues with the next sequential instruction. ble R10,target_address blt If the specified register is less than zero, the specified target address is loaded into the program counter. Otherwise, the program continues with the next sequential instruction. blt R10,target_address bne If the specified register is not equal to zero, the specified target address is loaded into the program counter. Otherwise, the program continues with the next sequential instruction. bne R10,target_address br The current program counter address is written to the specified register and the specified target address is loaded into the program counter. The program continues with the next sequential instruction at that point. br R10,target_address 54 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual CAM Instruction Set [4] Mnemonic Description Usage bsr The current program counter address is written to the specified register and the specified target address is loaded into the program counter. The program continues with the next sequential instruction at that point. bsr R10,target_address jmp The program count of the instruction following the jmp instruction is written to the first specified register, and program execution continues at the address specified in the register in parenthesis. jmp R10,(R11[),literal] jsr The program count of the instruction following the jsr instruction is written to the first specified register, and program execution continues at the address specified in the register in parenthesis until the ret instruction is encountered. jsr R10,(R11)[,literal] SR–2510 2.2 Cray Research, Inc. 55 CAM Instruction Set [4] Mnemonic Cray Assembler for MPP (CAM) Reference Manual Description Usage jsr_coroutine The program count of the instruction following the jsr_coroutine instruction is written to the first specified register, and program execution continues at the address specified in the register in parenthesis until the end of the subroutine. Program execution then returns to the original address. ret Signals the end of a subroutine and returns execution of the program to the original address. jsr_coroutine R10,(R11)[,literal] ret R10,(R11)[,literal] 4.2.3 Integer arithmetic instructions The integer arithmetic instructions perform add, subtract, multiply, and signed/unsigned compare operations. The format for these instructions can be either of the following: mnemonic/v RA,RB,RC mnemonic/v RA, literal,RC Table 4, page 57 describes the integer arithmetic instructions. The v qualifier is optional with some of these instructions. See Table 1, page 45, for information on the v qualifier. 56 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual CAM Instruction Set [4] Table 4. Integer arithmetic instructions Mnemonic Description Usage addl/v Performs addition of the contents of two specified integer registers or a register and a literal and places the sign-extended 32-bit sum in a third specified register. addl R20,R21,R22 addl R20,literal,R22 addq/v Performs addition of the contents of two specified integer registers or a register and a literal and places the sign-extended 64-bit sum in a third specified register. addq R20,R21,R22 addq R20,literal,R22 mull/v The first specified register is multiplied mull R10,R11,R12 mull R10,literal,R12 by the second specified register or a literal and the sign-extended 32-bit product is stored in the third specified register. On overflow, the proper sign extension of the least significant 32 bits is written to the destination register. mulq/v The first specified register is multiplied mulq R10,R11,R12 mulq R10,literal,R12 by the second specified register or a literal and the 64-bit product is stored in the third specified register. On overflow, the least significant 64 bits are written to the destination register. s4addl/v The first specified register is scaled by 4 and added to the second specified register or a literal. The sign-extended 32-bit sum is stored in the third specified register. SR–2510 2.2 Cray Research, Inc. s4addl R10,R11,R12 s4addl R10,literal,R12 57 CAM Instruction Set [4] Cray Assembler for MPP (CAM) Reference Manual Mnemonic Description Usage s4addq/v The first specified register is scaled by 4 and added to the second specified register or a literal. The sign-extended 64-bit sum is stored in the third specified register. s4addq R10,R11,R12 s4addq R10,literal,R12 s4subl/v The first specified register is scaled by 4 and the second specified register or a literal is subtracted from it. The sign-extended 32-bit difference is stored in the third specified register. s4subl R10,R11,R12 s4subl R10,literal,R12 s4subq/v The first specified register is scaled by 4 and the second specified register or a literal is subtracted from it. The 64-bit difference is stored in the third specified register. s4subq R10,R11,R12 s4subq R10,literal,R12 s8addl/v The first specified register is scaled by 8 and added to the second specified register or a literal. The sign-extended 32-bit sum is stored in the third specified register. s8addl R10,R11,R12 s8addl R10,literal,R12 s8addq/v The first specified register is scaled by 8 and added to the second specified register or a literal. The 64-bit sum is stored in the third specified register. s8addq R10,R11,R12 s8addq R10,literal,R12 s8subl/v The first specified register is scaled by 8 and the second specified register or a literal is subtracted from it. The sign-extended 32-bit difference is stored in the third specified register. s8subl R10,R11,R12 s8subl R10,literal,R12 58 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual CAM Instruction Set [4] Mnemonic Description Usage s8subq/v The first specified register is scaled by 8 and the second specified register or a literal is subtracted from it. The 64-bit difference is stored in the third specified register. s8subq R10,R11,R12 s8subq R10,literal,R12 subl/v The second specified register or a literal is subtracted from the first specified register and the sign-extended 32-bit difference is stored in the third specified register. subl R10,R11,R12 subl R10,literal,R12 subq/v The second specified register is subtracted from the first specified register and the 64-bit difference is stored in the third specified register. subq R10,R11,R12 subq R10,literal,R12 umulh umulh R10,R11,R12 The first specified register and the second specified register or a literal are umulh R10,literal,R12 multiplied as unsigned numbers. The high-order 64 bits of the 128-bit product are stored in the third specified register. 4.2.4 Integer compare instructions Integer compare instructions perform signed and unsigned comparisons between the contents of two integer registers and write a nonzero floating value to a third register based upon the relationship specified by the mnemonic. The format for the compare instructions can be either of the following: mnemonic RA,RB,RC mnemonic RA,literal,RC SR–2510 2.2 Cray Research, Inc. 59 CAM Instruction Set [4] Cray Assembler for MPP (CAM) Reference Manual Table 5, page 60 describes the integer compare instructions. Operand qualifiers are not valid for this type of instruction. Table 5. Integer compare instructions Mnemonic Description Usage cmpeq Performs a signed comparison between the value of the first specified register and the value of the second specified register or a literal and writes a 1 to the third specified register if they are equal or a 0 if they are not equal. cmpeq R10,R11,R12 cmpeq R10,literal,R12 cmple Performs a signed comparison between the value of the first specified register and the value of the second specified register or a literal and writes a 1 to the third specified register if the value of the first specified register is less than or equal to the value of the second specified register or literal. A 0 is written to the third specified register if the first specified register is greater than the second specified register or literal. cmple R10,R11,R12 cmple R10,literal,R12 60 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual CAM Instruction Set [4] Mnemonic Description Usage cmplt Performs a signed comparison between the value of the first specified register and the value of the second specified register or a literal and writes a 1 to the third specified register if the value of the first specified register is less than the value of the second specified register or literal. A 0 is written to the third specified register if the first specified register is greater than or equal to the second specified register or literal. cmplt R10,R11,R12 cmplt R10,literal,R12 cmpule Performs an unsigned 32-bit comparison between the values in the first specified register and the value in the second specified register or a literal. If the value in the first specified register is less than or equal to the value in the second specified register or a literal, a 1 is written to the third specified register. Otherwise, a zero is written to the third specified register. cmpule R10,R11,R12 cmpule R10,literal,R12 cmpult Performs an unsigned 32-bit comparison cmpult R10,R11,R12 cmpult R10,literal,R12 between the values in the first specified register and the value in the second specified register or a literal. If the value in the first specified register is less than the value in the second specified register or a literal, a 1 is written to the third specified register. Otherwise, a zero is written to the third specified register. SR–2510 2.2 Cray Research, Inc. 61 CAM Instruction Set [4] Cray Assembler for MPP (CAM) Reference Manual 4.2.5 Logical instructions The logical instructions perform 64-bit Boolean operations between two integer registers or an integer register and a literal. The format for these instructions can be either of the following: mnemonic RA,RB,RC mnemonic RA,literal,RC Note: The NOT function can be performed by doing an ORNOT with zero (RA = R31). Table 6, page 62 describes the logical instructions. Operand qualifiers are not valid for this type of instruction. Table 6. Logical instructions Mnemonic Description Usage and Performs the Boolean logical product function between the contents of two specified integer registers or an integer register and a literal and places the result in a third specified integer register. and R20,R21,R22 and R20,literal,R20 bic Performs the Boolean logical product with complement function between the contents of two specified integer registers or an integer register and a literal and places the result in a third specified integer register. bic R20,R21,R22 bic R20,literal,R20 62 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual CAM Instruction Set [4] Mnemonic Description Usage bis Performs the Boolean logical sum (OR) function between the contents of two specified integer registers or an integer register and a literal and places the result in a third specified integer register. bis R20,R21,R22 bis R20,literal,R20 eqv Performs the Boolean logical equivalence function (XORNOT) between the first specified register and the second specified register or a literal and stores the result in the third specified register. eqv R10,R11,R12 eqv R10,literal,R12 ornot Performs the Boolean logical sum with complement function between the contents of two specified integer registers or an integer register and a literal and places the result in a third specified integer register. ornot R10,R11,R12 ornot R10,literal,R12 xor Performs the Boolean exclusive OR function between the first specified register and the second specified register or literal and stores the result in the third specified register. xor R10,R11,R12 xor R10,literal,R12 4.2.6 Conditional move instructions Conditional move instructions move data between integer registers if the relationship specified in the mnemonic is true. The format for these instructions can be either of the following: mnemonic RA,RB,RC SR–2510 2.2 Cray Research, Inc. 63 CAM Instruction Set [4] Cray Assembler for MPP (CAM) Reference Manual mnemonic RA,literal,RC Table 7, page 64 describes the conditional move instructions. Operand qualifiers are not valid for this type of instruction. Table 7. Conditional move instructions Mnemonic Description Usage cmoveq If the value of the first specified register is equal to zero, the value of the second specified register or a literal is moved to the third specified register. cmoveq R10,R11,R12 cmoveq R10,literal,R12 cmovge If the value of the first specified register is greater than or equal to zero, the value of the second specified register or a literal is moved to the third specified register. cmovge R10,R11,R12 cmovge R10,literal,R12 cmovgt If the value of the first specified register is greater than zero, the value of the second specified register or a literal is moved to the third specified register. cmovgt R10,R11,R12 cmovgt R10,literal,R12 cmovlbc If the low bit of the first specified register is clear, the value of the second specified register or a literal is moved to the third specified register. cmovlbc R10,R11,R12 cmovlbc R10,literal,R12 cmovlbs If the low bit of the first specified register is set, the value of the second specified register or a literal is moved to the third specified register. cmovlbs R10,R11,R12 cmovlbs R10,literal,R12 64 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual CAM Instruction Set [4] Mnemonic Description Usage cmovle If the value of the first specified register is less than or equal to zero, the value of the second specified register or a literal is moved to the third specified register. cmovle R10,R11,R12 cmovle R10,literal,R12 cmovlt If the value of the first specified register is less than zero, the value of the second specified register or a literal is moved to the third specified register. cmovlt R10,R11,R12 cmovlt R10,literal,R12 cmovne If the value of the first specified register is not equal to zero, the value of the second specified register or a literal is moved to the third specified register. cmovne R10,R11,R12 cmovne R10,literal,R12 4.2.7 Shift instructions Shift instructions perform left and right logical shift and right arithmetic shift of data within an integer register. The format for these instructions can be either of the following: mnemonic RA,RB,RC mnemonic RA,literal,RC Note: An arithmetic shift left is accomplished by using a logical shift left. Table 8, page 66 describes the shift instructions. Operand qualifiers are not valid for this type of instruction. SR–2510 2.2 Cray Research, Inc. 65 CAM Instruction Set [4] Cray Assembler for MPP (CAM) Reference Manual Table 8. Shift instructions Mnemonic Description Format sll The first specified register is shifted left by the number of bits specified in the second specified register or a literal. The vacated bit positions become zeros, and the result is stored in the third specified register. sll R10,R11,R12 sll R10,literal,R12 sra The first specified register is shifted right arithmetically by the number of bits specified in the second specified register or a literal. The sign bit is propagated into the vacated bit positions, and the result is stored in the third specified register. sra R10,R11,R12 sra R10,literal,R12 srl The first specified register is shifted right by the number of bits specified in the second specified register or a literal. The vacated bit positions become zeros, and the result is stored in the third specified register. srl R10,R11,R12 srl R10,literal,R12 4.2.8 Byte manipulation instructions Byte manipulation instructions perform operations on byte operands within registers. The format for these instructions can be either of the following: mnemonic RA,RB,RC mnemonic RA,literal,RC 66 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual CAM Instruction Set [4] Table 9, page 67 describes the byte manipulation instructions. Operand qualifiers are not valid for this type of instruction. Table 9. Byte manipulation instructions Mnemonic Description Usage cmpbge Performs eight parallel unsigned byte comparisons between corresponding bytes of the first two specified registers or the first specified register and a literal and stores the results in the lower 8 bits of the third specified register. The high 56 bits of the third specified register are set to zero. cmpbge R10,R11,R12 cmpbge R10,literal,R12 extbl The first specified register is shifted right by the number of bytes specified by the value of the second specified register or a literal. The lowest byte is extracted, replaced by zeros, and stored in the third specified register. extbl R10,R11,R12 extbl R10,literal,R12 extlh The first specified register is shifted left by the number of bytes specified by the value of the second specified register or a literal. The highest 8 bytes are extracted, replaced by zeros, and stored in the third specified register. extlh R10,R11,R12 extlh R10,literal,R12 SR–2510 2.2 Cray Research, Inc. 67 CAM Instruction Set [4] Cray Assembler for MPP (CAM) Reference Manual Mnemonic Description Usage extll The first specified register is shifted right by the number of bytes specified by the value of the second specified register or a literal. The lowest 8 bytes are extracted, replaced by zeros, and stored in the third specified register. extll R10,R11,R12 extll R10,literal,R12 extqh The first specified register is shifted left by the number of bytes specified by the value of the second specified register or a literal. The highest 4 bytes are extracted, replaced by zeros, and stored in the third specified register. extqh R10,R11,R12 extqh R10,literal,R12 extql The first specified register is shifted right by the number of bytes specified by the value of the second specified register or a literal. The lowest 4 bytes are extracted, replaced by zeros, and stored in the third specified register. extql R10,R11,R12 extql R10,literal,R12 extwh The first specified register is shifted left by the number of bytes specified by the value of the second specified register or a literal. The highest 2 bytes are extracted, replaced by zeros, and stored in the third specified register. extwh R10,R11,R12 extwh R10,literal,R12 extwl The first specified register is shifted right by the number of bytes specified by the value of the second specified register or a literal. The lowest 2 bytes are extracted, replaced by zeros, and stored in the third specified register. extwl R10,R11,R12 extwl R10,literal,R12 68 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual CAM Instruction Set [4] Mnemonic Description Usage insbl Shifts the first specified register left by the amount in the second specified register or literal and inserts the low byte into a field of zeros. The result is placed in the third specified register. insbl R10,R11,R12 insbl R10,literal,R12 inslh Shifts the first specified register right by the amount in the second specified register or literal and inserts the high 32 bits into a field of zeros. The result is placed in the third specified register. inslh R10,R11,R12 inslh R10,literal,R12 insll Shifts the first specified register left by the amount in the second specified register or literal and inserts the low 32 bits into a field of zeros. The result is placed in the third specified register. insll R10,R11,R12 insll R10,literal,R12 insqh Shifts the first specified register right by the amount in the second specified register or literal and inserts the high 64 bits into a field of zeros. The result is placed in the third specified register. insqh R10,R11,R12 insqh R10,literal,R12 insql Shifts the first specified register left by the amount in the second specified register or literal and inserts the low 64 bits into a field of zeros. The result is placed in the third specified register. insql R10,R11,R12 insql R10,literal,R12 inswh Shifts the first specified register right by the amount in the second specified register or literal and inserts the high 16 bits into a field of zeros. The result is placed in the third specified register. inswh R10,R11,R12 inswh R10,literal,R12 SR–2510 2.2 Cray Research, Inc. 69 CAM Instruction Set [4] Cray Assembler for MPP (CAM) Reference Manual Mnemonic Description Usage inswl Shifts the first specified register left by the amount in the second specified register or literal and inserts the low 16 bits into a field of zeros. The result is placed in the third specified register. inswl R10,R11,R12 inswl R10,literal,R12 mskbl mskbl R10,R11,R12 The byte to the right of the starting position in the first specified register is mskbl R10,literal,R12 set to zero. The value of the second specified register or a literal selects the starting position. The results are stored in the third specified register. msklh msklh R10,R11,R12 The 4 bytes to the left of the starting msklh R10,literal,R12 position in the first specified register are set to zero. The value of the second specified register or a literal selects the starting position. The results are stored in the third specified register. mskll The 4 bytes to the right of the starting mskll R10,R11,R12 mskll R10,literal,R12 position in the first specified register are set to zero. The value of the second specified register or a literal selects the starting position. The results are stored in the third specified register. mskqh mskqh R10,R11,R12 The 8 bytes to the left of the starting mskqh R10,literal,R12 position in the first specified register are set to zero. The value of the second specified register or a literal selects the starting position. The results are stored in the third specified register. 70 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual CAM Instruction Set [4] Mnemonic Description mskql The 8 bytes to the right of the starting mskql R10,R11,R12 mskql R10,literal,R12 position in the first specified register are set to zero. The value of the second specified register or a literal selects the starting position. The results are stored in the third specified register. mskwh mskwh R10,R11,R12 The 2 bytes to the left of the starting mskwh R10,literal,R12 position in the first specified register are set to zero. The value of the second specified register or a literal selects the starting position. The results are stored in the third specified register. mskwl The 2 bytes to the right of the starting mskwl R10,R11,R12 mskwl R10,literal,R12 position in the first specified register are set to zero. The value of the second specified register or a literal selects the starting position. The results are stored in the third specified register. SR–2510 2.2 Usage Cray Research, Inc. 71 CAM Instruction Set [4] Cray Assembler for MPP (CAM) Reference Manual Mnemonic Description Usage zap Sets the selected bytes of the first specified register to zero. Bytes are selected by the value of the second specified register or literal. The results are stored in the last specified register. A result byte is set to zero if the corresponding bit of the second specified register is a 1. zap R10,R11,R12 zap R10,literal,R12 zapnot Sets the selected bytes of the first specified register to zero. Bytes are selected by the value of the second specified register or literal. The results are stored in the last specified register. A result byte is set to zero if the corresponding bit of the second specified register is a 0. zapnot R10,R11,R12 zapnot R10,literal,R12 4.2.9 Floating-point load and store instructions Floating-point load and store instructions move data between floating-point registers and memory. The format for these instructions is as follows: mnemonic FA,literal[(RB)] If the term (RB) is not supplied by the user, the assembler substitutes R31. Table 10, page 73 describes the floating-point load and store instructions. Operand qualifiers are not valid for this type of instruction. 72 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual CAM Instruction Set [4] Table 10. Floating-point load and store instructions Mnemonic Description Usage lds The single-precision 32-bit value located in memory at the address specified by the sum of the sign-extended literal and the contents of the register specified in parentheses is stored in the first specified floating-point register. lds F10,literal (R11) ldt The double-precision 64-bit value located in memory at the address specified by the sum of the sign-extended literal and the contents of the register specified in parentheses is stored in the first specified floating-point register. ldt F10,literal (R11) sts The single-precision 32-bit value in the first specified floating-point register is stored at the memory location formed by the sum of the sign-extended literal and the contents of the register specified in parentheses. sts F10,literal (R11) stt The double-precision 64-bit value in the first specified floating-point register is stored at the memory location specified by the sum of the sign-extended literal and the contents of the register specified in parentheses. stt F10,literal (R11) SR–2510 2.2 Cray Research, Inc. 73 CAM Instruction Set [4] Cray Assembler for MPP (CAM) Reference Manual 4.2.10 Floating-point control instructions Floating-point control instructions test the value of a floating-point register and change the program counter if the condition specified by the mnemonic is true. The format for these instructions is as follows: mnemonic FA,target_address Table 11, page 74 describes the floating-point control instructions. Operand qualifiers are not valid for this type of instruction. Table 11. Floating-point control instructions Mnemonic Description Usage fbeq If the specified floating-point register is equal to zero, the specified target address is loaded into the program counter. Otherwise, the program continues with the next sequential instruction. fbeq F10,target_address fbge If the specified floating-point register is greater than or equal to zero, the specified target address is loaded into the program counter. Otherwise, the program continues with the next sequential instruction. fbge F10,target_address fbgt If the specified floating-point register is greater than zero, the specified target address is loaded into the program counter. Otherwise, the program continues with the next sequential instruction. fbgt F10,target_address 74 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual CAM Instruction Set [4] Mnemonic Description Usage fble If the specified floating-point register is less than or equal to zero, the specified target address is loaded into the program counter. Otherwise, the program continues with the next sequential instruction. fble F10,target_address fblt If the specified floating-point register is less than zero, the specified target address is loaded into the program counter. Otherwise, the program continues with the next sequential instruction. fblt F10,target_address fbne If the specified floating-point register is not equal to zero, the specified target address is loaded into the program counter. Otherwise, the program continues with the next sequential instruction. fbne F10,target_address 4.2.11 Floating-point copy instructions Floating-point copy instructions perform copy operations on 64-bit register values. The format for floating-point copy instructions is as follows: mnemonic FA,FB,FC Table 12, page 76 describes the floating-point copy instructions. Operand qualifiers are not valid for this type of instruction. SR–2510 2.2 Cray Research, Inc. 75 CAM Instruction Set [4] Cray Assembler for MPP (CAM) Reference Manual Table 12. Floating-point copy instructions Mnemonic Description Usage cpys cpys F10,F11,F12 Concatenates the sign bit of the floating-point value in the first specified register with the exponent and fraction bits of the second specified register and stores the result in the third specified register. cpyse cpyse F10,F11,F12 Concatenates the sign and exponent bits of the floating-point value in the first specified register with the fraction bits of the second specified register and stores the result in the third specified register. cpysn Concatenates the complemented sign bit of the floating-point value in the first specified register with the exponent and fraction bits of the second specified register and stores the result in the third specified register. cpysn F10,F11,F12 4.2.12 Floating-point conversion instructions Floating-point conversion instructions perform conversion operations on 64-bit register values. The format for the conversion instructions is as follows: mnemonic FA,FB Operand qualifiers are valid for each instruction only as described in Table 13, page 77. See Table 1, page 45, for more information on operand qualifiers. 76 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual CAM Instruction Set [4] Table 13. Floating-point conversion instructions Mnemonic Description Usage cvtlq The floating-point operand in the first specified register is converted to a twos complement 64-bit number and stored in the second specified register. cvtlq F10,F11 cvtql/sv Converts the 32-bit integer in the first specified register to a 64-bit integer and stores the result in the second specified register. cvtql F10,F11 cvtgq The VAX-format floating-point operand in the first specified register is converted to a twos complement 32-bit number and stored in the second specified register. cvtgq F10,F11 cvtqs/[cmd]usi cvtqs F10,F11 The integer in the first specified register is converted to a single-precision floating-point number and stored in the second specified register. The result is complemented if it is negative, normalized, rounded to the target precision, and packed with the appropriate sign and exponent field. cvtqt/[cmd]usi cvtqt F10,F11 The integer in the first specified register is converted to a double-precision floating-point number and stored in the second specified register. The result is complemented if it is negative, normalized, rounded to the target precision, and packed with the appropriate sign and exponent field. SR–2510 2.2 Cray Research, Inc. 77 CAM Instruction Set [4] Mnemonic Cray Assembler for MPP (CAM) Reference Manual Description Usage cvttq/[cmd]usvi The floating operand in the first specified register is converted to a twos complement number and stored in the second specified register. The operand fraction is aligned with the binary point to the right of bit zero, rounded as specified, and complemented if negative. cvttq F10,F11 The double-precision floating-point number in the first specified register is converted to a single-precision floating-point number and stored in the second specified register. cvtts F10,F11 cvtts/[cmd]usi 4.2.13 Floating-point move instructions Floating-point move instructions move the contents of a floating-point register to another floating-point register if the condition specified by the mnemonic is true. Instructions that change the contents of the floating-point control register (FPCR) are also included. The format for these instructions is as follows: mnemonic FA,FB,FC Note: When writing to or reading from the FPCR, FA, FB, and FC must point to the same register. Table 14, page 79 describes the floating-point move instructions. Operand qualifiers are not valid for this type of instruction. 78 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual CAM Instruction Set [4] Table 14. Floating-point move instructions Mnemonic Description fcmoveq If the first specified register is equal to fcmoveq F10,F11,F12 zero, the content of the second specified register is written to the third specified register. Otherwise, the third specified register is unchanged. fcmovge If the first specified register is greater than or equal to zero, the content of the second specified register is written to the third specified register. Otherwise, the third specified register is unchanged. fcmovge F10,F11,F12 fcmovgt If the first specified register is greater than zero, the content of the second specified register is written to the third specified register. Otherwise, the third specified register is unchanged. fcmovgt F10,F11,F12 fcmovle If the first specified register is less than or equal to zero, the content of the second specified register is written to the third specified register. Otherwise, the third specified register is unchanged. fcmovle F10,F11,F12 fcmovlt If the first specified register is less than zero, the content of the second specified register is written to the third specified register. Otherwise, the third specified register is unchanged. fcmovlt F10,F11,F12 SR–2510 2.2 Usage Cray Research, Inc. 79 CAM Instruction Set [4] Cray Assembler for MPP (CAM) Reference Manual Mnemonic Description Usage fcmovne fcmovne F10,F11,F12 If the first specified register is not equal to zero, the content of the second specified register is written to the third specified register. Otherwise, the third specified register is unchanged. mf_fpcr The contents of the FPCR are moved to the specified floating-point register. mf_fpcr F1,F1,F1 mt_fpcr The contents of the specified floating-point register are moved to the FPCR. mt_fpcr F1,F1,F1 4.2.14 Floating-point compare instructions Floating-point compare instructions compare the contents of two floating-point registers and write a nonzero floating value to a third register based upon the relationship specified by the mnemonic. The format for the compare instructions is as follows: mnemonic FA,FB,FC Table 15, page 81 describes the floating-point compare instructions. The s and u operand qualifiers are valid for all floating-point compare instructions. See Table 1, page 45, for more information on operand qualifiers. 80 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual CAM Instruction Set [4] Table 15. Floating-point compare instructions Mnemonic Description cmpteq/su Compares the operands in the first two cmpteq F10,F11,F12 specified floating-point registers. If the operands are equal, a nonzero floating value is written to the third specified floating-point register. Otherwise, a zero is written to the third specified register. cmptle/su Compares the operands in the first two cmptle F10,F11,F12 specified floating-point registers. If the operand in the first specified floating-point register is less than or equal to the operand in the second specified floating-point register, a nonzero floating value is written to the third specified floating-point register. Otherwise, a zero is written to the third specified register. SR–2510 2.2 Usage Cray Research, Inc. 81 CAM Instruction Set [4] Cray Assembler for MPP (CAM) Reference Manual Mnemonic Description Usage cmptlt/su Compares the operands in the first two cmptlt F10,F11,F12 specified floating-point registers. If the operand in the first specified floating-point register is less than the operand in the second specified floating-point register, a nonzero floating value is written to the third specified floating-point register. Otherwise, a zero is written to the third specified register. cmptun/su Compares the operands in the first two cmptun F10,F11,F12 specified floating-point registers. If the operand in either floating-point register is not a number, a nonzero floating value is written to the third specified floating-point register. Otherwise, a zero is written to the third specified register. 4.2.15 Floating-point arithmetic instructions Floating-point arithmetic instructions perform arithmetic operations using the contents of two floating-point registers. The result is stored in a third floating-point register. The format for the arithmetic instructions is as follows: mnemonic FA,FB,FC Operand qualifiers are valid for each floating-point arithmetic instruction as specified in Table 16, page 83. See Table 1, page 45, for more information on operand qualifiers. 82 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual CAM Instruction Set [4] Table 16. Floating-point arithmetic instructions Mnemonic Description Usage adds/[cmd]usi Performs single-precision 32-bit addition of two specified floating-point registers and places the sum in a third specified register. adds F10,F11,F12 addt/[cmd]usi Performs double-precision 64-bit addition of two specified floating-point registers and places the sum in a third specified register. addt F10,F11,F12 divs/[cmd]usi The single-precision 32-bit operand in the first specified register is divided by the single-precision operand in the second specified register and the quotient is stored in the third specified register. The quotient is rounded to the specified precision and checked for underflow/overflow. divs F10,F11,F12 divt/[cmd]usi The double-precision 64-bit operand in the first specified register is divided by the double-precision operand in the second specified register and the quotient is stored in the third specified register. The quotient is rounded to the specified precision and checked for underflow/overflow. divt F10,F11,F12 SR–2510 2.2 Cray Research, Inc. 83 CAM Instruction Set [4] Cray Assembler for MPP (CAM) Reference Manual Mnemonic Description Usage muls/[cmd]usi The single-precision 32-bit operand in the first specified floating-point register is multiple by the single-precision 32-bit operand in the second specified floating-point register and the product is stored in the third specified floating-point register. muls F10,F11,F12 mult/[cmd]usi The double-precision 64-bit operand in the first specified floating-point register is multiplied by the double-precision 64-bit operand in the second specified floating-point register and the product is stored in the third specified floating-point register. mult F10,F11,F12 subs/[cmd]usi The single-precision 32-bit subtrahend operand in the second specified floating-point register is subtracted from the single-precision 32-bit minuend operand in the first specified floating-point register and the single-precision 32-bit difference is stored in the third specified floating-point register. subs F10,F11,F12 subt/[cmd]usi The double-precision 64-bit subtrahend operand in the second specified floating-point register is subtracted from the double-precision 64-bit minuend operand in the first specified floating-point register and the double-precision 64-bit difference is stored in the third specified floating-point register. subt F10,F11,F12 84 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual CAM Instruction Set [4] 4.2.16 Miscellaneous instructions The following miscellaneous instructions perform various operations related to program execution. The format for these instructions varies. See Table 17, page 85 for the format and description of these miscellaneous instructions. Operand qualifiers are not valid for these instructions. Table 17. Miscellaneous instructions Mnemonic Description Usage call_pal Transfers control to the operating system routine determined by the specified function_code. See the Alpha AXP Architecture Handbook, publication TPD-0007, for more information on PAL code. call_pal function_code excb (CRAY T3E only) Allows software to guarantee that in a pipelined implementation, all previous instructions have completed any related to exceptions or rounding modes before any further instructions are issued. excb fetch Prefetches data. fetch 0(R11) fetch_m Prefetches data, modifies intent. fetch_m 0(R11) mb Allows memory accesses to be serialized on the issuing processor as seen by other processors. mb rpcc The content of the read process cycle counter (RPCC) is written to the specified register. rpcc R10 SR–2510 2.2 Cray Research, Inc. 85 CAM Instruction Set [4] Cray Assembler for MPP (CAM) Reference Manual Mnemonic Description Usage trapb Stalls instruction issuing until all prior instructions are guaranteed to complete without incurring arithmetic traps. trapb wmb (CRAY T3E only) Provides a way for software to control write buffers. wmb 86 Cray Research, Inc. SR–2510 2.2 Assembler Directives [5] Cray Assembler for MPP (CAM) assembly language uses assembler directives to assist the assembler in interpreting source statements and generating an object program. Assembler directives can be given in upper, lower, or mixed case. Note: In this manual, assembler directives appear in lowercase. Each program module begins with a .ident assembler directive and ends with a .end assembler directive . Only comments and assembly time symbols (for example, micros, macros, and local symbols) can precede the .ident assembly directive in a program module. Assembler directives are classified and described according to their applications, as follows: • Conditional assembly • Data definition • Macro control • Message/listing control • Program control Individual assembler directive descriptions are listed alphabetically in Section 5.6, page 93. 5.1 Conditional assembly The conditional assembly assembler directives let the programmer specify conditional assembly of a section of code. If the outcome of the specified condition is true, the section of code that follows the assembler directive is assembled. SR–2510 2.2 Cray Research, Inc. 87 Assembler Directives [5] Cray Assembler for MPP (CAM) Reference Manual All operands used in the conditional tests must be defined at the point where the test is being done. Operands can be integer expressions or strings. If strings are being tested, they must be enclosed in double quotation marks. Conditional assembly assembler directives include the following: • .else / .iff / .if_false • .endc / .endif • .if • .iif Most conditional tests work in straight line assembly code; however, conditional tests that use the intreg, fltreg, and b operators are valid only when used within macros. The eq, ne, gt, ge, lt, and le conditional operators are binary operators (that is, they operate on two operands). These operators take integer or string (in double quotation marks) expressions as operands. The user is required to specify at least one operand. If the second operand is not specified, it is assumed by the assembler to be 0 (for integer operands) or the null string (for string operands). An integer expression can contain a label or a local symbol name. A symbol name is not a macro, macro argument, or micro. The df, ndf, b, nb, intreg, and fltreg conditional operators are unary operators (that is, they operate on one operand). The b and nb operators take macro arguments as operands. Conditional operators are described as follows: 88 Operator Description eq True if operand1 is equal to operand2, or if operand1 is equal to 0 or the null string. ne True if operand1 is not equal to operand2, or if operand1 is not equal to 0 or the null string. Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual SR–2510 2.2 Assembler Directives [5] gt True if operand1 is greater than operand2, or if operand1 is greater than 0 or the null string. Takes integer expressions or strings (in double quotation marks) as operands. ge True if operand1 is greater than or equal to operand2, or if operand1 is greater than or equal to 0 or the null string. Takes integer expressions or strings (in double quotation marks) as operands. lt True if operand1 is less than operand2, or if operand1 is less than 0 or the null string. Takes integer expressions or strings (in double quotation marks) as operands. le True if operand1 is less than or equal to operand2, or if operand1 is less than or equal to 0 or the null string. Takes integer expressions or strings (in double quotation marks) as operands. df True if operand1 is defined; operand1 can be the name of any legal user symbol. Takes symbol names defined using the = or == operators or labels as operands. ndf True if operand1 is not defined; operand1 can be the name of any legal user symbol. Takes symbol names defined using the = or == operators or labels as operands. intreg True if the argument is an integer register; used within macros to test for macro arguments that are integer registers. This operator is valid onlywithin a macro definition or expansion and takes a register specification (such as R1) as an argument. fltreg True if the argument is a floating-point register; used within macros to test for macro arguments that are floating-point registers. This operator is valid only within a macro definition or expansion and takes a register specification (such as R1) as an argument. b True if operand1 is blank; primary use is to test for the existence of a macro argument. This operator is valid only within a macro definition or expansion and takes a register specification (such as R1) as an argument. Cray Research, Inc. 89 Assembler Directives [5] nb Cray Assembler for MPP (CAM) Reference Manual True if operand1 is not blank; primary use is to test for the existence of a macro argument. This operator is valid only within a macro definition or expansion and takes a register specification (such as R1) as an argument. 5.2 Data definition Data definition assembler directives allow the programmer to define data with initial values or as uninitialized data. The data definition assembler directives can only be used in data or common program sections. The .blkx directives ( .blk_bits, .blkb, .blkl, .blkq, .blks, .blkt, and .blkw) define uninitialized sources of data; the rest of these directives define data sources that are initialized. The appropriate type of expression must be used to satisfy the arguments used with data definition assembler directives. The value and repeat arguments are stated as integer expressions unless the data source is in floating-point format. Directives that define sources of floating-point data need floating-point expressions to satisfy the value argument and integer expressions to satisfy the repeat argument. A repeat count cannot be negative. See Section 3.15, page 40, for more information on expressions. Data definition assembler directives include the following: • .ascic • .ascii • .asciz • .bits • .blk_bits • .blkb • .blkl • .blkq 90 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Assembler Directives [5] • .blks • .blkt • .blkw • .byte • .long • .quad • .s_floating / .float • .t _ floating / .double • .word 5.3 Macro control The macro control assembler directives let the user define a sequence of instructions or directives that can be stored and used in subsequent parts of the program like a new instruction. See Chapter 6, page 121, for more information on macros. The macro control assembler directives include the following: • .endm • .macro • .mexit • .mdelete 5.4 Message/listing control The message/listing control assembler directives allow the user to control certain types of messages and the content of the listing file. The message SR–2510 2.2 Cray Research, Inc. 91 Assembler Directives [5] Cray Assembler for MPP (CAM) Reference Manual level of the assembler controls the printing of these messages and the built-in assembler messages. (See the -m option in Section 2.1, page 11.) The message/listing control assembler directives include the following: • .error • .list • .print / .comment • .subtitle • .title • .warning 5.5 Program control The program control assembler directives define the limits of a program module. The program control assembler directives include the following: • .align • .dexend • .dexstart • .end • .endp • .endr • .even • .external / .extern • .ident • .odd 92 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Assembler Directives [5] • .psect • .repeat • .restore _ psect / .restore • .save _ psect / .save • .soft / .weak • .stack • .start 5.6 Assembler directive descriptions The assembler directives from the preceding subsections are listed in the subsections that follow in alphabetical order for easy reference. 5.6.1 .align The .align assembler directive sets the location counter to the next byte address boundary specified by the parameter. The parameter can either be an alignment code or an integer. The alignment caused by the .align assembler directive is relative to the beginning of the program section that it exists in. Therefore, it is required that an alignment be specified within the .psect statement to ensure correct alignment within the program section. The .align assembler directive can be used only within a code or data program section. The syntax for the .align assembler directive can be either of the following: .align align_code .align integer SR–2510 2.2 Cray Research, Inc. 93 Assembler Directives [5] Cray Assembler for MPP (CAM) Reference Manual The following list describes the alignment codes (align_code) that can be used with the .align directive: byte Starts the next code or data item on the next byte boundary (for the CRAY T3D system, the next 1 (20) byte boundary). cache Starts the next code or data item on the next cache line boundary (for the CRAY T3D system, the next 32 (25) byte boundary). long Starts the next code or data item on the next long line boundary (for the CRAY T3D system, the next 4 (22) byte boundary). page Starts the next code or data item on the next page boundary (for the CRAY T3D system, the next 64 (26) byte boundary). quad Starts the next code or data item on the next quad line boundary (for the CRAY T3D system, the next 8 (23) byte boundary). word Starts the next code or data item on the next word line boundary (for the CRAY T3D system, the next 2 (21) byte boundary). The value of integer must be 0 through 15. The next code or data item will start on the next 2integer integer byte boundary. In other words, the lower integer bits of the byte address will be zero. 5.6.2 .ascic The .ascic assembler directive performs the same function as the .ascii assembler directive, but includes the number of bytes contained in the string as the first byte of the string. This byte count does not include the byte that contains the count. The .ascic assembler directive can be used only in data or common program sections. The syntax of the .ascic assembler directive is as follows: .ascic "data_string" 94 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Assembler Directives [5] 5.6.3 .ascii The .ascii assembler directive allows the programmer to enter an ASCII string into memory. The string must be enclosed by double quotation marks. To use double quotation marks within a string, precede the double quotation with a backslash (\). To use a backslash in a string enter two backslashes (\\). Other allowable ASCII codes include \b, \n, and \r. Decimal ASCII codes may be used in a string by using the backslash followed by the 3-digit decimal ASCII code. Hexadecimal ASCII codes may be used in a string by using the backslash followed by either an uppercase or lowercase x and the 2-digit hexadecimal ASCII code. The .ascii assembler directive can only be used in data or common program sections. The syntax for the .ascii assembler directive is as follows: .ascii "data_string" 5.6.4 .asciz The .asciz assembler directive performs the same function as the .ascii assembler directive, but enters a null as the last character of the string. The .asciz assembler directive can be used only in data or common program sections. The syntax for the .asciz assembler directive is as follows: .asciz "data_string" 5.6.5 .bits The .bits assembler directive allows the user to initialize a bit field with a specified integer constant (number). The .bits assembler directive can be used only in data or common program sections. The syntax for the .bits assembler directive is as follows: SR–2510 2.2 Cray Research, Inc. 95 Assembler Directives [5] Cray Assembler for MPP (CAM) Reference Manual .bits number:value[:repeat][,number:value[:repeat]] The value specified must fit into the field specified or it will be truncated. As an optional operand, a repeat count can be specified. This causes the field and initial value to be repeated as if there were that number of consecutive .bit specifications. A repeat count cannot be negative. All data specifications are packed within the program section. The number, value, and repeat variables can be specified as DEX expressions. 5.6.6 .blk_bits The .blk_bits assembler directive allows the programmer to reserve a number of bits (value) of memory. Memory will not be initialized, only reserved. The .blk_bits assembler directive can be used only in data or common program sections. The syntax for the .blk_bits assembler directive is as follows: .blk_bits value The value variable can be specified as a DEX expression. 5.6.7 .blkb The .blkb assembler directive allows the programmer to reserve a number of bytes (value) of memory. Memory will not be initialized, only reserved. The .blkb assembler directive can be used only in data or common program sections. The syntax for the .blkb assembler directive is as follows: .blkb value The value variable can be specified as a DEX expression. 96 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Assembler Directives [5] 5.6.8 .blkl The .blkl assembler directive allows the programmer to reserve a number (value) of longwords (4 bytes) of memory. Memory will not be initialized, only reserved. The .blkl assembler directive can be used only in data or common program sections. The syntax for the .blkl assembler directive is as follows: .blkl value The value variable can be specified as a DEX expression. 5.6.9 .blkq The .blkq assembler directive allows the programmer to reserve a number (value) of quadwords (8 bytes) of memory. Memory will not be initialized, only reserved. The .blkq assembler directive can be used only in data or common program sections. The syntax for the .blkq assembler directive is as follows: .blkq value The value variable can be specified as a DEX expression. 5.6.10 .blks The .blks assembler directive allows the programmer to reserve a number (value) of IEEE single-precision, floating-point format words (4 bytes) of memory. Memory will not be initialized, only reserved. The .blks assembler directive can be used only in data or common program sections. The syntax for the .blks assembler directive is as follows: SR–2510 2.2 Cray Research, Inc. 97 Assembler Directives [5] Cray Assembler for MPP (CAM) Reference Manual .blks value The value variable can be specified as a DEX expression. 5.6.11 .blkt The .blkt assembler directive allows the programmer to reserve a number (value) of IEEE double-precision, floating-point format words (8 bytes) of memory. Memory will not be initialized, only reserved. The .blkt assembler directive can be used only in data or common program sections. The syntax for the .blkt assembler directive is as follows: .blkt value The value variable can be specified as a DEX expression. 5.6.12 .blkw The .blkw assembler directive allows the programmer to reserve a number (value) of words (2 bytes) of memory. Memory will not be initialized, only reserved. The .blkw assembler directive can be used only in data or common program sections. The syntax for the .blkw assembler directive is as follows: .blkw value The value variable can be specified as a DEX expression. 98 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Assembler Directives [5] 5.6.13 .byte The .byte assembler directive allows the user to initialize a byte of data with a specified integer constant. A single expression or a list of expressions, separated by commas, may be given as arguments. The .byte assembler directive can be used only in data or common program sections. The syntax for the .byte assembler directive is as follows: .byte number:value[:repeat],[number:value[:repeat]] Thevalue of the expression must be in the range 0 through 255 for unsigned data or +127 through –128 for signed data. The most significant bits outside of these ranges are truncated. As an optional operand, a repeat count can be specified. This causes the byte value to be repeated as if there were that many consecutive .byte specifications. A repeat count cannot be negative. The number, value, and repeat variables can be specified as DEX expressions. 5.6.14 .comment See the .print assembler directive. 5.6.15 .dexend The .dexend assembler directive ends a DEX section. The syntax for the .dexend assembler directive is as follows: .dexend 5.6.16 .dexstart The .dexstart assembler directive marks the beginning of a DEX section. The DEX section contains definitions of DEX expressions. The DEX section can also contain definitions of macros, micros, and local and global set SR–2510 2.2 Cray Research, Inc. 99 Assembler Directives [5] Cray Assembler for MPP (CAM) Reference Manual values. These local and global set values can be used within the DEX expressions that appear within the current DEX section. Only one .dexstart can appear within an .ident and it must be the last assembler directive used before the .dexend and .end directives. The syntax for the .dexstart assembler directive is as follows: .dexstart DEX expressions are defined within the DEX section according to the following syntax: DEX(constant_integer_expression) = dex_expression The DEX section is ended with the .dexend assembler directive or the .end assembler directive of the current .ident. DEX expressions are described in Section 3.15.3, page 42. 5.6.17 .double See the .t_floating assembler directive. 5.6.18 .else The .else assembler directive marks the end of the true portion of a conditional assembly block, which was started with the .if assembler directive. If the logical test of the .if was false, the code block between the .if and .endif is skipped and assembly continues with the statement following the .else assembler directive. This block of code is terminated with .endif. Note: .iff and .if_false are alternate forms of the same assembler directive. 100 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Assembler Directives [5] 5.6.19 .end The .end assemble directive marks the end of the program module. The syntax for the .end assembler directive is as follows: .end [name] The use of name with the .end directive is optional, however, if name is used, it must match the name used with the preceding .ident directive. Note: An identifier placed on a/ .end directive is not used to specify the program entry point. 5.6.20 .endc See the .endif assembler directive. 5.6.21 .endif The .endif assembler directive marks the end of a conditional assembly block that was started with the .if assembler directive. Assembly continues with the statement following the .endif assembler directive. Note: .endc is an alternate form of the .endif assembler directive. 5.6.22 .endm The .endm assembler directive marks the end of a macro definition. All instructions and directives that fall between the .macro and .endm assembler directives make up the body of the macro. Note: If a .endm directive is used out of context, a WARNING message is generated and assembly continues. SR–2510 2.2 Cray Research, Inc. 101 Assembler Directives [5] Cray Assembler for MPP (CAM) Reference Manual 5.6.23 .endp The .endp assembler directive terminates the current program section. The .endp directive is required before additional global definitions can be created. Once the .endp directive has been issued, a new program section must be started before any code or data can be specified. 5.6.24 .endr The .endr assembler directive marks the end of the block of instructions or directives that will be repeated. The following example demonstrates the use of the .endr assembler directive: .repeat 64 .s_float .s_float .endr 1.0 2.0 ;Define a pattern of ;data in memory ;End data definitions Note: If a .endr directive is used out of context, a WARNING message is generated and assembly continues. 5.6.25 .error The .error assembler directive allows for a level 4 severity, programmer-defined ERROR message. See the -m option in Section 2.1, page 11, for more information. The syntax for the .error assembler directive is as follows: .error "text_string"/expression The string of characters between the double quotation marks (text_string) will appear on the programmer’s display during the assembly process and in 102 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Assembler Directives [5] the listing file. If an expression is specified, it will be evaluated and printed. The expression must be a constant integer expression. 5.6.26 .even The .even assembler directive sets the location counter to the next even-byte boundary. If the location counter is odd, 1 is added to make the counter even. 5.6.27 .extern See the .external assembler directive. 5.6.28 .external The .external assembler directive indicates that the specified symbols are defined in another program module. The definition of the symbols will be resolved at link time. The .external assembler directive is valid only within a program module. The syntax for the .external assembler directive is as follows: .external symbol1[,symbol2] Multiple symbols can be specified by using a comma to separate them. Note: .extern is an alternate form of the .external assembler directive. 5.6.29 .float See the .s_floating assembler directive. SR–2510 2.2 Cray Research, Inc. 103 Assembler Directives [5] Cray Assembler for MPP (CAM) Reference Manual 5.6.30 .ident The .ident assembler directive specifies the beginning of a module. The module may be a program module or a subprogram module (a module without a .start directive). If a .ident assembler directive is to be used within a program module, a .start assembler directive must also be used. The syntax for the .ident assembler directive is as follows: .ident name 5.6.31 .if_false See the .else assembler directive. 5.6.32 .if The .if assembler directive marks the beginning of a conditional assembly block. The logical expression is evaluated and, if true, code is assembled up to a corresponding .else or .endif assembler directive. Code that follows the .else assembler directive up to the matching .endif assembler directive will be skipped. The syntax for the .if assembler directive is as follows: .if operator,operand1[,operand2] In the preceding syntax, if operand2 is not specified, zero or the null string is supplied by the assembler according to the type (integer or string) of operand1. The operand1 and operand2 variables must be either a constant integer expression or a string. 5.6.33 .iff See the .else assembler directive. 104 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Assembler Directives [5] 5.6.34 .iif The .iif assembler directive allows the conditional assembly of a single statement. Assembly of the statement occurs only if the statement is true. Use this directive to simplify single test conditions. The syntax for the .iif assembler directive is as follows: .iif operator,operand1[,operand2],statement In the preceding syntax, if operand2 is not specified, zero or the null string is supplied by the assembler according to the type (integer or string) of operand1. The operand1 and operand2 variables must be either a constant integer expression or a string. 5.6.35 .list The . list assembler directive turns listing control flags on and off. For more information, see the -e and -d command options on the cam(1) man page or refer to Chapter 2, page 11. The syntax for the .list assembler directive is as follows: .list flag[,flag][,flag]... The following table lists the possible values for flag that can be used with .list: SR–2510 2.2 EDIT Turns edited line display on NOEDIT Turns edited line display off MAC Shows macro expansion NOMAC Turns macro expansion off MACDEF Shows macro definitions NOMACDEF Turns macro definitions off Cray Research, Inc. 105 Assembler Directives [5] Cray Assembler for MPP (CAM) Reference Manual NOREP Turns repeat block expansion off REP Shows repeat block expansion The following example demonstrates the use of the .list assembler directive: .listMAC,NOMACDEF,NOREP ;Macro expansion ;is turned on, macro ;definition listing ;status is turned off, ;and repeat block ;expansion is turned ;off. 5.6.36 .long The .long assembler directive allows the programmer to define a long word (4 bytes) of data with a specified integer constant. A single expression or a list of expressions, separated by commas, may be given as arguments. The .long assembler directive can be used only in data or common program sections. The syntax for the .long assembler directive is as follows: .long value[:repeat][,value[:repeat]] The value of the expression must be in the range of 0 through 4,294,967,296 for unsigned data or –2,147,483,648 through +2,147,483,647 for signed data. The most significant bits of values outside of these ranges will be truncated. As an optional operand, a repeat count can be specified. This will cause the long value to be repeated as if there were that many consecutive .long specifications. A repeat count cannot be negative. The value and repeat variables can be specified as DEX expressions. 106 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Assembler Directives [5] 5.6.37 .macro The .macro assembler directive starts a macro definition and assigns a name to the sequence of instructions being used in the definition. You can specify an optional value (val1, val2, val3, ...) for each argument. The syntax for the .macro assembler directive is as follows: .macro name arg1[=val1],arg2[=val2][,...] The following examples demonstrate the use of the .macro assembler directive: .macro name arg1,arg2 ;Macro definition ;with positional ;arguments .endm .macro name arg1=val1,arg2=val2 ;Macro definition ;with keyword ;arguments .endm 5.6.38 .mdelete The .mdelete assembler directive deletes the definitions of specified macro(s). If you delete a macro that is currently expanding, the macro name is removed from the macro name space immediately, however, the macro is not deleted until expansion is completed. The syntax for the .mdelete assembler directive is as follows: .mdelete name[,name][,name][,...] SR–2510 2.2 Cray Research, Inc. 107 Assembler Directives [5] Cray Assembler for MPP (CAM) Reference Manual 5.6.39 .mexit The .mexit assembler directive indicates that expansion of the macro should terminate. This happens only during macro call processing. Statement processing will continue with the statement following the macro statement. 5.6.40 .odd The .odd assembler directive sets the location counter to the next odd-byte boundary. If the location counter is even, 1 is added to make the counter odd. 5.6.41 .print The .print assembler directive allows for a level 4 severity, programmer-defined PRINT message. See the -m option in Section 2.1, page 11, for more information. The syntax for the .print assembler directive is as follows: .print "text_string"/expression The string of characters between the double quotation marks (text_string) will appear on the programmer’s display during the assembly process and in the listing file. If an expression is specified, it will be evaluated and printed out. The expression variable must be specified as a constant integer expression. Note: .comment is an alternate form of the .print assembler directive. 5.6.42 .psect The .psect assembler directive names a program section (name) and assigns attributes (attr1, attr2, ...) to that section. An alignment can be specified (align_code) to indicate where the section should start. Attributes 108 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Assembler Directives [5] and alignment codes need not appear in a specified order, however, the name of the program section must follow the .psect directive. The syntax for the .psect assembler directive is as follows: .psect name,[align_code],attr1[,attr2,...] There are two groups of attributes: attributes that directly support the CAM implementation and DEC attributes that have a mapping to the CAM implementation. Some, but not all, of the DEC attributes are being supported to reduce the amount of change required in ported codes. The .psect attributes that directly support the CAM implementation are as follows: • code Only instructions may be placed in a code .psect. Use of the code attribute will also cause the section to be relocated by the loader. • common Sections with the same name and attributes will share the same area of memory. These sections will also be relocated. The common attribute supports Fortran common blocks. • data Only data can be placed in a data section. Use of the data attribute also causes the section to be relocated by the loader. • shared Marks the data within the section as shared data. The shared attribute is valid only when used in combination with common or data attributes. Distribution information is specified following the shared attribute on a dimension-by-dimension basis, using the keyword DIM, as follows: DIM #(cycle,pe,block)[,DIM #(cycle,pe,block)] SR–2510 2.2 Cray Research, Inc. 109 Assembler Directives [5] Cray Assembler for MPP (CAM) Reference Manual The DIM keyword specifies the dimension number (#), the number of passes (cycle) made across the number of processing elements (pe), and the amount of memory bytes (block) allocated to each processor. cycle, pe, and block are expressed as 32-bit integer expressions (block is expressed in bytes). Up to seven dimensions can be specified. The following example demonstrates the use of the DIM keyword with a .psect assembler directive: .psect NAME,data,shared,DIM 1(1,2,2) The # variable must be a constant integer expression. The cycle, pe, and block variables can be specified as DEX expressions. The following list describes DEC attributes that support the CAM implementation of .psect: exe Executable. Only instructions can be placed in an executable program section. This does not imply any relocation of the program section; to get relocation, use the rel attribute. (See also code attribute.) noexe No execute. When a program section has the noexe attribute, only data may be placed in it. lcl Local. The lcl attribute defines the program section as local to the current module. rel Relocatable. The contents of the program section are relocated at link time. The align_code parameter starts the program section at the next specified boundary. See the .align assembler directive for valid values. Only one align_code parameter is allowed. Several attributes can be assigned to a program section, however, not all attributes are compatible with one another. If attributes are assigned to a program section that are not compatible, an error message is generated. The following combinations of attributes are legal: • code • data 110 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Assembler Directives [5] • common • rel and exe • rel, lcl, and exe • rel and noexe • shared and common • shared and data The following example demonstrates the use of the .psect assembler directive: .psect FX_DATA,cache,common ;Align start of ;FX_DATA on cache ;line boundary 5.6.43 .quad The .quad assembler directive allows the programmer to define a quad word (8 bytes) of data with a specified integer constant. A single expression or a list of expressions, separated by commas, may be given as arguments. The .quad assembler directive can be used only in data or common program sections. The syntax for the .quad assembler directive is as follows: .quad value[:repeat][,value[:repeat]] The value of the expression must be in the range of 0 through 18,446,744,073,709,551,615 for unsigned data, or –9,223,372,036,854,775,807 through +9,223,372,036,854,775,808 for signed data. The most significant bits of values outside of these ranges are truncated. As an optional operand, a repeat count can be specified. This causes the quad value to be repeated as if there were that many consecutive .quad specifications. A repeat count SR–2510 2.2 Cray Research, Inc. 111 Assembler Directives [5] Cray Assembler for MPP (CAM) Reference Manual cannot be negative. The value and repeat variables can be specified as DEX expressions. 5.6.44 .repeat The .repeat assembler directive marks the beginning of a block of instructions or directives that will be assembled a specified number of times. The result of the repitition is the same as if the code had been duplicated expression number of times in the original source. The expression attribute must be specified as a constant integer expression. The syntax for the .repeat assembler directive is as follows: .repeat expression Repeat blocks can be used in either data or code program sections. The .repeat assembler directive can appear anywhere in a CAM program. Note: Nesting of .repeat assembler directives is not supported. The following example demonstrates the use of the .repeat assembler directive in a data program section: .repeat 64 .s_float .s_float .endr 1.0 2.0 ;Define a pattern of ;data in memory ;End data definitions The result of the above example is 64 duplications of the two data specifications as follows: .s_float .s_float .s_float .s_float 112 1.0 2.0 1.0 2.0 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Assembler Directives [5] The preceding example could also be accomplished as follows: .byte 1.0:64,2.0:64 See the .byte assembler directive for more information. 5.6.45 .restore See the .restore_psect assembler directive. 5.6.46 .restore_psect The .restore_psect assembler directive restores the program section from the top of the internal program section stack of the assembler. If .restore_psect is used and there are no entries on the program section stack, an error message will be generated. Note: .restore_psect and .restore are alternate forms of the same assembler directive. 5.6.47 .s_float See the .s_floating assembler directive. 5.6.48 .s_floating The .s_floating assembler directive allows the programmer to define an IEEE single-precision number (32 bits). A single floating constant or list of floating constants, separated by commas, may be given as arguments. The .s_floating assembler directive can be used only in data or common program sections. The syntax for the .s_floating assembler directive is as follows: .s_floating value[:repeat][,value[:repeat]] SR–2510 2.2 Cray Research, Inc. 113 Assembler Directives [5] Cray Assembler for MPP (CAM) Reference Manual The constant value variables must be in the range of magnitude 1.175e–38 through 3.40e38. This format provides for about seven decimal digits of accuracy. As an optional operand, a repeat count can be specified. This causes the quad value to be repeated as if there were that many consecutive .s_floating specifications. When specified, a repeat count cannot be negative and can be represented by a DEX expression. The value variable must be a floating-point expression. Note: .s_floating, .s_float, and .float are alternate forms of the same assembler directive. 5.6.49 .save See the .save_psect assembler directive. 5.6.50 .save_psect The .save_psect assembler directive stores the current program section information on the top of the internal program section stack of the assembler. If a program section is not in progress when .save_psect is issued, a WARNING message is issued. The maximum number of program sections that can be stacked is 50. If the internal program section stack is full, an error message will be generated. Note: .save is an alternate form of the same assembler directive. 5.6.51 .soft The .soft assembler directive indicates that the specified symbols are defined in another program module. The definition of the symbols will be resolved at link time. The difference between .soft and .external is that the specific symbols defined by .soft will be linked if one or more non-.soft references to that symbol exist. The syntax for the .soft assembler directive is as follows: 114 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Assembler Directives [5] .soft symbol1,symbol2,... Symbols specified in the .soft directive must be external. Multiple symbols can be specified by using a comma between symbols. Note: .soft and .weak are alternate forms of the same assembler directive. 5.6.52 .stack The .stack assembler directive indicates that a stack section is to be created. Stacks can either be local or distributed when the shared attribute is specified. Stacks are data sections that are both started and ended by the .stack directive. The syntax for the .stack directive is as follows: .stack size[,shared[,distribution]] The shared attribute marks the data within the section as shared data. The shared attribute is valid only when used in combination with common or data attributes. Distribution information is specified following the shared attribute on a dimension-by-dimension basis, using the keyword DIM, as follows: DIM #(cycle,pe,block)[,DIM #(cycle,pe,block)] See the .psect assembler directive for more information. 5.6.53 .start The .start assembler directive specifies the primary entry point for a program module. It is allowed only outside of a program section. SR–2510 2.2 Cray Research, Inc. 115 Assembler Directives [5] Cray Assembler for MPP (CAM) Reference Manual The syntax for the .start assembler directive is as follows: .start symbol A symbol is required for the .start assembler directive and the symbol must be defined as a global label (see Section 3.11, page 32, for more information). Only one symbol can be specified as the primary entry point. This name can be the same as the name specified with .ident. 5.6.54 .subtitle The .subtitle assembler directive lets a programmer specify subprogram names. The name must be in the form of a string. This string will be used in the banner for every page in the listing file. This directive helps mark subsections of the user program. Any number of .subtitle assembler directives can be used in .ident. The syntax for the .subtitle assembler directive is as follows: .subtitle "subtitle_ string" 5.6.55 .t_floating The .t_floating assembler directive allows the programmer to define an IEEE double-precision number. A single floating constant or list of floating constants, separated by commas, may be given as arguments. The .t_floating assembler directive can be used only in data or common program sections. The syntax for the .t_floating assembler directive is as follows: .t_floating value[:repeat][,value[:repeat]] 116 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Assembler Directives [5] The constant value variables must be in the range of magnitude 2.225e-308 through 1.798e308. This format provides for about 16 decimal digits of accuracy. Since the .t_floating assembler directive defines data, it cannot be used in a code .psect. As an optional operand, if a repeat count is specified, DEX expressions can be used. This causes the quad value to be repeated as if there were that many consecutive .t_floating specifications. A repeat count cannot be negative. The value variable must be specified as a floating-point expression. Note: .t_float and .double are alternate forms of the same assembler directive. 5.6.56 .title The .title assembler directive lets a programmer specify a program name. The name must be in the form of a string. This string will be used in the banner of every page in the listing file. .title can be specified only once per .ident directive. The syntax for the .title assembler directive is as follows: .title "title_string" 5.6.57 .uses_eregs (CRAY T3E systems only) The .uses_eregs assembler directive notifies the CAM assembler that a bit in the relocatable object file (specified with a .o suffix) is to be set. This bit indicates that E registers are used. The .uses_eregs assembler directive can be specified anywhere between the .ident and .end assembler directives. It can be specified multiple times, however, the first occurence is all that is needed. The syntax for the .uses_eregs assembler directive is as follows: .uses_eregs SR–2510 2.2 Cray Research, Inc. 117 Assembler Directives [5] Cray Assembler for MPP (CAM) Reference Manual The following example illustrates the use of the .uses_eregs assembler directive: .ident test .uses_eregs . . . .end 5.6.58 .warning The .warning assembler directive allows for a level 3 severity, programmer-defined WARNING message. See the -m option in Section 2.1, page 11, for more information. The syntax for the .warning assembler directive is as follows: .warning "text_string"/expression The string of characters between the double quotation marks (text_string) is printed out in the same manner as WARNING messages generated by the assembler. If an expression is specified, it will be evaluated and printed. 5.6.59 .weak See the .soft assembler directive. 5.6.60 .word The .word assembler directive allows the programmer to initialize a word (2 bytes) of data with a specified integer constant. A single expression or a list of expressions, separated by commas, may be given as arguments. The .word assembler directive can be used only in data or common program sections. 118 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Assembler Directives [5] The syntax for the .word assembler directive is as follows: .word value[:repeat][,value[:repeat]] The value of the expression must be in the range of 0 through 65,536 for unsigned data or –32768 through +32,767 for signed data. The most significant bits of values outside of these ranges will be truncated. As an optional operand, a repeat count can be specified. This causes the word of data to be repeated as if there were that many consecutive .word specifications. A repeat count cannot be negative. The value and repeat variables can be specified as DEX expressions. SR–2510 2.2 Cray Research, Inc. 119 Macros [6] Macros are defined by the user. After a macro is defined, it is visible to the rest of the source file. 6.1 CAM macro facility The Cray Assembler for MPP (CAM) macro facility lets the user identify a sequence of source lines that can be inserted into a program by including the macro name as a source statement later in the program. A macro must be defined prior to use. Using macros in a CAM program requires the following: • A macro definition that contains the source lines of the macro • A macro call that consists of the macro name followed by optional arguments 6.1.1 Macro definitions A macro definition is a sequence of assembler instructions and/or assembler directives that can be saved and called as a unit later in the program. It starts with the .macro assembler directive and ends with the .endm assembler directive. Macro definitions can be nested, however, the enclosed macro definition is not defined until the enclosing macro is invoked. Note: If a micro is used in a macro definition, it must be defined prior to the macro definition. 6.1.1.1 Formal arguments The macro definition can include optional formal arguments. These formal arguments can be used throughout the sequence of source lines. When a SR–2510 2.2 Cray Research, Inc. 121 Macros [6] Cray Assembler for MPP (CAM) Reference Manual macro is called, the formal arguments are replaced by the actual arguments in the macro call. Formal arguments are specified by name in the macro definition; that is, after the macro name in the .macro directive . There can be up to 16 formal arguments specified in a macro definition and they must be separated by commas (,). The following is an example of a macro definition that uses formal arguments: .macro .long .word .byte .endm STORE ARG1,ARG2,ARG3 ARG1 ;ARG1 is first argument ARG3 ;ARG3 is third argument ARG2 ;ARG2 is second argument STORE 6.1.1.2 Default values Default values are values that are defined in the macro definition. They are used when no value for a formal argument is specified in the macro call. Default values are specified in the .macro directive as follows: formal argument name = default value Note: If a macro is called without a value for a formal argument and no default value exists for that argument, the assembler substitutes a blank for that value. Default values are not allowed for formal arguments that are labels. The following is an example of a macro definition that specifies default values: .macro .long .word .byte .endm 122 STORE ARG1=12,ARG2=0,ARG3=1000 ARG1 ARG3 ARG2 STORE Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Macros [6] 6.1.1.3 String arguments If an actual argument is a string containing characters that the assembler interprets as separators (such as a tab, space, or comma), the string must be enclosed by delimiters. String delimiters for macro arguments are usually paired parentheses (). A string literal enclosed in double quotes is also a valid string argument. The difference between using parentheses and quotes is that the parentheses are not included in the substitution, but the quotes are included. The following are examples of delimited macro arguments: (HAVE THE SUPPLIES RUN OUT?) (TAB: CTRL R4) "A string literal is taken as a single parameter value." "ARGUMENT IS (LAST,FIRST) FOR CALL" The assembler interprets a string argument enclosed by delimiters as one actual argument and associates it with one formal argument. For example, the following macro call has one formal argument: .macro .ascii .ascii .endm DOUBLE_ASCII "STRING" "STRING" DOUBLE_ASCII STRING The following two macro calls demonstrate actual arguments with and without delimiters: DOUBLE_ASCII (A B C D E) .ascii " A B C D E" .ascii "A B C D E" DOUBLE_ASCII A B C D E cam-127 cam:ERROR Line=#.Column#,File=name Bad Macro Argument Note that the assembler interpreted the second macro call as having five actual arguments instead of one actual argument with spaces. SR–2510 2.2 Cray Research, Inc. 123 Macros [6] Cray Assembler for MPP (CAM) Reference Manual When a macro is called, the assembler removes normal delimiters around a string before associating it with the formal arguments. However, a string literal within double quotes is treated as a single token and retains its double quote delimiters. If a string contains a semicolon (;), the string must be enclosed by delimiters; otherwise, the semicolon will mark the start of the comment field. 6.1.1.4 Macro-defined temporary labels Temporary labels defined within the body of a macro differ from user-defined temporary labels. For more information on user-defined temporary labels, see Section 3.12, page 33. Defining temporary labels within the body of a macro is often very useful. Because a temporary label is created by the assembler when the macro is expanded, multiple calls of the same macro could create duplicate temporary labels. Preceding the formal argument name with a question mark (?) tells the assembler that the label is a macro-defined temporary label. This causes the assembler to create a unique label for each call to the macro. These labels will be in the range from 30000$ through 65535$. Macro-defined temporary labels must be associated with positional actual arguments; they cannot be associated with keyword actual arguments. Arguments that specify a temporary label cannot be assigned a default value. Note: If the corresponding actual argument is specified in the macro call, the assembler substitutes the actual argument for the formal argument instead of creating a unique label. The following example is a macro definition specifying a macro-defined temporary label: .macro POSITIVE ARG1,?L1 bge ARG1,L1 negq ARG1,ARG1 L1: .endm POSITIVE 124 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Macros [6] The following calls and expansions of the macro defined previously show both created temporary labels and a user-defined temporary label: POSITIVE R5 bge R5,30000$ sll R5,32,R2 30000$: ;Assembler created POSITIVE R15 bge R15,30001$ sll R15,32,R2 30001$: U-lbl$: ;Assembler created POSITIVE R15,U_lbl$ ;User-defined bge R15,30010$ bge R15,32,R2 ;Assembler created .endp .end 6.1.1.5 Argument concatenation The argument concatenation operator, the apostrophe (’), concatenates a macro argument with constant text or another argument. Apostrophes can either precede or follow a formal argument name in the macro definition. Argument concatenation will occur inside of strings used in a macro definition and in strings enclosed within double quotation marks (" "). If an apostrophe (’) precedes the argument name, the text before the apostrophe is concatenated with the actual argument when the macro is expanded. For example, if ARG1 is a formal argument associated with the actual argument TEST, then ABCDE’ ARG1 is expanded to ABCDETEST. If an apostrophe follows the formal argument name, the actual argument is concatenated with the text that follows the apostrophe when the macro is expanded. The apostrophe itself does not appear in the macro expansion. SR–2510 2.2 Cray Research, Inc. 125 Macros [6] Cray Assembler for MPP (CAM) Reference Manual In the following example of a macro definition, two successive apostrophes are used when concatenating the two formal arguments A and B: A’’B: .macro .word 0 .endm CONCAT A,B CONCAT An example of a macro call and expansion follows: XY: CONCAT .word 0 X,Y If the macro argument is not being replaced by the assembler, enclose the macro argument in the body of the macro definition in apostrophes (’). 6.1.2 Macro calls The macro call consists of the macro name, optionally followed by actual arguments. The assembler replaces the line containing the macro call with the source lines in the macro definition. It replaces any occurrences of formal arguments specified in the macro call. This includes argument names found inside of strings. This process is called macro expansion. 6.1.2.1 Actual arguments Actual arguments are the text given in the macro call after the name of the macro. Actual arguments in macro calls must be separated by commas. Note: Nonlabel arguments that do not have default values and are not specified in the macro call are expanded with a null value (that is, the formal argument is replaced with no characters). Default values are discussed in Section 6.1.1.2, page 122. The following two examples show possible calls and expansions of the macro defined in Section 6.1.1, page 121: STORE .long .word 126 3,2,1 3 1 ;Macro call ;3 is first argument ;1 is third argument Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Macros [6] .byte 2 ;2 is second argument STORE .long .word .byte X,XY,Z X Z XY ;Macro call ;X is first argument ;Z is third argument :XY is second argument The following examples show possible calls and examples of using default values in the macro defined in Section 6.1.1, page 121 : STORE .long .word .byte 12 1000 0 ;No arguments supplied STORE ,5,X .long .word .byte 12 X 5 STORE .long .word .byte 1 1 1000 0 ;First argument not ;supplied, last two ;arguments supplied ; ; ;First argument supplied ;only 6.1.2.2 Keyword arguments Keyword arguments allow a macro call to specify arguments in any order. In this case, the macro call must specify the formal argument names that appear in the macro definition. Keyword arguments are useful when it is obvious that a macro contains more formal arguments than need to be specified in the call. SR–2510 2.2 Cray Research, Inc. 127 Macros [6] Cray Assembler for MPP (CAM) Reference Manual Note: In any one macro call, positional arguments and keyword arguments can be mixed, however, positional arguments must be specified before keyword arguments in the macro call. For example, the following macro definition specifies three arguments: .macro .long .word .byte .endm STORE ARG1,ARG2,ARG3 ARG1 ARG3 ARG2 STORE The following macro call specifies keyword arguments: STORE .long .word .byte ARG3=27+5/4, ARG2=5, ARG1=SYMBL SYMBL 27+5/4 5 Because the keywords are specified in the macro call, the arguments in the macro call do not need to be given in the order they were listed in the macro definition. Also, if an expression contains white space, the expression must be enclosed in double quotation marks (" "). The parentheses are removed by the assembler, however, the double quotes are not. 6.1.2.3 Passing numeric values of symbols When a symbol is specified in the macro definition as an actual argument, the name of the symbol, not the numeric value of the symbol, is passed to the macro on a macro call. 6.1.2.4 Macro call nesting Macro expansion can be nested; that is, a macro definition can contain a call to another macro. If, within a macro definition, another macro is called and is passed as a string argument, you must delimit the argument so that the entire string is passed to the second macro as one argument. 128 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Macros [6] The following macro definition contains a call to the DOUBLE_ASCII macro defined earlier: .macro CNTDA LAB1,LAB2,STR_ARG LAB1 =. DOUBLE_ASCII (’STR_ARG’) LAB2 =. LEN’CNT: .byte LAB2-LAB1 .endm CNTDA Note that the argument in the call to DOUBLE_ASCII is enclosed in parentheses, even though it does not contain any separator characters. The argument is delimited because it is a formal argument in the definition of the macro CNTDA and will be replaced with an actual argument that may contain separator characters. The following example calls the macro CNTDA, which in turn calls the macro DOUBLE_ASCII: CNTDA ST, FIN, 0, (Learn your ABC’s) Another way to pass string arguments in nested macros is to enclose the macro argument in nested delimiters. Do not use delimiters around the macro calls in the macro definitions. Each time you use an argument that is delimited by parentheses in a macro call, the assembler removes the outermost pair of delimiters before associating it with the formal argument. This method is not recommended because it requires knowledge of how deeply a macro is nested. The following macro definition also contains a call to the DOUBLE_ASCII macro: .macro CNTDA2 LAB1,LAB2,STR_ARG CNTDA LAB1, LAB2, CNT, STR_ARG .endm CNTDA2 Note: The argument that is, in turn, passed on to DOUBLE_ASCII is not enclosed in parentheses. SR–2510 2.2 Cray Research, Inc. 129 Macros [6] Cray Assembler for MPP (CAM) Reference Manual The following example calls the macro CNTDA2: CNTDA2 ST1, FIN1, 1, ((Mind your P’s and Q’s)) The following example shows the use of the double quote delimiters in an example message: .macro MSG LEVEL, msg .if eq, "NOTE", LEVEL .print msg .else .if eq, "WARNING", LEVEL .warning msg .else .if eq, "ERROR", LEVEL .error msg .else .error "MSG MACRO: Unknown message severity" .error msg .endif .endif .endif .endm MSG The following example shows the call for the expansion of a macro: MSG "Warning", "test of a warning message" .if eq, "Note", "WARNING" .print "test of warning message" .else .if eq, "WARNING", "WARNING" cam-171 cam: WARNING Line = 24, Column = 40,File = msg.s USEr defined WARNING number 1:"test of warning message" .warning "test of of warning message" .else 130 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Macros [6] .if eq, "ERROR", "WARNING" .error "test of warning message" .else .error "MSG MACRO:Unknown message severity;" .error "test of warning message" .endif .endif .endif SR–2510 2.2 Cray Research, Inc. 131 Micros [7] Micros allow for the textual substitution of strings, integer expressions, or registers. Micros can be redefined and used in the body of a macro. Micros consist of the following types: • Register micros • Numeric micros • String micros • Assembler-defined micros Micro definitions are allowed anywhere within a source file, however, it should be noted that micro expansion will not occur in macro definition code, conditional code that is skipped, or in repeat block definition code. To ensure correct recognition of conditional code and macro or repeat definitions, micros should not be inserted into assembler directives. Note: Micros must be defined prior to using them in a macro definition. Micro definitions are visible from anywhere in the source file. When a micro is redefined, the original definition is lost, however, previous definitions of items using the original micro do not change. References to micros are resolved when encountered by the assembler. To reference a micro, enclose the micro name in grave accent characters (‘) anywhere in a source statement. This includes strings that are enclosed in double quotation marks. Comments are not included (except for register micros). An error occurs if the micro name is unknown. Note: The names of register micros do not have to be enclosed in grave accent characters (‘). When a source statement containing a micro is edited, the source statement is changed. The result of all substitutions produces a new source statement for processing. SR–2510 2.2 Cray Research, Inc. 133 Micros [7] Cray Assembler for MPP (CAM) Reference Manual Note: The unedited line will be put into the listing file. The Cray Assembler for MPP (CAM) assembler resolves micro definitions immediately upon encountering them. This means that redefinition of a micro is allowed, but once a definition is resolved it remains unchanged (resubstitution does not occur). The following example shows a legal micro substitution: str1 <- "abc" strZ <- "xyz‘str1‘" str1 <- "newstring" The result of strZ is xyzabc, not xyznewstring. Note: Micros can be defined anywhere in a program and remain visible to the rest of the source file. 7.1 Register micros The register micro is set to any valid hardware user register. The register micro can be used only in valid register locations. The following example illustrates the use of a register micro: .ident Rresult <- R5 example Fresult <- F5 psectcsec, code addq R1,R2,Rresult adds F1,F2,Fresult 134 ;Start assembly module ;Integer register micro ;definition ;Floating point register ;micro definition ;Start code section csec ;Put sum of R1 and R2 in ;register specified by ;micro Rresult ;Put sum of F1 and F2 in ;register specified by ;micro Fresult Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual .endp .end Micros [7] ;End code section csec ;End module example 1 7.2 Numeric micros The numeric micro can be set to any valid integer expression. The identifier assigned to the integer value can be redefined. The numeric micro can be used only in places where integer expressions are valid operands. The use of the numeric micro requires delimiters. The grave accent (‘) character must delimit the micro name at the point of use. A numeric micro can also be used as part of a label to create a sequence of numbered labels. For example: .ident num1 <-10 example tmp1 =1 .psect .repeat count<-tmp1 dsec,data 20 lab‘count‘: .quad tmp1 =tmp1 + 1 .endr .endp tmp1 .psect csec,code addq R1, ‘num1‘, R3 .endp .end SR–2510 2.2 ;Start assembly module ;Numeric micro ;definition ;Set local assembler ;variable ;Start data section dsec ;Start repeat block ;Define/redefine numeric ;micro ;Create unique local ;label ;Quad data specification ;Increment tmp variable ;End repeat block ;End data section dsec ;Start code section csec ;Put sum of num1 and R1 ;in R3 ;End code section csec ;End module example 1 Cray Research, Inc. 135 Micros [7] Cray Assembler for MPP (CAM) Reference Manual 7.3 String micros The string micro can be set to any valid string. The identifier assigned to the string can be redefined. The string micro can be usedonly in places where strings are valid operands. The use of the string micro requires delimiters. The grave accent (‘) character must delimit the micro name at the point of use. The following example illustrates the use of string micros: .ident example 3 str_1<-"message string .psect bis csec,code R1,R2,R3 .print ‘str_1‘ .endp .end ;Arbitrary user ;instruction ;String micro ;definition ;Start code section csec ;Arbitrary user ;instructions ;Print message,supplied ;by string micro str 1 ;End code section csec ;End module example 1 7.4 Assembler-defined micros The following predefined micros are available in the CAM assembler: 136 _TIME_ A string micro that specifies the time of assembly. _DATE_ A string micro that specifies the date of assembly. _FILE_ A string micro that names the source file being assembled. This micro is affected by the #line preprocessor directive. _ERRNO_ A numeric micro that reflects the number of the last assembler message. Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual _LINENO_ Micros [7] A numeric micro that reflects the current line number. This micro is affected by the #line preprocessor directive. Note: References to assembler-defined micros must be enclosed in grave accent (‘) characters. SR–2510 2.2 Cray Research, Inc. 137 Interlanguage Calling Protocol [A] This appendix describes the interlanguage calling protocol associated with Cray Assembler for MPP (CAM) 2.2. Topics covered in this appendix include subroutine linkage and the calling sequence for Cray MPP systems. A.1 Subroutine linkage When using a higher-level programming language, such as C or Fortran, you do not need to be concerned about the low-level details of subprogram linkage. Subprogram linkage includes such tasks as calling subprograms, passing arguments to and from subprograms, allocating and placing local variables, and so on. When using CAM, however, you must be concerned with the details of subroutine linkage as well as the instruction set related to the MPP processor and the capabilities of the assembler. The general form and structure of the Cray MPP systems linkage macros are derived from the UNICOS linkage macros. Familiarity with other Cray Research assembly languages and the UNICOS assembly language linkage macros may be helpful. The UNICOS assembly language linkage macros are described in the UNICOS Macros and Opdefs Reference Manual, publication SR–2403. Note: In this appendix, names of macros appear in uppercase. A.1.1 Accessing the linkage macros The assembly language linkage macros for Cray MPP systems are located in the /mpp/asdef.h file. They are available to a program when the following preprocessor directive is used: #include SR–2510 2.2 Cray Research, Inc. 139 Interlanguage Calling Protocol [A] Cray Assembler for MPP (CAM) Reference Manual A.1.2 Register use conventions The linkage macros described in this appendix adhere to the Cray MPP systems register naming conventions and use the following conventions: • The scratch registers (t0 through t13 and ft0 through ft13) and return value registers (v0, and fv0 through fv1) are intended for user-defined tasks at the programmer’s discretion. • The save registers (s0 through s5 and fs0 through fs7) can be used in linkage macros, provided that they are saved and restored. Note: The save registers are automatically saved and restored for user-level routines and optionally saved and restored in leaf-level routines by the ENTER and EXIT macros. • The argument registers (a0 through a5 and fa0 through fa5) can be used once any argument contained within the register has been used. • The return address (ra) and call information (ci) registers can be used freely in user-level routines. If they are used in leaf-level routines, however, they must be restored prior to using the EXIT macro. • The frame pointer (fp) and stack pointer (sp) registers should not be modified by the user. For more information on register use conventions, see Section A.2.5.3, page 158. A.1.3 Linkage macro descriptions The following subsections describe the linkage macros available to CAM programmers. Each subsection describes a group of linkage macros that are related by their functions. A.1.3.1 The ALLOC, LOAD, and STORE macros The ALLOC macro allows you to define variables that are local to the subprogram in terms of 64-bit words of stack space. The ALLOC macro must 140 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Interlanguage Calling Protocol [A] be used prior to using the ENTER macro. The LOAD and STORE macros can then be used to access these local variables. The format of the ALLOC macro is as follows: ALLOC name[,SIZE=n][,ALIGN=alignment] In the preceding format, the optional SIZE parameter indicates the number of 64-bit words of stack space (n) to be allocated to the variable specified by name. If the SIZE parameter is not specified, it defaults to 1 word. The optional ALIGN parameter specifies the alignment of the variable name, as follows: • If quad is specified for alignment, the variable is aligned on quad word (8 byte) boundaries. This is the default alignment. • If cache is specified for alignment, the variable is aligned on cache word (32 byte) boundaries. The LOAD and STORE macros access local variables defined by the ALLOC macro or arguments passed by address. The format of the LOAD macro is as follows: LOAD REGX,name[,INDEX=REGI][,USE=REGJ] The format of the STORE macro is as follows: STORE REGX,name[,INDEX=REGI][,USE=REGJ] In the preceding formats, REGX is the register that receives or provides the value of a local variable and name is the symbolic name assigned to the first word of stack space allocated by the ALLOC macro. The INDEX and USE parameters are optional and specify registers that index arrays. (REG I and REGJ are integer registers.) SR–2510 2.2 Cray Research, Inc. 141 Interlanguage Calling Protocol [A] Cray Assembler for MPP (CAM) Reference Manual The following example demonstrates the use of the ALLOC and LOAD macros: ALLOC ALLOC ALLOC ... LOAD LOAD temp array,SIZE=100 complex,SIZE=2 ;Temporary word ;100-word array ;Complex value t2,temp ft2,array,INDEX=t3 ;Load temp ;Load array[t3] If the INDEX register is not specified, it defaults to zero (R31) and no indexing occurs. If the INDEX register is specified, the USE register defaults to t0 and indexing occurs based on the value in t0. If the USE register is specified, indexing occurs based on the value in the specified register to assist in converting the word index to a byte address. A.1.3.2 The DEFARG, ENTER, and EXIT macros The ENTER and EXIT macros, like their UNICOS counterparts, control the basic subprogram entry and exit process. They are responsible for declaring the entry point, updating the stack pointer, saving and restoring certain registers, and providing an interface to the arguments. The DEFARG macro lets the user declare formal arguments. Unlike the UNICOS system, the MPP calling sequence is considerably more complicated because some arguments (up to the first six) are passed in registers and floating-point and nonfloating-point arguments are passed in different registers. The format of the DEFARG macro is as follows: DEFARG name[,SIZE=n][,TYPE=argtype][,LOCATION=loc] In the preceding format, name is the symbolic name of the formal argument and the optional SIZE parameter indicates the number of 64-bit words (n) in the argument (defaults to 1). 142 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Interlanguage Calling Protocol [A] The optional TYPE parameter indicates the type of argument (argtype), as follows: • If address is specified for argtype, the argument is an address and can be passed through an integer register (default). • If shared is specified for argtype, the argument is the address of a shared data descriptor (SDD) and can be passed through an integer register (deferred). • If integer is specified for argtype, the argument is a value and can be passed through an integer register. • If float is specified for argtype, the argument is a 32-bit floating-point value and can be passed through a floating-point register. • If double is specified for argtype, the argument is a 64-bit floating-point value and can be passed through a floating-point register. • If quad is specified for argtype, the argument is a 128-bit floating-point value and can be passed through a floating-point register (deferred). The optional LOCATION parameter indicates the residence of the argument (loc), as follows: • If register is specified as the residence of the argument, the argument is eligible to be passed through a register (default). • If memory is specified as the residence of the argument, the argument must be passed through memory. Note: Fortran passes all arguments by address and all addresses are eligible to be passed through integer registers, however, no more than six arguments will be passed through registers and once a memory argument is specified, all subsequent arguments are passed through memory. Multiword arguments are not split between registers and memory. The LOAD and STORE macros can be used to access arguments passed by address. SR–2510 2.2 Cray Research, Inc. 143 Interlanguage Calling Protocol [A] Cray Assembler for MPP (CAM) Reference Manual The following example demonstrates the use of the DEFARG macro: DEFARG DEFARG DEFARG DEFARG DEFARG ... LOAD I,TYPE=integer ;An integer ;argument X,TYPE=float,SIZE=2 ;2-word floating ;point arg. J,TYPE=address ;Address argument Q,LOCATION=memory,SIZE=2 ;2-word Memory ;argument Y,TYPE=float ;floating point ;argument t2,J ;Load (J) into ;integer reg. In this example, the first three arguments are passed through registers (argument Q must be passed through memory and therefore, argument Y is also passed through memory). Argument I is placed in register a0, argument X is placed in registers fa1 and fa2, and argument J is placed in register a3. Note: The DEFARG macro must be used prior to using the ENTER macro. The ENTER macro declares the actual subprogram name and generates the entry code sequence. The format of the ENTER macro is as follows: ENTER name[,NUMARG=REGI][,LEVEL=level][,FSAVE=m][,RSAVE=n] In the preceding format, name specifies the name of the subprogram. The optional NUMARG parameter specifies the integer register that receives the argument word count (REGI). The NUMARG register defaults to zero (R31), meaning the argument word count is not loaded into a register. The optional LEVEL parameter specifies the level of the subprogram (level). It controls the assumptions made by the ENTER and EXIT macros with respect to registers saved, subprograms called, and so on. The LEVEL parameter is currently specified as follows: 144 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Interlanguage Calling Protocol [A] • If leaf is specified for level, the subprogram is a leaf-level subprogram. • If user is specified for level, the subprogram is a user-level subprogram. • If baselvl or library is specified for level, it is the same as specifying it as a leaf-level subprogram. The difference between a leaf-levelsubprogram and a user-level subprogram is register usage. If a subprogram is specified as leaf-level, the save registers (s0 through s5 and fs0 through fs7) are not automatically saved and restored by the ENTER and EXIT macros and the ci and ra registers must not be altered. The optional FSAVE (m) and RSAVE (n) parameters specify the number of floating point and integer save registers, respectively, to be automatically saved and restored by the subprogram. The FSAVE and RSAVE counts default to 6 and 8, respectively, for user-level subprograms, and to 0 for leaf-level subprograms. The EXIT macro restores the registers saved by the ENTER macro and returns to the calling program. You are responsible for setting the return value (v0 or fv0 and fv1) registers appropriately. The format of the EXIT macro is as follows: EXIT The following example demonstrates the use of the ENTER and EXIT macros: ENTER subprogram R10,4,5,6 ... EXIT A.1.3.3 The ADDRESS and VALUE macros The ADDRESS macro lets the user obtain the address of a call-by-address subprogram argument, local stack variable, or static variable. The format of the ADDRESS macro is as follows: SR–2510 2.2 Cray Research, Inc. 145 Interlanguage Calling Protocol [A] Cray Assembler for MPP (CAM) Reference Manual ADDRESS REGI,name In the preceding format, the address of the formal argument that is passed by address and defined by the DEFARG macro (with a TYPE of address or shared), the local variable (allocated by the ALLOC macro), or the static variable specified by name is placed in the integer register specified by REGI. The following example demonstrates the use of the ADDRESS macro: ALLOC Z,100 DEFARG Q,TYPE=address ADDRESS t1,Q ADDRESS ADDRESS t2,Z t3,K ;100-element local stack ;array ;Argument passed by ;address ;Address of argument Q ;to t1 ;Address of Z[0] to t2 ;Address of static datum ;K to t3 The VALUE macro lets the user obtain the value of a subprogram call-by-value argument. The format of the VALUE macro is as follows: VALUE REGX,name The value of the subprogram call-by-value argument, as defined by the DEFARG macro with a TYPE of integer or float and specified by name, is placed in the register specified by REGX. REGX can be any register. The following example demonstrates the use of the VALUE macro: 146 DEFARG I,TYPE=integer DEFARG DEFARG ... VALUE VALUE X,TYPE=float Y,TYPE=float ;Integer argument passed ;by value :Floating-point value ;Floating-point value t1,I ft2,X ;Value of I to t1 ;Value of X to ft2 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual VALUE t3,Y Interlanguage Calling Protocol [A] ;Value of Y to t3 Note: An integer register can be used to receive the value of a floating-point argument or vice versa. In this case, an additional memory store may be required. A.1.3.4 The CALL and MXCALLEN macros The CALL macro generates code that calls a subprogram with arguments that are passed by address. The format of the CALL macro is as follows: CALL name[,ARG1 [,ARG2 [...]]] In the preceding format, name specifies the subprogram to be called and ARG1 through ARGn are the optional arguments. Arguments are separated by commas and can include any of the following: • An integer register (for example, t5, s4, zero) that is presumed to contain an address. A null argument is assumed to be zero. • A local variable declared using the ALLOC macro (the stack address of the declared variable will be passed). • A formal argument declared using the DEFARG macro. It is the user’s responsibility to ensure that the formal argument is a call-by-address argument. • A static variable (the address of the static variable will be passed). The CALL macro places the first six arguments in registers a0 through a5. If more than six arguments are specified, they are placed on the stack. The call information word (ci) register is generated to indicate which arguments have been placed in registers. Finally, a transfer to the subprogram is initiated. The following example demonstrates the use of the CALL macro: CALL subprogram arg1,arg2,arg3 SR–2510 2.2 Cray Research, Inc. 147 Interlanguage Calling Protocol [A] Cray Assembler for MPP (CAM) Reference Manual The MXCALLEN macro declares the size of the largest argument list used in the subprogram. This value is used by the ENTER macro to ensure that enough stack space is allocated to accommodate the largest possible call list that will be generated by either the CALL or SETARG macros. The format of the MXCALLEN macro is as follows: MXCALLEN n In the preceding format, n specifies the number of words of stack space that are reserved for arguments. Note: The MXCALLEN macro must precede the ENTER macro. The following example demonstrates the use of the MXCALLEN macro: MXCALLEN 5 ;Reserves space for 5 words of args A.1.3.5 The SETARG and CALLV macros The SETARG and CALLV macros generate code that calls a subprogram. They provide more detailed control than the CALL macro and, in particular, allow call-by-value arguments. The subprogram arguments are set up by successive invocations of the SETARG macro in the desired order. The CALLV macro completes the call. The format of the SETARG macro is as follows: SETARG object[,SIZE=n][,TYPE=argtype][,LOCATION=loc] In the preceding format, object represents a register or the symbolic name of the argument. The optional SIZE parameter defines the number of 64-bit words (n) that are in the argument (defaults to 1). The optional TYPE parameter specifies the type of the argument (argtype), as follows: 148 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Interlanguage Calling Protocol [A] • If address is specified for argtype, the argument is an address and can be passed through an integer register. This is the default. • If shared is specified for argtype, the argument is the address of a shared data descriptor and can be passed through an integer register (deferred). • If integer is specified for argtype, the argument is a value and can be passed through an integer register. • If float is specified for argtype, the argument is a 32-bit floating-point value and can be passed through a floating-point register. • If double is specified for argtype, the argument is a 64-bit floating-point value and can be passed through a floating-point register. • If quad is specified for argtype, the argument is a 128-bit floating-point value and can be passed through a floating-point register (deferred). The optional LOCATION parameter specifies the location of the argument (loc), as follows: • If register is specified for loc, the argument is eligible to be passed through a register. This is the default. • If memory is specified for loc, the argument must be passed through memory. Note: Fortran passes all arguments by address and all addresses are eligible to be passed through integer registers. However, no more than six arguments will be passed through registers and, once a memory argument is specified, all subsequent arguments will be passed through memory. Multiword arguments are not split between registers and memory. The format of the CALLV macro is as follows: CALLV name In the preceding format, name specifies the subprogram to be called. SR–2510 2.2 Cray Research, Inc. 149 Interlanguage Calling Protocol [A] Cray Assembler for MPP (CAM) Reference Manual Note: For a subprogram invocation with no arguments, the CALL and CALLV macros are equivalent. For example, to call a subprogram with two arguments, one call-by-address and one call-by-value, the following code could be used: SETARG t2,TYPE=address SETARG t4,TYPE=value CALLV name ;t2 contains address of ;first argument ;t4 contains value of ;second argument The SETARG macro is permitted only when the CALLV macro is used. It is not used with the CALL macro and no subprogram calls or other uses of the argument registers should be initiated between the first SETARG macro and the CALLV macro. A.1.3.6 The CRI_REGISTER_NAMES and CRI_STACK_DEFINITIONS macros The CRI_REGISTER_NAMES and CRI_STACK_DEFINITIONS macros are provided to assist the assembly language programmer. These macros are called from within the ENTER macro. The CRI_REGISTER_NAMES macro defines the register names identified in Table 18, page 155. These register names are obtained automatically when the ENTER macro is initiated. The CRI_STACK_DEFINITIONS macro defines the layout of the stack frame, primarily the dynamic subprogram information block (DSIB) . A.2 The calling sequence for Cray MPP systems This subsection describes the calling sequence for Cray MPP systems, which have an underlying microprocessor that is based on the DECchip 21064 RISC microprocessor by Digital Equipment Corporation (DEC). The CRAY T3D system uses the Alpha EV-4 architecture and the CRAY T3E uses the Alpha EV-5 architecture. The calling convention must accommodate protocols for the C, CFT77, CF90, and assembly programming languages. 150 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Interlanguage Calling Protocol [A] A.2.1 Background The CRAY T3D calling sequence design is based on the DEC Alpha OSF/1 Calling Standard. The Alpha standard is not completely adequate for Cray MPP systems because features such as distributed data objects, memory-mapped registers, and the block transfer engine are not considered. The Alpha standard design was extended by Cray Research to include shared stacks, distributed data, and some capabilities that are specific to Cray MPP systems. The Alpha architecture is often referred to as a base register architecture, that is, it does not allow instructions to contain full virtual addresses. This means that the implementation must define a method of putting the addresses of objects into a register. Rather than using a group pointer register, as is done in the Alpha standard, the Cray MPP convention uses a load-immediate-address macro. This allows a more efficient method of handling run-time constant expressions, such as those that are based on N$PES. In general, there appears to be a need for both a shared and private version of the static, stack, and heap data segments. The calling conventions apply only to static and stack data. A.2.2 Differences The CRAY T3E calling sequence is based on the CRAY T3D conventions. The calling sequence has been modified to support the new hardware, CF90 2.0, and Cray Research Adaptive Fortran (CRAFT) and to improve performance of the entry and exit sequences. The largest change related to the new hardware involves conventions for E-register usage. Because cache coherence is maintained by the hardware, several minor changes have been made to remove software-controlled cache invalidation. Shared data descriptor conventions are not required by CRAFT, so they have been removed. Shared-to-private coercion and the shared flag in the call information word are not necessary when using CRAFT and have been removed. SR–2510 2.2 Cray Research, Inc. 151 Interlanguage Calling Protocol [A] Cray Assembler for MPP (CAM) Reference Manual To improve the execution performance of entry and exit sequences, argument mismatch checking is performed only for routines with variable arguments and a modified entry/exit sequence has been defined. The subprogram linkage for the call information has also been changed. The call information is now allocated statically and an index to the call information word is passed to the callee, in subprogram linkage. The numargs field has been split into two fields: the number of words in the call list and the number of arguments passed. The number of arguments (numargs) in Fortran for the CRAY T3E system is the number of arguments rather than the number of words in the call list as it is on the CRAY T3D system. In addition, many of the fields in the call information word (ci_word) have been reordered and renamed. The Fortran character descriptor (FCD) has been changed. The first word remains the byte address of the character, but the second word now contains the length in bytes of the character. On the CRAY T3D system the length was in bits. C and C++ programmers can access the definitions for most of the structures defined in this document by including the stk.h header file, as follows: #include A.2.3 Contents of the calling sequence The calling sequence defines the following items: • Methods for referencing declared static and stack data • Conventions for register use • Sequence for program startup • Code needed at a call site • Methods used to pass arguments across subprogram boundaries • Code needed at entry points to subprograms 152 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Interlanguage Calling Protocol [A] • Code needed at exits from subprograms • Termination of program execution A.2.4 Terminology This subsection describes the terminology associated with the calling sequences, as follows: • Compilation unit Any module of source code that can be compiled separately from the main body of a program. In C, this is an entire source file (a .c file); in Fortran, this is a subprogram; and in assembly language, this is a program module. • Subprogram A program, function, or subroutine in Fortran; a function in C; or an entry/exit sequence in assembly language. This appendix defines three types or levels of subprograms: user, leaf, and needle. User subprograms are standard subprograms. A leaf subprogram does not call another subprograms, and certain optimizations can be performed on the entry/exit sequence. A needle subprogram is a minimal subprogram that does not call another subprogram and does not have any stack variables (either private or shared). Additional optimizations can be performed on a needle subprogram. • Stack Memory allocated for data objects upon entrance to a subprogram and at certain times during program execution (for example, Fortran automatic arrays, C variable-length arrays, and dynamic temporary arrays used for intermediate results of array syntax expressions). This memory is freed automatically upon exit from the subprogram. • Private stack SR–2510 2.2 Cray Research, Inc. 153 Interlanguage Calling Protocol [A] Cray Assembler for MPP (CAM) Reference Manual A stack that is private to a processing element (PE). Each PE has its own private stack, however, all private stacks begin at the same virtual address. Each private stack occupies contiguous virtual memory. • Shared stack A stack that is distributed across all existing PEs. All shared stacks are identical. Any space that is on an individual PE is contiguous for both the remote shared stack and local shared stack segments. The local shared segment allows access to that portion of the shared stack that is local to the PEs through faster, cacheable references. The top of the shared stack is stored in the memory location shared_stack_top. • Data widths Data widths are referred to as follows: – char refers to 8-bit values – short and half-precision refer to 32-bit values – pointer, integer, and single-precision refer to 64-bit values – double-precision and FCD refer to 128-bit floating-point values – Complex refers to any of the forms of complex numbers as follows: • half-complex(two 32-bit floating-point numbers) • single-complex (two 64-bit floating-point numbers) • double-complex (two 128-bit floating-point numbers) • Double-precision and double-complex are not currently implemented on Cray MPP systems • All numbers collectively, are referred to as scalar A Fortran character descriptor is represented as a two word structure; the first word is a C pointer and the second word is the length of the data, in bytes. 154 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Interlanguage Calling Protocol [A] A.2.5 Register use conventions The following subsections describe the register usage conventions assumed by Cray MPP calling sequences. Registers are classified by the following types: • Integer registers • Floating-point registers • E-registers (CRAY T3E systems only) It is recommended that assembly language programmers use the symbolic names when referring to the registers in a program. A.2.5.1 Integer register usage conventions Table 18, page 155 describes the conventions for using the 32 integer registers on Cray MPP systems. Integer registers are designated as R0 through R31. Table 18. Integer register usage conventions Register Symbolic name R0 v0 Return value register if the function returns an integer or pointer. This register may be modified by the callee without being saved and restored. R1-R8 t0-t7 Scratch registers. These registers may be modified by the callee without being saved and restored. R9-R14 s0-s5 Saved registers. If these registers are modified by the callee, they must be saved and restored. SR–2510 2.2 Description Cray Research, Inc. 155 Interlanguage Calling Protocol [A] Cray Assembler for MPP (CAM) Reference Manual Register Symbolic name R15 fp Private frame pointer. This register is used to point at the base of the stack frame. It is used to restore the stack pointer (sp) on subprogram exit and by traceback during abnormal program termination. This register is saved, updated, and restored during subprogram entry and exit. Assembly language programmers should exercise caution when updating this register as it is used for traceback purposes in the event of an abnormal termination. R16-R21 a0-a5 Argument registers. Up to six integer and pointer values of the calling list are passed in these registers. These registers may be modified by the callee without being saved and restored. R22-R24 t8-t10 Scratch registers, continued. R25 ci Call information register. Contains the size of the call list, source line number of the call, size of the argument list, a bit-field indicating the type of each argument register, and the number of arguments passed in registers. It is saved during subprogram entry. R26 ra Return address register. The return address must be passed in this register. The contents of this register is saved during subprogram entry. R27-R29 t11-t13 Scratch registers, continued. 156 Description Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Interlanguage Calling Protocol [A] Register Symbolic name R30 sp Private stack pointer register. This register contains a pointer to the top of the private stack. This register is saved, updated, and restored during subprogram entry and exit. Assembly language programmers should exercise extreme caution when updating this register because the operating system is free to use any memory beyond the stack pointer. R31 zero ReadAsZero or Sink register. This register is defined by hardware. It takes binary zero as a source operand and sink (no effect) as a result operand. Description A.2.5.2 Floating-point register usage conventions Table 19, page 157 describes the conventions for using the 32 floating-point registers on Cray MPP systems. Floating-point registers are designated as F0 through F31. Table 19. Floating-point register usage conventions Register Symbolic name F0 fv0 Register return value if a function returns a floating-point, real part of a complex result, or the most significant part of a double-precision result. This register may be modified by the callee without being saved and restored. F1 fv1 Register return value for the imaginary part of a complex result or the least significant part of a double-precision result. This register may be modified by the callee without being saved and restored. SR–2510 2.2 Description Cray Research, Inc. 157 Interlanguage Calling Protocol [A] Cray Assembler for MPP (CAM) Reference Manual Register Symbolic name F2-F9 fs0-fs7 Saved registers. If these registers are modified by the callee, they must be saved and restored. F10-F15 ft0-ft5 Scratch registers. These registers may be modified by the callee without being saved and restored. F16-F21 fa0-fa5 Argument registers. Up to six floating-point arguments may be passed by value in these registers. These registers may be modified by the callee without being saved and restored. F22-F30 ft6-ft14 Scratch registers, continued. F31 fzero ReadAsZero/Sink register. This register is hardware-defined to be floating-point zero as a source operand and sink (no effect) as a result operand. Description A.2.5.3 E register usage conventions (CRAY T3E systems only) Table 20, page 160 describes conventions for using 512 E registers on CRAY T3E systems. E registers are designated as E0 through E511 (see Figure 2, page 159). The more operands block of E registers (MOBEs) are allocated starting from the smallest E register address (&_E[0] in Figure 2, page 159). The source and destination E registers (SADEs) are allocated immediately after the last allocated MOBE. The values of save MOBEs are preserved across function boundaries, but the values of scratch MOBEs and all SADEs are not. 158 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Interlanguage Calling Protocol [A] <= &_E[N-1] 480 Scratch SADEs 4 Scratch MOBEs MO+3 MO+2 MO+1 MO+0 2 Callee save MOBEs MO+3 MO+2 MO+1 stride-4 MOBE MO+0 stride-8 MOBE <= &_E[0] a10041 Figure 2. E register allocation Users and system software must adhere to the following conventions regarding the use of E registers. There are four groups of E registers, each with different uses. SR–2510 2.2 Cray Research, Inc. 159 Interlanguage Calling Protocol [A] Cray Assembler for MPP (CAM) Reference Manual Table 20. E register usage conventions (CRAY T3E systems only) Register Description E[0]-E[3] Stride-8 MOBE. Read only block of E registers. This block of E registers contains a default centrifuge mask, a 0 offset, and a stride of 8 bytes. Word 3 is set to 0x100000000 for 32–bit fetch-and-increment via GET_ADD operations. E[4]-E[6] Stride-4 MOBE. Read only partial MO block. This block of E registers contains a default centrifuge mask, a 0 offset, and a stride of 4 bytes. E[7] Word 3 of stride-4 MOBE. Word 3 of this MOBE block can be set to any value. This is useful for fetch-and-add operations. E[8]-E[15] 2 Callee save MOBEs. A subroutine must restore any of these E registers before returning to the caller. E[16]-E[31] 4 Scratch MOBEs. Any subroutine can use these E registers as MOBEs. Any subroutine can write to these E registers. These E registers can be used as SADEs only if they never go to full-fault state (_MPC_STC_FULL_F). E[32]-E[511] 480 scratch SADEs. A subroutine can use these registers as SADEs. Any subroutine call can write to these E registers. These E registers can also be used as MOBEs, however, the user must ensure that any full-fault (_MPC_STC_FULL_F) state is cleared before they can be used. A.2.6 Data structures The data structures associated with the Cray MPP calling sequences are contained within stack frames. Stack frames are areas of memory 160 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Interlanguage Calling Protocol [A] specifically designated by the subprogram. Stack frames can beprivate or shared, depending on the type of data that must be referenced. The data structures associated with private and shared stack frames are described in the following subsections. A.2.6.1 Private stack frame The private stack frame contains all stack variables within a single invocation of a subprogram (see Figure 3, page 162). On CRAY T3E systems there is a fixed stack frame size that is known at compile time and a dynamic stack size that is allocated for Fortran automatic arrays, C variable length arrays, and dynamic temporary arrays used for intermediate results of array syntax expressions. By convention, each private stack frame starts on a cache-word (32 byte) boundary. In addition, various languages may permit private stack variables to be aligned on a cache-word boundary. In either case, padding is added, as needed. SR–2510 2.2 Cray Research, Inc. 161 Interlanguage Calling Protocol [A] Cray Assembler for MPP (CAM) Reference Manual sp -> Calling list Low memory Private stack variables Hosted stack variables Stack grows toward low memory Register save area Dynamic subprogram information block (DSIB) High memory fp -> a10027 Figure 3. Private stack frame The private stack frame includes the following components (see Figure 3, page 162): • Calling list • Private stack variables • Hosted stack variables • Register save area • Dynamic subprogram information block (DSIB) 162 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Interlanguage Calling Protocol [A] A.2.6.1.1 Calling list The calling list contains the arguments for any called procedures. The size of the calling list is determined by the maximum call list size for all the calls in the calling procedure. Each argument is a word-aligned entry that contains an address, an aggregate return pointer (see section 6.3), a value, or a pointer to a Fortran 90 dope vector. Arguments can be either aggregates or private/shared data arguments. The contents of the calling list are as follows: • A hidden first argument, which is actually a pointer to the memory location of an aggregate. Most high-level languages allow functions that return structures or aggregates. Since these aggregates can be of arbitrary size, they are not returned in registers. Using this argument, the callee then stores the value of the aggregate in memory instead of placing it in registers. Compilers automatically generate code to set and use aggregate return pointer arguments; assembly language programs must do this explicitly. In C and Fortran, double-precision, half-complex, and single-complex values are always returned in two registers. In C, all structures and unions, regardless of size, are considered aggregates. A Fortran CHARACTER value is also considered to be an aggregate where the hidden first argument is an Fortran character descriptor. • A private data argument that represents a value or a virtual address on the processing element (PE). Because it is a pointer to private (PE local) data, the sign bit is not set. A.2.6.1.2 Private stack variables The private stack variables area of the private stack frame is designated to contain the variables that are associated with the subprogram. It expands and contracts depending on the number of variables. SR–2510 2.2 Cray Research, Inc. 163 Interlanguage Calling Protocol [A] Cray Assembler for MPP (CAM) Reference Manual A.2.6.1.3 Hosted stack variables The hosted stack variables area of the private stack frame is designated to contain the variables that are associated with the host (parent) procedure in CF90. It expands and contracts depending on the number of variables in the host procedure. A.2.6.1.4 Register save area The register save area is the area of the private stack frame immediately following the dynamic subprogram information block (DSIB), where registers used by the subprogram are saved. First the integer (R) registers are saved, then the floating-point (F) registers, in descending order (R31, R30, ...R1, R0, F31, F30, ...F1, F0), and, finally, the E registers. Note: Normally, all registers are not saved. The static subprogram information block (SSIB) contains register bit maps that indicate the registers saved by each subprogram. A.2.6.1.5 Dynamic subprogram information block The dynamic subprogram information block (DSIB) includes information generated at execution time about the subprogram. The DSIB includes the context of the caller and information about the call (see Figure 4, page 165). 164 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Interlanguage Calling Protocol [A] Byte offset -48 Parent procedure pointer (f90 procedures only) -40 ci, call information word -32 SSIB, static subprogram information block address -24 shared_stack_top, caller’s shared_stack_top -16 fp, caller’s frame pointer -8 ra, return address fp -> a10031 Figure 4. Dynamic subprogram information block (DSIB) The contents of the DSIB are as follows (see Figure 4, page 165): • The parent procedure pointer pertains only to Fortran 90 procedures. It contains a pointer to the Fortran procedure that is doing the calling. • The call information word (see Figure 5, page 166) contains information that is passed from each call site to the callee. An index to this information is passed by the caller in register ci. SR–2510 2.2 Cray Research, Inc. 165 Interlanguage Calling Protocol [A] 63 52 argucnt Cray Assembler for MPP (CAM) Reference Manual 32 argcnt 16 15 lineno A 3 8 u 0 at aw a10033 Figure 5. Call information word The contents of the call information word are as follows (see Figure 5, page 166): – argucnt is 12 bits long and contains the number of user arguments in the calling list. – argcnt is 20 bits long and contains the number of words in the calling list. – lineno is 16 bits long and contains the source line number of the calling program. – A is 1 bit. It is set if run-time argument checking is enabled. – u is 7 bits long and is reserved for future use. – at is 5 bits long. Each bit indicates the type of the corresponding argument register (0: integer, 1: floating-point) and the number of arguments passed in registers. – aw is 3 bits long and contains the number of argument words passed in registers. Up to 6 words can be passed in registers. The at field is a 5-bit field where each bit indicates whether the corresponding argument is in an integer or floating-point register. The aw field is a 3-bit field that indicates the number of argument words that are passed in registers (up to a maximum of 6). An aw value of 6 (110) means that the sixth parameter is in an R register; a value of 7 (111) indicates that there are still only six parameters, but the sixth parameter is in an F register. 166 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Interlanguage Calling Protocol [A] When character or 32-bit values are passed by value (or returned as the result of a function) they are right-justified without sign extension. When they are passed by address, the callee is responsible for a proper declaration that ensures the appropriate load/store instruction is generated (for example, LDL/STL vs. LDL/STS). • The DSIB contains the address of the static subprogram information block (SSIB), which contains static information generated at compile time about the subprogram (see Figure 6, page 167). Word offset 0 1 63 U 59 47 39 fpcount ci_info ver 23 31 kind saved_R_bit_map lang 15 len 7 0 length saved_F_bit_map 2 entry point address 3 language specific information 4 machine specific information 5 subprogram name saved_E a10032 Figure 6. Static subprogram information block (SSIB) The contents of the SSIB are as follows : – The U field is 4 bits long and is not used. SR–2510 2.2 Cray Research, Inc. 167 Interlanguage Calling Protocol [A] Cray Assembler for MPP (CAM) Reference Manual – The fpcount field is 12 bits long and contains the number of formal arguments. – The ci_info field is 8 bits long and includes the callee’s arg_reg_type and #reg_args in the same format as the lower 8 bits of the call information word. – The ver field is 8 bits long and includes the version number of the SSIB block. – The kind field is 8 bits long and indicates the kind of subprogram (for example, 1 if leaf, 0 if nonleaf). – The lang field is 8 bits long and specifies the language of the subprogram (0 for CFT/CFT2, 1 for Old PCC, 2 for Pascal, 3 for CFT77, 4 for CAL/assembly language, 5 for CFT90, 6 for C++, 7 is unassigned, 8 for SCC, 9 for Ada, 10 for LISP, and 11 for PCC). – The len field is 8 bits long and contains the length of the subprogram name in bytes, excluding the null terminator character. – The length field is 8 bits long and specifies the length of the SSIB (excluding the name) in bytes. – The saved_R_bit_map field is 32 bits long and includes a bit map for R registers (the rightmost bit represents R0 and the leftmost bit represents R31). The bit is set for each callee saved/restored register. – The saved_F_bit_map field is 32 bits and is the same format as saved_R_bit_map. – The entry-point address is 64 bits long and contains the address of the subprogram entry point. – The language specific information is 64 bits long and is reserved for language implementation. – The machine specific information is 64 bits long and is reserved for machine characteristics, such as saved_E (number of callee saved/restored E registers), starting at E0. 168 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Interlanguage Calling Protocol [A] – The subprogram name is a variable and contains the name of the subprogram in ASCII, terminated by a null character. A.2.6.1.6 The call information index The call information index is a quad word offset from the SSIB that points to the call information word. The index can be positive, negative, or zero. A zero value indicates that no call information is available for the call. Note that the call information index is relative to the caller’s SSIB, not the callee’s SSIB. A.2.6.2 Shared stack frame A shared stack frame contains the memory for all SHARED stack variables in a single invocation of a subprogram. There is a fixed stack frame size that is allocated upon entrance to a subprogram, and a dynamic stack size that is allocated for Fortran automatic arrays, C variable length arrays, and dynamic temporary arrays. The dynamic stack is used for intermediate results of array syntax expressions. The base of the shared frame pointer (shared_stack_top) is stored on the private stack for routines that use the shared stack. It is undefined if the shared stack is not used (see Figure 7, page 170). Subprograms that do not use the shared stack have no overhead associated with managing the shared stack. SR–2510 2.2 Cray Research, Inc. 169 Interlanguage Calling Protocol [A] Cray Assembler for MPP (CAM) Reference Manual Low memory shared_stack_top SHARED stack variables and compiler temporaries (variable size) Stack grows toward low memory SHARED stack variables and compiler temporaries (fixed size) High memory Shared stack frame pointer a10034 Figure 7. Shared stack frame A.2.7 Calling sequence elements The elements of the calling sequence are as follows: • Program start-up state • User subprogram entry and exit • Leaf and needle subprogram entry/exit • Call site actions • Subprogram exit The subsections that follow describe each of these elements. A.2.7.1 Program start-up state At program start-up time, certain memory locations and registers must be initialized. The argv and envp pointers are arguments to the main entry point. Memory locations are initialized as follows: 170 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Interlanguage Calling Protocol [A] Memory location Value MY_PE() Each PE has its own identity argv Invocation arguments envp Environment pointer shared_stack_top Initial top of shared stack Parallel Region flag Initialized to TRUE Work Sharing Region flag Initialized to FALSE Registers are initialized as follows: Registers Description sp and fp Initial private stack limits ra Address of exit routine A.2.7.2 User subprogram entry and exit The tasks performed upon entry into a subprogram are determined by the type of routine being called and whether it uses the SHARED stack. Note: Assembly language programs should use the ENTER macro. The compiler classifies routines into leaf and nonleaf routines. Nonleaf routines are routines that call other procedures. Leaf routines do not call other procedures. You must determine the routine category before determining the calling sequence. The tasks that are performed by the compiler upon entry into a subprogram are as follows: 1. Declare the entry point. Regardless of the type of the routine, include the entry label for the subprogram, as follows: proc_name:: SR–2510 2.2 Cray Research, Inc. 171 Interlanguage Calling Protocol [A] Cray Assembler for MPP (CAM) Reference Manual 2. Each subprogram must allocate stack space for the following: • Dynamic subprogram information block (DSIB) • Saved registers, if any • Local variables, if any • Calling list (nonleaf routines only) The operating system increases the size of the stack on any reference that is not valid. However, all memory below or less than the private stack pointer belongs to the operating system, and any references to that memory are unpredictable. In order to make sure that the traceback routines always find a consistent stack, the stack pointer is updated and then all locations in the DSIB are set. Finally, the frame pointer is updated so that it defines the subprogram in which the program is executing. The following example illustrates the allocation of stack space for a new private stack frame: 172 bis sp,sp,t0 ;Save old stack pointer la t1,fixed_frame_size ;Size of private stack ;frame subq sp,t1,sp ;Update stack pointer stq ra,-8(t0) ;Save return address stq fp,-16(t0) ;Save caller’s frame ;pointer stq zero,-24(t0) ;Clear shared stack top la t1,SSIB(zero) ;Address of static ;subprogram information Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Interlanguage Calling Protocol [A] ;block stq t1,-40(t0) ;Save pointer to SSIB stq ci,-32(t0) ;Save call information word bis t0,t0,fp ;New fp is old sp Note: Some of the code sequences in the preceding example can and will be optimized. For example, the second and third instructions could be consolidated into a single instruction if the fixed_frame_size is sufficiently small. 3. Check for argument mismatch. The call information word (ci register) contains information about the number and location (integer register, floating-point register, or memory) of the arguments. If there is a mismatch between what the caller passed and what the callee expected, or if the callee is doing variable-argument (varargs) processing, control branches to a section of code that moves the arguments to their expected locations and then returns to the subprogram entry sequence. The following example illustrates the check for argument mismatch: ldq ldq s8addq ldq bsr t2,-32(t2) ;Load address of callers’s SSIB t0,0(t2) ;Set formal argument info from SSIB t2,ci,t2 ;Add call inforamtion word index to SSIB t1,0(t2) ;Load call information word t2,_T3E_MISMATCH ;Check for argument mismatch Note: No argument mismatch code is generated if a subprogram has no formal arguments. 4. Save any callee registers. The following code stores callee save registers s4 and s5: stqs5,-48(fp) stq s4,-56(fp) SR–2510 2.2 Cray Research, Inc. 173 Interlanguage Calling Protocol [A] Cray Assembler for MPP (CAM) Reference Manual If the subprogram uses any MOBE save E registers, they must be saved. The following example illustrates how these registers are saved: lda t3,-64(fp) lda t1,num_Es(zero) la s8addq t4_E(zero) t4,t3,t4 ;Address of E register ;save area ;Number of requested E ;registers ;&_E[0] ;&_E[IMOB] loop: ldq ldq ldq ldq stq stq stq stq s8addq s8addq subq ble t5,0(t4) t6,8(t4) t7,16(t4) t8,24((t4) t5,0(t3) t6,8(t3) t7,16(t3) t8,24((t3) t4,4,t4 t3,4,t3 t1,4,t1 t1,loop ;Load MO+0 from _E ;Load MO+1 from E ;Load MO+2 from E ;Load MO+3 from E ;Store MO+0 to stack ;Store MO+1 to stack ;Store MO+2 to stack ;Store MO+3 to stack ;Next MO in E registers ;Next MO stack address ;Decrement counter ;If more MOBEs to save 5. Set up the shared stack frame. If the shared stack frame is not used in this subprogram, this step is not necessary; otherwise, set up the new shared stack frame. A barrier is required after allocating any shared stack space. The value of the memory location shared_stack_top is defined on all PEs at program startup. The value of shared_stack_top should remain consistent across all PEs for the life of the program. When shared stack space is allocated, all processors must store the current stack top in their respective private stack frames and store the new stack top in the memory location shared_stack_top. 174 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Interlanguage Calling Protocol [A] The following example illustrates how the shared stack frame is set up: barrier ldq t1,shared_stack_top ;Barrier is required ;Get current top of ;stack stq t1,-24(t0) ;Current shared frame ;pointer (sfp) la t2,shared_fixed_frame_size(zero) ;New shared stack ;pointer subq stq t1,t2,t1 ;Check for dynamic ;shared stack usage t1,shared_stack_top(zero) ;Store new top of ;shared stack barrier ;Barrier is required Note: If a SHARED automatic array is declared or if a SHARED array is redistributed, additional space must be allocated dynamically on the shared stack. This is not shown in the preceding example. 6. Subprogram body The body of the subprogram that is being executed appears at this point 7. Set return value. If necessary, copy the return value back to memory at the address in the return value register (v0). Return values that are of the char, short, or int types are returned in v0. Half-precision and single-precision return values are returned in fv0. Double-precision, half-precision complex, and single-precision complex values are returned in fv0 (most significant or real part) and fv1 (least significant or imaginary part). Aggregate return values are stored in memory. SR–2510 2.2 Cray Research, Inc. 175 Interlanguage Calling Protocol [A] Cray Assembler for MPP (CAM) Reference Manual 8. Free shared stack space. The following sequence of code frees up shared stack space by storing sfp (current shared frame base) as the new shared_stack_top: ldq t0,-16(fp) la t1,shared_stack_top(zero) stq t0,0(t1) 9. Restore any callee save registers. The following code sequence restores callee save registers s4 and s5: ldqr s5,-48(fp) ldq s4,-56(fp) 10. Restore return address (ra), stack pointer (sp), and frame pointer (fp) registers. Note: The frame pointer is reloaded before the stack pointer is updated to prevent a memory reference beyond the stack pointer. The following code sequence restores the ra, sp, and fp registers: ldq ra,-8(fp) ;Restore return address bis fp,fp,t0 ;Save frame pointer (old ;stack pointer) ldq fp,-16(fp) ;Restore frame pointer bis t0,t0,sp ;Restore stack pointer Note: The code sequences in this appendix are for illustration purposes only. They are intended to clarify the intention of the entry and exit code. 176 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Interlanguage Calling Protocol [A] A.2.7.3 Call site actions At a call site, the following actions occur: • Conventional floating-point, integer, and MOBE scratch registers that hold intermediate values are spilled. • Actual arguments are passed. Up to six scalar arguments are placed in the argument registers by type. FCDs and 128-bit call-by-value aggregates occupy two consecutive integer (R) registers. Double-precision, half-complex, and single-complex call-by-value arguments occupy two consecutive floating-point (F) registers. Because Fortran is call-by-address, only integer (R) registers are used. Arguments are placed in registers as follows: Argument 1: a0 or fa0 Argument 2: a1 or fa1 Argument 3: Argument 4: a3 or fa3 Argument 5: a4 or fa4 Argument 6: a5 or fa5 The following example demonstrates register-to-register transfers or memory references: { int i, j; double x; f(i,x,j,i+j); /* call site */ } SR–2510 2.2 ldq a0,-96(fp) ; i ldt fa1,-112(fp) ; x ldq a2,-104(fp) ; j addq a0,a1,a3 ; i+j Cray Research, Inc. 177 Interlanguage Calling Protocol [A] Cray Assembler for MPP (CAM) Reference Manual The following example places argument 7 and beyond in memory on the private stack: { int i1, i2, i3, i5, i6, i7, i8; double x4; g(i1, i2, i3, x4, i5, i6, i7, i8); ldq a0,-80(fp) ; i1 ldq a1,-88(fp) ; i2 ldq a2,-96(fp) ; i3 ldt fa3,-136(fp) ; x4 ldq a4,-104(fp) ; i5 ldq a5,-112(fp) ; i6 ldq t0,-120(fp) ; i7 stq t0,48(sp) ; Put on stack ldq t0,-128(fp) ; i8 stq t0,56(sp) ; Put on stack } The presence of a structure value causes that argument and all remaining arguments to be placed on the stack. • Set the call information register (ci). The following statement sets the call information into register ci: lda ci,call_information_index(zero) • Branch to the entry point. 178 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Interlanguage Calling Protocol [A] The following statement branches to the entry point: bsr SR–2510 2.2 ra,func Cray Research, Inc. 179 Privileged Architecture Library (PAL Code) [B] The Privileged Architecture Library (PAL) code provides the user of CAM 2.2 access to a number of special CRAY T3D and CRAY T3E functions. PAL code is defined within the pal.h file which is located in the /usr/include/mpp/mpp/ directory. Invoking PAL code is done by means of a CAM instruction in the following form: call_pal # The # parameter in the preceding instruction indicates the function code specified of the function you want to access. PAL calls cause the execution of system codes defined for CRAY T3D and CRAY T3E systems. Most PAL calls are defined for system use only, but a subset, described below, are accessible to users of CAM. Table 21, page 182 lists the user PAL calls, a brief description, and a mnemonic substitute for the call_pal syntax. The mnemonics are defined in the include file pal.h. Users can define PAL codes that invoke operating system functions within the pal.h file. The format for defining a PAL code is as follows: #define mnemonic call_pal # [/*comment] In the preceding format, mnemonic specifies the name of the command, # specifies the function code of the command, and comment can be used to describe the command. SR–2510 2.2 Cray Research, Inc. 181 Privileged Architecture Library (PAL Code) [B] Cray Assembler for MPP (CAM) Reference Manual Table 21. User PAL codes Mnemonic Function code Description bpt 128 The bpt instruction is provided for program debugging. It switches the processor mode to kernel, builds a stack frame on the kernel stack, and dispatches to the breakpoint code. bugchk 129 The bugchk instruction is provided for error reporting. It switches the processor mode to kernel, builds a stack frame on the kernel stack, and dispatches to the breakpoint code. uaclwr 130 User atomic cache line write. wrinvcr 133 Write invalidate control register. imb 134 The instruction memory barrier code ensures that the contents of the instruction cache are coherent after the instruction stream has been modified by software or I/O devices. udcflush 135 User data cache invalidate. rscc 157 The read system cycle counter code causes the contents of the system cycle counter to be written to integer register R0. This counter is a 64-bit integer that increments at the same rate as the process cycle counter. slrcv 158 Read unique. slxmit 159 Write unique. 182 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual Privileged Architecture Library (PAL Code) [B] Mnemonic Function code Description gentrap 170 The gentrap trap code permits reporting runtime software conditions. It switches the processor mode to kernel and pushes registers R2 through R7, the updated PC, and the PS onto the kernel stack. It then dispatches to the address of the gentrap vector, stored in a control block. simctl 189 Simulator control call. callsys 190 The callsys instruction switches the processor mode to kernel, builds a callsys stack frame, and dispatches to the system call code. SR–2510 2.2 Cray Research, Inc. 183 Index A ADDRESS macro, 145 ALLOC macro, 146 Argument registers, 140 Assembler directives defined, 26 .end, 24 .external, 3, 24 .ident, 24 Assembler instructions defined, 26 B Base register architecture, 151 C Call information registers, 140 CALL macro, 147 Call site actions, 177 call information register, 178 entry point, 179 Calling sequence, 150 contents, 152 data structures, 161 aggregate return pointer arguments, 163 calling list, 163 SR–2510 2.2 Dynamic subprogram information block (DSIB), 164 private data argument, 163 register save area, 164 shared stack frame, 169 Static subprogram information block (SSIB), 167 design, 151 register use conventions, 155 terminology, 153 CALLV macro, 148 CAM assembler Environment variables NLSPATH, 19 TARGET, 19 execution, 5 hardware requirements, 5 interactive execution, 17, 19 UNICOS operating system, 2 CAM Instruction set instruction types, 47 cam(1) command – option, 16 -D option, 13 -d option, 12 -e option, 13 format, 11 -g option, 14 -I option, 14 -l option, 15 -M option, 15 -m option, 15 Cray Research, Inc. 185 Index Cray Assembler for MPP (CAM) Reference Manual -o option, 15 -P option, 15 source_file option, 17 -U option, 16 -V option, 16 -v option, 16 Capabilities, CAM, 3 Catalog, message, 19 Commands cam(1), 11 explain(1), 19 Compilation Unit, 153 CRI_REGISTER_NAMES macro, 150 CRI_STACK_DEFINITIONS macro, 150 D Data notation, 3 Data widths, 154 Data, individualized, 3 DEFARG macro, 142, 146 Delimiters, string micros, 136 DEX expressions, 42 evaluation mode, 42 Directives, assembler .align, 93 .ascic, 94 .ascii, 95 .asciz, 95 .bits, 95 .blk_bits, 96 .blkb, 96 .blkl, 97 .blkq, 97 .blks, 97 186 .blkt, 98 .blkw, 98 .byte, 99 classes, 87 .comment, 99 conditional assembly, 87 operators, 88 data definition, 90 defined, 87 .double, 100 .else, 100 .end, 87, 101 .endc, 101 .endif, 101 .endm, 101 .endp, 102 .endr, 102 .error, 102 .even, 103 .extern, 103 .external, 103 .float, 103 .ident, 87, 104 .if, 104 .if_false, 100, 104 .iff, 100, 104 .iif, 105 .list, 105 .long, 106 macro control, 91 .macro, 26, 107, 122 .mdelete, 107 message/listing control, 91 .mexit, 108 .odd, 108 .print, 108 program control, 92 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual .psect, 108 purpose, 6 .quad, 111 .repeat, 26, 112 .restore, 113 .restore_psect, 113 .s_float, 113 .s_floating, 113 .save, 114 .save_psect, 114 .soft, 114 .start, 115 .subtitle, 116 .t_floating, 116 .title, 117 .warning, 118 .weak, 118 .word, 118 Dynamic Subprogram Information Block (DSIB), 150 E ENTER macro, 142 EXIT macro, 142 explain(1) command, 19 F File assembly, 22 File, absolute, 5 Files pal.h, 181 Floating-point expressions, 41 SR–2510 2.2 Index Functions DEX, 28 regnum, 28 G Global definitions, 24 Group pointer register, 151 I Identifiers, 31 Character usage, 4 INDEX register, 142 Instruction set operand qualifiers, 45 Instruction types Byte manipulation, 66 Conditional move, 63 Floating-point arithmetic, 82 Floating-point compare, 80 Floating-point control, 74 Floating-point copy, 75 Floating-point load and store, 72 Floating-point move, 78 Integer arithmetic, 56 integer compare, 59 Integer control, 52 Integer load and store, 48 Logical, 62 Miscellaneous, 85 Shift, 65 Instructions addl, 57 addq, 57 Cray Research, Inc. 187 Index adds, 83 addt, 83 and, 62 beq, 53 bge, 53 bic, 62 bis, 63 blbc, 53 blbs, 54 ble, 54 blt, 54 bne, 54 br, 54 bsr, 55 call_pal, 85 cmoveq, 64 cmovge, 64 cmovgt, 64 cmovlbc, 64 cmovlbs, 64 cmovle, 65 cmovlt, 65 cmovne, 65 cmpbge, 67 cmpeq, 60 cmple, 60 cmplt, 61 cmpteq, 81 cmptle, 81 cmptlt, 82 cmptun, 82 cmpule, 61 cmpult, 61 cpys, 76 cpyse, 76 cpysn, 76 cvtgq, 77 188 Cray Assembler for MPP (CAM) Reference Manual cvtlq, 77 cvtql, 77 cvtqs, 77 cvtqt, 77 cvttq, 78 cvtts, 78 divs, 83 divt, 83 eqv, 63 excb, 85 extbl, 67 extlh, 67 extll, 68 extwl, 68 fbeq, 74 fbge, 74 fbgt, 74 fble, 75 fblt, 75 fbne, 75 fcmoveq, 79 fcmovge, 79 fcmovgt, 79 fcmovle, 79 fcmovlt, 79 fcmovne, 80 fetch, 85 fetch_m, 85 Floating-point conversion, 76 insbl, 69 inslh, 69 insll, 69 insqh, 69 insql, 69 inswh, 69 inswl, 70 jmp, 55 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual jsr, 55 jsr_coroutine, 56 lal, 48 lalm, 48 lau, 49 laum, 49 lda, 49 ldah, 49 ldl, 50 ldl_l, 50 ldq, 50 ldq_l, 50 ldq_u, 51 lds, 73 ldt, 73 Maximum, 3 mb, 85 mf_fpcr, 80 mskbl, 70 msklh, 70 mskll, 70 mskqh, 70 mskql, 71 mskwh, 71 mskwl, 71 mt_fpcr, 80 mull, 57 mulq, 57 muls, 84 mult, 84 ornot, 63 ret, 56 rpcc, 85 s4addl, 57 s4addq, 58 s4subl, 58 s4subq, 58 SR–2510 2.2 Index s8addl, 58 s8addq, 58 s8subl, 58 s8subq, 59 sll, 66 sra, 66 srl, 66 stl, 51 stl_c, 51 stq, 51 stq_c, 52 stq_u, 52 sts, 73 stt, 73 subl, 59 subq, 59 subs, 84 subt, 84 trapb, 86 types, 26 umulh, 59 wmb, 86 xor, 63 zap, 72 zapnot, 72 Integer constants hexadecimal C format, 40 octal C format, 40 radices, 39 Integer expressions, 41 Interactive assembly, 17 L Labels, 33 Definitions, 4 Cray Research, Inc. 189 Index global, 33 local, 33 Macro-defined temporary, 4 temporary, 34 User-defined temporary, 4 Limitations, CAM, 3 Linkage macros, 140 Listing control, 3 Listing file format, 7 #, 7 example, 8 macro name, 8 offset, 7 source line, 8 state, 7 value, 7 Location counter, 28 M Macro definition using formal arguments, 122 Macro call actual arguments, 126 keyword arguments, 127 nesting, 128 using, 129 using string arguments, 129 using keyword arguments, 128 Macro calls, 126 Macro definition, 121 argument concatenation, 125 default values, 122 Formal arguments, 121 nesting, 121 190 Cray Assembler for MPP (CAM) Reference Manual temporary labels, 124 using argument concatenation, 126 using temporary labels, 124 Macro expansion, 126 Macro facility, 121 Macros Actual arguments, 4 Definitions, 4, 24 purpose, 6 string arguments, 123 Macros, linkage accessing, 139 ADDRESS, 145 ALLOC, 146 CALL, 147 CALLV, 148 CRI_REGISTER_NAMES, 150 CRI_STACK_DEFINITIONS, 150 DEFARG, 142, 146 ENTER, 142 EXIT, 142 MXCALLEN, 148 register use conventions, 140 SETARG, 148, 150 VALUE, 146 Massively parallel processing, 1 Micro names case, 3 Micro substitution, 134 Micros assembler-defined, 136 definition, 27, 133 Definitions, 24 example, 134 Expansion, 4 naming, 133 numeric, 135 Cray Research, Inc. SR–2510 2.2 Cray Assembler for MPP (CAM) Reference Manual register, 134 resolution, 134 string, 136 usage, 27 MPP simulator, 5 MXCALLEN macro, 148 N Name space, 4, 31 NUMARG register, 144 Numeric constants floating-point, 39 integer, 39 radix format, 39 Index Program startup state, 170 Program structure, , 24 R Register designators, 30 Register format, 30 Register mnemonics, 30 Register names, 30 Register redefinition, 30 Registers Valid range, 4 Repeat blocks, 112 Return address registers, 140 S O Operator ==, 3, 24 precedence, 36 Operators listed, 34 Organization, manual, 2 P PAL codes, 181 Priveleged Architecture Library (PAL), 181 Program module definition, 24 Program segment definition, 22 SR–2510 2.2 Save registers, 140 Scalable heterogeneous system, 1 Scratch registers, 140 SETARG macro, 148 Source statement Concatenation, 25 editing functions, 25 format, 6 Micro substitution, 25 processing, 25 Source statement format Comments, 7 example, 7 Identifiers, 6 Operators/operands, 6 Source statements case, 3, 24 content, 24 Cray Research, Inc. 191 Index format, 3, 25 Stack, 153 Stack frame, private, 161 Stack variables, private, 161 Stack, Private, 154 Stack, Shared, 154 Subprogram, 153 Subprogram entry, 171 argument mismatch, 173 callee save registers, 173 entry point, 171 shared stack frame, 174 Subprogram exit callee save registers, 176 restore registers, 176 set return value, 175 shared stack space, 176 Subroutine linkage, 139 Symbols 192 Cray Assembler for MPP (CAM) Reference Manual definition, 32 external, 5 global, 4, 32 local, 32 user, 3, 24 U UNICOS environment, 17 UNICOS operating system, 2 USE register, 142 User-defined instructions macros, 26 V VALUE macro, 146 Cray Research, Inc. SR–2510 2.2

Cray Assembler For Mpp (cam) Reference Manual Sr–2510 2.2

Rating

Date

Size

Views

Categories

Share

Transcript