-(G)-(E)-(M)-(A)- [G]enPC [E]lite [M]acro [A]ssembler (C)oderite SECTOR ONE 1994-95 English documentation for version 2.6 I. Introduction 1. Shareware 2. Credits 3. Greetings II. Generalities 1. Addressing modes 2. Arithmetic 3. Assembly directives III. Mnemonics IV. Conclusion --==-- I. Introduction _______________ GenPC aka GEMA is a new symbolic assembler for MS-DOS. It is mainly based upon the 68k reference : GenST on Atari ST. Moreover the logical structure of Motorola 680x0 was adapted to Intel mnemonics, as it is actually easier and more logical. Opposed to TASM that wants to get a lousy pseudo- structurated style and features lotsa bugs ( especially with 386+ instructions ), and doesn't let us really guess how our source codes will be assembled, GEMA let you enjoy heavy coding and was especially designed for 32-bit processing. It now supports all the opcodes of Intel processors, from 8086 to P6, including all discovered, but undocumented opcodes ! In addition, it is really faster than TASM, doesn't need any linker, and features handy assembly directives, especially INCBIN, that has always been missing on TASM and MASM. If you never coded in machine language before, GEMA is the tool you need to discover the marvellous ( well... just about ) world of 80x86. And you'll be easily able to learn 680x0 if you need to. If you already know the 680x0 joys, you won't have to worry about the lousiness of Intel stuff anymore and won't yell about the lameness of the classical assembly tools. If you're coding on 80x86, you must fed up with TASM and MASM. So that GEMA is the assembler you need ! It is especially designed for 32-bit coding ( protected or flat-real modes ) and is really easy to use in this context, opposed to the previous references. A full integrated environment with a powerful editor and debugger ( MonST-like ) are about to be released. I.1 Shareware ------------- GEMA is a SHAREWARE. Installing it on your hard disk implies your agreement to the following terms : - If you are working in a telco company, you have to clean up my phone-bills, - If you think the world of COBOL, you have to kill yourself, - If you're one of my teachers, you have to gimme good results to the exams and projects, - If you're a coder, graphixx-man, music-man, or courrier, love demos, and don't belong to any crew, you'd better join Sector One, - If you're a 20-years old female, you must fall in love with me, - If you can have cheap hardwares, let me know, - If you find bugs in GEMA, let me also know, - If you find none, it's because you never used it :) Bad boy, BTW, this is just a beta-release, it's why some bugs are maybe still present. Please report them to me by writing down what you exactly did, what happened, what sould have happened, and the version of your release. If you like to register as kewl official GEMA user, please send me a nice letter, telling what you're thinking about it. You can also help me coding a part of the complete assembly-package. Just ask for the source code and tell me what you'd like to do on it. You can also send me a few money, at least to cover the mail charges. The official registration fee was 50 FF or $10.00 . If you do it : - You will have a good conscience, - You will support the author to improve GEMA and code other sharewares, - You will get all the new releases by e-mail or snail-mail, - Your name will be quoted in our greetings-list. I'd also enjoy receiving anything done with GEMA. I.2 Credits ----------- Assembler : design, documentation, coding ........... Jedi/Sector One Moral support : .................................... Stephanie Mauger Spelling correction of the french doc : ....................... Mogar Used software : QEdit, Gema, Hacker's View, DJGPP Beta-testing : MJS, Altomcat/Sector One, Oxygene, Keops/Equinox, Alexey Voinov You can get in touch with me at the following address : Frank DENIS 2, rue Madeleine Laffite F-93100 MONTREUIL FRANCE Or thru InterNet : j@nether.net Your can get the latest release by sending an e-mail titled "GET GEMA" to the previous address, or on ftp.nether.net in the directory : /pub/gema/* . You can also leech it without any ratio on ACE BBS +33-1-4588-7548 or thru FidoNet with "GEMA" as keyword on the node 2:320/305. But please, never phone to me... I.3 Greetings ------------- Dream regards to : Infiny ( LCA, Gandalf ), Eclipse ( Hacker Croll ), CyberPunk, Trash, Dream Syndicate, Underground Tectonics, EKO ( Maxx-Out, McDo, Createur, Jool ), Eagles ( Ard ), Equinox ( Keops, Checksum, Al Cool ), Lego System ( Skill ), Dune, Fantasy, Genesis, DBA ( Bonus Software ), Sentry ( Eagle ), Isiolis, Imphobia, Dead Hacker Society, Control Team, Quicky, Daniel Bozinov, Michel Furic, Live!, Fantasy, Anixter, Fongus, Bresil, DSK, Alexey Voinov, Oxygene, Jared, Impact Studios, Kloon, Antares, Pulse, RealTech, Animal Mine, Oxyron, Max in the Star System, Epsilon, EMF, Plant, Cascada, Cubic Team, and you ! II. Generalites --------------- GEMA works on 386, 486, Pentium or P6+. It takes one or two parameters that are the name of the source code and eventually the name of the executable file. eg : gema foo.s or : gema foo.s bar.exe If the second parameter is missing, the name of the executable file will be the name of the source with a COM or EXE suffix. Some options may prefix the file names : -E or --preprocess : display each processed line -v or --verbose : verbose output -q or --quiet : reduced output -o or --optimize : extended automatic optimizations ( 3 passes ) -nw or --nowarning : disable warnings -a or --autoalign : enable automatic alignment ( experimental ) -86, -88, --cpu=86 or --cpu=88, -186 or --cpu=186, -286 or --cpu=286, -386 or --cpu=386, -486 or --cpu=486, -586, -pentium, --cpu=586 or --cpu=pentium, -686, -p6, --cpu=686 or --cpu=p6 : assemble only the opcodes supported by the designed kind of processor. By default, all instructions from 8086 to p6 are supported. II.1 Addressing modes --------------------- All addressing modes conforms to the Motorola 680x0 format, ie : Designation Intel GEMA ------------------------------------------------------------------ Short immediate 12 #12.b Immediate (word) 32000 #32000.w Immediate (long word) 99999 #99999.l .b, .w and .l are optional. They allow forcing a type, for instance to cast a value that fits into one byte as a long world. This can be quite useful with self-modifying code. If the size isn't explicitely fixed, GEMA finds automatically the smallest one. Designation Intel GEMA ------------------------------------------------------------------ Direct AH, BX, ECX, SI, CS AH, BX, CE, SI, CS Under GEMA, the registers are designed the same way than TASM, excepted for 32-bit registers that looks like : AE, BE, CE, DE, SIE, DIE, BPE and SPE, as it's more logical. But in most cases, just one letter ( A, B, C or D ) is enough. Indeed, like 680x0, the size of operands is explicitely part of the instruction and GEMA sets them automatically. So that : NEG.B A under GEMA means NEG AL with TASM. By default, a ".B" instruction applied to a single-letter register is interpreted as AL, BL, CL or DL. To mention AH, BH, CH or DH, just write the full designation. For instance : NEG.B AH NEG.W A under GEMA means NEG AX with TASM Excepted for instructions that don't deal with words as operands, the default size of all mnemonics is the word. So that NEG.W A and NEG A have the same effects. So do most of the instructions. NEG.L A under GEMA means NEG EAX with TASM We can always use the full designation of a register instead of a single-letter shorthand. For instance, NEG.L EAX is okay. But NEG.L AX will ware as the operand size is incoherent with the instruction size. If you ever think this is harder to design explicitely the size in the mnemonic instead of an implicit guess based upon operand sizes, you understood nothing to life. 'Coz it's actually clearer like this and you don't have to add gadgets like WORD PTR in ambiguous cases. So that : NEG.L (SI,DI) under GEMA means NEG DWORD PTR [SI][DI] with TASM. Oh BTW, I was about to forget it... You can always use ".s" instead of ".b" if you prefer... This was just designed to keep on the norms of GenST. Designation Intel GEMA ------------------------------------------------------------------ Short absolute [12] 12.b Absolute (word) [32000] 32000.w Absolute (long) [99999] 99999.l Once again, the ".b", ".w" and ".l" are optional. They could be useful only to cast a type. Otherwise, the assembler guess them on its own. Under GEMA, like in 680x0, and as an immediate value would be prefixed ( with a "#" ), an absolute address has no prefix. It's also the way with labels : NEG label under GEMA means NEG WORD PTR [label] with TASM Designation Intel GEMA ------------------------------------------------------------------ Indirect [SI] (SI) Indirect with register [SI+BX] (SI,BX) or (SI,B) The default size for an index register is a word. Designation Intel GEMA ------------------------------------------------------------------ Indirect with reg and offset.b [SI+BX+12] 12.b(SI,BX) Indirect with reg and offset.w [SI+BX+32000] 32000.w(SI,BX) Indirect with reg and offset.l [ESI+EBX+99999] 99999.l(SIE,BE) As expected, ".b", ".w" and ".l" are optional, they're only useful as casts. For instance 12(SI,BX) is exactly equivalent to 12.b(SI,BX). Most of the time, there's no good adding these suffixes. Offsets can be set as they are, GEMA will find a way outta difficulties. Designation Intel GEMA ------------------------------------------------------------------ Indirect reg/off/factor [ESI+EBX*factor+off] off(SIE,BE*factor) Once again, offsets can be bytes, words or long words. BTW, a factor can be applied only to 32-bit registers, as well as long offsets ( someone told me that was a bug of GEMA, but this is not : this is the tricky Intel architecture ) . THE *SOURCE* ARGUMENT IS ALWAYS THE *FIRST* ONE AND AN OPTIONAL TARGET ARGUMENT, ALWAYS THE NEXT ONE. This is nothing but the intelligent Motorola logic. So that with GEMA, to copy the value of AX into BX, just write : MOVE A,B ( or MOV A,B ) ... and not MOV B,A as TASM would expect it. Of course, instructions like ENTER that don't manage source and destination arguments, have no reason to get reversed arguments. But for the rest, the Motorola syntax has to be used. II.2 Arithmetic --------------- All classical operators are available for offsets and immediate values. Here is the decrescent list : []: These are the parenthesis - : Opposite to a number < : Left shift. For instance : 3<2 sends back 12 > : Right shift. 6>1 sends back 2 ^ : Exclusive OR & : Logical AND | : Logical OR / : Divide * : Multiply - : Sub + : Add % : Modulo = : Equal : sends back 0 if the expression is false, 1 otherwise @ : Divide by 16 what follows. For instance, @fooLabel sends the segment address of the label Toto, according to the current ASSUME value. This operator is accumulable, for instance @@fooLabel divide fooLabel by 256. But it's just an arithmetic operation, there will be no relocation ( see \ above ) . ~ : Logical NOT \ : Sends back the segment address of what follows. This forces the result of the expression to a word and will be relocated is the executable code is a .EXE . It's a kinda SEG prefix like with MASM and TASM. : : ( Yes it's the ':' character ) Versions < 2.5 : returns the 4 lower bits of what follows, Versions >=2.5 : returns what follows modulo the current segment size. Was changed for compatibility purpose with A2G. In practice, this is quite a dummy operator... Stupid but funny instance : 2+3*4/[-5-7]<[~3^5] Apart from numbers, lotsa other things may be used inside arithmetic operations : * : Asterisks mean the address of the current instruction, or to be more precize, its offset to the beginning of the program. For instance : bra.s * equals : foo jmp foo '': ASCII values of one or more characters. For instance : 'A' equals 65 'AB' equals $4142 Bases : ------- A raw number is parsed as decimal by default. Even if there are one or more zeros at the beginning. Hexadecimal numbers have just to be prefixed by a '$' ( how silly is the 'h' suffix with sometimes a '0' suffix to avoid confusions ! ) For instance : $ABCD1234 under GEMA means 0ABCD1234h with TASM A binary constant has to be prefixed by a '%'. For instance : %101 means 5 An octal number has to begin with the '§' sign. Of course, arithmetic operations can mix several bases. And casting is always possible on all the expression... For instance : [1+$12A/%10001001].b ...But can also always be done on some terms of an expression : For instance : $1234.b+1 means $35 ans not $1235 because the ".b" reduced the result to a single byte. So can GEMA evaluate complex expressions mixing different bases with type casts. Results can be used anywhere as constants, offsets or immediates. But GEMA can also process these operations on symbols. Symbols : --------- There are 3 sorts of symbols : - Labels ------ The global format of an instruction line under GEMA is : [Label[:]] [Mnemonic] [Arguments] [;] [Comments] A label has to be always at the beginning of a line ( without any trailling space ) . So that the classical leading column is now optional. A mnemonic has not to be directly at the beginning of a line. Spaces or tabs must prefix it to avoid it being interpreted as a label. Reserved keywords can always be used as label names, this is not ambiguous. The comments don't need a semicolumn. But avoid this kinda comments as it overloads the source code that becomes less readable. And the heavy parsing technic used by GEMA has sometimes problems to distinguish comments and arguments. But don't panic, in all cases, you'll just get an error message, and never an unexpected result. A complete line of comments has to begin with a '*', ';', '%' or '/' . For instance : foo move.l a,b this is a comment addx (si,bx),c bar bra.s foo * dummy line / Another line of comments junk: nop Usually, the semicolumn is not necessary to distinguish instructions and comments. But let you see the following case : rts returns to the main routine rts can be used without any argument or with a number corresponding to an extra depth of the stack. It's why GEMA will think "returns" is that depth ( being a label ) and "to the main routine" is a comment. To avoid this, just add a semicolumn : rts ; returns to the main routine And there will be no possible confusion anymore. Thanks to Altomcat for having related that problem to me. In case of doubt you can add semicolumns all the time, this will be ok. You can also add extra spaces everywhere, GEMA will ignore them. For instance : addx.l 4 + 3 / [ 1 + 2 ] ( sie , be * 8 ) , d But never insert spaces between a mnemonic and its optional size indicator ( "addx.l" is ok, not "addx . l" ) . The START label is always defined as a null label, ie. the beginning of the program and can be used by pedants the same way than NULL in C language. A label can be used inside an arithmetic expression. In this case, it represents the offset to the beginning of the program, less the ASSUME value ( see ASSUME above ) + the last ORG value ( see ORG above ) . For instance : foo move.l #[foo-bar]/2,foo+2 ... bar flush Such labels can be defined only once a source code, otherwise would GEMA produce a redeclaration error. They have a constant value for the whole assembly, opposed to local labels and variables. - Local labels ------------ Idem to global labels, excepted the fact they can be redefined. Their name have to begin with a dot. It's a pretty handy feature for loops. For instance : move #$1234,c .wait dec.c bne.s .wait bne = jnz ... some other piece of code ... move a,c .wait nop cmps.b dbeq .wait dbeq = loopnz Local labels can be used in arithmetic expressions and equals the last point of their definition. They can even be set to any value with the SET directive and act like real numeric constants or counters. For instance : .wait set $1234 Oh yeah, that local label has now the $1234 value. But before being assigned a constant value, it is assigned the instruction offset like any other label. So what, would you say ? Yup, just try for instance : .wait set .wait+2 As much as you do that, .wait will always be evaluated as two more bytes than its last declaration. IMHO, it's really useful for self-modified code. But now, let us imagine another situation, where you have to build a table of the 3 multiples. We'll explain later than the REPT...ENDR structure allows repeating a piece of source code several times and that DC.L is used to insert a constant long word ( like DD with TASM ) . What would be interesting in this case, would be that .wait equals optionally the offset value it was declared in the first time, and then that it would be only depending of the SET value for the other times. Of course, GEMA manages it with "assembly variables". - Assembly variables ------------------ Their names begin with a '!'. They can be used as traditionnal labels. They are in fact local labels that are affected by the offset they stand in only the first time they're assigned. So here is the way we can build our table of multiples ( 0, 3, 6, 9, ... 255*3 ) : !foo set 0 rept 256 dc.l !foo !foo set !foo+3 endr Marvellous, isn't it ? Any arithmetic expression can mix these 3 sorts of symbols and can even cast them to any type. Reserved keywords are supported, and there is no limit on significative length. Upper and lower cases are differencied. Allowed character for a label name are alphanumerics ( of course with no digit as a first character ), underscore, '!' and '.' . II.3 Assembly directives ------------------------ They have to be positionned to the second position, like mnemonics. - REPT ... ENDR ------------------------ Repeat times a piece of code. For instance : rept 5 nop xlat endr will produce : nop xlat nop xlat nop xlat nop xlat nop xlat - SET --- See the previous part. - ORG -------------- Set the base offset. Exactly like equivalents on all other assemblers, excepted the fact that GEMA supports more than one, and anywhere in a program ( even though it is interesting only at the beginning ) . For instance : org $100 GEMA will also agree with org #$100 - TITLE ------------- Gives a title to the current source. Yet unused. _ INCLUDE <file> -------------- Merge the file at this point of the current source and continue the assembly procedure ( like #include in C ) . Any source can include another one that may include other ones that may... There is no depth limit, but a basic check for circular references is done. The file name can be quoted or double- quoted ( or not quoted at all ) . - USE16 ----- Assume that the following code will be in a 16-bit code segment by default ( ie. needs prefixes for 32-bit accesses ). Enabled by default. - USE32 ----- Assume that the following code will be in a 32-bit code segment by default. This is only possible in protected mode, and needed by most of the DOS-Extenders ( like PMODE ) . - OPT --- Enable or disable several options. Overrides command line options. OPT o+ : enable automatic optimizations OPT o- : disable them ( default ) OPT w+ : enable all warnings ( default ) OPT w- : disable them OPT v+ : verbose mode OPT v- : normal mode ( default ) OPT q+ : quiet mode OPT q- : normal mode ( default ) OPT a+ : enable automatic alignment OPT a- : disable automatic alignment ( default ) - ONCE ---- Like #pragma once implemented in most C compilers. Don't include the file if it was already included before. - INCBIN <file> ------------- Here comes THE missing directive on TASM and MASM. It allows you to insert any BINARY file in the executable code. Forget lousy hexadecimal conversions or tiny files loading. To insert the picture of your girlfriend at the "tut" label, you now just have to do : tut incbin cindy.jpg or tut incbin "cindy.jpg" - DC -- Inserts a byte, a word, a long word or a string. For instance : dc.b 1,2,3,4,"Tototata",'t',10 dc.w $1234,"tuttut",4 In the last example, "tuttut" is inserted as single bytes as if we had done : dc.w $1234 dc.b "tuttu" dc.w 4 - DS -- Inserts several nul bytes. For instance : ds.b 4 means dc.b 0,0,0,0 ds.l 3 means dc.l 0,0,0 - EVEN, ALIGN, SEGMENT, PAGE, DPAGE, PPAGE ---------------------------------------- Inserts several nops so that the next instruction will be aligned. EVEN = 2 bytes ALIGN.B / .W / .L / .Q = Try and guess ( .Q = Quand, .B is just here for fun ! ) SEGMENT = 16 bytes PAGE = 256 bytes DPAGE = 512 bytes PPAGE = 2048 bytes - MIN or MINI <xxx> ----------------- Minimal size your program need ( .EXE files only ) in 16-byte chunks. Default is the size of your program. - MAX or LIMIT <xxx> ------------------ Fix the maximal memory size your program has to reserve when loaded, in 16-byte chunks. Useful for overlays and resident programs. - OVERLAY <xxx> ------------- Set an overlay ID. - HEADER ------ Inserts the traditionnal header of a .EXE file, with the relocation table and all that stuff. It's usually the first instruction of a .EXE program. But GEMA allows you to put ones everywhere you want ( might be useful for self- extracting archives or nested executables ) . - STACK ----- Describe where the initial stack ( SS:SP ) will be allowed when a .EXE file will be launched. All .EXE files should have a STACK. For instance : header * lotsa code ds.l 256 * 256*4 bytes for the stack, that's enough ! stack * other code or datas - ASSUME ------ Fix the reference to compute the next label offsets inside arithmetic expressions. TASM allows this directive having different values for all segment registers. Well... IMHO there is no good doing that as we sould be able to do anything we need with our segment pointers, but maybe this feature will be implemented in the next releases of GEMA if requested by enough people. - FATAL ----- Stop assemblying. - SECTION BSS or BSS ------------------ What follows will be "passively" assembled : the inline labels will be computed but no code will be integrated in the executable file. Usually used with "ds" at the end of a program. - SECTION TEXT or SECTION DATA or TEXT or DATA -------------------------------------------- Cancel the effects of the previous directives. - REAL or REALMODE ---------------- Assume that next segments ( from where one of these directives is, and after SEGMENT, PAGE, DPAGE or PPAGE ) are 64 Kb long like in real-mode. These directives are only useful to produce an error in case of overflow. They have no consequence on code generation. - UNREAL or UNLIMIT or FLAT ------------------------- Opposed to the previous directives, this set assume that the next segments ( until a new directive of this kind is encountered ) have no size limit. By default, all segments have an infinite size for GEMA. - SEGSIZE <size> -------------- The following segments will be <size> bytes long. These three sets of directives may be prefixed with an alignement directive ( SEGMENT, PAGE, DPAGE or PPAGE ) or can be themselves prefixes to any instruction. For instance : SEGMENT:REAL is exactly like : SEGMENT REAL SEGSIZE 4096:PAGE means : SEGSIZE 4096 PAGE In both cases, the order has no influence. So that DPAGE:UNREAL and UNREAL:DPAGE have the same meaning. Theses directives are usually useful in real mode or in protected mode with funny segment sizes. In all other cases, there is no good using them, as GEMA assume all segments are infinite by default. III. Mnemonics -------------- All ( but two ) TASM and MASM mnemonics are available with the same designation under GEMA, even all synonyms ( such as JZ and JE ) . Some mnemonics have different names with TASM and MASM ( for instance XLAT and XLATB ) : both designations are supported by GEMA and have of course the same effect. But GEMA also features alternative synonyms. Most of them are 680x0 equivalents or more logical forms. The following list represents some equivalent mnemonics and those that need some extra comments : LEAVE = UNLINK MOV = MOVE MOVSX = MOVESX MOVZX = MOVEZX TRAPV = INTO WBINVD = FLUSH TRAP = INT TRAP supports an abolute way of writing its argument. For instance, TRAP #14 is exactly identical to TRAP 14 or INT 14. RTED = RTID = IRETD RTE = RTI = IRET BRAF ~ JMPF JMPF is the FAR alternative of JMP. It is waiting for two arguments that are the segment and the offset, for instance JMPF $14c9,$418db2a. But as we often use a FAR JMP with a label address or an absolute address, writing "JMPF \label,:label" all the time would be actually lousy. Instead, just use BRAF, that works similar to JMPF excepted the fact it expects only one argument, a 32-bit address that it automatically converts to a segment and an offset. For instance, BRAF $12345 is similar to JMPF $1234,5 386+ allow use of these instructions with a 32-bit offset. BRA ~ JMP BRA is similar to JMP excepted the fact that, as expected, JMP label under GEMA means JMP [label] to TASM, ie. a jump to the address contained in "label", and not a direct jump to "label". It's why a logical way of doing this is JMP #label. But as "JMP #label" are often more useful than "JMP label", you'd rather use BRA, that is similar excepted in this case. "BRA label" means "BRA #label" or "JMP #label". In other cases, BRA can deal with all the addressing modes a JMP would support, such as "BRA (si, dx)". A BRA or JMP can be short, word or long ( for flat-real and protected modes ) . REP = REPE = REPZ There are two way of using these prefixes : - Either as independant instructions on a single line, keeping all the features of any instruction, - Or as prefixes. In this case, as any good prefix, they have to be placed before the instruction they act on. A column must separe both elements. For instance : toto ds gs: move.l (si),a rep : outs.b repne ins.l As you can see, you can add extra spaces before and after the column without any problem, as always. HLT = STOP XOR = EOR CMC = NGC CLD = D+ ( '+' means : increment ) STD = D- ( '-' means : decrement ) CLI = INTOFF STI = INTON ( Saturn coders will love these ! ) ADDX = ADC BS+ = BSF ( idem, IMHO, a '+' is clearer than (F)orward ) BS- != BSR WARNING : BSR HAS NOT THE SAME MEANING UNDER GEMA AND UNDER TASM&MASM. Indeed, it is used to call subroutines as we'll explain it later. So that to scan bits in reversed mode, you MUST use BS-. BTSTC = BTSTN = BTC BTSTR = BTR TAS = BTS = BTSTS BTST = BT (BSRF = JSRF) ~ CALLF There are the FAR version of CALL. The way is the same as JMPF and BRAF but the instructions that only expect one argument are BSRF and JSRF. The CALL instruction as we always knew it is logically named CALLF. It is absolutely RIDICULOUS declaring subroutines with PROC NEAR or FAR foo in machine language. Assembly was designed for heavy coders wanting to make their compi blow out their silicon minds and not structured-languages fans that would better have to try PASCAL or another shit like that. So that under GEMA, you just have to do a BSRF or BSR, and you know exactly how it will be assembled. (BSR = JSR) ~ CALL NEAR version of CALL. See BRA and JMP. RTS = RTN RTSF = RTNF NEAR and FAR ways of coming back from a subroutine. May be followed by an immediate to enforce the stack depth. EXTA.Q = EXT.Q = CDQ EXTA = EXT or EXT.W = CBW CWD = EXT.L = EXTA.L DIVS = IDIV DIVU = DIV LINK = ENTER WAIT = FWAIT ( Microsoft / Borland ) MULS = IMUL MULU = MUL INS, OUTS, MOVES, LODS and co. : The string concerns have to follow the GEMA's logic, that is an instruction optionally followed by its size, being .W by default. So that instead of OUTSD with TASM, just write : OUTS.L BHI = JNBE BCC = JAE = JNB = JNC BCS = JNAE BLS = JBE = JNA BGE = JGE = JNL BVC = JNO BLT = JNGE BLE = JLE = JNG BCXZ = JCXZ BEQ = JE = JZ BGT = JNLE BECXZ = JECXZ = JCEZ = BCEZ BPL = JNS BNE = JNE = JNZ BPO = JPO = JNP BVS = JO BPE = JPE = JP BMI = JS I maybe forgot some, but are equivalent : - All synonyms recognized by MASM and TASM - Their Motorola 680x0 equivalent SETxx = Sxx Idem. For instance, SZ can be used as well as SEQ ... All Microsoft, Borland and Motorola ways are recognized. LOOPE = LOOPZ = DBEQ LOOPNE = LOOPNZ = DBNE LOOP = DBF = DBRA ROXL = RCL ROXR = RCR SAHF = SAF ASL = SAL = SHL ASR = SAR ( != SHR ) SUBX = SBB XLAT = XLATB Of course, whatever synonym you choose, the addressing modes keeps the GEMA's norm. - When an immediate value is always awaited, removing the '#' is tolered, - When the register size depends from the instruction size ( 95% of the usual cases ), you can use one-letter designations for registers, - When a source and target arguments are involved, they have to be always in this order, - Instructions featuring a 32-bit alternative should active it with ".L" or it is automatically done with 32-bit offets or immediates. Anyway, when there is no possible ambiguousness, GEMA tolers some abuses ( like INT 14 that normally should be only INT #14 ), and in all cases, the most logic way is assumed. And in case of trouble, lotsa pretty precize errors and warnings will help you. AAM and AAD These instructions may be followed by an immediate number ( with or without a # ) and allow any divisor ( and not only 10 ) . This works on all CPU, but was never officially documented. SALC ICEBP = ICE01 = TRAP01 UMOV = UMOVE LOADALL Undocumented opcodes of 386+. See www.x86.org for more informations. All these opcodes are implemented in GEMA, but a warning appears when the verbose ( -v or --verbose ) option is enabled. CMOV = CMOVE RDPMC UD UD2 New opcodes of the P6 CPU, fully implemented in GEMA. As UD and UD2 seem to be working on other sort of CPU, too, they never generate any warning. Here is a very complex and original program, displaying "Hello World" on your screen : org $100 push cs pop ds ds = cs move #plouf,d plouf's offets in dx move.b #9,ah 9 in ah trap #$21 system call #$21 move #$4c00,a exit(0) trap #$21 pooeek ! plouf dc.b "Hello world !",13,10,'$' the marvellous text we want to display Oh god ! Assembling this tiny piece of source code will produce a .COM file displaying an odd message with the ugly default DOS font. Here is another version of it, ridiculously longer. But this shows the structure of an executable ( .EXE ) program. At least the half of the following instructions are dumbshit, but this can help you understand the GEMA's point of view on segmentation : header create a .EXE and not a .COM overlay $1234 overlay number ( not very useful here ) min 1+@fin minimal necessary memory ( implicit ) max 1+@fin ...equals the max one ( just for fun ) move cs,a code segment address move a,ds ...into ds move.b #9,ah fonction 9 move #plouf,d plouf's offset into DX trap #$21 call the DOS move #$4c00,a return code to the DOS trap #$21 let's call it plouf dc.b "Hello world !",13,10,'$' text to be displayed segment alignment to a multiple of 16 bytes ds.l 128 some space for the stack ( 128 Lwords ) fin stack the new stack starts here If despite this summary you still have troubles using GEMA, please report them to me, the faster way being thru e-mail to j@nether.net IV. That's all folks -------------------- I hope you'll be able to use this fabulous ( well... just about ) tool and see its advantage to TASM and MASM. All suggestions, critics, remarks and bug-reports will be welcomed.