GO TO LINK & DOWNLOAD:ASSEMBLY_LANGUAGE_EBOOK PDF
GO TO LINK & DOWNLOAD:BASIC_ASSEMBLY_LANGUAGE EBOOK PDF
GO TO LINK & DOWNLOAD:BEGINNER_INTRO_OF_ASSEMBLY_LANGUAGE PDF
What is Assembly Language?
Each personal computer has a microprocessor that manages the computer's arithmetical, logical and control activities.
Each family of processors has its own set of instructions for handling various operations like getting input from keyboard, displaying information on screen and performing various other jobs. These set of instructions are called 'machine language instructions'.
Processor understands only machine language instructions which are strings of 1's and 0's. However, machine language is too obscure and complex for using in software development. So, the low-level assembly language is designed for a specific family of processors that represents various instructions in symbolic code and a more understandable form.
Advantages of Assembly Language
An understanding of assembly language provides knowledge of:
- Interface of programs with OS, processor and BIOS;
- Representation of data in memory and other external devices;
- How processor accesses and executes instruction;
- How instructions access and process data;
- How a program accesses external devices.
Other advantages of using assembly language are:
- It requires less memory and execution time;
- It allows hardware-specific complex jobs in an easier way;
- It is suitable for time-critical jobs;
- It is most suitable for writing interrupt service routines and other memory resident programs.
Basic Features of PC Hardware
The main internal hardware of a PC consists of the processor, memory and the registers. The registers are processor components that hold data and address. To execute a program, the system copies it from the external device into the internal memory. The processor executes the program instructions.
The fundamental unit of computer storage is a bit; it could be on (1) or off (0). A group of nine related bits makes a byte. Eight bits are used for data and the last one is used for parity. According to the rule of parity, number of bits that are on (1) in each byte should always be odd.
So, the parity bit is used to make the number of bits in a byte odd. If the parity is even, the system assumes that there had been a parity error (though rare) which might have caused due to hardware fault or electrical disturbance.
The processor supports the following data sizes:
- Word: a 2-byte data item
- Doubleword: a 4-byte (32 bit) data item
- Quadword: an 8-byte (64 bit) data item
- Paragraph: a 16-byte (128 bit) area
- Kilobyte: 1024 bytes
- Megabyte: 1,048,576 bytes
Addressing Data in Memory
The process through which the processor controls the execution of instructions is referred as the fetch-decode-execute cycle or the execution cycle. It consists of three continuous steps:
- Fetching the instruction from memory
- Decoding or identifying the instruction
- Executing the instruction
The processor may access one or more bytes of memory at a time. Let us consider a hexadecimal number 0725H. This number will require two bytes of memory. The high-order byte or most significant byte is 07 and the low-order byte is 25.
The processor stores data in reverse-byte sequence, i.e., the low-order byte is stored in low memory address and high-order byte in high memory address. So if processor brings the value 0725H from register to memory, it will transfer 25 first to the lower memory address and 07 to the next memory address.
x: memory address
When the processor gets the numeric data from memory to register, it again reverses the bytes. There are two kinds of memory addresses:
- An absolute address - a direct reference of specific location.
- The segment address (or offset) - starting address of a memory segment with the offset value.
Assembly - Environment Setup
Assembly language is dependent upon the instruction set and the architecture of the processor. In this tutorial, we focus on Intel 32 processors like Pentium. To follow this tutorial, you will need:
- An IBM PC or any equivalent compatible computer
- A copy of Linux operating system
- A copy of NASM assembler program
There are many good assembler programs, like:
- Microsoft Assembler (MASM)
- Borland Turbo Assembler (TASM)
- The GNU assembler (GAS)
We will use the NASM assembler, as it is:
- Free. You can download it from various web sources.
- Well documented and you will get lots of information on net.
- Could be used on both Linux and Windows.
Installing NASM
If you select "Development Tools" while installing Linux, you may get NASM installed along with the Linux operating system and you do not need to download and install it separately. For checking whether you already have NASM installed, take the following steps:
- Open a Linux terminal.
- Type whereis nasm and press ENTER.
- If it is already installed, then a line like, nasm: /usr/bin/nasm appears. Otherwise, you will see just nasm:, then you need to install NASM.
To install NASM take the following steps:
- Check The netwide assembler (NASM) website for the latest version.
- Download the Linux source archive nasm-X.XX. ta .gz, where X.XX is the NASM version number in the archive.
- Unpack the archive into a directory which creates a subdirectory nasm-X. XX.
- cd to nasm-X. XX and type ./configure . This shell script will find the best C compiler to use and set up Makefiles accordingly.
- Type make to build the nasm and ndisasm binaries.
- Type make install to install nasm and ndisasm in /usr/local/bin and to install the man pages.
This should install NASM on your system. Alternatively, you can use an RPM distribution for the Fedora Linux. This version is simpler to install, just double-click the RPM file.
Assembly - Basic Syntax
An assembly program can be divided into three sections:
- The data section
- The bss section
- The text section
The data Section
The data section is used for declaring initialized data or constants. This data does not change at runtime. You can declare various constant values, file names or buffer size, etc., in this section.
The syntax for declaring data section is:
section .data
The bss Section
The bss section is used for declaring variables. The syntax for declaring bss section is:
section .bss
The text section
The text section is used for keeping the actual code. This section must begin with the declarationglobal _start, which tells the kernel where the program execution begins.
The syntax for declaring text section is:
section .textglobal _start_start:
Comments
Assembly language comment begins with a semicolon (;). It may contain any printable character including blank. It can appear on a line by itself, like:
; This program displays a message on screen
or, on the same line along with an instruction, like:
add eax ,ebx ; adds ebx to eax
Assembly Language Statements
Assembly language programs consist of three types of statements:
- Executable instructions or instructions
- Assembler directives or pseudo-ops
- Macros
The executable instructions or simply instructions tell the processor what to do. Each instruction consists of an operation code (opcode). Each executable instruction generates one machine language instruction.
The assembler directives or pseudo-ops tell the assembler about the various aspects of the assembly process. These are non-executable and do not generate machine language instructions.
Macros are basically a text substitution mechanism.
Syntax of Assembly Language Statements
Assembly language statements are entered one statement per line. Each statement follows the following format:
[label] mnemonic [operands] [;comment]
The fields in the square brackets are optional. A basic instruction has two parts, the first one is the name of the instruction (or the mnemonic), which is to be executed, and the second are the operands or the parameters of the command.
Following are some examples of typical assembly language statements:
INC COUNT ; Increment the memory variable COUNTMOV TOTAL, 48 ; Transfer the value 48 in the; memory variable TOTALADD AH, BH ; Add the content of the; BH register into the AH registerAND MASK1, 128 ; Perform AND operation on the; variable MASK1 and 128ADD MARKS, 10 ; Add 10 to the variable MARKSMOV AL, 10 ; Transfer the value 10 to the AL register
Assembly - Constants
There are several directives provided by NASM that define constants. We have already used the EQU directive in previous chapters. We will particularly discuss three directives:
EQU %assign %defineThe EQU Directive
The EQU directive is used for defining constants. The syntax of the EQU directive is as follows:CONSTANT_NAME EQU expression
For example,TOTAL_STUDENTS equ 50You can then use this constant value in your code, like:mov ecx, TOTAL_STUDENTS cmp eax, TOTAL_STUDENTSThe operand of an EQU statement can be an expression:LENGTH equ 20 WIDTH equ 10 AREA equ length * widthAbove code segment would define AREA as 200.Assembly - Arithmetic Instructions
The INC Instruction
The INC instruction is used for incrementing an operand by one. It works on a single operand that can be either in a register or in memory.SYNTAX:
The INC instruction has the following syntax:INC destination
The operand destination could be an 8-bit, 16-bit or 32-bit operand.EXAMPLE:
INC EBX ; Increments 32-bit register INC DL ; Increments 8-bit register INC [count] ; Increments the count variableThe DEC Instruction
The DEC instruction is used for decrementing an operand by one. It works on a single operand that can be either in a register or in memory.SYNTAX:
The DEC instruction has the following syntax:.DEC destination
The operand destination could be an 8-bit, 16-bit or 32-bit operand.EXAMPLE:
segment .data count dw 0 value db 15 segment .text inc [count] dec [value] mov ebx, count inc word [ebx] mov esi, value dec byte [esi]The ADD and SUB Instructions
The ADD and SUB instructions are used for performing simple addition/subtraction of binary data in byte, word and doubleword size, i.e., for adding or subtracting 8-bit, 16-bit or 32-bit operands, respectively.SYNTAX:
The ADD and SUB instructions have the following syntax:ADD/SUB destination, sourceThe ADD/SUB instruction can take place between:
Register to register Memory to register Register to memory Register to constant data Memory to constant dataHowever, like other instructions, memory-to-memory operations are not possible using ADD/SUB instructions. An ADD or SUB operation sets or clears the overflow and carry flags.EXAMPLE:
The following example asks two digits from the user, stores the digits in the EAX and EBX register, respectively, adds the values, stores the result in a memory location 'res' and finally displays the result.SYS_EXIT equ 1 SYS_READ equ 3 SYS_WRITE equ 4 STDIN equ 0 STDOUT equ 1 segment .data msg1 db "Enter a digit ", 0xA,0xD len1 equ $- msg1 msg2 db "Please enter a second digit", 0xA,0xD len2 equ $- msg2 msg3 db "The sum is: " len3 equ $- msg3 segment .bss num1 resb 2 num2 resb 2 res resb 1 section .text global _start ;must be declared for using gcc _start: ;tell linker entry point mov eax, SYS_WRITE mov ebx, STDOUT mov ecx, msg1 mov edx, len1 int 0x80 mov eax, SYS_READ mov ebx, STDIN mov ecx, num1 mov edx, 2 int 0x80 mov eax, SYS_WRITE mov ebx, STDOUT mov ecx, msg2 mov edx, len2 int 0x80 mov eax, SYS_READ mov ebx, STDIN mov ecx, num2 mov edx, 2 int 0x80 mov eax, SYS_WRITE mov ebx, STDOUT mov ecx, msg3 mov edx, len3 int 0x80 ; moving the first number to eax register and second number to ebx ; and subtracting ascii '0' to convert it into a decimal number mov eax, [number1] sub eax, '0' mov ebx, [number2] sub ebx, '0' ; add eax and ebx add eax, ebx ; add '0' to to convert the sum from decimal to ASCII add eax, '0' ; storing the sum in memory location res mov [res], eax ; print the sum mov eax, SYS_WRITE mov ebx, STDOUT mov ecx, res mov edx, 1 int 0x80 exit: mov eax, SYS_EXIT xor ebx, ebx int 0x80When the above code is compiled and executed, it produces the following result:Enter a digit: 3 Please enter a second digit: 4 The sum is: 7The MUL/IMUL Instruction
There are two instructions for multiplying binary data. The MUL (Multiply) instruction handles unsigned data and the IMUL (Integer Multiply) handles signed data. Both instructions affect the Carry and Overflow flag.SYNTAX:
The syntax for the MUL/IMUL instructions is as follows:MUL/IMUL multiplierMultiplicand in both cases will be in an accumulator, depending upon the size of the multiplicand and the multiplier and the generated product is also stored in two registers depending upon the size of the operands. Following section explains MUL instructions with three different cases:
SN Scenarios 1 When two bytes are multiplied
The multiplicand is in the AL register, and the multiplier is a byte in the memory or in another register. The product is in AX. High-order 8 bits of the product is stored in AH and the low-order 8 bits are stored in AL2 When two one-word values are multiplied
The multiplicand should be in the AX register, and the multiplier is a word in memory or another register. For example, for an instruction like MUL DX, you must store the multiplier in DX and the multiplicand in AX.The resultant product is a double word, which will need two registers. The high-order (leftmost) portion gets stored in DX and the lower-order (rightmost) portion gets stored in AX.3 When two doubleword values are multiplied
When two doubleword values are multiplied, the multiplicand should be in EAX and the multiplier is a doubleword value stored in memory or in another register. The product generated is stored in the EDX:EAX registers, i.e., the high order 32 bits gets stored in the EDX register and the low order 32-bits are stored in the EAX register.
FOR MORE STUDY DOWNLOAD PDF
No comments:
Post a Comment