The compilation process is a sequence of various phases. Each phase takes input from the previous, and passes the output on to the next phase.
LEXICAL ANALYSIS
The first phase of compiler works as a text scanner. The lexical analyzer scans the source code as a stream of characters and converts it into meaningful lexemes of the form .
SYNTAX ANALYSIS
It takes the token produced by lexical analysis as input and generates a parse tree (or syntax tree). Token arrangements are checked against the source code grammar.
SEMANTIC ANALYSIS
Checks whether the parse tree constructed follows the rules of language. E.g. assignment of values is between compatible data types, and adding string to an integer. It keeps track of identifiers, their types and expressions. It produces an annotated syntax tree as an output.
INTERMEDIATE CODE GENERATION
After semantic analysis the compiler generates an intermediate code of the source code for the target machine. It represents a program for some abstract machine. It is generated in such a way that it makes it easier to be translated into the target machine code.
CODE OPTIMIZATION
Optimization can be assumed as something that removes unnecessary code lines, and arranges the sequence of statements in order to speed up the program execution without wasting resources. CODE GENERATION This phase takes the optimized representation of the intermediate code and maps it to the target machine language (sequence of re-locatable machine code).