Contents Up << >>

The Structure of a .COM File

A .COM file consists entirely of executable code and data. When the file Hello.COM is executed, for example (by typing either Hello or Hello.COM at the DOS prompt), the contents of the file are simply loaded into memory. When the file has been loaded, execution starts with the first byte. All of the segment registers are set to point to a single 64K segment starting 256 bytes before the address where the program was loaded, so in fact execution starts at CS:0100. The first 256 bytes of the segment comprise the Program Segment Prefix (PSP), which contains a variety of pieces of information about the executing program, mostly obsolete (holdovers from the days when MS-DOS was first designed as a clone of the old CP/M operating system, which was developed in the mid-70's to run on the original Intel 8080).

The most useful field in the PSP is the tail of the command line; for example, if Hello.COM had been executed by typing Hello/full C:\Temp, then the string /full C:\Temp would be stored in the PSP. The program can access this argument string starting at offset 80h; the first byte gives the length of the tail (13 in the example), and that many bytes starting at 81h contain the string itself. The string is terminated with a carriage return character (ASCII code 0Dh), which is not included in the count.

Since all of the segment registers point to the same segment, the structure of a typical .COM program in memory is as follows:

PSP
Program Text
Initialized Data
Uninitialized Data
Free Space
Stack
The program text and initialized data are the bytes that are read in from the .COM file, corresponding to the .text and .data sections of the NASM source. The PSP is generated by the operating system, and the stack is automatically arranged to grow down from the top of the segment. The uninitialized data, corresponding to the bytes reserved in the .bss section, are carved out of the free space between the loaded bytes and the growing stack; since they were not explicitly initialized before execution, they will start out containing whatever garbage was left in those locations of physical memory by the previous programs.