R. Clayton (rclayton@monmouth.edu)
(no date)
I compile and link a program:
/home/nobody/useful_dir/gal>ls -lrt hello*
-rwxr-xr-x 1 nobody devl 20624 Nov 12 01:32 hello
-rw-r--r-- 1 nobody devl 88 Nov 12 01:32 hello.c
Can I conclude that the size of the hello-world proces is 20,624 bytes?
Not really. The disk-file size of an executable does not have a strong
relation to the size of the process created from the executable. If the
executable file is big, chances are the process will be big too, but not
necessarily; and small executables don't necessarily lead to small processes
either. The exact nature and strength of the relation depends on the OS, but
this imprecision is true for all major general-purpose OSs.
Although I haven't had a chance to cover it in lecture yet, the internal
structure of a process is more complicated than we've been assuming.
However, because you've already covered this when you read Chapter 11
(particularly Figure 11.11), I'll over-simplify. A process is split into
three parts: code, uninitialized globals, and initialized globals. You can
use the size command on *nix to find the size of each of these parts:
$ cat t.c
#include <stdio.h>
int main() {
printf("hello world!\n");
return 0;
}
$ gcc -o t t.c
$ ls -l t
-rwx------ 1 rclayton faculty 6608 Nov 12 10:04 t
$ size t
text data bss dec hex filename
1778 264 32 2074 81a t
$
The executable file for t contains 1778 code (text) bytes, 264 bytes of
initialized globals (data), and 32 bytes of unitialized globals (bss, block
started by symbol) for a total of 2074 bytes (decimal). The other 6608 -
2074 = 4534 bytes are overhead required by the linker, loader, and other
executable-manipulation tools, including the OS.
To emphasize the point, let's throw more stuff into the executable file by
compiling and linking for debug, which adds symbol-table information that
normally isn't included.
$ gcc -gstabs -o t t.c
$ ls -l t
-rwx------ 1 rclayton faculty 9088 Nov 12 10:10 t
$ size t
text data bss dec hex filename
1778 264 32 2074 81a t
$
The text, data, and bss sizes are the same, but the file size has increased
by around 3000 bytes. The executable (text + data + bss) doesn't change
because the extra information is used only by the debugger and isn't required
for execution.
We can beat this point into the ground by compiling the program for
profiling, which does modify the code to collect statement-execution counts:
$ gcc -pg -o t t.c
$ ls -l t
-rwx------ 1 rclayton faculty 11396 Nov 12 10:17 t
$ size t
text data bss dec hex filename
5689 280 80 6049 17a1 t
$
The text has tripled in size, mostly due to the instructions added to collect
the counts, and the bss has grown by around 50 bytes, which will be used to
store the statement-execution counts.
We can beat this text into the ground in a different way by giving the
hello-world program some bss:
$ cat t.c
#include <stdio.h>
int data[100];
int main() {
printf("hello world!\n");
return 0;
}
$ gcc -o t t.c
$ ls -l t
-rwx------ 1 rclayton faculty 6656 Nov 12 10:26 t
$ size t
text data bss dec hex filename
1803 264 432 2499 9c3 t
$
As expected, the bss size increased by 400 (= 100*sizeof(int)) bytes and the
data size didn't increase because the amount of initialized global data
hasn't changed. I don't know why the text size changed; the code is the same
in either program.
If we add some initialized global data, that will change too:
$ cat t.c
#include <stdio.h>
int data[100];
char tag[] = "A man likes milk, so he owns a million cows.";
int main() {
printf("hello world!\n");
return 0;
}
$ gcc -o t t.c
$ ls -l t
-rwx------ 1 rclayton faculty 6748 Nov 12 10:35 t
$ size t
text data bss dec hex filename
1827 312 432 2571 a0b t
$
Now the bss size is unchanged and the data size has increased by 48 bytes (44
for the string, 1 for the null byte and three pad bytes to maintain an 8-byte
alignment for whatever follows.) The code size has changed again, and again
I don't know why.
(C-C++ savvy readers might want to object to my characterization of data[] as
uninitialized globals by pointing out that the C-C++ standards require that
global ints be initialized to 0. That's true, but because the
initialization value (0) is known to the OS and the executable-manipulation
tools, it can be delayed until as late as possible, usually when a page of
bss data is being paged in for the first time.)
This archive was generated by hypermail 2.0b3 on Fri Dec 03 2004 - 12:00:06 EST