|
|
Many files in the UNIX system have standard headers that contain a magic number. For example, the magic number may indicate that it is executable only on a particular machine architecture, or that it is a file archive. The exec system call, debuggers such as dbXtra, object library archivers such as ar, and file archivers such as cpio look for a valid header in files presented to them for processing. Programs that can handle several different file formats can switch how they deal with a file depending on its magic number.
The file /etc/magic is a database of magic numbers used by file (and utilities such as more) to help it to determine the probable use to which a regular file is put. If none of the magic numbers in /etc/magic matches, file applies other heuristic tests on the first 512 bytes of the file. For example, /etc/magic does not define magic numbers for data files, ASCII or English text files, and assembler, C, or shell programs.
/etc/magic contains four fields separated by tab characters:
0
), or hexadecimal (with a leading
0x
).
The type string always requires an exact match.
%s
for type string,
and %ld
, %lo
, or %lx
for
the other types.
(Internally, the numeric types are converted to
signed longs.)
file
uses the format specification to print the
magic number that it finds.
Empty lines are ignored.
You can create your own magic file as described in ``Examples''.
0x61 long 0x20000000 tar archive 0x61 long 0x30000000 tar archive 0x61 long 0x31000000 tar archive 0 short 070707 cpio archive 0 short 0xa01f LZH-compressed data 0 string MZ DOS executable (EXE) 0 short =0514 iAPX 386 executableThe /etc/magic entries for tar archives could be modified to display the magic number value found. In this example, the values would be printed in decimal, octal, and hexadecimal respectively for the three archive types:
0x61 long 0x20000000 tar archive - dec magic %ld 0x61 long 0x30000000 tar archive - oct magic %lo 0x61 long 0x31000000 tar archive - hex magic %lxThe following entry from /etc/magic demonstrates the use of continuation lines to extract more information from a file:
0 short =0514 iAPX 386 executable >12 long >0 not stripped >22 short >0 - version %ldIf file(C) identifies a file as an iAPX 386 executable (in COFF or Common Object File Format), it looks at the values of the long and short words at offsets of 12 and 22 bytes. If the long word is greater than zero, file reports that the executable has not been stripped of its symbol table and line number information. A non-zero value in the short word is interpreted as a version number placed in the file by an assembler or by a linker.
The file utility does not recognize PostScript headers. To allow it to do so, you can create your own magic file by copying /etc/magic and adding the following line to it:
0 string %! PostScript textfile will now identify PostScript files if you tell it to use your magic file: