file

What is file?

file is a Linux tool that is used to read the magic bytes of a file. For example;

file example

we have the following 3 files

  • image.jpeg

  • information.txt

  • test.html

In this scenario we have an image file, a text file and a html file. The command file seperates the files by the magic bytes of the file itself. Check the output below for a better understanding of the command "file"

root@Corrosie:~/test# file image.jpeg 
image.jpeg: JPEG image data, JFIF standard 1.01, aspect ratio, density 1x1, segment length 16, baseline, precision 8, 371x136, components 3
root@Corrosie:~/test# file information.txt 
information.txt: ASCII text 
root@Corrosie:~/test# file test.html 
test.html: HTML document, ASCII text

How to use file

# syntax "file"
file <OPTIONS> <FILE>

##### examples
file <FILE> # shows the magic bytes of the file, no matter the extension the file contains.
file -i <FILE> # output the mime/content-type of a file and the charset
file --mime-type <FILE> # output mime-type of a file.
file -z <COMPRESSEDFILE> # tries to look in a compressed file

file --help # shows an help menu of the tool "file"
file --version # shows the current version of the tool "file"

A file gets recognized by the extension that is in the naming of the file itself, and the magic bytes (signature).

What are magic bytes?

Magic bytes are often the very few first bytes of a file, and in some cases also the last few bytes of a file. Magic bytes are also called : file signature.

Lets say that we have 2 files, image.txt and information.txt. Apart from the naming before the extension, we cannot know for certain if a file is a text file, or an image file. This is the reason why magic bytes are brought up in the game. This is to know whether a file is indeed what the file tells you it is.

when looking up this link, there are a lot of different file signatures (file signatures = magic bytes). The magic bytes are set in hexadecimal, what represents a few special characters in the file itself.

For example :

JPEG image files will always start with the value :

FF D8 FF DB (hexadecimal) = ÿØÿÛ (what you will see in the file itself, the signature)

PDF files will always start with the value :

25 50 44 46 2d (hexadecimal) = %PDF- (what you will see in the file itself, the signature)

references :

mime-type :

Last updated