4th gen AMD EPYC coming soon Pre-order now

How to Archive and Compress Files with the tar and gizp Commands in Linux

June 29th, 2022
How to Archive and Compress Files with the tar and gizp Commands in Linux

Archiving, compressing, and extracting files are some of the most common tasks for a Linux administrator. If you've ever worked with "tarball" files that have the .tar, .tar.gz, .xz, or .bz2 extension, there's a good chance it was created using the .tar utility.

In this article, we'll demonstrate how you can use the tar utility to archive, compress, and extract files on Linux systems. We'll use Ubuntu 20.04 for all examples, but you can follow along on any Linux system that uses tar.

What is tar?

tar — short for "tape archive" — is a GNU command line tool for creating and extracting archives.

An archive is a single file that includes multiple files or directories. In opensource and Linux communities, tarballs are one of the most common methods for distributing source code and other important files.

In addition to creating archives, tar can perform compression and decompression using several different compression utilities such as gzip and bzip2.

tar vs. gzip

When dealing with Linux archives, you're likely to hear about tar and gzip, often in similar contexts.

The basic difference between the two tools is: tar creates archives from multiple files while gzip compresses files.

However, these tools are not mutually exclusive. tar can use gzip to compress the files it archives. tar's z switch makes the tar command use gzip.

gzip vs. bzip2 vs. xz Compression

gzip isn't the only compression program tar can use. It also supports bzip2 and xz. The table below details some of the basic differences between these compression tools.

gzip bzip2 xz
Compression algorithim DEFLATE Burrows–Wheeler LZMA
Common file extensions .tar.gz, .tgz, .gz tar.bz2, .bz2 tar.xz, .xz
tar command switch -z -j -J

Generally, gzip and bzip2 are comparable from a compression and performance perspective, but gzip is more widely used. .xz tends to give the best overall compression but also takes more time and Linux desktop computer resources.

Note: In our examples, we'll focus on using gzip. Replacing -z with -j in the commands will use bzip2 instead of gzip. Using -J instead of -z will use xz instead of gzip.

How to Compress a Single File or Directory

The general command to compress a single file or directory in Linux is:

tar -czvf <archive name> </path/to/file/or/directory>

Here is what each of those switches means:

  • c- Create an archive.
  • z- Run the archive through gizp.
  • v- Verbosely list files.
  • f- Use a specific archive file.

For example, to compress the /pepper directory to an archive named egg.tar.gz, run this command:

tar -czvf egg.tar.gz /pepper

The output will look similar to:

tar: Removing leading `/' from member names
/pepper/
/pepper/pepperAndegg.log
/pepper/pepperAndEgg.txt
/pepper/pepperandegg.log

Compressing and archiving files with tar on Linux

If we omitted the v switch and instead used the command tar -czf egg.tar.gz /pepper, the output would not include each file. Instead, it would look similar to this

tar: Removing leading `/' from member names

And if there were no errors or characters that needed to be removed from the member names — for example, if we were compressing files in our current working directory — there would be no output.

Note: There's more than one way to specify tar switches. You'll notice that we're using - before specifying our tar switches. While that is a common convention, it is not generally required. tar czvf <archive name> </path/to/file/or/directory> would work too. As would tar -cf <archive name> -vz </path/to/file/or/directory>. We'll stick to the convention we used here in the rest of our examples, but keep in mind there is more than one way to specify tar options.

How to Compress Multiple Files or Directories to a Single Archive

The general command to compress a single file or directory in Linux is:

tar -czvf <archive name> </path/to/file/or/directory1> </path/to/file/or/directory2> ... </path/to/file/or/directoryN>

For example, to compress the files one.txt, two.mp4, and three.iso to an archive named egg.tar.gz, run this command:

tar -czvf egg.tar.gz one.txt two.mp4 three.iso

Compressing multiple files to a single archive with tar on a Linux System

How to Exclude Directories and Files when Archiving

If you specify a directory to create an archive, there may be some files you want to exclude from the archive. The --exclude option lets you specify patterns to exclude from your archive. Any file that matches the patterns passed to the --exclude option will NOT be included in the archive tar creates.

The general command to exclude files from a tar archive is:

tar --exclude=<PATTERN> <Options> <archive name> </path/to/directory>

For example, suppose we had these files in our /pepper directory:

  • one.txt
  • two.mp4
  • three.iso
  • four.log
  • output.log

And we want to compress everything except the .log files to an egg.tar.gz archive. We can use this command:

tar --exclude='*.log' -czvf egg.tar.gz /pepper 

Excluding files from a tar archive in Linux

If needed, you can specify multiple --exclude patterns in a single command.

How to Add Files to an Existing Archive

If you have an existing archive and you want to add files to it, you can use the -r or --append options. A general command to append to .tar archives is:

tar -rf <tar archive> </path/to/file>

However, -r and --append are incompatible with compressed archives. That means you can only use them with tarballs you have not run through compression programs like gzip, bzip2, or xz. If you attempt to use -r or --append on a compressed archive, you may see an error similar to:

tar: Cannot update compressed archives
tar: Error is not recoverable: exiting now

Because of this limitation and some of the other nuances of -r and --append, in many cases it's easier to create a new archive with the additional files.

How to List the Contents of an Archive

You can list the contents of an archive using the -t or --list options. The general command to list the contents of an archive is:

 tar -tvf <archive>

The -t and --list options work on compressed and uncompressed archives.

For example, to list the contents of an egg.tar.xz archive in your current working directory, run this command:

tar -tvf egg.tar.xz

Listing files in an archive on Linux.

How to Extract an Archive

tar's -x switch is for extracting archives. The general command to extract an archive in Linux is:

tar -xf <archive>

The tar -xf command works with both compressed and uncompressed archives.

For example, to extract a egg.tar.gz archive in our current working directory, we can use this command:

tar -xf egg.tar.gz

Listing the contents of an archive and then extracting it

How to Extract an Archive to a Specific Directory

In some cases, you may want to extract files to a directory other than your current working directory. tar's -C switch is useful in this case. The general command to extract an archive to a specific directory is:

tar -xf <archive> -C </path/to/destination>

For example, to extract our egg.tar.gz archive to /tmp/cherry, we can use this command:

tar -xf egg.tar.gz -C /tmp/cherry

Extracting a Linux archive to a specific directory

Conclusion

Now that you know the basics of working with tar, you can work with "tarballs" like a pro. Keep in mind, tar is flexible, and you can combine different switches to produce different results and tweak output. To take a deeper dive on tar, check out the offical GNU tar manual

Mantas is a hands-on growth marketer with expertise in Linux, Ansible, Python, Git, Docker, dbt, PostgreSQL, Power BI, analytics engineering, and technical writing. With more than seven years of experience in a fast-paced Cloud Computing market, Mantas is responsible for creating and implementing data-driven growth marketing strategies concerning PPC, SEO, email, and affiliate marketing initiatives in the company. In addition to business expertise, Mantas also has hands-on experience working with cloud-native and analytics engineering technologies. He is also an expert in authoring topics like Ubuntu, Ansible, Docker, GPU computing, and other DevOps-related technologies. Mantas received his B.Sc. in Psychology from Vilnius University and resides in Siauliai, Lithuania.

Cloud VPS - Cheaper Each Month

Start with $9.99 and pay $0.5 less until your price reaches $6 / month.

We use cookies to ensure seamless user experience for our website. Required cookies - technical, functional and analytical - are set automatically. Please accept the use of targeted cookies to ensure the best marketing experience for your user journey. You may revoke your consent at any time through our Cookie Policy.
build: e4941077.621