Python tarfile Module Usage Guide | Generated by AI
Importing the tarfile Module
To use the tarfile
library in Python, first import it:
import tarfile
This module is part of Python’s standard library, so no installation is needed. It’s based on the POSIX tar format and supports reading from and writing to tar archives compressed or uncompressed.
Opening and Creating Tar Archives
- Opening an existing tar file: Use
tarfile.open()
with a mode. Modes include'r'
(read),'w'
(write empty),'a'
(append), or variants with compression like'r:gz'
for gzip.import tarfile # Open for reading (uncompressed) with tarfile.open('example.tar', 'r') as tar: # Work with tar here pass
- Creating a new tar file: Use
'w'
mode to create an empty archive. Add compression prefixes like'w:gz'
for gzip or'w:bz2'
for bzip2.# Create a compressed tar.gz file with tarfile.open('archive.tar.gz', 'w:gz') as tar: pass
Adding Files to an Archive
- Add a single file: Call
add()
on the tar file object, passing the file path. You can specify an arcname for a different name inside the archive.with tarfile.open('archive.tar.gz', 'w:gz') as tar: tar.add('file.txt') # Adds file.txt as-is tar.add('data.csv', arcname='renamed_data.csv') # Rename inside archive
- Add multiple files or a directory: Use
add()
in a loop or add entire directories recursively.import os with tarfile.open('backup.tar', 'w') as tar: for root, dirs, files in os.walk('my_folder'): for file in files: tar.add(os.path.join(root, file))
Extracting Files from an Archive
- Extract all files: Use
extractall()
to extract to a specified directory (defaults to current).with tarfile.open('archive.tar.gz', 'r:gz') as tar: tar.extractall(path='extracted_folder') # Creates folder if needed
- Extract specific files: List members first with
getmembers()
, then useextract()
.with tarfile.open('example.tar', 'r') as tar: members = tar.getmembers() for member in members: if 'important' in member.name: # Example filter tar.extract(member, path='output_dir')
- Note: For security, filter paths (e.g., avoid
..
) to prevent path traversal attacks.tarfile
does some checks, but manual filtering is advised for untrusted archives.
Listing Archive Contents
- Iterate over files: Use
getmembers()
or directly iterate the tar object for TarInfo objects.with tarfile.open('example.tar', 'r') as tar: for member in tar.getmembers(): print(f"{member.name}: {member.size} bytes")
Key Modes and Compression
'r'
: Read uncompressed tar.'r:gz'
,'r:bz2'
,'r:xz'
: Read compressed varieties.'w'
: Write new uncompressed tar (overwrites).'a'
: Append to existing tar.- Compression auto-detection on read; specify on write.
Tips and Best Practices
- Always use context managers (
with
statements) to ensure files close properly. - For large archives, consider filtering or extracting selectively to save memory.
tarfile
doesn’t handle encrypted archives directly—use external tools or combine withgzip
manually if needed.- Error handling: Catch
tarfile.TarError
for archive issues. - Full docs: Refer to Python’s official tarfile documentation for advanced features like custom filters.