home
|
A module that reads and writes tar format archives. It is included in the Python Standard Library since Python 2.3.
Abstract
tarfile is a comprehensive implementation of the tar archive format as a module for the
Python Language. It enables read/write access to common tar
archives including support for gzip/bzip2 compressed archives.
2006-12-23: I was granted access to the Python source repository today, which gives me the opportunity to develop the tarfile.py version that is distributed together with Python. My main goal is to implement support for the POSIX.1-2001 pax format for the 2.6 release.
2006-10-15: Somehow I have managed to keep this tarfile version in sync with the official version in Python's stdlib for about 3½ years now. I think the time has come to put an end to it. It has been a lot of work and I more and more doubt the benefit. The official version is fine, well-tested, has a large group of people using it and apart from me some really experienced developer working on it.
I will leave this page as it is. tarfile 0.8.0 has some severe bugs in it, so do not use it.
tarfile is part of the Python Standard Library since release 2.3.
The tarfile version that is published here resides on a separate development branch and serves as a
test bed for new features and bugfixes.
Some features unique to this version:
- transparent compression detection for the stream interface (mode="r|*"). (accepted for Python 2.5)
- TarFile.extractall() method that addresses issues with directories (see documentation). (Python patch #1043890)
(accepted for Python 2.5)
- exclude argument to the TarFile.add() method that accepts a function which
should return True for each filename to exclude from the archive.
- tarfile.open() accepts a fileobj argument for "r:bz2" and "w:bz2" modes,
made possible by a workaround for a limitation in the bz2 module.(accepted for Python 2.5)
Documentation
Documentation for tarfile can be found here.
Downloads
Current release: 0.8.0
- 2006-04-25:
tarfile-0.8.0.tar.gz (60.4 KB)
This version has a number of known bugs, so do not use it!
- fixed unintended transformation of some GNU longnames to dirtypes (SF #1471427).
- added ReadGNULongTest to test for SF #1471427.
- validate header blocks using checksums in TarInfo.frombuf().
- rewrite of TarFile.next() and the member processing methods, which affects subclassing to implement custom types.
The patch by Luis Caamano for compatibility with Python 2.0/2.1 is included in the distribution.
- 2002-06-29:
Greg Lewis created a tarfile version for Python 1.5.2. Great work!
You can download his version below, but I don't give support for it:
tarfile152.tar.gz (15.7 KB) is based on tarfile 0.4.5
Source repository
The development source code for tarfile is available from a public
subversion repository. You can check out
the trunk anonymously:
$ svn co http://gustaebel.de/svn/lars/tarfile/trunk tarfile
You can browse the repository online
here.
History
0.7.9 - 2006-03-10: download
- added base-256 encoding for all number fields in a header not only the size field.
- added support for posix and sun header checksums.
- added posix argument to TarInfo.tobuf().
0.7.8 - 2005-10-26: download
- fixed: tarfile.add() accidentally creates hardlink members (SF #1330039).
- fixed: workaround for old tar archives DIRTYPE members does not work (SF #1336623).
0.7.7 - 2005-08-28: download
- fixed external GNU tar documentation URL.
- synchronized with Python branch, see SF #1107973, #1168594, #1262036.
- changed TarFile.add()"s exclude keyword argument to accept a function object instead of patterns.
- copied Python"s test_tarfile.py and changed it to work standalone.
- added: tarfile.open() accepts fileobj argument for "r:bz2" and "w:bz2" modes (workaround for a limitation in the bz2 module).
0.7.6 - 2005-01-18: download
- fixed TarIter which breaks with certain archives under Windows (Python bug #1100429).
0.7.5 - 2004-10-10: download
- added extractall() method (Python patch #1043890) (thanks to Joeseph Jones).
- fixed filemode() function (Python bug #1017553).
0.7.4 - 2004-09-15: download
- fixed truncated longnames in TarFile.getnames() (reported by Ignacio Feijoo).
0.7.3 - 2004-08-31: download
- synchronized with Python branch, see SF #988444, #995126, #1013882, #1014992.
0.7.2 - 2004-03-08: download
- fixed broken hardlink extraction (Python bug #857297).
- fixed missing trailing null character for names in TarFile._create_gnulong() (see Python patch #846659).
0.7.1 - 2003-11-14: download
- fixed bogus largefile debug messages that were introduced in version 0.7.
- fixed a bug in TarFile._create_gnulong() that had several side-effects, e.g. faulty end-of-tar padding.
- fixed missing end-of-tar padding in _Stream for non-10240-byte blocksizes (reported by Johan Fredrik Øhman).
0.7 - 2003-11-07: download
- added support for files exceeding 8 GB size (hint by Johan Fredrik Øhman).
0.6.9 - 2003-11-05: download
- fix for Python bug #822668.
0.6.8 - 2003-10-31: download
- added exclude keyword to TarFile.add() (inspired by Andrea Bolzonella).
0.6.7 - 2003-08-07: download
- added new mode "r|*" which enables the stream interface to use transparent detection of compression (inspired by Jacob Weismann Poulsen).
0.6.6 - 2003-04-25: download
- fix for Python bug #721871.
0.6.5 - 2003-01-07: download
- merged changes and fixes from the Python tarfile module.
- in TarFile.bz2open() compresslevel was not used.
- fixed a bug in TarFile.proc_gnulong().
- fixed missing slash bug with directories and GNU longnames.
0.6.4 - 2002-12-17: download
- gettarinfo() is now more performant (patch from Chris Jaeger).
- fixed a bug in TarFile.bz2open() (reported by Iustin Pop).
0.6.3 - 2002-11-22: download
- made subclassing TarFile is easier:
- - open() and *open() are now classmethods.
- - set up an internal interface that allows to replace methods and to add callbacks.
- _buftoinfo went into TarInfo as frombuf(), TarInfo.getheader() is now tobuf().
- changed exceptions: TarError is the base exception for all the others.
- error tuple is removed.
- improved debugging output process, debugging output is now written to stderr.
- list() handles uname, gname, uid, gid differently.
0.6.2 - 2002-11-14: download
- TarInfo has a __repr__() now.
- getnames(), getmember(), getmembers() and list() now work when writing, too.
0.6.1 - 2002-11-11: download
- applied a patch by Chris Jaeger (gettarinfo() with fileobj).
- fixed bug in _Stream._init_write_gz() when self.name is None.
- TarFile.list() shows uid/gid when uname/gname are not present.
- added support for bzip2 compression in _Stream.
- synced docstrincs with tex documentation.
- open() with mode "w|" truncates an existing file.
0.6 - 2002-10-22: download
- tarfile.open() now supports stream-like objects and tape devices (thanks to Richard Townsend!).
- support for bzip2 compression (requires bz2 module).
- added tardump.py, a small script that displays detailed information on tar files.
- fixed error in TarFile.add() when TarFile.name is None.
- defined maximum size of a file member (<8GB).
- TarInfo.name and .prefix are used more efficiently.
- fixed problem with hardlink extraction.
- fixed: TarInfo.linkname was always "." by mistake (reported by David Levine).
- GNU longname/longlink extensions now support names of unlimited length.
- fixed bugs with append mode.
0.5 - 2002-10-02: download
- intelligent open() method with transparent compression (suggestion by Jason Petrone).
- improved POSIX tar compliance (suggestion by Jörg Schilling):
- - TarFile.posix toggles posix-compliance (on by default).
- - TarInfo.mode uses only the lower 12 bits.
- - TarInfo.prefix is used for reading and writing.
- unknown type members are now extracted as normal files (suggestion by Ben Escoto).
- added class TarInc to tartools.py (inspired by Ben Escoto).
- rewrote/updated/improved the docs.
- rewrote/updated/improved the unittest.
- fixed error with GNU sparse files.
- some minor adjustments.
0.4.9 - 2002-07-22: download
- TarFile.getinfo() will be replaced by TarFile.getmember(), added DeprecationWarning.
- replaced TarFile.ignore_errors by TarFile.errorlevel under Python >= 2.3, True and False are returned where possible.
- added TarInfo.isdev().
- small fix for default of devmajor, devminor.
- complete rewrite of test_tarfile.py.
0.4.8 - 2002-07-14: download
- massive code-cleanup in _extract_member().
- reworked exception handling:
- - introduced TarFile.ignore_errors which enables the caller to react on errors when extracting.
- - small doc-update.
0.4.7 - 2002-07-06: download
- fixed copyfileobj() which failed on files that have no remainder.
- set tarfile.py and _tarfile.c in sync with Gustavo Niemeyer"s latest patches to posixmodule.c.
- fixed bug with chown() under root.
- added tartools.py which contains TarCount class.
0.4.6 - 2002-06-29: download
- improved output of TarFile.list() method.
- added os.lchown() call for future Python version.
- fixed link extraction.
- a zero block is now regarded as the end of an archive.
- by default, this can be avoided by setting TarFile.ignore_zeros to True.
0.4.5 - 2002-06-17: download
- fixed the _tarfile.c extension for FreeBSD support.
0.4.4 - 2002-06-10: download
- bugfix for devmajor, devminor handling on extraction.
- more elegant debug handling.
0.4.3 - 2002-06-08: download
- added support for character and block device extraction and addition, you need the _tarfile.c extension for that.
0.4.2 - 2002-06-02: download
- fixed a major bug in TarFile.extract().
- some minor enhancements.
0.4.1 - 2002-05-19: download
- fixed a small bug in gzopen().
0.4 - 2002-05-01: download
- added support for GNU sparse files (read-only).
- improved FileObject class with sparse-file support and seek(), tell(), readline() and readlines() methods.
- created a unittest suite and removed several bugs.
0.3.4 - 2002-04-26: download
- extractfile() and addfile() take file objects.
- included some code improvements by Niels Gustäbel.
- changed names for read*() and write*() to extract*() and add*().
- added isfile(), isdir() etc. methods to TarInfo.
0.3.3 - 2002-04-10: download
- added TarFileCompat class, which is compatible to zipfile.ZipFile (suggestion by Thomas Heller).
- added is_tarfile().
0.3.1 - 2002-03-27: download
- changed TarFile.__init__ to work more like expected.
- removed stdout class, it is not needed anymore.
- some other fine adjustments.
0.3 - 2002-03-23: download
- !dropped zipfile compatibility completely.
- changed class interface.
- removed gzip() and gunzip() functions.
- added method gettarinfo() to TarFile.
- added read support for "--portability" tar format.
0.2.7 - 2002-03-17: download
- hardlinks are now archived and extracted correctly.
- improved error handling.
- improved commandline interface.
- renamed TarFile.followsymlinks to TarFile.dereference, like the GNU tar option.
0.2.6 - 2002-03-15: download
- added some patches from Detlef Lannert.
- fixed some bugs.
0.2.5 - 2002-03-14: download
- added the stdout class for gzipped writing to sys.stdout.
- kicked out the last bugs.
- !first public release.
License
tarfile is distributed under a BSD-style license.
tarfile - Copyright © 2006, Lars Gustäbel (lars@gustaebel.de)
All rights reserved.
Permission is hereby granted, free of charge, to any person
obtaining a copy of this software and associated documentation
files (the "Software"), to deal in the Software without
restriction, including without limitation the rights to use,
copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the
Software is furnished to do so, subject to the following
conditions:
The above copyright notice and this permission notice shall be
included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
OTHER DEALINGS IN THE SOFTWARE.
|