@quyse/client-zip
Version:
A tiny and fast client-side streaming ZIP generator
1,144 lines (901 loc) • 171 kB
Plain Text
File: APPNOTE.TXT - .ZIP File Format Specification
Version: 6.3.6
Status: FINAL - replaces version 6.3.5
Revised: April 26, 2019
Copyright (c) 1989 - 2014, 2018, 2019 PKWARE Inc., All Rights Reserved.
1.0 Introduction
---------------
1.1 Purpose
-----------
1.1.1 This specification is intended to define a cross-platform,
interoperable file storage and transfer format. Since its
first publication in 1989, PKWARE, Inc. ("PKWARE") has remained
committed to ensuring the interoperability of the .ZIP file
format through periodic publication and maintenance of this
specification. We trust that all .ZIP compatible vendors and
application developers that use and benefit from this format
will share and support this commitment to interoperability.
1.2 Scope
---------
1.2.1 ZIP is one of the most widely used compressed file formats. It is
universally used to aggregate, compress, and encrypt files into a single
interoperable container. No specific use or application need is
defined by this format and no specific implementation guidance is
provided. This document provides details on the storage format for
creating ZIP files. Information is provided on the records and
fields that describe what a ZIP file is.
1.3 Trademarks
--------------
1.3.1 PKWARE, PKZIP, Smartcrypt, SecureZIP, and PKSFX are registered
trademarks of PKWARE, Inc. in the United States and elsewhere.
PKPatchMaker, Deflate64, and ZIP64 are trademarks of PKWARE, Inc.
Other marks referenced within this document appear for identification
purposes only and are the property of their respective owners.
1.4 Permitted Use
-----------------
1.4.1 This document, "APPNOTE.TXT - .ZIP File Format Specification" is the
exclusive property of PKWARE. Use of the information contained in this
document is permitted solely for the purpose of creating products,
programs and processes that read and write files in the ZIP format
subject to the terms and conditions herein.
1.4.2 Use of the content of this document within other publications is
permitted only through reference to this document. Any reproduction
or distribution of this document in whole or in part without prior
written permission from PKWARE is strictly prohibited.
1.4.3 Certain technological components provided in this document are the
patented proprietary technology of PKWARE and as such require a
separate, executed license agreement from PKWARE. Applicable
components are marked with the following, or similar, statement:
'Refer to the section in this document entitled "Incorporating
PKWARE Proprietary Technology into Your Product" for more information'.
1.5 Contacting PKWARE
---------------------
1.5.1 If you have questions on this format, its use, or licensing, or if you
wish to report defects, request changes or additions, please contact:
PKWARE, Inc.
201 E. Pittsburgh Avenue, Suite 400
Milwaukee, WI 53204
+1-414-289-9788
+1-414-289-9789 FAX
zipformat@pkware.com
1.5.2 Information about this format and a reference copy of this document
is publicly available at:
http://www.pkware.com/appnote
1.6 Disclaimer
--------------
1.6.1 Although PKWARE will attempt to supply current and accurate
information relating to its file formats, algorithms, and the
subject programs, the possibility of error or omission cannot
be eliminated. PKWARE therefore expressly disclaims any warranty
that the information contained in the associated materials relating
to the subject programs and/or the format of the files created or
accessed by the subject programs and/or the algorithms used by
the subject programs, or any other matter, is current, correct or
accurate as delivered. Any risk of damage due to any possible
inaccurate information is assumed by the user of the information.
Furthermore, the information relating to the subject programs
and/or the file formats created or accessed by the subject
programs and/or the algorithms used by the subject programs is
subject to change without notice.
2.0 Revisions
--------------
2.1 Document Status
--------------------
2.1.1 If the STATUS of this file is marked as DRAFT, the content
defines proposed revisions to this specification which may consist
of changes to the ZIP format itself, or that may consist of other
content changes to this document. Versions of this document and
the format in DRAFT form may be subject to modification prior to
publication STATUS of FINAL. DRAFT versions are published periodically
to provide notification to the ZIP community of pending changes and to
provide opportunity for review and comment.
2.1.2 Versions of this document having a STATUS of FINAL are
considered to be in the final form for that version of the document
and are not subject to further change until a new, higher version
numbered document is published. Newer versions of this format
specification are intended to remain interoperable with all prior
versions whenever technically possible.
2.2 Change Log
--------------
Version Change Description Date
------- ------------------ ----------
5.2 -Single Password Symmetric Encryption 07/16/2003
storage
6.1.0 -Smartcard compatibility 01/20/2004
-Documentation on certificate storage
6.2.0 -Introduction of Central Directory 04/26/2004
Encryption for encrypting metadata
-Added OS X to Version Made By values
6.2.1 -Added Extra Field placeholder for 04/01/2005
POSZIP using ID 0x4690
-Clarified size field on
"zip64 end of central directory record"
6.2.2 -Documented Final Feature Specification 01/06/2006
for Strong Encryption
-Clarifications and typographical
corrections
6.3.0 -Added tape positioning storage 09/29/2006
parameters
-Expanded list of supported hash algorithms
-Expanded list of supported compression
algorithms
-Expanded list of supported encryption
algorithms
-Added option for Unicode filename
storage
-Clarifications for consistent use
of Data Descriptor records
-Added additional "Extra Field"
definitions
6.3.1 -Corrected standard hash values for 04/11/2007
SHA-256/384/512
6.3.2 -Added compression method 97 09/28/2007
-Documented InfoZIP "Extra Field"
values for UTF-8 file name and
file comment storage
6.3.3 -Formatting changes to support 09/01/2012
easier referencing of this APPNOTE
from other documents and standards
6.3.4 -Address change 10/01/2014
6.3.5 -Documented compression methods 16 11/31/2018
and 99 (4.4.5, 4.6.1, 5.11, 5.17,
APPENDIX E)
-Corrected several typographical
errors (2.1.2, 3.2, 4.1.1, 10.2)
-Marked legacy algorithms as no
longer suitable for use (4.4.5.1)
-Added clarity on MS DOS time format
(4.4.6)
-Assign extrafield ID for Timestamps
(4.5.2)
-Field code description correction (A.2)
-More consistent use of MAY/SHOULD/MUST
-Expanded 0x0065 record attribute codes (B.2)
-Initial information on 0x0022 Extra Data
6.3.6 -Corrected typographical error 04/26/2019
(4.4.1.3)
3.0 Notations
-------------
3.1 Use of the term MUST or SHALL indicates a required element.
3.2 MUST NOT or SHALL NOT indicates an element is prohibited from use.
3.3 SHOULD indicates a RECOMMENDED element.
3.4 SHOULD NOT indicates an element NOT RECOMMENDED for use.
3.5 MAY indicates an OPTIONAL element.
4.0 ZIP Files
-------------
4.1 What is a ZIP file
----------------------
4.1.1 ZIP files MAY be identified by the standard .ZIP file extension
although use of a file extension is not required. Use of the
extension .ZIPX is also recognized and MAY be used for ZIP files.
Other common file extensions using the ZIP format include .JAR, .WAR,
.DOCX, .XLSX, .PPTX, .ODT, .ODS, .ODP and others. Programs reading or
writing ZIP files SHOULD rely on internal record signatures described
in this document to identify files in this format.
4.1.2 ZIP files SHOULD contain at least one file and MAY contain
multiple files.
4.1.3 Data compression MAY be used to reduce the size of files
placed into a ZIP file, but is not required. This format supports the
use of multiple data compression algorithms. When compression is used,
one of the documented compression algorithms MUST be used. Implementors
are advised to experiment with their data to determine which of the
available algorithms provides the best compression for their needs.
Compression method 8 (Deflate) is the method used by default by most
ZIP compatible application programs.
4.1.4 Data encryption MAY be used to protect files within a ZIP file.
Keying methods supported for encryption within this format include
passwords and public/private keys. Either MAY be used individually
or in combination. Encryption MAY be applied to individual files.
Additional security MAY be used through the encryption of ZIP file
metadata stored within the Central Directory. See the section on the
Strong Encryption Specification for information. Refer to the section
in this document entitled "Incorporating PKWARE Proprietary Technology
into Your Product" for more information.
4.1.5 Data integrity MUST be provided for each file using CRC32.
4.1.6 Additional data integrity MAY be included through the use of
digital signatures. Individual files MAY be signed with one or more
digital signatures. The Central Directory, if signed, MUST use a
single signature.
4.1.7 Files MAY be placed within a ZIP file uncompressed or stored.
The term "stored" as used in the context of this document means the file
is copied into the ZIP file uncompressed.
4.1.8 Each data file placed into a ZIP file MAY be compressed, stored,
encrypted or digitally signed independent of how other data files in the
same ZIP file are archived.
4.1.9 ZIP files MAY be streamed, split into segments (on fixed or on
removable media) or "self-extracting". Self-extracting ZIP
files MUST include extraction code for a target platform within
the ZIP file.
4.1.10 Extensibility is provided for platform or application specific
needs through extra data fields that MAY be defined for custom
purposes. Extra data definitions MUST NOT conflict with existing
documented record definitions.
4.1.11 Common uses for ZIP MAY also include the use of manifest files.
Manifest files store application specific information within a file stored
within the ZIP file. This manifest file SHOULD be the first file in the
ZIP file. This specification does not provide any information or guidance on
the use of manifest files within ZIP files. Refer to the application developer
for information on using manifest files and for any additional profile
information on using ZIP within an application.
4.1.12 ZIP files MAY be placed within other ZIP files.
4.2 ZIP Metadata
----------------
4.2.1 ZIP files are identified by metadata consisting of defined record types
containing the storage information necessary for maintaining the files
placed into a ZIP file. Each record type MUST be identified using a header
signature that identifies the record type. Signature values begin with the
two byte constant marker of 0x4b50, representing the characters "PK".
4.3 General Format of a .ZIP file
---------------------------------
4.3.1 A ZIP file MUST contain an "end of central directory record". A ZIP
file containing only an "end of central directory record" is considered an
empty ZIP file. Files MAY be added or replaced within a ZIP file, or deleted.
A ZIP file MUST have only one "end of central directory record". Other
records defined in this specification MAY be used as needed to support
storage requirements for individual ZIP files.
4.3.2 Each file placed into a ZIP file MUST be preceeded by a "local
file header" record for that file. Each "local file header" MUST be
accompanied by a corresponding "central directory header" record within
the central directory section of the ZIP file.
4.3.3 Files MAY be stored in arbitrary order within a ZIP file. A ZIP
file MAY span multiple volumes or it MAY be split into user-defined
segment sizes. All values MUST be stored in little-endian byte order unless
otherwise specified in this document for a specific data element.
4.3.4 Compression MUST NOT be applied to a "local file header", an "encryption
header", or an "end of central directory record". Individual "central
directory records" MUST NOT be compressed, but the aggregate of all central
directory records MAY be compressed.
4.3.5 File data MAY be followed by a "data descriptor" for the file. Data
descriptors are used to facilitate ZIP file streaming.
4.3.6 Overall .ZIP file format:
[local file header 1]
[encryption header 1]
[file data 1]
[data descriptor 1]
.
.
.
[local file header n]
[encryption header n]
[file data n]
[data descriptor n]
[archive decryption header]
[archive extra data record]
[central directory header 1]
.
.
.
[central directory header n]
[zip64 end of central directory record]
[zip64 end of central directory locator]
[end of central directory record]
4.3.7 Local file header:
local file header signature 4 bytes (0x04034b50)
version needed to extract 2 bytes
general purpose bit flag 2 bytes
compression method 2 bytes
last mod file time 2 bytes
last mod file date 2 bytes
crc-32 4 bytes
compressed size 4 bytes
uncompressed size 4 bytes
file name length 2 bytes
extra field length 2 bytes
file name (variable size)
extra field (variable size)
4.3.8 File data
Immediately following the local header for a file
SHOULD be placed the compressed or stored data for the file.
If the file is encrypted, the encryption header for the file
SHOULD be placed after the local header and before the file
data. The series of [local file header][encryption header]
[file data][data descriptor] repeats for each file in the
.ZIP archive.
Zero-byte files, directories, and other file types that
contain no content MUST NOT include file data.
4.3.9 Data descriptor:
crc-32 4 bytes
compressed size 4 bytes
uncompressed size 4 bytes
4.3.9.1 This descriptor MUST exist if bit 3 of the general
purpose bit flag is set (see below). It is byte aligned
and immediately follows the last byte of compressed data.
This descriptor SHOULD be used only when it was not possible to
seek in the output .ZIP file, e.g., when the output .ZIP file
was standard output or a non-seekable device. For ZIP64(tm) format
archives, the compressed and uncompressed sizes are 8 bytes each.
4.3.9.2 When compressing files, compressed and uncompressed sizes
SHOULD be stored in ZIP64 format (as 8 byte values) when a
file's size exceeds 0xFFFFFFFF. However ZIP64 format MAY be
used regardless of the size of a file. When extracting, if
the zip64 extended information extra field is present for
the file the compressed and uncompressed sizes will be 8
byte values.
4.3.9.3 Although not originally assigned a signature, the value
0x08074b50 has commonly been adopted as a signature value
for the data descriptor record. Implementers SHOULD be
aware that ZIP files MAY be encountered with or without this
signature marking data descriptors and SHOULD account for
either case when reading ZIP files to ensure compatibility.
4.3.9.4 When writing ZIP files, implementors SHOULD include the
signature value marking the data descriptor record. When
the signature is used, the fields currently defined for
the data descriptor record will immediately follow the
signature.
4.3.9.5 An extensible data descriptor will be released in a
future version of this APPNOTE. This new record is intended to
resolve conflicts with the use of this record going forward,
and to provide better support for streamed file processing.
4.3.9.6 When the Central Directory Encryption method is used,
the data descriptor record is not required, but MAY be used.
If present, and bit 3 of the general purpose bit field is set to
indicate its presence, the values in fields of the data descriptor
record MUST be set to binary zeros. See the section on the Strong
Encryption Specification for information. Refer to the section in
this document entitled "Incorporating PKWARE Proprietary Technology
into Your Product" for more information.
4.3.10 Archive decryption header:
4.3.10.1 The Archive Decryption Header is introduced in version 6.2
of the ZIP format specification. This record exists in support
of the Central Directory Encryption Feature implemented as part of
the Strong Encryption Specification as described in this document.
When the Central Directory Structure is encrypted, this decryption
header MUST precede the encrypted data segment.
4.3.10.2 The encrypted data segment SHALL consist of the Archive
extra data record (if present) and the encrypted Central Directory
Structure data. The format of this data record is identical to the
Decryption header record preceding compressed file data. If the
central directory structure is encrypted, the location of the start of
this data record is determined using the Start of Central Directory
field in the Zip64 End of Central Directory record. See the
section on the Strong Encryption Specification for information
on the fields used in the Archive Decryption Header record.
Refer to the section in this document entitled "Incorporating
PKWARE Proprietary Technology into Your Product" for more information.
4.3.11 Archive extra data record:
archive extra data signature 4 bytes (0x08064b50)
extra field length 4 bytes
extra field data (variable size)
4.3.11.1 The Archive Extra Data Record is introduced in version 6.2
of the ZIP format specification. This record MAY be used in support
of the Central Directory Encryption Feature implemented as part of
the Strong Encryption Specification as described in this document.
When present, this record MUST immediately precede the central
directory data structure.
4.3.11.2 The size of this data record SHALL be included in the
Size of the Central Directory field in the End of Central
Directory record. If the central directory structure is compressed,
but not encrypted, the location of the start of this data record is
determined using the Start of Central Directory field in the Zip64
End of Central Directory record. Refer to the section in this document
entitled "Incorporating PKWARE Proprietary Technology into Your
Product" for more information.
4.3.12 Central directory structure:
[central directory header 1]
.
.
.
[central directory header n]
[digital signature]
File header:
central file header signature 4 bytes (0x02014b50)
version made by 2 bytes
version needed to extract 2 bytes
general purpose bit flag 2 bytes
compression method 2 bytes
last mod file time 2 bytes
last mod file date 2 bytes
crc-32 4 bytes
compressed size 4 bytes
uncompressed size 4 bytes
file name length 2 bytes
extra field length 2 bytes
file comment length 2 bytes
disk number start 2 bytes
internal file attributes 2 bytes
external file attributes 4 bytes
relative offset of local header 4 bytes
file name (variable size)
extra field (variable size)
file comment (variable size)
4.3.13 Digital signature:
header signature 4 bytes (0x05054b50)
size of data 2 bytes
signature data (variable size)
With the introduction of the Central Directory Encryption
feature in version 6.2 of this specification, the Central
Directory Structure MAY be stored both compressed and encrypted.
Although not required, it is assumed when encrypting the
Central Directory Structure, that it will be compressed
for greater storage efficiency. Information on the
Central Directory Encryption feature can be found in the section
describing the Strong Encryption Specification. The Digital
Signature record will be neither compressed nor encrypted.
4.3.14 Zip64 end of central directory record
zip64 end of central dir
signature 4 bytes (0x06064b50)
size of zip64 end of central
directory record 8 bytes
version made by 2 bytes
version needed to extract 2 bytes
number of this disk 4 bytes
number of the disk with the
start of the central directory 4 bytes
total number of entries in the
central directory on this disk 8 bytes
total number of entries in the
central directory 8 bytes
size of the central directory 8 bytes
offset of start of central
directory with respect to
the starting disk number 8 bytes
zip64 extensible data sector (variable size)
4.3.14.1 The value stored into the "size of zip64 end of central
directory record" SHOULD be the size of the remaining
record and SHOULD NOT include the leading 12 bytes.
Size = SizeOfFixedFields + SizeOfVariableData - 12.
4.3.14.2 The above record structure defines Version 1 of the
zip64 end of central directory record. Version 1 was
implemented in versions of this specification preceding
6.2 in support of the ZIP64 large file feature. The
introduction of the Central Directory Encryption feature
implemented in version 6.2 as part of the Strong Encryption
Specification defines Version 2 of this record structure.
Refer to the section describing the Strong Encryption
Specification for details on the version 2 format for
this record. Refer to the section in this document entitled
"Incorporating PKWARE Proprietary Technology into Your Product"
for more information applicable to use of Version 2 of this
record.
4.3.14.3 Special purpose data MAY reside in the zip64 extensible
data sector field following either a V1 or V2 version of this
record. To ensure identification of this special purpose data
it MUST include an identifying header block consisting of the
following:
Header ID - 2 bytes
Data Size - 4 bytes
The Header ID field indicates the type of data that is in the
data block that follows.
Data Size identifies the number of bytes that follow for this
data block type.
4.3.14.4 Multiple special purpose data blocks MAY be present.
Each MUST be preceded by a Header ID and Data Size field. Current
mappings of Header ID values supported in this field are as
defined in APPENDIX C.
4.3.15 Zip64 end of central directory locator
zip64 end of central dir locator
signature 4 bytes (0x07064b50)
number of the disk with the
start of the zip64 end of
central directory 4 bytes
relative offset of the zip64
end of central directory record 8 bytes
total number of disks 4 bytes
4.3.16 End of central directory record:
end of central dir signature 4 bytes (0x06054b50)
number of this disk 2 bytes
number of the disk with the
start of the central directory 2 bytes
total number of entries in the
central directory on this disk 2 bytes
total number of entries in
the central directory 2 bytes
size of the central directory 4 bytes
offset of start of central
directory with respect to
the starting disk number 4 bytes
.ZIP file comment length 2 bytes
.ZIP file comment (variable size)
4.4 Explanation of fields
--------------------------
4.4.1 General notes on fields
4.4.1.1 All fields unless otherwise noted are unsigned and stored
in Intel low-byte:high-byte, low-word:high-word order.
4.4.1.2 String fields are not null terminated, since the length
is given explicitly.
4.4.1.3 The entries in the central directory MAY NOT necessarily
be in the same order that files appear in the .ZIP file.
4.4.1.4 If one of the fields in the end of central directory
record is too small to hold required data, the field SHOULD be
set to -1 (0xFFFF or 0xFFFFFFFF) and the ZIP64 format record
SHOULD be created.
4.4.1.5 The end of central directory record and the Zip64 end
of central directory locator record MUST reside on the same
disk when splitting or spanning an archive.
4.4.2 version made by (2 bytes)
4.4.2.1 The upper byte indicates the compatibility of the file
attribute information. If the external file attributes
are compatible with MS-DOS and can be read by PKZIP for
DOS version 2.04g then this value will be zero. If these
attributes are not compatible, then this value will
identify the host system on which the attributes are
compatible. Software can use this information to determine
the line record format for text files etc.
4.4.2.2 The current mappings are:
0 - MS-DOS and OS/2 (FAT / VFAT / FAT32 file systems)
1 - Amiga 2 - OpenVMS
3 - UNIX 4 - VM/CMS
5 - Atari ST 6 - OS/2 H.P.F.S.
7 - Macintosh 8 - Z-System
9 - CP/M 10 - Windows NTFS
11 - MVS (OS/390 - Z/OS) 12 - VSE
13 - Acorn Risc 14 - VFAT
15 - alternate MVS 16 - BeOS
17 - Tandem 18 - OS/400
19 - OS X (Darwin) 20 thru 255 - unused
4.4.2.3 The lower byte indicates the ZIP specification version
(the version of this document) supported by the software
used to encode the file. The value/10 indicates the major
version number, and the value mod 10 is the minor version
number.
4.4.3 version needed to extract (2 bytes)
4.4.3.1 The minimum supported ZIP specification version needed
to extract the file, mapped as above. This value is based on
the specific format features a ZIP program MUST support to
be able to extract the file. If multiple features are
applied to a file, the minimum version MUST be set to the
feature having the highest value. New features or feature
changes affecting the published format specification will be
implemented using higher version numbers than the last
published value to avoid conflict.
4.4.3.2 Current minimum feature versions are as defined below:
1.0 - Default value
1.1 - File is a volume label
2.0 - File is a folder (directory)
2.0 - File is compressed using Deflate compression
2.0 - File is encrypted using traditional PKWARE encryption
2.1 - File is compressed using Deflate64(tm)
2.5 - File is compressed using PKWARE DCL Implode
2.7 - File is a patch data set
4.5 - File uses ZIP64 format extensions
4.6 - File is compressed using BZIP2 compression*
5.0 - File is encrypted using DES
5.0 - File is encrypted using 3DES
5.0 - File is encrypted using original RC2 encryption
5.0 - File is encrypted using RC4 encryption
5.1 - File is encrypted using AES encryption
5.1 - File is encrypted using corrected RC2 encryption**
5.2 - File is encrypted using corrected RC2-64 encryption**
6.1 - File is encrypted using non-OAEP key wrapping***
6.2 - Central directory encryption
6.3 - File is compressed using LZMA
6.3 - File is compressed using PPMd+
6.3 - File is encrypted using Blowfish
6.3 - File is encrypted using Twofish
4.4.3.3 Notes on version needed to extract
* Early 7.x (pre-7.2) versions of PKZIP incorrectly set the
version needed to extract for BZIP2 compression to be 50
when it SHOULD have been 46.
** Refer to the section on Strong Encryption Specification
for additional information regarding RC2 corrections.
*** Certificate encryption using non-OAEP key wrapping is the
intended mode of operation for all versions beginning with 6.1.
Support for OAEP key wrapping MUST only be used for
backward compatibility when sending ZIP files to be opened by
versions of PKZIP older than 6.1 (5.0 or 6.0).
+ Files compressed using PPMd MUST set the version
needed to extract field to 6.3, however, not all ZIP
programs enforce this and MAY be unable to decompress
data files compressed using PPMd if this value is set.
When using ZIP64 extensions, the corresponding value in the
zip64 end of central directory record MUST also be set.
This field SHOULD be set appropriately to indicate whether
Version 1 or Version 2 format is in use.
4.4.4 general purpose bit flag: (2 bytes)
Bit 0: If set, indicates that the file is encrypted.
(For Method 6 - Imploding)
Bit 1: If the compression method used was type 6,
Imploding, then this bit, if set, indicates
an 8K sliding dictionary was used. If clear,
then a 4K sliding dictionary was used.
Bit 2: If the compression method used was type 6,
Imploding, then this bit, if set, indicates
3 Shannon-Fano trees were used to encode the
sliding dictionary output. If clear, then 2
Shannon-Fano trees were used.
(For Methods 8 and 9 - Deflating)
Bit 2 Bit 1
0 0 Normal (-en) compression option was used.
0 1 Maximum (-exx/-ex) compression option was used.
1 0 Fast (-ef) compression option was used.
1 1 Super Fast (-es) compression option was used.
(For Method 14 - LZMA)
Bit 1: If the compression method used was type 14,
LZMA, then this bit, if set, indicates
an end-of-stream (EOS) marker is used to
mark the end of the compressed data stream.
If clear, then an EOS marker is not present
and the compressed data size must be known
to extract.
Note: Bits 1 and 2 are undefined if the compression
method is any other.
Bit 3: If this bit is set, the fields crc-32, compressed
size and uncompressed size are set to zero in the
local header. The correct values are put in the
data descriptor immediately following the compressed
data. (Note: PKZIP version 2.04g for DOS only
recognizes this bit for method 8 compression, newer
versions of PKZIP recognize this bit for any
compression method.)
Bit 4: Reserved for use with method 8, for enhanced
deflating.
Bit 5: If this bit is set, this indicates that the file is
compressed patched data. (Note: Requires PKZIP
version 2.70 or greater)
Bit 6: Strong encryption. If this bit is set, you MUST
set the version needed to extract value to at least
50 and you MUST also set bit 0. If AES encryption
is used, the version needed to extract value MUST
be at least 51. See the section describing the Strong
Encryption Specification for details. Refer to the
section in this document entitled "Incorporating PKWARE
Proprietary Technology into Your Product" for more
information.
Bit 7: Currently unused.
Bit 8: Currently unused.
Bit 9: Currently unused.
Bit 10: Currently unused.
Bit 11: Language encoding flag (EFS). If this bit is set,
the filename and comment fields for this file
MUST be encoded using UTF-8. (see APPENDIX D)
Bit 12: Reserved by PKWARE for enhanced compression.
Bit 13: Set when encrypting the Central Directory to indicate
selected data values in the Local Header are masked to
hide their actual values. See the section describing
the Strong Encryption Specification for details. Refer
to the section in this document entitled "Incorporating
PKWARE Proprietary Technology into Your Product" for
more information.
Bit 14: Reserved by PKWARE.
Bit 15: Reserved by PKWARE.
4.4.5 compression method: (2 bytes)
0 - The file is stored (no compression)
1 - The file is Shrunk
2 - The file is Reduced with compression factor 1
3 - The file is Reduced with compression factor 2
4 - The file is Reduced with compression factor 3
5 - The file is Reduced with compression factor 4
6 - The file is Imploded
7 - Reserved for Tokenizing compression algorithm
8 - The file is Deflated
9 - Enhanced Deflating using Deflate64(tm)
10 - PKWARE Data Compression Library Imploding (old IBM TERSE)
11 - Reserved by PKWARE
12 - File is compressed using BZIP2 algorithm
13 - Reserved by PKWARE
14 - LZMA
15 - Reserved by PKWARE
16 - IBM z/OS CMPSC Compression
17 - Reserved by PKWARE
18 - File is compressed using IBM TERSE (new)
19 - IBM LZ77 z Architecture (PFS)
96 - JPEG variant
97 - WavPack compressed data
98 - PPMd version I, Rev 1
99 - AE-x encryption marker (see APPENDIX E)
4.4.5.1 Methods 1-6 are legacy algorithms and are no longer
recommended for use when compressing files.
4.4.6 date and time fields: (2 bytes each)
The date and time are encoded in standard MS-DOS format.
If input came from standard input, the date and time are
those at which compression was started for this data.
If encrypting the central directory and general purpose bit
flag 13 is set indicating masking, the value stored in the
Local Header will be zero. MS-DOS time format is different
from more commonly used computer time formats such as
UTC. For example, MS-DOS uses year values relative to 1980
and 2 second precision.
4.4.7 CRC-32: (4 bytes)
The CRC-32 algorithm was generously contributed by
David Schwaderer and can be found in his excellent
book "C Programmers Guide to NetBIOS" published by
Howard W. Sams & Co. Inc. The 'magic number' for
the CRC is 0xdebb20e3. The proper CRC pre and post
conditioning is used, meaning that the CRC register
is pre-conditioned with all ones (a starting value
of 0xffffffff) and the value is post-conditioned by
taking the one's complement of the CRC residual.
If bit 3 of the general purpose flag is set, this
field is set to zero in the local header and the correct
value is put in the data descriptor and in the central
directory. When encrypting the central directory, if the
local header is not in ZIP64 format and general purpose
bit flag 13 is set indicating masking, the value stored
in the Local Header will be zero.
4.4.8 compressed size: (4 bytes)
4.4.9 uncompressed size: (4 bytes)
The size of the file compressed (4.4.8) and uncompressed,
(4.4.9) respectively. When a decryption header is present it
will be placed in front of the file data and the value of the
compressed file size will include the bytes of the decryption
header. If bit 3 of the general purpose bit flag is set,
these fields are set to zero in the local header and the
correct values are put in the data descriptor and
in the central directory. If an archive is in ZIP64 format
and the value in this field is 0xFFFFFFFF, the size will be
in the corresponding 8 byte ZIP64 extended information
extra field. When encrypting the central directory, if the
local header is not in ZIP64 format and general purpose bit
flag 13 is set indicating masking, the value stored for the
uncompressed size in the Local Header will be zero.
4.4.10 file name length: (2 bytes)
4.4.11 extra field length: (2 bytes)
4.4.12 file comment length: (2 bytes)
The length of the file name, extra field, and comment
fields respectively. The combined length of any
directory record and these three fields SHOULD NOT
generally exceed 65,535 bytes. If input came from standard
input, the file name length is set to zero.
4.4.13 disk number start: (2 bytes)
The number of the disk on which this file begins. If an
archive is in ZIP64 format and the value in this field is
0xFFFF, the size will be in the corresponding 4 byte zip64
extended information extra field.
4.4.14 internal file attributes: (2 bytes)
Bits 1 and 2 are reserved for use by PKWARE.
4.4.14.1 The lowest bit of this field indicates, if set,
that the file is apparently an ASCII or text file. If not
set, that the file apparently contains binary data.
The remaining bits are unused in version 1.0.
4.4.14.2 The 0x0002 bit of this field indicates, if set, that
a 4 byte variable record length control field precedes each
logical record indicating the length of the record. The
record length control field is stored in little-endian byte
order. This flag is independent of text control characters,
and if used in conjunction with text data, includes any
control characters in the total length of the record. This
value is provided for mainframe data transfer support.
4.4.15 external file attributes: (4 bytes)
The mapping of the external attributes is
host-system dependent (see 'version made by'). For
MS-DOS, the low order byte is the MS-DOS directory
attribute byte. If input came from standard input, this
field is set to zero.
4.4.16 relative offset of local header: (4 bytes)
This is the offset from the start of the first disk on
which this file appears, to where the local header SHOULD
be found. If an archive is in ZIP64 format and the value
in this field is 0xFFFFFFFF, the size will be in the
corresponding 8 byte zip64 extended information extra field.
4.4.17 file name: (Variable)
4.4.17.1 The name of the file, with optional relative path.
The path stored MUST NOT contain a drive or
device letter, or a leading slash. All slashes
MUST be forward slashes '/' as opposed to
backwards slashes '\' for compatibility with Amiga
and UNIX file systems etc. If input came from standard
input, there is no file name field.
4.4.17.2 If using the Central Directory Encryption Feature and
general purpose bit flag 13 is set indicating masking, the file
name stored in the Local Header will not be the actual file name.
A masking value consisting of a unique hexadecimal value will
be stored. This value will be sequentially incremented for each
file in the archive. See the section on the Strong Encryption
Specification for details on retrieving the encrypted file name.
Refer to the section in this document entitled "Incorporating PKWARE
Proprietary Technology into Your Product" for more information.
4.4.18 file comment: (Variable)
The comment for this file.
4.4.19 number of this disk: (2 bytes)
The number of this disk, which contains central
directory end record. If an archive is in ZIP64 format
and the value in this field is 0xFFFF, the size will
be in the corresponding 4 byte zip64 end of central
directory field.
4.4.20 number of the disk with the start of the central
directory: (2 bytes)
The number of the disk on which the central
directory starts. If an archive is in ZIP64 format
and the value in this field is 0xFFFF, the size will
be in the corresponding 4 byte zip64 end of central
directory field.
4.4.21 total number of entries in the central dir on
this disk: (2 bytes)
The number of central directory entries on this disk.
If an archive is in ZIP64 format and the value in
this field is 0xFFFF, the size will be in the
corresponding 8 byte zip64 end of central
directory field.
4.4.22 total number of entries in the central dir: (2 bytes)
The total number of files in the .ZIP file. If an
archive is in ZIP64 format and the value in this field
is 0xFFFF, the size will be in the corresponding 8 byte
zip64 end of central directory field.
4.4.23 size of the central directory: (4 bytes)
The size (in bytes) of the entire central directory.
If an archive is in ZIP64 format and the value in
this field is 0xFFFFFFFF, the size will be in the
corresponding 8 byte zip64 end of central
directory field.
4.4.24 offset of start of central directory with respect to
the starting disk number: (4 bytes)
Offset of the start of the central directory on the
disk on which the central directory starts. If an
archive is in ZIP64 format and the value in this
field is 0xFFFFFFFF, the size will be in the
corresponding 8 byte zip64 end of central
directory field.
4.4.25 .ZIP file comment length: (2 bytes)
The length of the comment for this .ZIP file.
4.4.26 .ZIP file comment: (Variable)
The comment for this .ZIP file. ZIP file comment data
is stored unsecured. No encryption or data authentication
is applied to this area at this time. Confidential information
SHOULD NOT be stored in this section.
4.4.27 zip64 extensible data sector (variable size)
(currently reserved for use by PKWARE)
4.4.28 extra field: (Variable)
This SHOULD be used for storage expansion. If additional
information needs to be stored within a ZIP file for special
application or platform needs, it SHOULD be stored here.
Programs supporting earlier versions of this specification can
then safely skip the file, and find the next file or header.
This field will be 0 length in version 1.0.
Existing extra fields are defined in the section
Extensible data fields that follows.
4.5 Extensible data fields
--------------------------
4.5.1 In order to allow different programs and different types
of information to be stored in the 'extra' field in .ZIP
files, the following structure MUST be used for all
programs storing data in this field:
header1+data1 + header2+data2 . . .
Each header MUST consist of:
Header ID - 2 bytes
Data Size - 2 bytes
Note: all fields stored in Intel low-byte/high-byte order.
The Header ID field indicates the type of data that is in
the following data block.
Header IDs of 0 thru 31 are reserved for use by PKWARE.
The remaining IDs can be used by third party vendors for
proprietary usage.
4.5.2 The current Header ID mappings defined by PKWARE are:
0x0001 Zip64 extended information extra field
0x0007 AV Info
0x0008 Reserved for extended language encoding data (PFS)
(see APPENDIX D)
0x0009 OS/2
0x000a NTFS
0x000c OpenVMS
0x000d UNIX
0x000e Reserved for file stream and fork descriptors
0x000f Patch Descriptor
0x0014 PKCS#7 Store for X.509 Certificates
0x0015 X.509 Certificate ID and Signature for
individual file
0x0016 X.509 Certificate ID for Central Directory
0x0017 Strong Encryption Header
0x0018 Record Management Controls
0x0019 PKCS#7 Encryption Recipient Certificate List
0x0020 Reserved for Timestamp record
0x0021 Policy Decryption Key Record
0x0022 Smartcrypt Key Provider Record
0x0023 Smartcrypt Policy Key Data Record
0x0065 IBM S/390 (Z390), AS/400 (I400) attributes
- uncompressed
0x0066 Reserved for IBM S/390 (Z390), AS/400 (I400)
attributes - compressed
0x4690 POSZIP 4690 (reserved)
4.5.3 -Zip64 Extended Information Extra Field (0x0001):
The following is the layout of the zip64 extended
information "extra" block. If one of the size or
offset fields in the Local or Central directory
record is too small to hold the required data,
a Zip64 extended information record is created.
The order of the fields in the zip64 extended
information record is fixed, but the fields MUST
only appear if the corresponding Local or Central
directory record field is set to 0xFFFF or 0xFFFFFFFF.
Note: all fields stored in Intel low-byte/high-byte order.
Value Size Description
----- ---- -----------
(ZIP64) 0x0001 2 bytes Tag for this "extra" block type
Size 2 bytes Size of this "extra" block
Original
Size 8 bytes Original uncompressed file size
Compressed
Size 8 bytes Size of compressed data
Relative Header
Offset 8 bytes Offset of local header record
Disk Start
Number 4 bytes Number of the disk on which