UNPKG

@quyse/client-zip

Version:

A tiny and fast client-side streaming ZIP generator

1,144 lines (901 loc) 171 kB
File: APPNOTE.TXT - .ZIP File Format Specification Version: 6.3.6 Status: FINAL - replaces version 6.3.5 Revised: April 26, 2019 Copyright (c) 1989 - 2014, 2018, 2019 PKWARE Inc., All Rights Reserved. 1.0 Introduction --------------- 1.1 Purpose ----------- 1.1.1 This specification is intended to define a cross-platform, interoperable file storage and transfer format. Since its first publication in 1989, PKWARE, Inc. ("PKWARE") has remained committed to ensuring the interoperability of the .ZIP file format through periodic publication and maintenance of this specification. We trust that all .ZIP compatible vendors and application developers that use and benefit from this format will share and support this commitment to interoperability. 1.2 Scope --------- 1.2.1 ZIP is one of the most widely used compressed file formats. It is universally used to aggregate, compress, and encrypt files into a single interoperable container. No specific use or application need is defined by this format and no specific implementation guidance is provided. This document provides details on the storage format for creating ZIP files. Information is provided on the records and fields that describe what a ZIP file is. 1.3 Trademarks -------------- 1.3.1 PKWARE, PKZIP, Smartcrypt, SecureZIP, and PKSFX are registered trademarks of PKWARE, Inc. in the United States and elsewhere. PKPatchMaker, Deflate64, and ZIP64 are trademarks of PKWARE, Inc. Other marks referenced within this document appear for identification purposes only and are the property of their respective owners. 1.4 Permitted Use ----------------- 1.4.1 This document, "APPNOTE.TXT - .ZIP File Format Specification" is the exclusive property of PKWARE. Use of the information contained in this document is permitted solely for the purpose of creating products, programs and processes that read and write files in the ZIP format subject to the terms and conditions herein. 1.4.2 Use of the content of this document within other publications is permitted only through reference to this document. Any reproduction or distribution of this document in whole or in part without prior written permission from PKWARE is strictly prohibited. 1.4.3 Certain technological components provided in this document are the patented proprietary technology of PKWARE and as such require a separate, executed license agreement from PKWARE. Applicable components are marked with the following, or similar, statement: 'Refer to the section in this document entitled "Incorporating PKWARE Proprietary Technology into Your Product" for more information'. 1.5 Contacting PKWARE --------------------- 1.5.1 If you have questions on this format, its use, or licensing, or if you wish to report defects, request changes or additions, please contact: PKWARE, Inc. 201 E. Pittsburgh Avenue, Suite 400 Milwaukee, WI 53204 +1-414-289-9788 +1-414-289-9789 FAX zipformat@pkware.com 1.5.2 Information about this format and a reference copy of this document is publicly available at: http://www.pkware.com/appnote 1.6 Disclaimer -------------- 1.6.1 Although PKWARE will attempt to supply current and accurate information relating to its file formats, algorithms, and the subject programs, the possibility of error or omission cannot be eliminated. PKWARE therefore expressly disclaims any warranty that the information contained in the associated materials relating to the subject programs and/or the format of the files created or accessed by the subject programs and/or the algorithms used by the subject programs, or any other matter, is current, correct or accurate as delivered. Any risk of damage due to any possible inaccurate information is assumed by the user of the information. Furthermore, the information relating to the subject programs and/or the file formats created or accessed by the subject programs and/or the algorithms used by the subject programs is subject to change without notice. 2.0 Revisions -------------- 2.1 Document Status -------------------- 2.1.1 If the STATUS of this file is marked as DRAFT, the content defines proposed revisions to this specification which may consist of changes to the ZIP format itself, or that may consist of other content changes to this document. Versions of this document and the format in DRAFT form may be subject to modification prior to publication STATUS of FINAL. DRAFT versions are published periodically to provide notification to the ZIP community of pending changes and to provide opportunity for review and comment. 2.1.2 Versions of this document having a STATUS of FINAL are considered to be in the final form for that version of the document and are not subject to further change until a new, higher version numbered document is published. Newer versions of this format specification are intended to remain interoperable with all prior versions whenever technically possible. 2.2 Change Log -------------- Version Change Description Date ------- ------------------ ---------- 5.2 -Single Password Symmetric Encryption 07/16/2003 storage 6.1.0 -Smartcard compatibility 01/20/2004 -Documentation on certificate storage 6.2.0 -Introduction of Central Directory 04/26/2004 Encryption for encrypting metadata -Added OS X to Version Made By values 6.2.1 -Added Extra Field placeholder for 04/01/2005 POSZIP using ID 0x4690 -Clarified size field on "zip64 end of central directory record" 6.2.2 -Documented Final Feature Specification 01/06/2006 for Strong Encryption -Clarifications and typographical corrections 6.3.0 -Added tape positioning storage 09/29/2006 parameters -Expanded list of supported hash algorithms -Expanded list of supported compression algorithms -Expanded list of supported encryption algorithms -Added option for Unicode filename storage -Clarifications for consistent use of Data Descriptor records -Added additional "Extra Field" definitions 6.3.1 -Corrected standard hash values for 04/11/2007 SHA-256/384/512 6.3.2 -Added compression method 97 09/28/2007 -Documented InfoZIP "Extra Field" values for UTF-8 file name and file comment storage 6.3.3 -Formatting changes to support 09/01/2012 easier referencing of this APPNOTE from other documents and standards 6.3.4 -Address change 10/01/2014 6.3.5 -Documented compression methods 16 11/31/2018 and 99 (4.4.5, 4.6.1, 5.11, 5.17, APPENDIX E) -Corrected several typographical errors (2.1.2, 3.2, 4.1.1, 10.2) -Marked legacy algorithms as no longer suitable for use (4.4.5.1) -Added clarity on MS DOS time format (4.4.6) -Assign extrafield ID for Timestamps (4.5.2) -Field code description correction (A.2) -More consistent use of MAY/SHOULD/MUST -Expanded 0x0065 record attribute codes (B.2) -Initial information on 0x0022 Extra Data 6.3.6 -Corrected typographical error 04/26/2019 (4.4.1.3) 3.0 Notations ------------- 3.1 Use of the term MUST or SHALL indicates a required element. 3.2 MUST NOT or SHALL NOT indicates an element is prohibited from use. 3.3 SHOULD indicates a RECOMMENDED element. 3.4 SHOULD NOT indicates an element NOT RECOMMENDED for use. 3.5 MAY indicates an OPTIONAL element. 4.0 ZIP Files ------------- 4.1 What is a ZIP file ---------------------- 4.1.1 ZIP files MAY be identified by the standard .ZIP file extension although use of a file extension is not required. Use of the extension .ZIPX is also recognized and MAY be used for ZIP files. Other common file extensions using the ZIP format include .JAR, .WAR, .DOCX, .XLSX, .PPTX, .ODT, .ODS, .ODP and others. Programs reading or writing ZIP files SHOULD rely on internal record signatures described in this document to identify files in this format. 4.1.2 ZIP files SHOULD contain at least one file and MAY contain multiple files. 4.1.3 Data compression MAY be used to reduce the size of files placed into a ZIP file, but is not required. This format supports the use of multiple data compression algorithms. When compression is used, one of the documented compression algorithms MUST be used. Implementors are advised to experiment with their data to determine which of the available algorithms provides the best compression for their needs. Compression method 8 (Deflate) is the method used by default by most ZIP compatible application programs. 4.1.4 Data encryption MAY be used to protect files within a ZIP file. Keying methods supported for encryption within this format include passwords and public/private keys. Either MAY be used individually or in combination. Encryption MAY be applied to individual files. Additional security MAY be used through the encryption of ZIP file metadata stored within the Central Directory. See the section on the Strong Encryption Specification for information. Refer to the section in this document entitled "Incorporating PKWARE Proprietary Technology into Your Product" for more information. 4.1.5 Data integrity MUST be provided for each file using CRC32. 4.1.6 Additional data integrity MAY be included through the use of digital signatures. Individual files MAY be signed with one or more digital signatures. The Central Directory, if signed, MUST use a single signature. 4.1.7 Files MAY be placed within a ZIP file uncompressed or stored. The term "stored" as used in the context of this document means the file is copied into the ZIP file uncompressed. 4.1.8 Each data file placed into a ZIP file MAY be compressed, stored, encrypted or digitally signed independent of how other data files in the same ZIP file are archived. 4.1.9 ZIP files MAY be streamed, split into segments (on fixed or on removable media) or "self-extracting". Self-extracting ZIP files MUST include extraction code for a target platform within the ZIP file. 4.1.10 Extensibility is provided for platform or application specific needs through extra data fields that MAY be defined for custom purposes. Extra data definitions MUST NOT conflict with existing documented record definitions. 4.1.11 Common uses for ZIP MAY also include the use of manifest files. Manifest files store application specific information within a file stored within the ZIP file. This manifest file SHOULD be the first file in the ZIP file. This specification does not provide any information or guidance on the use of manifest files within ZIP files. Refer to the application developer for information on using manifest files and for any additional profile information on using ZIP within an application. 4.1.12 ZIP files MAY be placed within other ZIP files. 4.2 ZIP Metadata ---------------- 4.2.1 ZIP files are identified by metadata consisting of defined record types containing the storage information necessary for maintaining the files placed into a ZIP file. Each record type MUST be identified using a header signature that identifies the record type. Signature values begin with the two byte constant marker of 0x4b50, representing the characters "PK". 4.3 General Format of a .ZIP file --------------------------------- 4.3.1 A ZIP file MUST contain an "end of central directory record". A ZIP file containing only an "end of central directory record" is considered an empty ZIP file. Files MAY be added or replaced within a ZIP file, or deleted. A ZIP file MUST have only one "end of central directory record". Other records defined in this specification MAY be used as needed to support storage requirements for individual ZIP files. 4.3.2 Each file placed into a ZIP file MUST be preceeded by a "local file header" record for that file. Each "local file header" MUST be accompanied by a corresponding "central directory header" record within the central directory section of the ZIP file. 4.3.3 Files MAY be stored in arbitrary order within a ZIP file. A ZIP file MAY span multiple volumes or it MAY be split into user-defined segment sizes. All values MUST be stored in little-endian byte order unless otherwise specified in this document for a specific data element. 4.3.4 Compression MUST NOT be applied to a "local file header", an "encryption header", or an "end of central directory record". Individual "central directory records" MUST NOT be compressed, but the aggregate of all central directory records MAY be compressed. 4.3.5 File data MAY be followed by a "data descriptor" for the file. Data descriptors are used to facilitate ZIP file streaming. 4.3.6 Overall .ZIP file format: [local file header 1] [encryption header 1] [file data 1] [data descriptor 1] . . . [local file header n] [encryption header n] [file data n] [data descriptor n] [archive decryption header] [archive extra data record] [central directory header 1] . . . [central directory header n] [zip64 end of central directory record] [zip64 end of central directory locator] [end of central directory record] 4.3.7 Local file header: local file header signature 4 bytes (0x04034b50) version needed to extract 2 bytes general purpose bit flag 2 bytes compression method 2 bytes last mod file time 2 bytes last mod file date 2 bytes crc-32 4 bytes compressed size 4 bytes uncompressed size 4 bytes file name length 2 bytes extra field length 2 bytes file name (variable size) extra field (variable size) 4.3.8 File data Immediately following the local header for a file SHOULD be placed the compressed or stored data for the file. If the file is encrypted, the encryption header for the file SHOULD be placed after the local header and before the file data. The series of [local file header][encryption header] [file data][data descriptor] repeats for each file in the .ZIP archive. Zero-byte files, directories, and other file types that contain no content MUST NOT include file data. 4.3.9 Data descriptor: crc-32 4 bytes compressed size 4 bytes uncompressed size 4 bytes 4.3.9.1 This descriptor MUST exist if bit 3 of the general purpose bit flag is set (see below). It is byte aligned and immediately follows the last byte of compressed data. This descriptor SHOULD be used only when it was not possible to seek in the output .ZIP file, e.g., when the output .ZIP file was standard output or a non-seekable device. For ZIP64(tm) format archives, the compressed and uncompressed sizes are 8 bytes each. 4.3.9.2 When compressing files, compressed and uncompressed sizes SHOULD be stored in ZIP64 format (as 8 byte values) when a file's size exceeds 0xFFFFFFFF. However ZIP64 format MAY be used regardless of the size of a file. When extracting, if the zip64 extended information extra field is present for the file the compressed and uncompressed sizes will be 8 byte values. 4.3.9.3 Although not originally assigned a signature, the value 0x08074b50 has commonly been adopted as a signature value for the data descriptor record. Implementers SHOULD be aware that ZIP files MAY be encountered with or without this signature marking data descriptors and SHOULD account for either case when reading ZIP files to ensure compatibility. 4.3.9.4 When writing ZIP files, implementors SHOULD include the signature value marking the data descriptor record. When the signature is used, the fields currently defined for the data descriptor record will immediately follow the signature. 4.3.9.5 An extensible data descriptor will be released in a future version of this APPNOTE. This new record is intended to resolve conflicts with the use of this record going forward, and to provide better support for streamed file processing. 4.3.9.6 When the Central Directory Encryption method is used, the data descriptor record is not required, but MAY be used. If present, and bit 3 of the general purpose bit field is set to indicate its presence, the values in fields of the data descriptor record MUST be set to binary zeros. See the section on the Strong Encryption Specification for information. Refer to the section in this document entitled "Incorporating PKWARE Proprietary Technology into Your Product" for more information. 4.3.10 Archive decryption header: 4.3.10.1 The Archive Decryption Header is introduced in version 6.2 of the ZIP format specification. This record exists in support of the Central Directory Encryption Feature implemented as part of the Strong Encryption Specification as described in this document. When the Central Directory Structure is encrypted, this decryption header MUST precede the encrypted data segment. 4.3.10.2 The encrypted data segment SHALL consist of the Archive extra data record (if present) and the encrypted Central Directory Structure data. The format of this data record is identical to the Decryption header record preceding compressed file data. If the central directory structure is encrypted, the location of the start of this data record is determined using the Start of Central Directory field in the Zip64 End of Central Directory record. See the section on the Strong Encryption Specification for information on the fields used in the Archive Decryption Header record. Refer to the section in this document entitled "Incorporating PKWARE Proprietary Technology into Your Product" for more information. 4.3.11 Archive extra data record: archive extra data signature 4 bytes (0x08064b50) extra field length 4 bytes extra field data (variable size) 4.3.11.1 The Archive Extra Data Record is introduced in version 6.2 of the ZIP format specification. This record MAY be used in support of the Central Directory Encryption Feature implemented as part of the Strong Encryption Specification as described in this document. When present, this record MUST immediately precede the central directory data structure. 4.3.11.2 The size of this data record SHALL be included in the Size of the Central Directory field in the End of Central Directory record. If the central directory structure is compressed, but not encrypted, the location of the start of this data record is determined using the Start of Central Directory field in the Zip64 End of Central Directory record. Refer to the section in this document entitled "Incorporating PKWARE Proprietary Technology into Your Product" for more information. 4.3.12 Central directory structure: [central directory header 1] . . . [central directory header n] [digital signature] File header: central file header signature 4 bytes (0x02014b50) version made by 2 bytes version needed to extract 2 bytes general purpose bit flag 2 bytes compression method 2 bytes last mod file time 2 bytes last mod file date 2 bytes crc-32 4 bytes compressed size 4 bytes uncompressed size 4 bytes file name length 2 bytes extra field length 2 bytes file comment length 2 bytes disk number start 2 bytes internal file attributes 2 bytes external file attributes 4 bytes relative offset of local header 4 bytes file name (variable size) extra field (variable size) file comment (variable size) 4.3.13 Digital signature: header signature 4 bytes (0x05054b50) size of data 2 bytes signature data (variable size) With the introduction of the Central Directory Encryption feature in version 6.2 of this specification, the Central Directory Structure MAY be stored both compressed and encrypted. Although not required, it is assumed when encrypting the Central Directory Structure, that it will be compressed for greater storage efficiency. Information on the Central Directory Encryption feature can be found in the section describing the Strong Encryption Specification. The Digital Signature record will be neither compressed nor encrypted. 4.3.14 Zip64 end of central directory record zip64 end of central dir signature 4 bytes (0x06064b50) size of zip64 end of central directory record 8 bytes version made by 2 bytes version needed to extract 2 bytes number of this disk 4 bytes number of the disk with the start of the central directory 4 bytes total number of entries in the central directory on this disk 8 bytes total number of entries in the central directory 8 bytes size of the central directory 8 bytes offset of start of central directory with respect to the starting disk number 8 bytes zip64 extensible data sector (variable size) 4.3.14.1 The value stored into the "size of zip64 end of central directory record" SHOULD be the size of the remaining record and SHOULD NOT include the leading 12 bytes. Size = SizeOfFixedFields + SizeOfVariableData - 12. 4.3.14.2 The above record structure defines Version 1 of the zip64 end of central directory record. Version 1 was implemented in versions of this specification preceding 6.2 in support of the ZIP64 large file feature. The introduction of the Central Directory Encryption feature implemented in version 6.2 as part of the Strong Encryption Specification defines Version 2 of this record structure. Refer to the section describing the Strong Encryption Specification for details on the version 2 format for this record. Refer to the section in this document entitled "Incorporating PKWARE Proprietary Technology into Your Product" for more information applicable to use of Version 2 of this record. 4.3.14.3 Special purpose data MAY reside in the zip64 extensible data sector field following either a V1 or V2 version of this record. To ensure identification of this special purpose data it MUST include an identifying header block consisting of the following: Header ID - 2 bytes Data Size - 4 bytes The Header ID field indicates the type of data that is in the data block that follows. Data Size identifies the number of bytes that follow for this data block type. 4.3.14.4 Multiple special purpose data blocks MAY be present. Each MUST be preceded by a Header ID and Data Size field. Current mappings of Header ID values supported in this field are as defined in APPENDIX C. 4.3.15 Zip64 end of central directory locator zip64 end of central dir locator signature 4 bytes (0x07064b50) number of the disk with the start of the zip64 end of central directory 4 bytes relative offset of the zip64 end of central directory record 8 bytes total number of disks 4 bytes 4.3.16 End of central directory record: end of central dir signature 4 bytes (0x06054b50) number of this disk 2 bytes number of the disk with the start of the central directory 2 bytes total number of entries in the central directory on this disk 2 bytes total number of entries in the central directory 2 bytes size of the central directory 4 bytes offset of start of central directory with respect to the starting disk number 4 bytes .ZIP file comment length 2 bytes .ZIP file comment (variable size) 4.4 Explanation of fields -------------------------- 4.4.1 General notes on fields 4.4.1.1 All fields unless otherwise noted are unsigned and stored in Intel low-byte:high-byte, low-word:high-word order. 4.4.1.2 String fields are not null terminated, since the length is given explicitly. 4.4.1.3 The entries in the central directory MAY NOT necessarily be in the same order that files appear in the .ZIP file. 4.4.1.4 If one of the fields in the end of central directory record is too small to hold required data, the field SHOULD be set to -1 (0xFFFF or 0xFFFFFFFF) and the ZIP64 format record SHOULD be created. 4.4.1.5 The end of central directory record and the Zip64 end of central directory locator record MUST reside on the same disk when splitting or spanning an archive. 4.4.2 version made by (2 bytes) 4.4.2.1 The upper byte indicates the compatibility of the file attribute information. If the external file attributes are compatible with MS-DOS and can be read by PKZIP for DOS version 2.04g then this value will be zero. If these attributes are not compatible, then this value will identify the host system on which the attributes are compatible. Software can use this information to determine the line record format for text files etc. 4.4.2.2 The current mappings are: 0 - MS-DOS and OS/2 (FAT / VFAT / FAT32 file systems) 1 - Amiga 2 - OpenVMS 3 - UNIX 4 - VM/CMS 5 - Atari ST 6 - OS/2 H.P.F.S. 7 - Macintosh 8 - Z-System 9 - CP/M 10 - Windows NTFS 11 - MVS (OS/390 - Z/OS) 12 - VSE 13 - Acorn Risc 14 - VFAT 15 - alternate MVS 16 - BeOS 17 - Tandem 18 - OS/400 19 - OS X (Darwin) 20 thru 255 - unused 4.4.2.3 The lower byte indicates the ZIP specification version (the version of this document) supported by the software used to encode the file. The value/10 indicates the major version number, and the value mod 10 is the minor version number. 4.4.3 version needed to extract (2 bytes) 4.4.3.1 The minimum supported ZIP specification version needed to extract the file, mapped as above. This value is based on the specific format features a ZIP program MUST support to be able to extract the file. If multiple features are applied to a file, the minimum version MUST be set to the feature having the highest value. New features or feature changes affecting the published format specification will be implemented using higher version numbers than the last published value to avoid conflict. 4.4.3.2 Current minimum feature versions are as defined below: 1.0 - Default value 1.1 - File is a volume label 2.0 - File is a folder (directory) 2.0 - File is compressed using Deflate compression 2.0 - File is encrypted using traditional PKWARE encryption 2.1 - File is compressed using Deflate64(tm) 2.5 - File is compressed using PKWARE DCL Implode 2.7 - File is a patch data set 4.5 - File uses ZIP64 format extensions 4.6 - File is compressed using BZIP2 compression* 5.0 - File is encrypted using DES 5.0 - File is encrypted using 3DES 5.0 - File is encrypted using original RC2 encryption 5.0 - File is encrypted using RC4 encryption 5.1 - File is encrypted using AES encryption 5.1 - File is encrypted using corrected RC2 encryption** 5.2 - File is encrypted using corrected RC2-64 encryption** 6.1 - File is encrypted using non-OAEP key wrapping*** 6.2 - Central directory encryption 6.3 - File is compressed using LZMA 6.3 - File is compressed using PPMd+ 6.3 - File is encrypted using Blowfish 6.3 - File is encrypted using Twofish 4.4.3.3 Notes on version needed to extract * Early 7.x (pre-7.2) versions of PKZIP incorrectly set the version needed to extract for BZIP2 compression to be 50 when it SHOULD have been 46. ** Refer to the section on Strong Encryption Specification for additional information regarding RC2 corrections. *** Certificate encryption using non-OAEP key wrapping is the intended mode of operation for all versions beginning with 6.1. Support for OAEP key wrapping MUST only be used for backward compatibility when sending ZIP files to be opened by versions of PKZIP older than 6.1 (5.0 or 6.0). + Files compressed using PPMd MUST set the version needed to extract field to 6.3, however, not all ZIP programs enforce this and MAY be unable to decompress data files compressed using PPMd if this value is set. When using ZIP64 extensions, the corresponding value in the zip64 end of central directory record MUST also be set. This field SHOULD be set appropriately to indicate whether Version 1 or Version 2 format is in use. 4.4.4 general purpose bit flag: (2 bytes) Bit 0: If set, indicates that the file is encrypted. (For Method 6 - Imploding) Bit 1: If the compression method used was type 6, Imploding, then this bit, if set, indicates an 8K sliding dictionary was used. If clear, then a 4K sliding dictionary was used. Bit 2: If the compression method used was type 6, Imploding, then this bit, if set, indicates 3 Shannon-Fano trees were used to encode the sliding dictionary output. If clear, then 2 Shannon-Fano trees were used. (For Methods 8 and 9 - Deflating) Bit 2 Bit 1 0 0 Normal (-en) compression option was used. 0 1 Maximum (-exx/-ex) compression option was used. 1 0 Fast (-ef) compression option was used. 1 1 Super Fast (-es) compression option was used. (For Method 14 - LZMA) Bit 1: If the compression method used was type 14, LZMA, then this bit, if set, indicates an end-of-stream (EOS) marker is used to mark the end of the compressed data stream. If clear, then an EOS marker is not present and the compressed data size must be known to extract. Note: Bits 1 and 2 are undefined if the compression method is any other. Bit 3: If this bit is set, the fields crc-32, compressed size and uncompressed size are set to zero in the local header. The correct values are put in the data descriptor immediately following the compressed data. (Note: PKZIP version 2.04g for DOS only recognizes this bit for method 8 compression, newer versions of PKZIP recognize this bit for any compression method.) Bit 4: Reserved for use with method 8, for enhanced deflating. Bit 5: If this bit is set, this indicates that the file is compressed patched data. (Note: Requires PKZIP version 2.70 or greater) Bit 6: Strong encryption. If this bit is set, you MUST set the version needed to extract value to at least 50 and you MUST also set bit 0. If AES encryption is used, the version needed to extract value MUST be at least 51. See the section describing the Strong Encryption Specification for details. Refer to the section in this document entitled "Incorporating PKWARE Proprietary Technology into Your Product" for more information. Bit 7: Currently unused. Bit 8: Currently unused. Bit 9: Currently unused. Bit 10: Currently unused. Bit 11: Language encoding flag (EFS). If this bit is set, the filename and comment fields for this file MUST be encoded using UTF-8. (see APPENDIX D) Bit 12: Reserved by PKWARE for enhanced compression. Bit 13: Set when encrypting the Central Directory to indicate selected data values in the Local Header are masked to hide their actual values. See the section describing the Strong Encryption Specification for details. Refer to the section in this document entitled "Incorporating PKWARE Proprietary Technology into Your Product" for more information. Bit 14: Reserved by PKWARE. Bit 15: Reserved by PKWARE. 4.4.5 compression method: (2 bytes) 0 - The file is stored (no compression) 1 - The file is Shrunk 2 - The file is Reduced with compression factor 1 3 - The file is Reduced with compression factor 2 4 - The file is Reduced with compression factor 3 5 - The file is Reduced with compression factor 4 6 - The file is Imploded 7 - Reserved for Tokenizing compression algorithm 8 - The file is Deflated 9 - Enhanced Deflating using Deflate64(tm) 10 - PKWARE Data Compression Library Imploding (old IBM TERSE) 11 - Reserved by PKWARE 12 - File is compressed using BZIP2 algorithm 13 - Reserved by PKWARE 14 - LZMA 15 - Reserved by PKWARE 16 - IBM z/OS CMPSC Compression 17 - Reserved by PKWARE 18 - File is compressed using IBM TERSE (new) 19 - IBM LZ77 z Architecture (PFS) 96 - JPEG variant 97 - WavPack compressed data 98 - PPMd version I, Rev 1 99 - AE-x encryption marker (see APPENDIX E) 4.4.5.1 Methods 1-6 are legacy algorithms and are no longer recommended for use when compressing files. 4.4.6 date and time fields: (2 bytes each) The date and time are encoded in standard MS-DOS format. If input came from standard input, the date and time are those at which compression was started for this data. If encrypting the central directory and general purpose bit flag 13 is set indicating masking, the value stored in the Local Header will be zero. MS-DOS time format is different from more commonly used computer time formats such as UTC. For example, MS-DOS uses year values relative to 1980 and 2 second precision. 4.4.7 CRC-32: (4 bytes) The CRC-32 algorithm was generously contributed by David Schwaderer and can be found in his excellent book "C Programmers Guide to NetBIOS" published by Howard W. Sams & Co. Inc. The 'magic number' for the CRC is 0xdebb20e3. The proper CRC pre and post conditioning is used, meaning that the CRC register is pre-conditioned with all ones (a starting value of 0xffffffff) and the value is post-conditioned by taking the one's complement of the CRC residual. If bit 3 of the general purpose flag is set, this field is set to zero in the local header and the correct value is put in the data descriptor and in the central directory. When encrypting the central directory, if the local header is not in ZIP64 format and general purpose bit flag 13 is set indicating masking, the value stored in the Local Header will be zero. 4.4.8 compressed size: (4 bytes) 4.4.9 uncompressed size: (4 bytes) The size of the file compressed (4.4.8) and uncompressed, (4.4.9) respectively. When a decryption header is present it will be placed in front of the file data and the value of the compressed file size will include the bytes of the decryption header. If bit 3 of the general purpose bit flag is set, these fields are set to zero in the local header and the correct values are put in the data descriptor and in the central directory. If an archive is in ZIP64 format and the value in this field is 0xFFFFFFFF, the size will be in the corresponding 8 byte ZIP64 extended information extra field. When encrypting the central directory, if the local header is not in ZIP64 format and general purpose bit flag 13 is set indicating masking, the value stored for the uncompressed size in the Local Header will be zero. 4.4.10 file name length: (2 bytes) 4.4.11 extra field length: (2 bytes) 4.4.12 file comment length: (2 bytes) The length of the file name, extra field, and comment fields respectively. The combined length of any directory record and these three fields SHOULD NOT generally exceed 65,535 bytes. If input came from standard input, the file name length is set to zero. 4.4.13 disk number start: (2 bytes) The number of the disk on which this file begins. If an archive is in ZIP64 format and the value in this field is 0xFFFF, the size will be in the corresponding 4 byte zip64 extended information extra field. 4.4.14 internal file attributes: (2 bytes) Bits 1 and 2 are reserved for use by PKWARE. 4.4.14.1 The lowest bit of this field indicates, if set, that the file is apparently an ASCII or text file. If not set, that the file apparently contains binary data. The remaining bits are unused in version 1.0. 4.4.14.2 The 0x0002 bit of this field indicates, if set, that a 4 byte variable record length control field precedes each logical record indicating the length of the record. The record length control field is stored in little-endian byte order. This flag is independent of text control characters, and if used in conjunction with text data, includes any control characters in the total length of the record. This value is provided for mainframe data transfer support. 4.4.15 external file attributes: (4 bytes) The mapping of the external attributes is host-system dependent (see 'version made by'). For MS-DOS, the low order byte is the MS-DOS directory attribute byte. If input came from standard input, this field is set to zero. 4.4.16 relative offset of local header: (4 bytes) This is the offset from the start of the first disk on which this file appears, to where the local header SHOULD be found. If an archive is in ZIP64 format and the value in this field is 0xFFFFFFFF, the size will be in the corresponding 8 byte zip64 extended information extra field. 4.4.17 file name: (Variable) 4.4.17.1 The name of the file, with optional relative path. The path stored MUST NOT contain a drive or device letter, or a leading slash. All slashes MUST be forward slashes '/' as opposed to backwards slashes '\' for compatibility with Amiga and UNIX file systems etc. If input came from standard input, there is no file name field. 4.4.17.2 If using the Central Directory Encryption Feature and general purpose bit flag 13 is set indicating masking, the file name stored in the Local Header will not be the actual file name. A masking value consisting of a unique hexadecimal value will be stored. This value will be sequentially incremented for each file in the archive. See the section on the Strong Encryption Specification for details on retrieving the encrypted file name. Refer to the section in this document entitled "Incorporating PKWARE Proprietary Technology into Your Product" for more information. 4.4.18 file comment: (Variable) The comment for this file. 4.4.19 number of this disk: (2 bytes) The number of this disk, which contains central directory end record. If an archive is in ZIP64 format and the value in this field is 0xFFFF, the size will be in the corresponding 4 byte zip64 end of central directory field. 4.4.20 number of the disk with the start of the central directory: (2 bytes) The number of the disk on which the central directory starts. If an archive is in ZIP64 format and the value in this field is 0xFFFF, the size will be in the corresponding 4 byte zip64 end of central directory field. 4.4.21 total number of entries in the central dir on this disk: (2 bytes) The number of central directory entries on this disk. If an archive is in ZIP64 format and the value in this field is 0xFFFF, the size will be in the corresponding 8 byte zip64 end of central directory field. 4.4.22 total number of entries in the central dir: (2 bytes) The total number of files in the .ZIP file. If an archive is in ZIP64 format and the value in this field is 0xFFFF, the size will be in the corresponding 8 byte zip64 end of central directory field. 4.4.23 size of the central directory: (4 bytes) The size (in bytes) of the entire central directory. If an archive is in ZIP64 format and the value in this field is 0xFFFFFFFF, the size will be in the corresponding 8 byte zip64 end of central directory field. 4.4.24 offset of start of central directory with respect to the starting disk number: (4 bytes) Offset of the start of the central directory on the disk on which the central directory starts. If an archive is in ZIP64 format and the value in this field is 0xFFFFFFFF, the size will be in the corresponding 8 byte zip64 end of central directory field. 4.4.25 .ZIP file comment length: (2 bytes) The length of the comment for this .ZIP file. 4.4.26 .ZIP file comment: (Variable) The comment for this .ZIP file. ZIP file comment data is stored unsecured. No encryption or data authentication is applied to this area at this time. Confidential information SHOULD NOT be stored in this section. 4.4.27 zip64 extensible data sector (variable size) (currently reserved for use by PKWARE) 4.4.28 extra field: (Variable) This SHOULD be used for storage expansion. If additional information needs to be stored within a ZIP file for special application or platform needs, it SHOULD be stored here. Programs supporting earlier versions of this specification can then safely skip the file, and find the next file or header. This field will be 0 length in version 1.0. Existing extra fields are defined in the section Extensible data fields that follows. 4.5 Extensible data fields -------------------------- 4.5.1 In order to allow different programs and different types of information to be stored in the 'extra' field in .ZIP files, the following structure MUST be used for all programs storing data in this field: header1+data1 + header2+data2 . . . Each header MUST consist of: Header ID - 2 bytes Data Size - 2 bytes Note: all fields stored in Intel low-byte/high-byte order. The Header ID field indicates the type of data that is in the following data block. Header IDs of 0 thru 31 are reserved for use by PKWARE. The remaining IDs can be used by third party vendors for proprietary usage. 4.5.2 The current Header ID mappings defined by PKWARE are: 0x0001 Zip64 extended information extra field 0x0007 AV Info 0x0008 Reserved for extended language encoding data (PFS) (see APPENDIX D) 0x0009 OS/2 0x000a NTFS 0x000c OpenVMS 0x000d UNIX 0x000e Reserved for file stream and fork descriptors 0x000f Patch Descriptor 0x0014 PKCS#7 Store for X.509 Certificates 0x0015 X.509 Certificate ID and Signature for individual file 0x0016 X.509 Certificate ID for Central Directory 0x0017 Strong Encryption Header 0x0018 Record Management Controls 0x0019 PKCS#7 Encryption Recipient Certificate List 0x0020 Reserved for Timestamp record 0x0021 Policy Decryption Key Record 0x0022 Smartcrypt Key Provider Record 0x0023 Smartcrypt Policy Key Data Record 0x0065 IBM S/390 (Z390), AS/400 (I400) attributes - uncompressed 0x0066 Reserved for IBM S/390 (Z390), AS/400 (I400) attributes - compressed 0x4690 POSZIP 4690 (reserved) 4.5.3 -Zip64 Extended Information Extra Field (0x0001): The following is the layout of the zip64 extended information "extra" block. If one of the size or offset fields in the Local or Central directory record is too small to hold the required data, a Zip64 extended information record is created. The order of the fields in the zip64 extended information record is fixed, but the fields MUST only appear if the corresponding Local or Central directory record field is set to 0xFFFF or 0xFFFFFFFF. Note: all fields stored in Intel low-byte/high-byte order. Value Size Description ----- ---- ----------- (ZIP64) 0x0001 2 bytes Tag for this "extra" block type Size 2 bytes Size of this "extra" block Original Size 8 bytes Original uncompressed file size Compressed Size 8 bytes Size of compressed data Relative Header Offset 8 bytes Offset of local header record Disk Start Number 4 bytes Number of the disk on which