Internet-Draft | CCSV | June 2024 |
Rankin | Expires 20 December 2024 | [Page] |
This document documents the format used for Control-Character-Separated Values (CCSV) files and registers the associated MIME type "text/ccsv".¶
This note is to be removed before publishing as an RFC.¶
The latest revision of this draft can be found at https://oldgrognard.github.io/ccsv-id/draft-rankin-ccsv.html. Status information for this document may be found at https://datatracker.ietf.org/doc/draft-rankin-ccsv/.¶
Source for this draft and an issue tracker can be found at https://github.com/oldgrognard/ccsv-id.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 20 December 2024.¶
Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
A CCSV (Control-Character-Separated Values file) is a file format that enables moving data between spreadsheets, statistical analysis programs, databases, and any other program that works with rectangular data. It is very similar to (CSV) Comma-Separated Values files [RFC4180], (TSV) Tab-Separated Values files, and their derivatives. Unlike those file types, the CCSV minimizes usage ambiguity by having non-printable characters as delimiters. The two delimiter characters may not appear in the document's text, making the practice of escaping certain characters or adding additional delimiters for certain strings unnecessary. This document seeks to define the format of Control Character Separated Values (CCSV) files and formally register the "text/ccsv" Media Type for CCSV in accordance with [RFC6838].¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
In order for a file to be a CCSV, it MUST adhere to the following formatting rules:¶
A Record Separator RS (U+001E) is used between each record in the file including the header.¶
A Unit Separator US (U+001F) is used between each field in a record.¶
A CCSV MUST begin with a header. The header consists of the names of the columns separated with US (U+001F) entities.¶
The header is terminated with the RS (U+001E) entity if the document contains any records.¶
The header and each record MUST contain the same number of US (U+001F) entities i.e., the header and each record MUST have the same number of fields.¶
Each record in the body is delimited with a RS (U+001E) entity. Note that carriage returns and line feeds are not part of the delimiter and are valid characters in the body of a field.¶
Each field within a record is delimited with the US (U+001F) entity.¶
The US (U+001F) entity and the RS (U+001E) separator MUST NOT appear in the body of a field.¶
The ABNF grammar [STD68] appears as follows:¶
file = header RS *(record RS) [record] header = name *( US name ) record = field *( US field ) name = field field = *VCHAR VCHAR = %x21-7E ; visible characters RS = %x1E ; record separator US = %x1F ; unit separator¶
TODO Security¶
This section provides the media-type registration application (as per [RFC6838]).¶
Type name: text¶
Subtype name: ccsv¶
Required parameters: N/A¶
Optional parameters: N/A¶
Encoding considerations: utf-8¶
Security considerations:¶
Interoperability considerations:¶
Published specification: TBD¶
Applications that use this media type:¶
Fragment identifier considerations: N/A¶
Additional information:¶
Deprecated alias names for this type: N/A Magic number(s): N/A File extension(s): CCSV Macintosh file type code(s): TEXT¶
Person & email address to contact for further information:¶
Intended usage: COMMON¶
Restrictions on usage: N/A¶
Author: Mike Rankin¶
Change controller:¶
Provisional registration?¶
TODO acknowledge.¶