This is the 29th day of my participation in the First Challenge 2022


See the Data Visualization column for a series of articles


Create a generic DSV parser and builder (hereafter referred to as DSV) using D3.dsvFormat (Delimiter)

The delimiter parameter specifies the delimiter to be used between the data values to be parsed or built, and you can then use the rich methods provided by the DSV parser and builder.

Missing structure

The following two methods build DSV data based on an iterable whose elements are an array of objects or nested arrays:

  • Dsv.format (Rows [, columns]) constructs DSV data (strings) from iterables, which is the inverse of the dsv.parse() method

    The first argument rows is an iterable (an array of objects), where each element is an object (different attributes represent different column attributes in the original table), and represents a data item (a row of data in the original table) from which the DSV data (string) is constructed

    The second (optional) argument, columns, is an array that specifies the column properties (that is, header information) of the DSV. If this parameter is not set, when DSV is constructed, D3 will automatically extract the union of attributes of each element in the iterable (array) as the table head (i.e. the first row of data) of the constructed DSV.

    ⚠️ However, DSV data constructed by using D3’s method of automatically generating column attributes cannot be guaranteed to be the same every time for the same iterable. And fields with a property name of NULL or undefined are converted to an empty string “”. Therefore, it is recommended to pass the columns parameter, which explicitly indicates the information of the table header, to more accurately control the order of the column attributes of the table.

    const data = [{a: "1".b: "2".c: "3"}, {a: "4".b: "5".c: "6"}]
    
    const dsv = d3.dsvFormat(",");
    dsv.format(data, ["a"."b"."c"]); / / "a, b, c \ n1, 2, 3 \ n4 interchange, 5 or 6"
    Copy the code

    If the iterable’s attribute value contains delimiters (for example,), double quotation marks (“), and newline characters (\n), the conversion uses double quotation marks (“) to wrap the characters with special meaning

    const data = [{a: '1'.b: '2\n'.c: '" 3"}];
    
    const dsv = d3.dsvFormat(",");
    dsv.format(data, ["a"."b"."c"]); // "a,b,c\n'1,','2\n','''3'''"
    Copy the code

    💡 If you want to build DSV data from iterable objects (arrays of objects) that contain only the body of the table and not the head of the table, use the dsv.formatBody(rows[, columns]) method

    const data = [{a: "1".b: "2".c: "3"}, {a: "4".b: "5".c: "6"}]
    
    const dsv = d3.dsvFormat(",");
    dsv.format(data, ["a"."b"."c"]); / / "a, b, c \ n1, 2, 3 \ n4 interchange, 5 or 6"
    dsv.formatBody(data); / / "1, 2, 3 \ n4 interchange, 5, 6"
    Copy the code

    💡 actually uses the dSV.formatValue (value) method inside these DSV construction methods to transform data values to ensure that the generated DSV data (strings) can be parsed correctly later

    const dsv = d3.dsvFormat(",");
    
    dsv.formatValue('a'); // "a"
    dsv.formatValue('a,'); // '"a,"'
    dsv.formatValue('a\n'); // "a\n"
    dsv.formatValue(null); / / ""
    dsv.formatValue(undefined); / / ""
    Copy the code
  • Dsv. formatRows(rows) constructs DSV data (strings) based on iterables, which is the inverse of the dsv.parseRows() method

    The first argument rows is an iterable (a nested array), where each element is also an array (whose element is a string), representing a data item (that is, a row of data from the original table) from which the DSV data (a string) needs to be constructed.

    💡 Because this method assumes that the iterable contains no header information, unlike the dsv.format() method, this method does not require the columns parameter to be specified

    This method actually concatenates each element of the iterable (representing each row of data items) with a newline character \n, followed by a delimiter such as,, concatenates each string element in a nested array

    const data = [["1997"."Ford"."E350"."2.34"], ["2000"."Mercury"."2.38"]]
    
    const dsv = d3.dsvFormat(",");
    dsv.formatRows(data); / / ", 1997, Ford, on 2.34 \ n2000, Mercury, 2.38"
    Copy the code

🎉 If you want to customize how elements of an iterable (object by object) are converted during the build process:

  • You can start with JavaScript’s built-in array methodsarr.map()Converts an iterable element (an object) to a string to get a new arraynewArrThe properties of the original array are lostarr.columnsTable header information)
  • Since the header information is missing, you can use another JavaScript built-in array method[arr.columns].concat(newArr)To build an array containing the table headers
  • Finally usingdsv.formatRows()By concatenating the elements of the nested array, you end up with DSV data
const arr = [{a: "1".b: "2".c: "3"}, {a: "4".b: "5".c: "6"}]

// Convert an array of objects to a nested array
const newArr = arr.map((d, i) = > {
  return [
    d.a,
    d.b,
    d.c,
  ];
});

// Add the header array
const newArrWithHeader = [["a"."b"."c"]].concat(newArr);

/ / build missing
const dsv = d3.dsvFormat(",");
dsv.formatRows(newArrWithHeader); / / "a, b, c \ n1, 2, 3 \ n4 interchange, 5 or 6"
Copy the code
  • Dsv.formatrow (row) creates DSV data (string) from string array.

    This is similar to the dsv.formatRows(rows) method, except that rows requires a nested array, each element representing a row in the original table; The row argument is an array of strings (the elements are strings) that represent a row of data, and the elements represent the values of the different column attributes of that row. Dsv. formatRows() generates a table of data; Dsv.formatrow () generates only one row of data.

    const dataset = [[1.2.3]].const rowData = [1.2.3];
    
    dsv = d3.dsvFormat(",");
    
    // Create a table without the table header.
    dsv.formatRows(dataset); / / "1, 2, 3"
    // Generate a row of data
    dsv.formatRow(rowData); / / "1, 2, 3"
    Copy the code

For the two most common DSV data formats, CSV and TSV, D3 provides the corresponding construction method, which is easy to call directly:

  • D3. csvFormat(Rows [, columns]) construct a CSV table (with header information) based on iterable objects (array of objects).

    d3.csvFormat([{foo: "1".bar: "2"}]); / / foo and bar \ n1, "2"
    Copy the code

    💡 equivalent to d3.dsvformat (“,”).format()

    If you want to build CSV data based on an iterable (an array of objects) that contains only the body of the table and not the header, you can use the method d3.csvFormatBody(rows[, columns]). Equivalent to the d3. DsvFormat (“, “). FormatBody ()

  • The method d3.csvFormatRows(rows) builds a CSV table (with no header information) based on iterable objects (nested arrays).

    💡 equivalent to d3.dsvformat (“,”).formatRows()

  • D3. csvFormatRow(row) creates a CSV row based on the string array.

    💡 actually uses the method d3.csvFormatValue(value) inside these constructing CSV methods to convert the data values. Equivalent to the d3. DsvFormat (“, “). FormatValue ()

  • Method d3.tsvFormat(Rows [, columns]) construct TSV table (with header information) based on iterable (object array)

    d3.tsvFormat([{foo: "1".bar: "2"}]); // "foo\tbar\n1\t2"
    Copy the code

    💡 equivalent to d3.dsvformat (‘\t’).format()

    If you want to construct TSV data based on an iterable (an array of objects) containing only the body of the table and not the header, you can use the method d3.tsvformatBody (rows[, columns]). Equivalent to the d3. DsvFormat (” \ t “). FormatBody ()

  • The method d3.tsvFormatRows(rows) builds a TSV table (with no header information) based on iterable objects (nested arrays).

    💡 equivalent to d3.dsvFormat(“\t”).formatRows()

  • TsvFormatRow (row) creates a TSV row based on an array of strings.

    💡 actually uses the == method d3.tsvFormatValue(value)== to convert the data values inside these TSV methods. Equivalent to the d3. DsvFormat (” \ t “). FormatValue ()

Command line conversion tools

D3-dsv module also provides some command line tools for format conversion of DSV and JSON documents

To install the D3-DSV module, run the following command on the terminal

Install the D3-DSV module in the current project
npm install d3-dsv

Install the D3-DSV module globally
npm install -g d3-dsv
Copy the code

You can then use a command-line conversion tool. Suppose there is a data table named test.csv in the current directory to convert it into a result.json document by typing the following command on the terminal

# If the D3-DSV module is installed in the project
./node_modules/.bin/csv2json test.csv -o result.json

# If the d3-DSV module is installed globally on the system
csv2json test.csv -o result.json
Copy the code
  • dsv2dsvThe command,csv2tsvThe command,tsv2csvCommand to support DSV document conversion delimiters
  • dsv2jsonThe command,csv2jsonThe command,tsv2jsonCommands andjson2dsvThe command,json2csvThe command,json2tsvThe DSV and JSON document transfer commands are supported

💡 For more command configuration options, see the official documents