Niklaus Wirth, the Turing prize winner and Pascal’s father, famously said, “Algorithm+Data Structures=Programs.” Let’s look at how data structures and algorithms can be applied to solve actual business requirements in the process canvas scenario.

Tree and figure

In the data structure, the tree is described as follows: a tree is a finite set of N (n≥0) nodes, or an empty tree (n=0); Or is non-empty tree, for non-empty tree T:

  1. There is only one node called root;

  2. The other nodes except the root node can be divided into m (m>0) finite sets T1, T2… Tm, each set is itself a tree, and is called a SubTree of the root.

It can be seen from the above description that the definition of tree structure is a recursive definition, that is, the definition of tree is used in the definition of tree, which reveals the inherent characteristics of tree. The common storage structures of trees are: sequential storage, chain storage.

The graph is described as follows: Graph G consists of two sets V and E, denoted as G=(V, E), where V is a finite non-empty set of vertices and E is a finite set of even pairs of vertices in V. V(G) is the set of vertices of the graph and cannot be considered empty. E(G) is the set of edges in the graph, which can be empty. If the edge set E(G) is the set of directed edges, the graph is called a directed graph. An undirected graph is called if the edge set E(G) is a set of undirected edges.

The commonly used storage structures of graphs are: adjacency matrix, adjacency list, cross linked list and adjacency multiple list.

Trees and graphs are the two most closely related data structures in the canvas scenario (the most basic, of course, is the linked list structure).

G6

G6 is a graph visualization engine. It provides basic graphical visualization capabilities such as graph drawing, layout, analysis, interaction and animation. With the help of G6 capabilities, we can quickly build our own graph analysis or graph editing applications. The process canvas layer uses ANTV/G6 to realize the function of graph visualization.

The core concepts of G6 are summarized as follows according to the degree of use of its functions:

The parameters received by Graph in G6 are defined as follows:

export interface GraphData { nodes? : NodeConfig[]; edges? : EdgeConfig[]; combos? : ComboConfig[]; [key: string]: any; }Copy the code

The simplest code snippet for a quick start demo is as follows:

 const data = {
  nodes: [
    {
      id: 'node1',
      x: 100,
      y: 200,
    },
    {
      id: 'node2',
      x: 300,
      y: 200,
    },
  ],
  edges: [
    {
      source: 'node1',
      target: 'node2',
    },
  ],
};
 
const graph = new G6.Graph({
  container: 'mountNode',
  width: 800,
  height: 500,
});
graph.data(data);
graph.render();
Copy the code

As can be seen from the above two codes, nodes and edges represent the data of vertex sets and edge sets in the graph structure, and realize the ability of vertex grouping through combos fields. At this point, we can conclude that we don’t really care about the element drawing part, we need to do more of the management of the vertex and edge set data.

The x and Y fields in the vertex items are optional. When there is no node location information in the data, or the location information in the data does not meet the requirements, some layout algorithms are needed to carry out the layout of the graph.

As can be seen from the core concepts of G6, the framework itself has implemented a variety of classical layout algorithms, including some brain-map layouts based on tree graphs. However, according to the definition of UI design and interaction in the requirement description, the existing layout algorithms cannot meet our requirements. Therefore, we not only need to achieve custom vertices, edges, but also real-time calculation of the coordinate value of each vertex to achieve custom layout logic.

While providing greater flexibility, G6 also brings inevitable complexity due to data processing, such as node coordinate value calculation and node management (new and deleted) are typical scenarios.

Data structure definition

G6 itself is an implementation of graph visualization, but in the process canvas scenario, we use the tree structure of chained storage in the real implementation.

In the actual development process, the data types of nodes are defined as follows:

interface IWorkflowNode<T extends WorkflowNodeParamsXOR> {
  id: string;
  type: TWorkflowNodeType;
  params: T;
  parent: string[];
  children: string[];
}
Copy the code

Description:

  • Parent refers to the parent node (stored as id)

  • Children for child node reference (stored as ID)

The structure of bidirectional linked list satisfies binary number

algorithm

There are two main points for data processing:

  1. Node coordinate calculation

  2. When a node is deleted, the current node, associated edges, and subtrees are deleted

Based on the above definition of data structure, the same algorithm is used to realize the above two functions: binary tree depth-first, pre-order traversal algorithm (recursive solution).

As shown in the figure above, if the depth-first and front-order traversal algorithm is adopted, the sequence of node access should be 1, 2, 4, 8, 5, 9, 3, 6, and 7.

So far, our preliminary knowledge has been given, and the implementation details are next.

implementation

Suppose you have a canvas like the one below:

The nodes data details are as follows (some fields irrelevant to coordinate calculation are ignored).

[
  {
    id: 'root-1630463843227',
    type: 'start',
    parent: [],
    children: ['root-1630463843227-0'],
  },
  {
    id: 'root-1630463843227-0',
    type: 'strategy',
    parent: ['root-1630463843227'],
    children: ['root-1630463843227-0-0', 'root-1630463843227-0-1'],
  },
  {
    id: 'root-1630463843227-0-0',
    type: 'strategy',
    parent: ['root-1630463843227-0'],
    children: ['root-1630463843227-0-0-0', 'root-1630463843227-0-0-1'],
  },
  {
    id: 'root-1630463843227-0-0-0',
    type: 'action',
    parent: ['root-1630463843227-0-0'],
    children: [],
  },
  {
    id: 'root-1630463843227-0-0-1',
    type: 'trigger',
    parent: ['root-1630463843227-0-0'],
    children: ['root-1630463843227-0-0-1-0'],
  },
  {
    id: 'root-1630463843227-0-0-1-0',
    type: 'action',
    parent: ['root-1630463843227-0-0-1'],
    children: [],
  },
  {
    id: 'root-1630463843227-0-1',
    type: 'action',
    parent: ['root-1630463843227-0'],
    children: [],
  },
]
Copy the code

Coordinate calculation

1. Data preprocessing

  • Generates an object type based on the key ID from the array list of Nodes and edges to facilitate subsequent values

  • Y The cumulative offset MaxOffsetY is set to 0

  • Select the start node and set the start coordinate to (0, 0)

  • Start recursively solving the coordinates of each node

    / * *

    • Evaluates dependent variables
    • IndentX: indicates a fixed offset in the X direction
    • GapVertical: specifies the fixed offset in the Y direction
    • MaxOffsetY: indicates the cumulative offset in the Y direction when the current node is calculated

    */ const indentX = 370; const gapVertical = 50; const MaxOffsetY = 0;

    const parseWorkflowToDraw = ( workflow: IWorkflow, resolveNode: ResolveNode ): [GraphData, IWorkflowNodeMap] => { const nodes = workflow.nodes; const edges = workflow.edges; const nodeMap: IWorkflowNodeMap = {}; const edgeMap: IWorkflowEdgeMap = {}; let rootId = ”;

    for (const item of nodes) { if (item.type === ‘start’) { rootId = item.id; } nodeMap[item.id] = { … item }; }

    for (const edge of edges) { edgeMap[edge.target] = { … edge }; }

    if (! RootId) {throw new Error(‘Workflow node type Error, no start node! ‘); } MaxOffsetY = 0; const graphData: GraphData = { nodes: [], edges, };

    parseWorkflowToGraphData( nodeMap[rootId], 0, 0, null, [0, 0], nodeMap, edgeMap, resolveNode, graphData );

    return [graphData, nodeMap]; };

2. Coordinate calculation

  • Data preprocessing (more data required by draw function when injecting subsequent custom nodes)

  • Calculate the size of its own node

  • The current location is calculated based on the location of the parent node, whether it is a branch process, and the location index in the branch process

  • Determine whether to add the offset in the y direction according to the series of conditions

  • If there are children, each children node position is solved recursively

    const parseWorkflowToGraphData = ( node: IWorkflowNode, indexInSibling: number, parentMatrixX: number, parentNodeType: TWorkflowNodeType, parentNodeSize: number[], nodeMap: IWorkflowNodeMap, edgeMap: IWorkflowEdgeMap, resolveNode: ResolveNode, graphData: GraphData ): IWorkflowDrawNode => { const nodeType = node.type; const condition = nodeType === ‘start’ ? true : edgeMap[node.id]? .condition; let newNode = { … resolveNode(node), nodeType: nodeType, type: tagRenderNodeType(nodeType), condition, }; // compute nodeSize const size = nodeSize(newNode); newNode.size = size; newNode.domSize = isBranchedNodeType(nodeType) ? [size[0] – padding – 130, size[1] – padding – 75] : size;

    Const matrixX = parentNodeType === null? 0 : parentMatrixX + indentX; const matrixY = ! condition && indexInSibling === 0 ? MaxOffsetY + parentNodeSize[1] + gapVertical : MaxOffsetY; newNode.x = matrixX; newNode.y = matrixY;

    if (! condition && indexInSibling === 0) { MaxOffsetY += parentNodeSize[1] + gapVertical; }

    const children = []; if (node.children.length > 0) { node.children.forEach((childId, index) => { const childNode = parseWorkflowToGraphData( nodeMap[childId], index, matrixX, nodeType, size, nodeMap, edgeMap, resolveNode, graphData ); children.push(childNode); }); } else { MaxOffsetY += size[1] + gapVertical; }

    newNode.children = children; graphData.nodes.push(newNode);

    return newNode; };

The node to delete

When a node is deleted, not only the current node, but also the subtree and the inbound edge of the current node must be deleted. At this time, the recursive solution of n-tree is still used. At the same time, the function can change the parameters by reference and parameter transmission, realizing the operation of direct change to the source data.

/** * removeNode = (node: @param nodes * @param edges */ const removeNode = (node: @param edges); IWorkflowNode<any>, nodes: IWorkflowNode<any>[], edges: IWorkflowEdge[] ) => { const { children } = node; if (children? .length > 0) { children.forEach((child) => removeNode( nodes.find((n) => n.id === child), nodes, edges ) ); } handleNodeDelete(node, nodes, edges); }; const handleNodeDelete = ( node: IWorkflowNode<any>, nodes: IWorkflowNode<any>[], edges: IWorkflowEdge[]) => {// Locate elements const nodeIndex = Nodes.findIndex ((n) => NID === node.id); const foundEdges = edges.filter((edge) => edge.target === node.id); Const parentNode = nodes.find((n) => Nodes [nodeIndex].parent[0] === NID); if (parentNode.children? .length) { parentNode.children = parentNode.children.filter((child) => child ! == node.id); } // Delete the element nodes.splice(nodeIndex, 1); if (foundEdges? .length > 0) { foundEdges.forEach((v) => { const edgeIndex = edges.findIndex((_) => _.id === v.id); edges.splice(edgeIndex, 1); }); }};Copy the code

ABTest

Before the ABTest node is expanded, the data structure of the entire canvas satisfies the definition of binary tree. However, after the extension, since the number of child nodes is greater than two, it no longer meets the definition of binary tree, but it does not affect the structure definition of linked list + tree on the whole, which is an extension from binary to n-fork.

However, the particularity of ABTest is not only reflected in the unlimited number of successors of the current node, but more importantly, its successor node can be used as a container containing a special small canvas.

Therefore, based on the original data structure, we extend the type definition of the node to add the childNodes field. ChildNodes is used to store data of the combo group followed by the ABTest node. It serves as an entry point for inserting a tree structure without interrupting the original tree structure. In fact, at this point, it is no longer a simple bidirectional linked list form, but also a taste of generalized table.

interface IWorkflowNode<T extends WorkflowNodeParamsXOR> { comboId? : string; id: string; type: TWorkflowNodeType; params: T; parent: string[]; children: string[]; childNodes? : string[]; }Copy the code

In fact, the extended data structure does not affect the original algorithm logic. What we need to do is to deal with the subtree logic brought by the intermediate generalized table structure (including the coordinate calculation of the subtree and the influence of the subtree on the position coordinates of the subsequent nodes).

If it is like a tree above, the traversal order should be 1, 2, 3, 5, 4, 6, 10, 11, 13, 12, 14, 7, 8, 9 according to the extended type definition and traversal method of n-cross number.

Therefore, in the concrete implementation, the data preprocessing logic is basically unchanged, but the MaxOffsetYSnapshot variable is added as a snapshot of the Y offset before calculating the generalized table. If the y offset of the successor node is affected to calculate the generalized table subtree, the y offset of the successor node can be reset to the snapshot value to ensure that the original tree structure coordinate calculation is not affected.

Added processing of generalized table data brought by childNodes:

const parseWorkflowToGraphData = ( node: IWorkflowNode<WorkflowNodeParamsXOR>, indexInSibling: number, parentMatrixX: number, parentNodeType: TWorkflowNodeType, parentNodeSize: number[], nodeMap: IWorkflowNodeMap, edgeMap: IWorkflowEdgeMap, resolveNode: ResolveNode, graphData: GraphData ): IWorkflowDrawNode => { const nodeType = node.type; const condition = ['start', 'combo'].includes(nodeType) ? true : edgeMap[node.id]? .condition; let newNode = { ... resolveNode(node), nodeType: nodeType, type: tagRenderNodeType(nodeType), condition, }; // compute nodeSize const size = nodeSize(newNode); newNode.size = size; newNode.domSize = isBranchedNodeType(nodeType) ? [size[0] - padding - 130, size[1] - padding - 75] : size; // Compute node coordinate position let matrixX = parentNodeType === null &&! node.id.startsWith('combo') ? 0 : parentMatrixX + indentX; const matrixY = ! condition && indexInSibling === 0 ? MaxOffsetY + parentNodeSize[1] + gapVertical : MaxOffsetY; newNode.x = matrixX; newNode.y = matrixY; if (! condition && indexInSibling === 0) { MaxOffsetY += parentNodeSize[1] + gapVertical; If (nodeType === 'combo') {if (node.childNodes? .length > 0) { MaxOffsetY -= gapVertical; } const comboGraphData: GraphData = { nodes: [], edges: [], }; if (node.childNodes? .length) { parseWorkflowToGraphData( nodeMap[node.id.replace('root', 'combo')], 0, matrixX - size[0] - 90, null, [0, 0], nodeMap, edgeMap, resolveNode, comboGraphData ); } let maxXNode = null; let maxYNode = null; comboGraphData.nodes.forEach((node) => { if (! maxXNode || node.x > maxXNode.x) maxXNode = node; if (! maxYNode || node.y > maxYNode.y) maxYNode = node; }); const width = maxXNode && maxXNode.x > 0 ? maxXNode.x + maxXNode.size[0] - matrixX : size[0]; const height = maxYNode && maxYNode.y > 0 ? maxYNode.y + maxYNode.size[1] - matrixY : size[1]; newNode.size = [width, height]; graphData.nodes.push(... comboGraphData.nodes); graphData.edges.push(... comboGraphData.edges); If (indexInSibling === 0) {MaxOffsetYSnapshot = matrixY + newNode.size[1] + gapVertical; const nodeIds = nodeMap[node.parent[0]].children.map((id) => id.replace('root', 'combo')); const maxWidth = computeComboMaxWidth(nodeIds, nodeMap, edgeMap, resolveNode); matrixX = matrixX + maxWidth; } MaxOffsetY = matrixY; } const children = []; if (node.children.length > 0) { node.children.forEach((childId, index) => { const childNode = parseWorkflowToGraphData( nodeMap[childId], index, matrixX, nodeType, size, nodeMap, edgeMap, resolveNode, graphData ); children.push(childNode); }); MaxOffsetY = Math.max(MaxOffsetYSnapshot, MaxOffsetY); } else { MaxOffsetY += newNode.size[1] + gapVertical; } newNode.children = children; graphData.nodes.push(newNode); return newNode; };Copy the code

Delete childNodes. Add handler logic to childNodes.

/** * removeNode = (node: @param nodes * @param edges */ const removeNode = (node: @param edges); IWorkflowNode<any>, nodes: IWorkflowNode<any>[], edges: IWorkflowEdge[] ) => { const { children, childNodes } = node; if (childNodes? .length > 0) { childNodes.forEach((child) => removeNode( nodes.find((n) => n.id === child), nodes, edges ) ); } if (children? .length > 0) { children.forEach((child) => removeNode( nodes.find((n) => n.id === child), nodes, edges ) ); } handleNodeDelete(node, nodes, edges); };Copy the code

Above, the complex policy configuration we can implement is as follows:

conclusion

Software complexity is an essential feature, not an accident. But data structures and algorithms are effective ways to control complexity by leveling the perception of the program developer.