Introduction to the



StreamSet Data Collector (hereinafter referred to as StreamSetwebsite) is a lightweight, powerful design and execution engine for routing and processing data in data streams. It organizes and defines individual data-flow processing tasks to be completed using the concept of pipeline tasks, which represent the origin of the pipeline, the target, and any other processing you want to perform. Streamsets process data as it arrives at the source and wait silently when it is not needed. You can view real-time statistics about the data, examine the data as it passes through the pipeline, or take a closer look at a snapshot of the data.

interface

Area/icon The name of the describe
1 Pipeline canvas Canvas is used to configure, preview, or monitor pipes.
2 Properties Panel/Preview panel/Monitor panel When configuring a pipe, the Properties panel displays the properties of the pipe or selected phase. You can resize the panel to minimize and maximize it. When you preview data, the Preview panel displays the data for entering and exiting the selected phase or phase group. It can also display stage properties and preview configurations. When monitoring a running pipe, the Monitor panel displays real-time metrics and statistics.
Home page icon Displays a home page containing a list of pipes and their status, allowing you to perform pipe maintenance and navigate to individual pipes.
Package manager icon The package Manager icon shows the package manager, allowing you to install additional component libraries.
Notification icon Display notifications.
Management icon Provides access to StreamSet configuration properties, directories, and logs. It also allows you to restart and close the StreamSet.
User icon Displays active users and roles assigned to users. It also allows you to unregister the StreamSet.
Help icon Provide contextual help based on the information in the panel. Allows you to configure the display Settings and specify whether to use the local or managed version of help.
Link to list of pipes Link to the list of pipes on the home page. Use to view a list of available pipes, perform pipe maintenance, such as starting or sharing pipes, and navigate to individual pipes.
More icon Provide additional operations for pipes.

Preview function (similar to debug output)

1. Overview of data preview

You can preview data to help build or fine-tune pipes. You can also use data preview at development time. You can use the data preview with complete or incomplete pipes and fragments. You can choose from several options to provide a preview of the source data. When previewing the data, the source data is passed through pipes or fragments, allowing you to see how the data is passed and changed in each component. You can edit the stage properties and run the preview again to see how the changes affect the data. You can also edit the preview data to test and tune the pipeline logic. You can preview data for one component at a time, or for a group of components. You can also view data in a list or table view and refresh preview data.

2. Data preview availability

You can preview complete and incomplete pipes. When data preview is available, the Data Preview icon becomes active. You can preview the data under the following conditions:

  • All components in a pipe are connected
  • All required attributes are defined

Tip: Stage configuration does not need to be accurate or complete to preview data. After all phases are connected, data preview can be enabled by entering any valid values for the desired properties.

3. Data preview source data

The following types of data can be used for data preview:

  • Data from the source component.
  • Data from the test source – Use data from the test source configured in the pipeline properties.
  • From a data snapshot – Use snapshot data from the same pipe or other pipes. Applicable to pipes only.

4. Write the destination

As a development tool, data preview does not write data to the target by default. If you wish, you can configure preview to write data to the target. We recommend not writing preview data to production destinations.

5. Note

Keep the following considerations in mind when previewing data:

  • Date, Date time, and Time Data – Data preview displays date, date time, and time data using the default format of the browser locale. For example, if the browser uses the en_US locale, the preview displays the date in the following format: MMM d, yh: mm: ss A.
  • Oracle CDC client Pipeline – When previewing a pipeline using an Oracle CDC client source, the data preview may time out before connecting to the source system. When this happens, try increasing the timeout to 120,000 milliseconds to allow raw time connections.
  • Entire file Data Format – Preview Shows only one record when processing the entire file data pipeline.

6.Data Collector UI – Preview mode

You can use StreamSet to see how data is piped. The following image shows the StreamSet in preview mode: