Understanding your dataset is crucial before undertaking any analysis. The Statistics Panel in Tabula.io offers a range of profiling widgets to make this process straightforward and insightful.
- 1.Open your dataset in the Explorer Page - you will see the Statistics panel on the right side next to the table.
- 2.Alternatively, open the Flow Page, select any node, and select the "Stats" tab in the right panel.
Depending on the columns you select, you will see different widgets and visualizations:
- No columns selected: A summary widget for each column is displayed. Click on a row above the widget to see extended analytics for a specific column.
- One column selected: A set of widgets related to the selected column is shown.
- Two or more columns selected: A list of summary widgets for the selected columns is displayed. You can drill down into any widget for more information.
The following widgets are available for data profiling:
Data Quality Widget: Displays the summary of data quality for all rows in a column.
Summary Widget: Shows summary statistics based on the column's data type. Contains main and additional statistics.
Frequency (unique values) Widget: Displays the most (or least) frequent values in a column, sorted by default in descending order.
Histogram Widget: Visualizes the distribution of values in columns as bars.
Periodic Histogram Widget: Available for periodic data types only (e.g., DateTime, Date, Time). Shows value distribution grouped by parts of data, such as year, month, week, day, and hours.
Depending on the data type of the columns, specific widgets and visualizations will be displayed:
- String: Data Quality, Summary, and Frequency
- Boolean: Data Quality and Frequency
- Integer and Decimal: Histogram, Data Quality, Summary, and Frequency
- DateTime, Date, and Time: Histogram, Periodic Histogram, Data Quality, Summary, and Frequency
- Array and Object: Data Quality