Reports
This section covers the following topics:
You manage reports on the Reports page. To access the Reports page, click Data Discovery and Classification > Reports from the left sidebar.
Viewing reports
The Reports page lists all available reports including the newly configured reports. Additionally, the page shows the total number of available reports.
The page shows the following details:
Column | Description |
---|---|
Report Name | Name of the report. |
Status | Status of the report. The status could be Not Generated, In Progress, Completed, or Failed. |
Type | Type of the report (Aggregated or Trend). |
Last Run | Time when the report was run. |
Duration | The duration time of the report run. |
Re-generation | Whether to automatically trigger report regeneration on subsequent runs of the linked scans. Select to enable automatic report regeneration, clear to disable. |
You can use the Search box to filter reports. Search results display reports that contain specified text in their names. Reports can be sorted by their name (Report Name), and the last time that the report was generated (Last Run).
To view the details of a specific report, click the report name.
Alternatively, click the ellipsis icon () corresponding to the report name and select View from the context menu. The context menu also provides options to Generate, Export D.O., and Remove the report.
Viewing versions of a report
DDC supports report regeneration. As a result, multiple versions of a report can be generated. The details view of the Reports page shows available versions of a report.
To view details of a report version:
Click the report name. The details of the report are displayed.
At the top of the detailed view, click the expand icon (
) to expand the versions list. The available versions of the report is displayed.
Click a version link to view its details.
The details view shows the details of the selected report version. To switch to a different version, click its link in the detail view under the expand icon ().
Report types
You can generate two types of reports with DDC.
Aggregate report: Aggregate report displays consolidated information for multiple scan executions. The two ways to generate aggregate reports are:
Scan run on specific date: A report that is based on a scan run on a specific date. Such a report shows the scan findings on the selected date. In case of multiple executions of the scan for the selected day, the report will include the information for the latest execution on that day.
Latest scan execution: A report that always shows the information found on the latest scan execution. This way, the results will reflect (or update) the changes in the sensitive data discovered if the underlying data in the Data Store or the Classification Profile is modified. Still, the report will actually get generated when the user chooses to generate it.
Trend Report: The trend report presents information about the trend of the chosen scan over the last 15 scan executions. It includes trends for various elements such as scanned data objects, identified sensitive data objects, and infotypes. This report assists in comprehending how the scan information develops over time.
Note
To see an updated trend or aggregated report, you can manually regenerate the report to see the results of the last scan execution. Alternatively, for aggregated reports, you can select the Enable Report Re-generation check box from the General Info section to automatically generate the updated report for latest scan execution.
Configure reports
You can configure a report to include information of one or more scans. After the report is configured, it can be generated any number of times. The generated report contains the results of the executed scans.
To configure a report:
In the Data Discovery and Classification application, click Reports > + Add Report.
Complete the following steps:
a. General Info
General Info
Specify the following details:
Name: Provide a unique name for the report. The name must be longer than two characters and up to 64 characters. This field is mandatory.
Description(optional): Provide a description for the report.
Select report type: Select the type of the report to generate: Aggregated or Trend.
Enable Report Re-generation (Applicable only for Aggregated reports): Select the check box to trigger automatic report regeneration. The report will be regenerated automatically whenever any scan linked with the report is executed subsequently (that is, when Latest Execution time is changed).
Click Next.
Configure Content
The Configure Content screen shows available scans with their number and the number of selected scans.
For Aggregated reports, scans executions identified in this step will be merged in a single report. For Trend reports, you can select the desired scan for which you want to analyze the trend.
Tip
To include the removed scans in the list, leave the Show Removed Scans check box selected.
Use the Search box to filter available scans. Search results display scans that contain specified text in their names.
Select the desired Scan Name.
After selecting the scans, turn the Selected Only toggle on to list only the selected scans.
(Applicable only for Aggregated reports) Select the preferred way for generating the report.
A report that is based on a scan run on a specific date. For such a report, click "Latest Execution" and select the date of the scan that you wish to use.
If Enable Report Re-generation was selected and:
An earlier scan run date is selected for the report, the report will not be automatically regenerated.
The current date is selected, the report will be automatically regenerated for subsequent scan runs for the current day. However, after the date changes, any new scan runs won't trigger report regeneration.
A report that can change if the underlying data store or protection profile is modified and a scan is run again. For such a report, leave "Scan Execution" as "Latest Execution".
Leave the default settings for Generate Now check box. This will immediately generate the report after it is configured. Alternatively, you can clear the check box to generate report at later stage. See Generating reports.
Click Save.
Generating reports
After you have configured a report, it can be generated at anytime and any number of times.
To generate a report:
On the Reports page, search for the report that you want to generate.
Tip
Use the Search box to filter the reports. Search results display reports that contain specified text in their names.
By default, reports are listed in ascending alphabetic order of their names. Reports can be sorted by their name (Name), and the last time that the scan was run (Last Run).
Click the ellipsis icon (
) corresponding to the desired report.
Click Generate.
As soon as the report starts to run, its status becomes Not Generated. The status of the report changes in the sequence: Not Generated > In Progress > Completed / Failed.
Note
Permissions to access the data stores accessed by the scans included in a scan-based report are checked every time the report is run. If the current user no longer has the correct permission for any of them, an error is displayed.
Report details
The report details page displays the detailed information about the configured report. The upper part of the report details page displays general information about the report, such as the report name, version, and the number of scans included in the report.
You can use the Print Preview button in the top right corner of the page to print the report. See Printing report details.
Cards
The findings from the scan executions is reported on the following four cards:
TOTAL DATA OBJECTS SCANNED - The count of all data data objects that were included in the scan.
Note
When scanning Exchange Online and Exchange Server notes data objects, DDC counts notes and their associated attachments as separate data objects.
TOTAL DATA OBJECTS SCANNED count is equal to the sum of number of notes and number of attachments linked with each note in the corresponding target path. For example, if you scan a target path containing 4 notes, one of which includes an image attachment, the TOTAL DATA OBJECTS SCANNED count will be 5.
Due to a known limitation, in reports for MongoDB, Azure table, Salesforce, and G-Mail data stores, you will see "N/A" for the TOTAL DATA OBJECTS SCANNED values.
For G-Mail, DDC ignores copies of the emails that were received in the same second, from the same sender, and with the same subject.
In case of Binary Large Objects (BLOBs), TOTAL DATA OBJECTS SCANNED reports the parent and child data objects in a given blob.
SENSITIVE DATA OBJECTS FOUND - The count of all data objects containing sensitive data identified during the scan run.
SENSITIVE DATA MATCHES - The count of all sensitive pieces of information found inside the sensitive data objects.
SELECTED INFOTYPES FOUND - The count of infotypes found during the scan out of those configured for the scan when it was created (" out of ").
The lower part of the report details screen displays the report information in the following tabs.
Scans tab
This tab provides the list of scans that contributed the information for the report. Additionally, you can see the report data in graphical format containing following information.
Infotypes Discovered
Sensitive Data Objects by Content
Sensitive Infotype Distribution
Sensitive Data Objects
Information about scan filters
You can view the information about filters applied in the scans. On the Scans tab, click the expander arrow next to the scan name. This will display the information about the number and types of filters applied.
Example:
1 Scan Filter Exclude DO greater than size ................................. 14000 MB_
Data Stores tab
This tab provides the list of data stores included in the report, with the information about their risk score, sensitivity level, scan name, last scan time, targets, infotypes, data objects scanned, and sensitive objects found in each data store that was scanned.
Expand the data store name to see the following information in graphical format.
Sensitive infotypes
Sensitive content
Classification profiles
Business department (applicable only for ML infotypes)
Data Objects tab
This tab provides the list of data objects scanned with below details. Only the top 1000 data objects are displayed, sorted by their Risk score. The first 25 data objects are displayed in the list but you can view more by clicking Show more.
Column Name | Description |
---|---|
Data Object Name | The name of the data object scanned and listed in the report details. Click the expand icon ( ![]() For Oracle and IBM DB2 the result will be displayed in uppercase. |
Risk | The number of risks found by the report in the given scanned data object. A risk is the presence of a sensitive item of data. |
Type | The type of the scanned data object listed in the report details, such as "File" or "Folder". |
Path | The path to the object that is listed in the report details. |
Data Store | The name of the data store where the object listed in the report details was found. |
Infotypes | The number of information types found in the data object that is listed in the report. |
Filter Data Objects:
Use the fields of Filter Data Objects filter to search for a specific data objects.
Field | Description |
---|---|
Search | Use the search box to search data objects by name or path. The data object search has following characteristics:
|
Sort By | Use the Sort By options to sort the search results ascending or descending order by data object name or risk score. |
Type | Select the data object type to narrow down search results. |
Data Store | Select the associated data store to narrow down search results. |
Note
When searching for the first time, the result will take some time (usually 1 to 3 minutes) to display. During this time, you can navigate through the different tabs inside the report, however, if you leave the Reports page, you must repeat the search. The results of a successfully completed search are cached and will allow repeating the same search with a response time of a few seconds only. Generating the report after the search is completed will invalidate the cache as it will render the information outdated.
Note
Details of a data object scanned partially due to any issue are displayed in the Inaccessible Data Objects tab.
A scanned ZIP file is displayed under the Data Objects tab. However, this tab doesn't indicate whether all the files included in the ZIP file are scanned successfully. To confirm this, review the Inaccessible Data Objects tab.
Inaccessible Data Objects tab
This tab is visible for Aggregated reports. This tab shows the list of inaccessible, skipped, or partially scanned data objects.
The table in the report detail lists the following findings distributed among columns:
Column Name | Description |
---|---|
Data Object Name | The name of the data object scanned and listed in the report details. |
Data Store | The name of the data store where the object listed in the report details was found. |
Path | The path to the object that is listed in the report details. |
Severity | The severity of why the listed data object is inaccessible, skipped, or partially scanned. The severity could be Intervention (least severe), Notice, Error, Critical (most severe). |
Reason | The reason why the listed data object is inaccessible. Some of the possible reasons could be:
|
Date | The date and time when the report is generated. |
Note
Due to some known limitations, you might see same data objects listed in both Sensitive and Inaccessible tabs of the aggregated reports.
Examples:
- A partially scanned file (possibly, due to insufficient buffer memory).
- A table with partially scanned rows (due to some password protected content).
- Inaccessible, skipped, or partially scanned data objects by DDC ML agent are not listed on the Inaccessible Data Objects tab.
Additional metadata fields in reports
For each report, metadata fields of data objects can be displayed on the reports page. To see the metadata fields, go to the Data Objects tab and click the expander arrow next to data object name.
Note
The number of fields displayed on the GUI depends on the level of details selected in the scan configuration, data store, and file types.
For MongoDB and Azure Table, no metadata is displayed in Aggregated reports.
Metadata information of the data object is not displayed when the data object contains only ML infotypes.
The following table shows the complete list of metadata fields that can be displayed for the data objects.
Key | Description |
---|---|
Catalog | Name of the database or catalog. |
Client Modified | Client modification timestamp. |
Category | Category to which the data object belongs. See Category for list of supported categories. |
Date | Date for the resource. |
Date Modified | Date when the resource is modified. |
Document Created | Date when the document is created. |
Document Creator | Name of the document creator. |
Document Modified | Date when the document is last modified. |
Document Modifier | User who last modified the document. |
Encoding | Whether the match is found in an EBCDIC-encoded resource. |
File Created | Date when the file is created. |
File Modified | Date when the file is last modified. |
File Owner | Owner of the file. |
Filename | File name for the resource. |
Folder | Folder name for the resource. |
Instance | Name of the database instance. |
Key Columns | Name of the column(s) used as a key in the database scan. If multiple columns are used as the key, the column names will be comma-separated. |
Key Source | Source of the key used in the "Key" and "Column:Key" metadata (for example, whether the key is obtained from a primary key column, or unique key column etc.). Enum: Primary Key, Integer Unique Column, String Unique Column, Blob * , Integer Non Unique Column, String Non Unique Column• Blob is the default "Key Source" metadata value for matches detected in BLOB objects as no key column information is stored for BLOB objects. |
Number of matches | Displays the number of instances of an infotype found in the data object. |
Median of score | Displays the median similarity score per data object and infotype. |
Object Created | Date when the Google Cloud Storage object is created. |
Object Modified | Date when the Google Cloud Storage object is last modified. |
Permission Execute | List of groups, users, or user classes that have execute permissions to the matched object. |
Permission Full | List of groups, users, or user classes that have full permissions to the matched object. |
Permission Modify | List of groups, users, or user classes that have modify permissions to the matched object. |
Permission Read | List of groups, users, or user classes that have read permissions to the matched object. |
Permission Special | List of groups, users, or user classes that have special permissions to the matched object. |
Permission Write | List of groups, users, or user classes that have write permissions to the matched object. |
Processed Rows | Number of rows that were scanned for the table in a database scan. |
Schema | Name of the database schema. |
Server Modified | Date and time of the last modification by the server. Applicable only to the Dropbox Business Target types. |
Subcategory | Subcategory to which the data object belongs. See Subcategory for list of supported subcategories. |
Table | Name of the database table in a database. |
Track 1 | Whether a Track 1 data type was detected. |
Track 2 | Whether a Track 2 data type was detected. |
Printing report details
On the Reports page, click the report name.
Click Print Preview > Print.
Click Exit Print View to return to report details page.
Note
It is recommended to use A3 and landscape settings as print settings to avoid printing distorted charts.
For the best experience of exporting reports to PDF, use Chrome or Firefox.
Risk score
The risk score reflects the level of the risk to the business that would result from the exposure of the sensitive data objects (i.e., sensitive information) found by a scan. A low risk score number represents lower risk. The risk score depends on the type and number of sensitive data objects found. For example, if a risk score for a single email address found during scan is 10, the risk score for a document that contains thousands of such email addresses will be much higher.
Note
Only a complete removal/deletion of a sensitive data object would reduce the risk score to zero.
Removing reports
You can remove a report in the Reports screen. Since reports have no dependencies (i.e. do not affect other resources) you can remove them without problems.
Note
Only users with the right permissions can remove reports, that is Admin, DDC Admin, DDC Report Admin, and DDC Full Report Admin.
To remove a report:
Click the ellipsis icon (
) corresponding to the desired report.
In the shortcut menu that is displayed, select the Remove option.
A warning message "Remove Report? Are you sure you want to remove this report? This cannot be undone." is displayed.
Confirm the report removal by clicking the Remove button in the warning message dialogue box. To cancel the report removal, click the Cancel button.
Note
After deleting a report, you can create another report with the same name as the one that you deleted.
Reports are not deleted from TDP, which means that if you have the URL of the removed report, with the report ID, you can still view the report after you removed it.
Exporting report's data objects
You can export all the data objects of a report as Newline Delimited JSON (NDJSON) format. You can then view the exported NDJSON file in any editor supporting this format.
There are two ways of exporting those data objects.
From the Reports page, using the Export D.O. option.
From the Report Details page, using the Export All Data Objects button.
To export the data objects from the Reports page:
Click the ellipsis icon (
) corresponding to the desired report.
Click Export D.O..
Choose the target location for the exported file.
To export data objects from the report details page:
Click the report name.
Go to the Data Objects tab.
Click Export All Data Objects to export the data objects.
Choose the target location for the exported file.
Tip
Please check the ELK Reference to see how to use the exported data.
Note
When you export the data objects of a database data store of a report, the exported NDJSON file also contains sensitive columns that were extracted from the scan results. The list of sensitive columns gathered in the exported file could be partial. The number of columns in the list depends on the scan configuration and the number of sensitive data matches found.
Exporting report's inaccessible data objects
You can export the inaccessible data objects of a report as Newline Delimited JSON (NDJSON) format. You can then view the exported NDJSON file in any editor supporting this format.
To export the inaccessible data objects associated with a report from its Report Details page:
Click the report name.
Go to the Inaccessible Data Objects tab.
Click Export Data Objects to export inaccessible data objects.
The inaccessible data objects report is downloaded as a .ndjson file.