DEEPSCAN - GDPR

Personal Data Discovery Using GDPR Deep Scan Tool

Introduction to GDPR

How will the GDPR deep scan tool be useful for your company?

How does the GDPR deep scan tool work?

------

What does the GDPR meta data repository (Intelligence powering the tool) set contain?

Foundation for GDPR metadata preparation

1. With help of ABAP Dictionary Tables

DD04T – Identify the relevant data elements for each GDPR attribute based on the description or text
DD03L - Identify the fieldnames based on the data elements
DD02L – Identify relevant tables based on the fieldnames and data elements
(Note – Only transparent tables and fieldnames of CHAR Data type are considered)

2. Using the below keywords, initial set of data elements has been filtered from the total of 4+ Million data elements in DD04T For Name & User Id -

"user, first, last, display, person, business, legal, second, associate, birth, middle, surname, full, family, forename, employee, sponsor, dependent, maternal, father, mother, child, paternal, enterer, signer, translator, physician, format, consult, provident, pension, supervisor, doctor, initiator, Author, representative, lawyer, writer, reporter, responsible, inspector, name1, name2, name3, by, witness, creator, delete, account, patient, examiner, injured, personnel, doctor, liable, bank,
physician, recipient, beneficiary, payee, payer, passenger, nickname, initials, short name, authorized, authorized, contact, participant, full, partner, Business Partner, creditor, debtor, third party, 3rd, party, 3, auditor, instructor, tenant, grantee, guardian, guarantor, vendor, supplier, consumer, producer, customer, owner, who, chief, manager, sender, receiver, signed, signature, correspondence, participant, insured, child, payer, payee, driver, member, contact, applicant, spouse, wife, parents, sender, receiver, title, trustee"

3. For Email Address -

"Mail"
For SSN - "Employee Identification Number (EIN), identification, pin, number, SSN, social, security, customer number, personnel number, personal id, account number, Personnel no. Pers.no., tax number"
Using the below queries, the data elements are searched for each keyword.

SELECT
ROLLNAME, DDTEXT, REPTEXT, SCRTEXT_L, SCRTEXT_M, SCRTEXT_S
FROM DD04T WHERE DDLANGUAGE = 'E' AND
(UPPER(DDTEXT) LIKE '%NAME%' OR UPPER(REPTEXT) LIKE '%NAME%' OR UPPER(SCRTEXT_S) LIKE '%NAME%'
OR UPPER(SCRTEXT_M) LIKE '%NAME%' OR UPPER(SCRTEXT_L) LIKE '%NAME%')

The data elements are further filtered out by going through the description/text, data type and length one by one and only relevant data elements is considered.
From the final list of data elements, only the relevant fieldname and tables are considered.

SELECT TABNAME, FIELDNAME, ROLLNAME, DOMNAME, DATATYPE, LENG
FROM DD03L WHERE TABNAME IN (
SELECT DISTINCT TABNAME FROM DD02L WHERE TAB class = 'TRANSP') AND DATATYPE = 'CHAR' AND
ROLLNAME IN (LIST OF DATA ELEMENTS FILTERD OUT USING THE ABOVE KEYWORDS))

FAQ?

Support & maintenance

Support availability

Entitled to version upgrades

Customer customization

Additional personal data attributes

Customer specific visualizations etc

Additional solution packages

Additional system like SAP CRM

Also other system packages available on demand

GDPR deep scan tool foundation

Rapid deployment solution packages - SAP ERP/CRM

3 personal data attributes (Name, E-mail address.)

Full system scan, Standard and Z-tables

If I search for a customer’s name “Harald”, will only a selection of tables/fields with the right attributes be searched?
Yes, the right tables and fields for this name attribute are stored in our GDPR metadata repository.
If there are Z_ solutions developed and there are Z_ data tables/fields created, will we know if they are related to “name”? What if the developer has given very strange names?
Our deep scan search scripts pick the data elements / fields with the nearest description as well. If the custom data elements/ fields are absolutely misnamed, then we have a gap of course.
In some cases, the users can write a name or a social security number in a field that is not designed for that information. Like if you write the social security number in a fee-text field. Does our current search cover this? If we search for my number will it be discovered in fields not classified as SSN-fields?
No. Then we should go to a more sophisticated unstructured search. That's a different line of solution that’s typically un-available for non-hana based SAP systems.
As a result of the above question – If the answer is no today: Could this be some kind of "extended" search?
Yes, but the complexity raises, but doable on a consulting mode. Out of the box, SAP doesn't offer the capability for unstructured search, we will then need to use database specific mechanisms or trex for non-hana databases.
Another question related to this is if we can create an extensive search for certain sensitive words that is searched for in the full environment. It could be words indicating political opinion, labor union membership etc. Also, words regarding an individual’s health etc. This is very critical GDPR information and is not allowed to exist without a clear approval from the individual. All companies want to make sure no such information is stored without their knowledge.
We could take a repository of free text fields and run the extended search on those fields only. However, it is highly unlikely this kind of information is in an enterprise system. We either use text analysis with open source solutions like elastic, or if the customer is on HANA already, use the HANA native text analysis functions. However, this shall fall presently beyond the purview of the tool based on structured data.