Data & More Curated Privacy Classification Subscription
Empowering Organizations With a Validated Privacy Classification
Data & More as part of its software subscription provides a subscription to a fully managed global high performance and classification system with high accuracy that continuously identifies and maintains Privacy-sensitive data and critical security information classification across all your unstructured data sources.
This article describes the privacy classifiation, for a description of the security classification please read this article: https://support.dataandmore.com/en/knowledge/data-more-curated-security-classification-subscription
Your organization benefits from an expertly curated, multilingual, and always-current classification —without needing internal specialists or ongoing maintenance.
We provide:
1. Expert Multilanguage classification team
A dedicated international team maintains and validates the entire classification set across dozens of countries and languages.
This ensures:
- Regulatory accuracy
- Linguistic and cultural correctness
- Advanced false positives / false negatives feedback loops
- Correct mapping of privacy and security concepts unique to each region
- Customer-specific extensions when needed
- High performance classification, 100k + files per hour per Server instance
Your organization gets expert-level precision without the cost of building or maintaining it internally.
2. Global Privacy Classification system
Data & More Cassification models have been rigorously validated through the analysis of billions of real world files, images, and other data types
- Worldwide privacy entity dictionaries & databases (500k+)
- Developed and trained specialized language models—distinct from large language models (LLMs)—to identify and categorize PII with precision and scale
- Custom build AI pictures & OCR analyse models
- Thousands of advanced country & Language classification rules
- Thousands subtypes of privacy and security classification entities
Best of class: real life validated precise classification
3. Weekly Global Updates & Automatic Reprofiling
With each subscription update:
- New and improved privacy classifications are updated weekly
- All your data is automatically delta classified
- High-risk content is identified
- Classification stays aligned with regulatory and industry changes
The result: Your environment remains continuously accurate and protected.
4. Trustworthy Foundations for Compliance & Security
Our curated classification underpins:
- GDPR Privacy compliance (Articles 5, 25, 30, 32, 35)
- Data minimization, retention & lifecycle controls
- Security policies, retention and access control clean-up
- Enable AI governance (integration to Microsoft Purview, Copilot)
You get clear visibility into where privacy-sensitive and security-critical information lives.
Why It Matters
Reduce Risk. Strengthen Security. Simplify Compliance.
Organizations today face increasing regulatory pressure and rapidly growing data volumes.
A curated global classification service provides:
- Accuracy — validated on petabytes of real data, not guesswork or generic DLP or LLM guessing
- Consistency — a unified standard across languages, regions and industries
- Predictability — always up-to-date classifications built by experts
- Scalability — applies seamlessly billions of data sets
- Operational savings — no internal team needed to maintain rules or dictionaries
You gain a reliable foundation for both privacy management and security protection.
Data & More Classification Subscription Benefits at a Glance
✔ Weekly classification updates
✔ Multilanguage & country-specific coverage
✔ Privacy & critical security data identification
✔ Industry-specific extensions
✔ Customer-specific classification support
✔ Expert-driven accuracy & continuous improvement
Data & More's Privacy Classification:
Personal Identifiable Information (PII)
At Data & More, we’ve developed a comprehensive and granular approach to classifying Personal Identifiable Information (PII). Our system is grounded in a deep understanding of GDPR and other global privacy regulations, enabling us to accurately analyze and categorize data across various contexts.
We’ve broken down PII into hundreds of distinct, generic types, each representing a unique category of personal data. These subcategories allow for detailed analysis and recognition of PII in diverse countries and languages. Recognizing that each country and language introduces its own specific complexity—including national IDs, specific official documents, specific certificates, and country specific entities such as churches, political parties, and unions—our classification system maps and accommodates thousands of unique country- and language-specific PII categories.
Here is an overview of all the different high level privacy document classes:
|
Name |
Description of Personal Identifiable Information (PII) document class |
| Criminal Behavior | documents include criminal offenses, convictions and police reports or information about someones criminal behavior. |
| Criminal Record | A document showing whether someone has been convicted with a criminal offence. |
| Driver’s license | Data for driving licenses that can be attributed to one or more people. Algorithms are used for the search for the unique codes that appear on driving licenses. In addition, a search is made for words found on driving licenses and whether there is a picture of a person. |
| Education info | Educational diplomas, exam certificates, certificates, and other data that provide information about the education of one or more individuals. |
| Employee termination | Dataset with information about the termination of an employee’s employment within an organization, including resignations, departures, layoffs, and more. Dataset titles that indicate resignations and words that are particularly relevant to resignations is a part of the search. |
| Employee warning | Data concerning internal warnings to one or more individuals due to actions that violate the specific organization’s guidelines. |
| Employment info | Data for employment agreements between employee and employer, whether the terms are described in documents or in a written communication. A large collection of word combinations and phrases that are unique to employment agreements between an employee and an employer is searched for. Contracts that do not relate to employment, e.g. business leases are exempt. |
| Ethnic orientation | If the scanned data contains information about the ethnic orientation of one or more persons. Searches are made for all existing ethnic orientations or that one comes from a certain country |
| Grant application | Personally identifiable data that appears in applications to foundations for financial support. |
| Health card | If there are health cards in the scanned data, such as the health insurance card and the Blue EU health insurance card. Requirements for the search include that a social security number appears and that it is an image file. Health cards are primarily found using OCR scanning. |
| Health info | Data that provide information about the health of one or more people, such as sick leave. In the search, general corona information, safety data sheets, newsletters, internal manuals etc. are excluded. The search is for phrases that clearly indicate sick leave and a specific diagnosis, a visit to a general practitioner or the like, or medical preparations. The criteria are that there must be both a data subject and specific health information. |
| Insurance info | Data for insurance documents that describe how one or more persons are insured, such as home insurance policies and accident insurance policies. |
| Location |
Personal location data |
| Misc. ID | Various forms of of international identiticards or identifications |
| National ID card | Data for ID cards belonging to identifiable person from different countries. |
| National ID number |
National ID numbers that appears in files/text or OCR scanning, e.g. of image and PDF files.. Algorithms are used for the search for numbers that meet criteria for being a real National ID, and positive keywords such as “personal identification number” and with the use of negative keywords such as inovice no.
|
| Passport | When passports are found in the scanned data. To achieve the categorization, the data must contain an image of a person, and the unique entry codes that appear on passports must be included. We also search for individual passport numbers, which, for example, may be included in email correspondence between multiple people. |
| Payment card | Data containing information about a person’s credit card. Algorithms are used for the search, which specifically look for the number logics that characterize credit cards. |
| Personal certificates |
Each country issues a variety of official documents in different languages for purposes such as naming, marriage, birth, partnerships, and more. These documents are unique for both their type and name, serving as essential legal records for individuals within each respective nation. These documents are found by searching for content unique to these certificates. This includes the following types of documents:
|
| Political orientation | If the scanned data contains information about one or more persons’ membership of a political party or political observance. All existing political parties in your country are searched for. |
| Recruitment | Personal data that appear in solicited or unsolicited applications, as well as in CVs. This document class also contains rejections of job applications. Searches for phrases that are unique to job applications, whether solicited or unsolicited. |
| Referral consent | Data related to personal consent, where a person gives consent for their personal information to be shared with an organization or similar. |
| Religious orientation | If the scanned data contains information about a person’s religious orientation. Searches are made for all known religious orientations and membership of state-recognised churches. |
| Salary / financial info | Data that contain information about a person’s salary, for example payslips and fee papers. Also information about bonus schemes is searched for. To find data in this category, search for combinations of words that only apply to payslips. In addition, phrases are searched for that appear when information is given about one or more people’s salary, such as what a person’s monthly salary is. |
| Sexual orientation | Scanned data that contains information about the sexual orientation of one or more persons. In the search, all sexual orientations are searched for. |
| Tax info | Data that contain information about a person’s tax information, especially from the form of annual statements. PDFs are specifically searched for, e.g. so annual statements appear in the first place. |
| Travel info | Data that contain information about a person’s travels at specific times, such as hotel, airline and restaurant bookings. |
| Union membership | Data containing information about one or more persons’ membership of a trade union. A search is made of all existing trade unions in your country. |
| Wills | Personal data regarding one or more persons wills. |
| Work absence | Data concerning cases where an employee fails to work on scheduled days. |
|
Name |
Description of Critial Security information document class |
| Passwords & Secrets |
Passwords and login information for end-user access to systems as well as keys used for encryption of communication and for machine-to-machine communication. |
| Source code |
Data that expose secrets and other information that can potentially help malicious actors get access to systems and data. |
| Log files |
Log files from application systems or servers |
| Infrastructure config |
Various infrastructure configuration information, including infrastructure automation such as Ansible scripts. |
| Vulnerability Assessments | Documents assessing security of infrastructure and applications including assessing CVE vulnerabilities and results from penetration-testing. |
|
Security incidents |
Security incident reports that describe and evaluate security incidents. |
|
Digital certificates |
Digital certificates used for authenticaton, encryption, signatures etc. Certificates such as pki, pem, cert are searched for. |
|
CCTV camera locations |
Info about the location of CCTV cameras |
|
Security requirements |
Details of security requirements for service providers working for the company |
|
Cryptographic Signature |
Digital seals that confirm a file or message is authentic and hasn’t been altered. It’s commonly used in signed documents, emails, or software. |
|
Network access control |
Information about network access, such as Firewall rules, VPN settings etc. |