Module 2: PII Control Categories

Collection Limitation

12 min
+50 XP

Collection Limitation

ISO 27018 requires organizations to collect only the PII that is necessary and relevant for the documented purposes. This principle protects individuals by preventing excessive data gathering.

Core Principle

Collection Limitation Definition: "PII shall be collected only to the extent necessary to fulfill the purposes for which it is being processed."

Key Requirements

  1. Necessity Test - Every data field must be justified
  2. Relevance Test - Data must be relevant to the purpose
  3. Adequacy Test - Collected data must be sufficient for purpose
  4. Proportionality - Balance collection with purpose and risk

Control CLD.6.5: Collection Limitation

ISO 27018 Requirement: "The organization shall limit the collection of PII to what is adequate, relevant and necessary for the purposes identified."

The Three Questions

Before collecting any PII field, ask:

  1. Is it necessary? - Can we fulfill the purpose without it?
  2. Is it adequate? - Do we need this much detail?
  3. Is it relevant? - Does it directly relate to our purpose?

Data Collection Matrix

Essential vs. Optional

Essential PII (Cannot Provide Service Without):

  • Account credentials (email, password)
  • Payment information (for paid services)
  • Legal identity (for regulated services)
  • Contact information (for service delivery)

Optional PII (Nice to Have):

  • Demographics (age, gender)
  • Preferences (marketing interests)
  • Secondary contact methods
  • Extended profile information

Never Collect Unless Specific Need:

  • Government ID numbers (SSN, passport)
  • Biometric data
  • Health information
  • Financial details beyond payment
  • Precise geolocation
  • Children's data

Implementation Framework

Step 1: Data Mapping

Create Collection Inventory:

PII Collection Register:

Data Field: [Name of field]
PII Category: [Type - name, email, phone, etc.]
Sensitivity Level: [Low/Medium/High/Critical]
Collection Point: [Where/when collected]
Purpose: [Why it's collected]
Necessity Justification: [Why it's required]
Alternatives Considered: [Less invasive options]
Retention Period: [How long kept]
Access Controls: [Who can access]

Step 2: Minimize Collection Points

Registration Form Analysis:

Over-Collection Example:

Registration Form:
- First Name*
- Middle Name*
- Last Name*
- Email*
- Phone*
- Mobile Phone*
- Home Address*
- Work Address*
- Date of Birth*
- Gender*
- Occupation*
- Annual Income*
- Social Security Number*
- Emergency Contact*

Minimized Collection:

Registration Form:
- Full Name*
- Email*
- Phone (optional)
- Country* (for data residency compliance)

Step 3: Progressive Collection

Collect Data When Needed:

At Registration:

  • Email (authentication)
  • Name (personalization)
  • Country (compliance)

At First Use:

  • Payment info (if paid service)
  • Additional profile (if user opts in)

At Feature Use:

  • Location (if using location features)
  • Contacts (if using sharing features)

Never:

  • Speculative data collection
  • "We might need this later"
  • Data for undefined purposes

Technical Implementation

Form Design Principles

1. Required vs. Optional Fields

<form>
  <!-- Required for service -->
  <input type="email" required aria-label="Email (required)" />
  <input type="text" required aria-label="Full Name (required)" />

  <!-- Optional for enhanced experience -->
  <input type="tel" aria-label="Phone (optional)" />
  <input type="text" aria-label="Company (optional)" />
</form>

2. Clear Explanations

Field: Phone Number [Optional]
Why we ask: To send SMS notifications if email delivery fails
Your choice: You can use the service without providing this

3. Conditional Collection

// Only collect shipping address if physical product
if (orderType === "physical") {
  collectShippingAddress();
} else {
  // Digital delivery - no address needed
  skipShippingAddress();
}

API Design for Minimal Collection

RESTful API Example:

// ❌ Wrong: Collecting everything
POST /api/users
{
  "firstName": "John",
  "middleName": "Robert",
  "lastName": "Doe",
  "email": "john@example.com",
  "phone": "+1234567890",
  "mobilePhone": "+1987654321",
  "homeAddress": {...},
  "workAddress": {...},
  "dob": "1990-01-01",
  "ssn": "123-45-6789",
  "occupation": "Engineer",
  "income": "100000"
}

// ✓ Right: Minimal necessary data
POST /api/users
{
  "name": "John Doe",
  "email": "john@example.com",
  "country": "US"  // For data residency
}

// Additional data collected only when needed
POST /api/users/{id}/payment-methods
{
  "cardToken": "tok_xxx"  // Tokenized, not raw card data
}

Sensitivity-Based Collection

PII Sensitivity Levels

Level 1: Low Sensitivity

  • First name only
  • Country
  • Language preference
  • General interests

Level 2: Medium Sensitivity

  • Full name
  • Email address
  • Phone number
  • Job title
  • Company name

Level 3: High Sensitivity

  • Precise location
  • Financial information
  • Government ID numbers
  • Authentication credentials
  • Private communications

Level 4: Critical Sensitivity (Special Category)

  • Health/medical information
  • Biometric data
  • Racial/ethnic origin
  • Religious beliefs
  • Sexual orientation
  • Trade union membership
  • Criminal history

Collection Rules by Sensitivity

LevelNecessityLegal BasisProtections
LowConvenienceLegitimate interestStandard encryption
MediumOperational needContract or consentEncryption + access controls
HighEssentialContract + strong justificationEncryption + strict access + audit logs
CriticalAbsolutely criticalExplicit consent + legal basisMaximum security + segregation + audit

Alternative Collection Methods

Data Minimization Techniques

1. Anonymization Instead of: Collecting full user profiles Use: Anonymous usage analytics

// ❌ Identifiable
{
  userId: "12345",
  name: "John Doe",
  email: "john@example.com",
  pageVisited: "/pricing",
  timestamp: "2025-12-08T10:00:00Z"
}

// ✓ Anonymous
{
  sessionId: "random_uuid",
  pageVisited: "/pricing",
  timestamp: "2025-12-08T10:00:00Z"
}

2. Pseudonymization Instead of: Using real names in all systems Use: Internal IDs with limited linkage

// Customer-facing system
{
  internalId: "usr_a7b3c9d1",
  email: "john@example.com"  // Only where necessary
}

// Analytics system
{
  internalId: "usr_a7b3c9d1",  // Can't reverse to identity
  actions: [...]
}

3. Aggregation Instead of: Individual records Use: Aggregated statistics

// ❌ Individual data
const users = [
  { age: 25, location: "New York", income: 75000 },
  { age: 30, location: "Boston", income: 85000 },
  ...
]

// ✓ Aggregated
const statistics = {
  ageGroups: { "25-30": 45, "31-35": 67, ... },
  locationCounts: { "Northeast": 112, "West": 89, ... },
  incomeRanges: { "70-80k": 34, "80-90k": 28, ... }
}

4. Tokenization Instead of: Storing raw sensitive data Use: Tokens that can't be reversed

// Payment processing
{
  customerId: "12345",
  paymentToken: "tok_1abc2def3ghi",  // Reference to vault
  // Raw card number never stored
}

Industry-Specific Guidelines

SaaS Applications

Minimal Collection:

  • Email (authentication)
  • Name (personalization)
  • Organization name (if B2B)
  • Payment method (tokenized)

Avoid Collecting:

  • Personal phone numbers
  • Home addresses
  • Demographic data (unless product-relevant)
  • Social security numbers

E-Commerce

Minimal Collection:

  • Email (order confirmation)
  • Shipping address (if physical goods)
  • Payment info (tokenized)

Avoid Collecting:

  • Date of birth (unless age-restricted)
  • Phone (unless delivery requires)
  • Gender (unless product-relevant)

Healthcare Cloud Services

Minimal Collection:

  • Patient identifiers
  • Medical data relevant to service
  • Provider credentials

Strict Limitations:

  • Collect only data necessary for treatment/care
  • No marketing data collection
  • Segregate administrative vs. medical data

HR/Payroll Systems

Minimal Collection:

  • Employee identity
  • Tax information (as required by law)
  • Payment details
  • Work history (job-relevant)

Avoid Collecting:

  • Social media profiles
  • Personal financial information
  • Non-work relationships
  • Unrelevant medical information

Collection Limitation Checklist

Before Collecting Any PII:

  • Documented purpose exists for this data
  • Data is necessary to fulfill purpose
  • No less invasive alternative available
  • Appropriate legal basis exists
  • Retention period defined
  • Security measures adequate for sensitivity
  • Access controls defined
  • Privacy notice covers this collection
  • Consent obtained if required
  • Data subjects informed of collection

Regular Reviews:

  • Quarterly review of all collected fields
  • Remove fields no longer necessary
  • Update forms to minimize collection
  • Check for "scope creep" in data collection
  • Verify alternatives weren't overlooked

Common Pitfalls

1. "We Might Need It Later"

Wrong Thinking: Collect everything in case it's useful someday ✓ Right Approach: Collect only what's needed now; add collection points when new needs arise

2. Copy-Paste Forms

Wrong Practice: Using same detailed form for all services ✓ Right Approach: Custom forms tailored to specific service needs

3. Hidden Collection

Wrong Practice: Collecting data without clear disclosure ✓ Right Approach: Transparent collection with clear explanations

4. Default to Required

Wrong Practice: Making all fields required unless questioned ✓ Right Approach: Default to optional; require only essential fields

Audit Evidence

What Auditors Look For:

  • Data collection inventory
  • Necessity justification for each field
  • Evidence of alternatives considered
  • Form designs showing minimal collection
  • Privacy notices explaining collection
  • Regular review documentation
  • Examples of progressive collection
  • Removal of unnecessary fields

Self-Assessment Questions

  1. Can you justify every PII field you collect?
  2. Have you considered less invasive alternatives?
  3. Do you collect data "just in case"?
  4. Are all fields marked as required actually necessary?
  5. Do you collect sensitive data unnecessarily?
  6. Have you removed fields that are no longer needed?
  7. Do you explain why you collect each piece of data?
  8. Do you use progressive collection instead of upfront collection?

Case Study: SaaS Project Management Tool

Initial Over-Collection (Before ISO 27018):

Registration Required:
- First, Middle, Last Name
- Personal Email
- Work Email
- Personal Phone
- Work Phone
- Home Address
- Work Address
- Date of Birth
- Gender
- Occupation
- Company Size
- Annual Revenue
- Social Media Profiles
- Emergency Contact

After Collection Limitation Review:

Registration Required:
- Full Name
- Email
- Organization Name (for workspace creation)

Optional (Collected When Relevant):
- Phone (for 2FA, if user enables)
- Billing Address (when purchasing paid plan)
- Payment Method (tokenized, when upgrading)

Never Collected:
- Personal addresses
- Date of birth
- Demographic data
- Social media profiles
- Emergency contacts

Results:

  • 85% fewer data fields collected
  • Improved signup conversion rate
  • Reduced storage and security costs
  • Easier GDPR/privacy compliance
  • Lower data breach risk

Next Lesson: Data minimization techniques and implementation.

Complete this lesson

Earn +50 XP and progress to the next lesson