Data governance in a small serverless application is often treated as either non-applicable ("we're too small to worry about GDPR") or as a compliance theatre exercise ("we added a cookie banner, we're done"). Both approaches are wrong, and both carry real risk.

The practical reality is that GDPR compliance for a simple serverless app is not complicated. It requires knowing what personal data you hold, where it lives, how long it's retained, and how to delete it on request. That's it. This article covers the patterns that make these four requirements easy to fulfil.

Data classification: the foundation

Before you can protect data, you need to know what you have. A simple classification taxonomy for serverless apps:

Class Examples Requirements
Personal Email addresses, IP addresses, user IDs GDPR Article 6 lawful basis; deletion on request; retention limits
Sensitive personal Health data, biometrics, political opinions Explicit consent; separate storage; enhanced audit logging
Internal Scan results, configuration, logs Access control; retention policy; no external sharing
Public Published articles, scan summaries No special handling required

For ticketyboo.dev, we hold two categories of personal data: email addresses (newsletter subscribers) and IP addresses (rate limiting records). Both are documented, both have retention limits, and both have deletion mechanisms.

Lawful basis: know why you hold what you hold

Under GDPR Article 6, every piece of personal data must have a documented lawful basis. The most relevant bases for a developer tool:

The hard delete rule: When a user unsubscribes, delete their record entirely. Do not soft-delete, do not mark as "unsubscribed", do not retain for "analytics". A soft-delete that keeps the email address is not compliant with the right to erasure (Article 17). The DynamoDB pattern is: delete_item(PK="SUB#{email}", SK="META"). Done. No archive table, no audit log of the email itself.

Data residency in serverless

In a traditional multi-region architecture, data residency (keeping EU personal data in the EU) requires careful Lambda → database routing. In a single-region serverless architecture, it's simple: choose an EU region for all personal data storage, and document that choice.

ticketyboo.dev uses eu-north-1 (Stockholm) for all data storage. The ACM certificate is in us-east-1 (required for CloudFront), but the certificate stores no personal data — it's public key material. The CloudFront edge caches static assets at edge locations globally, but personal data only transits through edge nodes; it's only stored in eu-north-1.

Document this in your privacy notice. Users deserve to know where their data lives.

TTL-based retention

DynamoDB's TTL feature (automatic item deletion based on a Unix timestamp attribute) is the cleanest way to enforce retention limits in a serverless architecture. No cron jobs, no batch delete scripts, no forgetting to run the cleanup.

import time
from datetime import datetime, timezone

def ttl_days(days: int) -> int:
    """Return Unix timestamp N days from now, for DynamoDB TTL."""
    return int(time.time()) + (days * 86400)

# Scan records: retained for 90 days
item = {
    'PK': f'SCAN#{scan_id}',
    'SK': 'META',
    'ttl': ttl_days(90),
    # ...
}

# Rate limit records: retained for 24 hours
rate_item = {
    'PK': f'RATELIMIT#{ip}',
    'SK': f'REQ#{timestamp}',
    'ttl': ttl_days(1),
    # ...
}

# Newsletter subscriptions: NO TTL (retained until unsubscribe)
subscription = {
    'PK': f'SUB#{email}',
    'SK': 'META',
    # no 'ttl' attribute
}

IP address handling

IP addresses are personal data under GDPR (they can identify a natural person, especially combined with a timestamp). Rate limiting requires tracking IPs, but the retention period should be minimised.

Our approach: store IP addresses with a 24-hour TTL in DynamoDB. We don't hash them (hashing a 4-byte IPv4 address is not anonymisation — the hash space is too small). We store the IP in plaintext with a short TTL and no logs beyond CloudWatch default retention.

Scan results are stored with the requester IP for rate limiting correlation, but the IP is redacted from any externally-visible scan report.

Privacy notice: what to include

A GDPR-compliant privacy notice for a simple serverless app doesn't need to be long. It needs to answer five questions:

  1. What personal data do you collect? (email, IP)
  2. Why? (newsletter consent, rate limiting as legitimate interest)
  3. Where is it stored? (AWS DynamoDB, eu-north-1 / Stockholm)
  4. How long is it retained? (email: until unsubscribe; IP: 24 hours; scan results: 90 days)
  5. How can they delete it? (unsubscribe link = hard delete; scan results expire automatically)

The ticketyboo.dev footer includes all five answers in under 100 words. That's all a simple data controller needs.

Data governance in the scanner

The scanner checks repositories for common data governance violations:

Related tools and articles

→ Scan your repository for data governance issues → Security scanning at scale → Governance as code → Governance decision tree