Skip to content

1. Architecture Proposal

Executive summary

This document presents a pragmatic approach to protecting customer Personally Identifiable Information (PII), designed for a retail/e-commerce business with over one million customer records, focused on preventing data leaks and tracing their source when incidents occur.

Unlike building a full PII Vault (costly, high-risk, long timeline) or buying a commercial product (license cost, vendor lock-in), this hybrid approach leverages the database’s built-in security features (encryption, masking, access control) and adds a centralized audit and access-control layer to address the core pain point: today nobody knows who read which customer’s data, when, and why.

The problem

Three observed symptoms share one root cause:

  • Exposed and scattered data: PII (name, phone, email, address) stored in plaintext, scattered across CRM, order DB, logs, manual Excel exports, backups, partner services.
  • No access log: When a leak occurs, there is no evidence to identify the source. Every investigation is “groping in the dark.”
  • No purpose-based access control: Too many people and services can read almost everything; no record of why a record was accessed.

Legal implication: Decree 13/2023/ND-CP on personal data protection requires processing data for the correct purpose, with control and traceability. The lack of logs and access control makes compliance hard to demonstrate during audits.

Hybrid architecture overview

The solution has two complementary pillars.

Pillar A — DB-native + Masking (data layer)

  • TDE (Transparent Data Encryption): encrypts all data at rest.
  • Column / Field-level Encryption: encrypts sensitive columns with separately managed keys.
  • Dynamic Data Masking: masks data by role at query time.
  • Row-Level Security (RLS): restricts each role to rows in its scope.

Pillar B — Centralized Audit & Access-Control layer

The biggest differentiator. Every query touching PII is recorded by a central layer, leaving an immutable trail:

  • Access gateway / Data Access Layer: apps read PII via a shared service instead of querying sensitive tables directly.
  • Purpose binding: every PII read request must include a reason.
  • Immutable audit log (append-only, hash-chained): each access records who · what · when · why · result.
  • RBAC + least-privilege, default-deny: sensitive operations require “four-eyes” approval.
  • Real-time anomaly detection & alerting.

Why this hybrid approach

CriterionBuild VaultBuy productHybrid
Upfront costHighMedium–highLow
Time to deploySlow (12+ months)MediumFast (per quarter)
Crypto error riskHigh (self-borne)LowLow
Leak prevention & tracingYesYesYes (focus)
Data-in-use protectionYes (if done right)StrongLimited
Vendor lock-inNoHighLow
Data residency (VN)Self-controlledNeeds confirmationSelf-controlled

Positioning conclusion: for the top goal of leak prevention and tracing, the hybrid achieves most value at the lowest cost and risk, while keeping full control of data domestically.

Phased roadmap summary

PhaseFocusKey outcome
P1Survey & data mappingPII map, classification matrix
P2Centralized audit (top priority)Every PII access leaves an immutable trail
P3Encryption (TDE + column) & maskingLeaked DB/backup no longer leaks real data
P4Tighten access controlLeast-privilege, default-deny
P5Monitoring & anomaly detectionEarly detection; minutes to investigate
P6Compliance, audit, operationsDecree 13/2023 compliance evidence