HealthcareDigital Health Platform

Data Discovery Found 23 Unknown S3 Buckets Containing Patient PII

At a Glance

Company Type
Digital Health Platform
Industry
Healthcare Technology
Cloud Providers
3 (AWS, Azure, GCP)
Unknown Buckets Found
23
Assets Scanned
847+

The Challenge

You can't protect data you don't know exists

A digital health platform had grown rapidly through a combination of organic development and two acquisitions. Over five years, the engineering organization had accumulated cloud infrastructure across AWS, Azure, and GCP — managed by different teams, documented inconsistently, and never comprehensively inventoried.

After a competitor in the healthcare space suffered a significant data breach traced to an unmonitored legacy S3 bucket, leadership mandated a full data inventory within 30 days. The problem: nobody knew how many cloud storage assets actually existed. Manual enumeration across three cloud accounts, multiple regions, and dozens of engineering teams would take months.

HIPAA requirements made this an existential compliance issue, not just a security exercise. PHI stored in unmonitored buckets was not just a breach risk — it was a regulatory liability that could trigger OCR investigations regardless of whether a breach occurred.

After the acquisition, we inherited two cloud environments we barely understood. We needed to know exactly what data we had and where it lived before regulators asked us the same question.

Chief Information Security Officer

The Solution

How Theodolite transformed their workflow

Day 1–3

Multi-Cloud Connection

Connected all three cloud environments (AWS, Azure, GCP) to Theodolite using read-only IAM roles. The platform began enumerating storage assets across all regions within minutes of initial connection.

Week 1

Automated Data Discovery Scan

Theodolite's data discovery engine scanned 847 storage assets across three cloud providers simultaneously. The scan identified storage objects, sampled content for PII classification, and flagged assets with sensitive data exposure risks.

Week 2

PII Classification & Risk Ranking

The platform classified all discovered PII by type (PHI, PCI, PII) and assigned risk scores based on exposure level, encryption status, and access controls. 23 previously unknown S3 buckets containing patient data were flagged as critical findings — none appeared in the existing asset inventory.

Platform pillars used

Data DiscoveryPII ClassificationMulti-Cloud ScanningHIPAA ComplianceAsset Inventory

The Results

Measurable outcomes, not promises

Before Theodolite

  • No comprehensive inventory of cloud storage assets
  • Unknown PHI exposure across three cloud providers
  • Manual enumeration would take months
  • Regulatory liability from unmonitored sensitive data

After Theodolite

  • 23 unknown S3 buckets with patient PII discovered and remediated
  • 100% of storage assets classified by data type and sensitivity
  • Complete cloud data inventory delivered in 2 weeks
  • HIPAA-ready evidence package for OCR audit readiness

23

unknown S3 buckets found

100%

PII classified

3 clouds

scanned simultaneously

Data discovery found PII in 23 S3 buckets we didn't even know existed. That single scan justified the entire platform.

Chief Information Security Officer

Digital Health Platform

See how Theodolite can transform your security posture.

Start with a demo and see your own risk quantified in dollars within your first session.

More customer stories