Skip to content

Shadow AI Discovery Log Schema

Purpose

This schema defines a vendor-neutral format for logs that document the detection, inventory, and remediation of unapproved AI usage (Shadow AI). It enables organizations to:

  • Maintain an auditable record of Shadow AI detection events
  • Normalize logs from various sources (CASB, proxy, IdP, EDR, SaaS audit logs) into a consistent format
  • Support evidence submission for compliance and audit purposes

Normalization principles

Principle Description
Vendor-neutral No dependency on specific vendor log formats; applicable to Netskope, Zscaler, Microsoft Defender, and others
Minimal required fields Only essential fields are MUST; organizations can omit optional fields
Extensible additionalProperties: true allows vendor-specific or organization-specific extensions
Privacy-aware Fields are designed to reference (not embed) sensitive content

Required fields (MUST)

Field Type Description Example
event_time string (ISO8601) Timestamp of the event 2026-01-15T09:30:00Z
actor_id string User or service identifier user@example.com
actor_type string Type of actor user or service
source_system string System that detected the event proxy, casb, idp, edr, saas_audit
ai_service string AI product or domain accessed chat.openai.com, claude.ai
action string Action performed chat, upload, download, tool_execute, api_call
data_classification string Data classification level public, internal, confidential, restricted
decision string Policy decision applied allow, block, needs_review, unknown
evidence_ref string Reference to related evidence sha256:abc123... or urn:evidence:...
record_id string Unique identifier for this record evt-20260115-001

Optional fields (SHOULD/MAY)

Field Type Description
session_id string Session identifier
device_id string Device identifier
ip string IP address
user_agent string User agent string
department string Organizational department
project_id string Project identifier
prompt_category string Category of the prompt/query
model_family string AI model family (e.g., GPT-4, Claude)
destination string Destination URL or endpoint
policy_id string Policy that triggered the decision
remediation_ticket string Remediation ticket reference

Privacy/Security notes

!!! warning "Data handling" - Do not embed PII, credentials, or prompt content directly in log fields. - Use evidence_ref to reference separately stored sensitive content. - Apply appropriate access controls to log storage. - Consider data retention policies aligned with Minimum Evidence Requirements.

JSON Schema

Download: shadow-ai-discovery.schema.json

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "required": [
    "event_time", "actor_id", "actor_type", "source_system",
    "ai_service", "action", "data_classification", "decision",
    "evidence_ref", "record_id"
  ],
  "properties": {
    "event_time": { "type": "string", "format": "date-time" },
    "actor_id": { "type": "string", "minLength": 1 },
    "actor_type": { "type": "string", "enum": ["user", "service"] },
    "source_system": { "type": "string", "minLength": 1 },
    "ai_service": { "type": "string", "minLength": 1 },
    "action": { "type": "string", "minLength": 1 },
    "data_classification": { "type": "string", "minLength": 1 },
    "decision": { "type": "string", "enum": ["allow", "block", "needs_review", "unknown"] },
    "evidence_ref": { "type": "string", "minLength": 1 },
    "record_id": { "type": "string", "minLength": 1 }
  },
  "additionalProperties": true
}