The Black Box Audit: What Big Four AI Tools Are Doing Inside Your Audit — and What Your Audit Committee Hasn't Asked
KPMG Clara runs analytics across 100% of journal entries — not a statistical sample. EY Astra drafts audit memo language from flagged conditions and synthesizes risk assessment narratives. Deloitte Omnia runs automated anomaly detection before the engagement team reviews the findings. PwC Halo processes contracts, board minutes, and communications with GenAI. All four Big Four firms have deployed AI audit platforms at enterprise scale, and all four have announced Microsoft Azure partnerships for their AI infrastructure. Your engagement letter may predate these tools. Your audit committee has almost certainly not added AI tool disclosure as a standing agenda item. And the PCAOB has signaled that AI-assisted audit procedures carry the same documentation requirements as human-performed procedures — including documented professional judgment rationale for every AI flag the engagement team dismisses without full investigation.
Key Numbers
Background
Audit AI arrived earlier than most audit committees recognize. KPMG launched Clara in 2017 — nine years ago — initially focused on data analytics and journal entry review. The 2023–2025 generation of Clara added GenAI capabilities for document extraction and memo drafting, but the AI foundation was established years before most corporate governance frameworks had any concept of "AI in the audit." When audit committee chairs reference AI as an emerging concern in audit, they are often describing a technology that has been operationally standard in their engagements for multiple years.
The 2024–2025 platform upgrades changed the nature of the AI involvement materially. Earlier-generation audit AI focused on structured data analytics: running statistical tests on journal entry populations, identifying outliers, flagging unusual patterns for engagement team review. The current generation adds GenAI for unstructured content — contracts, board minutes, management representations, regulatory correspondence — and for synthesis tasks: drafting audit memo language from flagged conditions, generating risk assessment narratives, summarizing engagement team findings in formats that accelerate workpaper preparation. The engagement team reviews and approves the AI-generated content, but the starting point is now model output rather than engagement team composition.
The shift from sampling to population analytics represents the most structurally significant change to audit evidence standards in decades. Traditional audit sampling — reviewed in engagement planning and documented through statistical frameworks — covers 2–5% of journal entry populations in most large-company audits. Clara Analytics, Deloitte Omnia, and PwC Halo now run analytics on 100% of entries. This means the AI has more complete information about the financial statement population than any prior audit did — but it also means the AI generates more flags than prior audit sampling produced, and the engagement team's triage decisions about which flags to investigate and which to dismiss are now a material part of the audit evidence chain. The documentation of those triage decisions — what the flag was, why the engagement team concluded it did not require further investigation, what professional judgment supported the conclusion — is the governance gap that the PCAOB has signaled it will examine.
The data access expansion has proceeded with less attention from audit committees than the AI capabilities have received. Early audit AI required structured financial data export from the client's ERP system — general ledger transactions, trial balance, accounts receivable and payable sub-ledgers. The current generation of platforms has expanded to contracts and agreements (required for lease accounting, revenue recognition, and significant transaction review), board and committee minutes (required for governance and related-party transaction procedures), and in some engagements, management and executive communications (relevant to tone-at-the-top and fraud risk assessment procedures). Each expansion happened incrementally, driven by the audit procedure it supports — and each increment was probably not specifically communicated to the audit committee or reflected in an engagement letter update.
The Microsoft Azure concentration across all four Big Four firms is a systemic risk that the audit profession has not fully surfaced. KPMG, EY, Deloitte, and PwC have each announced multi-billion dollar Microsoft partnerships that place their AI audit infrastructure on Azure OpenAI. The strategic rationale is clear: Microsoft's enterprise security and compliance infrastructure makes Azure a credible foundation for audit-grade AI, and the partnerships give each firm preferred access to model capabilities and infrastructure investment. The systemic implication is that a material Azure disruption — a platform outage, a regulatory action, a model update that produces different anomaly detection results — affects all four firms' audit platforms simultaneously. Enterprise issuers that have relied on Big Four independence as a safeguard against single-vendor audit concentration have not modeled what it means when all four use the same underlying AI infrastructure.
The PCAOB's 2024 Staff Guidance on AI in audit established the regulatory baseline: AI-assisted audit procedures are subject to the same quality control, documentation, and supervision requirements as human-performed procedures. AS 2301 (the auditor's response to assessed risks of material misstatement) and AS 2315 (audit sampling) both have direct implications for AI-assisted evidence gathering. What the PCAOB has not yet done is issue a specific auditing standard for AI-assisted procedures — which means engagement teams are applying judgment about how to document AI procedures under frameworks that were designed for human-performed ones. The inspection cycle beginning in 2025–2026 will produce the first PCAOB findings specific to AI-assisted audit documentation quality, and the results will clarify how wide the gap is between current practice and the standard the PCAOB expects.
Decision Required
Your Big Four engagement team is using AI tools for journal entry analysis, document extraction, and risk assessment synthesis that your engagement letter may not specifically describe and that your audit committee agenda does not include as a standing oversight item. What oversight posture does your committee take — and does it reflect the PCAOB's stated expectations for AI-assisted audit governance?
The practical governance questions most audit committees have not formally answered: Does your engagement letter identify the AI platforms in use and the data categories they access? Has your committee received a briefing from the engagement partner on AI tool usage in the past 12 months? If PCAOB inspects your last audit and asks why a specific anomaly flagged by the AI was dismissed without investigation, what documentation exists — and where in the workpapers? When did the data handling terms in your engagement letter last reflect the actual data categories flowing to the auditor's AI infrastructure?
These are oversight questions, not audit quality questions. They do not imply that your engagement team is doing something wrong. They reflect the structural reality that audit AI has expanded faster than the governance frameworks that surround it — and that the audit committee's oversight responsibility extends to the AI-assisted procedures that are now a standard part of the engagement.
Options
This is defensible if your engagement letter explicitly addresses AI tool usage, the data categories those tools access, and the PCAOB documentation standard the engagement team applies to AI-assisted procedures. It is not defensible if your engagement letter was last updated before 2023 and uses general confidentiality language that predates the current generation of AI audit platforms. For most audit committees, the honest assessment is that neither the engagement letter nor the agenda has kept pace with how AI is actually being used in the engagement — and continuing the current posture is a decision to close that gap only when something forces the issue.
The AI disclosure addendum asks the engagement team to specify: which AI platforms are active in your engagement (by name, not by category), what procedures each platform supports, what data categories are transmitted to auditor infrastructure, and what data retention and deletion terms apply at engagement end. The addendum does not limit auditor flexibility in selecting and applying tools — it creates the audit committee visibility that the oversight function requires and that the PCAOB's guidance on audit quality indicates it expects. Most engagement partners will agree to an addendum if asked; the market gap is that most audit committees do not ask. Requiring annual update ensures the addendum reflects current practice as AI tool capabilities evolve.
This option goes beyond disclosure to requiring that AI-assisted procedures in the workpapers be documented with the same specificity as human-performed procedures: what the AI tool did, what parameters it operated under, what it flagged, and — for each flag that the engagement team dismissed rather than investigated — what professional judgment supported the conclusion. This is the documentation standard the PCAOB's 2024 guidance implies, and requesting it proactively positions the audit committee ahead of the PCAOB inspection cycle that will begin surfacing AI documentation quality findings. Appropriate for PCAOB-regulated issuers, companies with prior PCAOB findings or comment letters, and any issuer where inspection risk is a material governance consideration.
An independent review — conducted by internal audit or a third party without a direct relationship to the engagement firm — evaluates whether the AI tools in use, the procedures they support, and the workpaper documentation of AI-assisted procedures meet the standard the PCAOB has signaled it expects. This is the highest governance posture and requires the engagement team's cooperation in providing documentation of AI procedures that may not be routinely surfaced to audit committee review. Right for issuers with a history of PCAOB findings, companies in industries where audit AI is particularly consequential (financial services, healthcare, highly regulated sectors), and audit committees that cannot independently evaluate an AI tool disclosure without external expertise.
Recommendation
Add a standing AI oversight item to the audit committee agenda: what AI tools did the engagement team use in this period, what procedures did they support, what data categories flowed to auditor infrastructure, and what was the team's review and override protocol for AI output? This is a 10-minute briefing from the engagement partner — not a technical deep dive — and it closes the structural gap between what AI is doing in your audit and what your audit committee knows about it. Schedule it for the same meeting where the engagement team presents the audit plan.
Request an AI tool disclosure addendum to the engagement letter and require annual update. The addendum should name the platforms (Clara, Astra, Omnia, Halo — whichever are active), describe the procedures they support, identify the data categories transmitted to auditor infrastructure, and specify data retention and deletion terms at engagement end. This is not adversarial — it is the governance documentation your oversight function requires. Most engagement partners will provide it if asked. The ones who push back on specific disclosure are telling you something about how they think about audit committee oversight.
Ask the engagement partner one specific question at the next committee meeting: if PCAOB inspects this audit and asks why a specific AI-flagged anomaly was not investigated, what documentation exists in the workpapers explaining the professional judgment rationale? The answer tells you whether AI-assisted procedures are being documented to the standard the PCAOB has signaled it expects, or whether the documentation is lagging the AI capability. You do not need to understand the AI technology to evaluate whether the answer is specific and satisfying or general and evasive.
Audit your data handling terms. Pull the current engagement letter and identify when the data access provisions were last updated. If the answer is before 2023, the data categories that the AI platforms now access — contracts, communications, board materials — almost certainly exceed what the current terms describe. Request an updated addendum that specifically addresses AI-related data access, the geographic location of processing infrastructure, and data deletion obligations. This is a data governance issue as much as an audit governance issue, and it belongs on both the audit committee's agenda and the general counsel's.
Build AI governance into your next audit RFP as an explicit evaluation criterion. Ask each firm to describe, in writing, the AI platforms active in their audit methodology, the procedures those platforms support, how AI-assisted procedures are documented in workpapers, what the engagement team's override protocol is when the team disagrees with an AI flag, and how AI tool disclosure is provided to the audit committee. The specificity and completeness of the response is itself a quality signal — firms that have thought carefully about AI governance will give more specific answers than firms that have not.
Enjoying this brief? The next one ships Tuesday.
One enterprise AI deployment, dissected weekly. Free during beta · No credit card · Unsubscribe anytime
Risks
The PCAOB's 2024 Staff Guidance established that AI-assisted procedures carry the same documentation requirements as human-performed procedures. The inspection cycle beginning in 2025–2026 is the first cycle where PCAOB inspectors are specifically examining AI-assisted procedure documentation quality. An engagement where AI tools made consequential scope decisions — what anomalies to surface, which document terms to flag — without workpaper documentation of the professional judgment rationale for each triage decision creates inspection exposure that did not exist when audit evidence was gathered through human-performed procedures alone. The documentation obligation does not diminish because the procedure was faster or covered more data.
AI audit platforms have incrementally expanded the categories of client data they access: structured financial data (established practice), contract and agreement terms (2022–2023 expansion, required for ASC 842 lease accounting and ASC 606 revenue recognition procedures), unstructured management communications and board materials (2024–2025 expansion, relevant for fraud risk and governance procedures). Each expansion happened because it supported a legitimate audit procedure — and each expansion probably was not communicated to the audit committee as a material change in data access scope. The practical test: ask your engagement partner to list every data category that flows from your organization to auditor infrastructure during the engagement. If the list is longer or more specific than what your engagement letter describes, the gap is a governance issue.
Population-scale AI analytics generate flags at a volume that engagement teams triage rather than investigate at equal depth. The professional judgment decision about which flags are material enough to pursue and which can be dismissed with a lower-depth review must be documented in a way that supports PCAOB review. An engagement workpaper that records "AI analytics performed; no significant findings" without documenting how many flags were generated, what the engagement team reviewed, and what professional judgment was applied to the dismissed flags does not meet the documentation standard the PCAOB's guidance implies. The risk is not that the dismissals were wrong — it is that they cannot be defended if examined.
All four Big Four firms have announced multi-billion dollar Microsoft Azure partnerships for AI audit infrastructure. From an audit independence and quality perspective, the relevant risk is not competition — it is correlation. If a material Azure disruption affects all four firms' audit AI platforms simultaneously during peak audit season (Q1 for calendar-year-end issuers), the impact on audit delivery timelines is not mitigable by switching to an alternative firm, because all alternatives face the same infrastructure disruption. Enterprise issuers and their audit committees have not assessed this as a systemic audit continuity risk, because prior audit resilience planning assumed that the Big Four had materially independent technology infrastructure.
AI audit tools are deployed firmwide, but the capability to evaluate, challenge, and override AI output varies materially across engagement teams. A senior partner who understands Clara's anomaly detection logic and its known limitations applies different professional judgment to AI flags than a first-year manager who treats the platform's output as a checklist. The quality variance attributable to differential AI fluency is not yet reflected in PCAOB inspection findings — because inspectors have not yet developed engagement-specific AI fluency assessment frameworks — but it is structurally present in every large-firm audit where AI is deployed firmwide but AI training and capability vary by team. Audit committees cannot evaluate this risk from public data; the engagement partner briefing on AI tool usage is the most accessible diagnostic.
Questions Your Team Should Be Answering
These are the questions that distinguish organizations that get this right from those that do not. If your team cannot answer them, that is your first deliverable.
- 1.
Does your current engagement letter identify the AI platforms in use in your audit, the specific procedures they support, and the data categories transmitted to your auditor's cloud infrastructure — or does it use general confidentiality language written before AI audit tools were in production?
- 2.
Has your audit committee received a briefing from the engagement partner on AI tool usage in your audit in the past 12 months — and if not, when was the last time your committee formally reviewed what AI the engagement team uses and how?
- 3.
If PCAOB inspectors examined your last audit and asked for documentation explaining why a specific AI-flagged anomaly was dismissed without full investigation, what workpaper entry would the engagement team point to — and does it record a specific professional judgment rationale or a general approval?
- 4.
When did your organization last update the data handling terms in the engagement letter — and do those terms specifically address AI-related data access, the geographic location of processing infrastructure, and data deletion obligations at engagement end?
- 5.
Did your most recent audit RFP include AI governance as an evaluation criterion — and do you know which AI platforms each shortlisted firm uses, what procedures they support, and how they document AI-assisted procedures to PCAOB standards?
- 6.
Who on your audit committee has sufficient understanding of AI audit platforms to evaluate an AI tool disclosure from the engagement partner — and if the honest answer is no one, what is your committee's plan to develop that capability before the PCAOB inspection cycle produces its first AI-specific findings?
If this memo belongs in your next executive meeting or board pack, send it along. One click opens a pre-drafted email — edit or send as-is.
The Freight Intelligence Bet: What Carrier AI Routing Means for Every Logistics Director Locked Into a Contract That Assumed Human Decisions
UPS ORION routes 21M+ packages per day. FedEx committed $2B+ to its DRIVE AI network redesign. Maersk AI manages dynamic ocean freight pricing across 380+ ports. Your enterprise shipper contract was almost certainly written before AI was making routing decisions. The governance gap is real.
Read memo →The Copilot Code Gap: What Engineering Leaders Haven't Decided About AI-Written Code in Production
GitHub Copilot is active across 77,000+ organizations. Independent security research finds 36–40% of AI-completed code contains at least one security-relevant flaw. Most enterprises have no policy distinguishing AI-generated from human-written code in production.
Read memo →The 510(k) Gap: What Hospital Radiology Departments Haven't Resolved Before Their Next AI Model Update
Viz.ai is deployed in 1,100+ hospitals with 20 FDA clearances. Aidoc covers 1,200+ health systems. FDA has cleared 950+ AI/ML-based SaMD — 75%+ in imaging. Most are cleared under the 510(k) pathway and can update their models without notifying the hospital or re-clearance.
Read memo →