Your AI System Inventory: Model, Dataset, Interface and Agent Cards

We move on from mapping the users, use cases and capabilities of AI systems, to dive into the models, datasets, interfaces and agents inside them.

Feb 13, 2025

In my last article1, I went through an approach for mapping the business landscape of your AI systems - understanding use cases, capabilities, and stakeholder relationships. This was a first step in building your AI System Inventory. But doing AI governance right means we also have to understand the key elements within each AI system: the models that generate outputs, the datasets for training those models and data flowing through our systems, the interfaces where humans and machines meet, and the increasingly common autonomous agents that take action.

In this article, I'll walk through how to map out these four fundamental components in a way that lets us govern effectively without getting lost in technical detail. I use a simple but powerful "card" system that captures essential information pertinent to governance about each component, and the relationships between them. This approach reveals the critical connections and dependencies that could affect multiple use cases simultaneously.

I’ll build on our talent management example, comprising two AI systems TalentMatch and PathFinder, and we'll see how these components interact in ways that matter for governance - from shared training data to overlapping interfaces. You might want to refer to the first article for the map of those systems.

The Company Ethos - Doing AI Governance

Building your AI System Inventory

In previous articles, we explored why high-integrity assurance matters, built the business case for AI governance, and unpacked the essential components of an AI Management System (AIMS). Now it's time to move from theory to practice – to begin the tangible work of building your AIMS. The foundation of this work starts with a clear understanding of your…

6 months ago · 6 likes · 3 comments · James Kavanagh

Models, Datasets, Interfaces and Agents

Let's begin with an overview of what each of these four components are and how they fit together within an AI System:

📊A Model is a trained algorithmic system that processes inputs to generate specific outputs. It's the engine that powers AI capabilities, encoding patterns learned from historical data to make predictions, classifications, or generate content. What matters most for governance purposes isn't the detailed architecture or algorithm, but understanding its purpose, limitations, behaviours and key characteristics that could affect outcomes or cause harm.
📄A Dataset represents the information enabling or flowing through our AI system - both static and dynamic. This includes training data used to develop models, operational data processed during use, and output data generated by the system. For governance purposes, we care about data origins, quality, currency, potential biases, privacy implications, and how it's used throughout the system lifecycle.
🌐An Interface is any point where the AI system interacts with the outside world - whether with human users or other systems. Interfaces control how information flows in and out of the system, defining what actions are possible and how outputs are presented. Governance focuses on how these interfaces shape interaction, what controls they provide, what surface they present to attackers, and what data they collect or expose.
⚡An Agent is a component that can take autonomous or semi-autonomous actions based on the AI system's outputs. Agents implement the system's decisions in the real world, whether through simple automated responses or complex chains of actions. From a governance perspective, we need to understand their scope of authority to act autonomously, their nature and range of their cyber-physical actions, oversight mechanisms, and potential impacts.

Now, let me illustrate these definitions through our talent management example from the previous article. I’ve included the use case map below as a refresher. Recall that the TalentMatch system has a job fit prediction feature. At its core is a Model that has learned what successful hires look like across different roles. This model processes structured information about candidates and produces a predicted success score. It's not just a simple matching algorithm - it has been trained on historical hiring data to recognise subtle patterns that indicate potential fit. For governance, we need to understand not just that it makes predictions, but key aspects like how it handles candidates with non-traditional backgrounds or what assumptions it makes about career progression.

The TalentMatch system works with Datasets of historical hiring records for training, structured information extracted from resumes, standardised assessments, and feedback from hiring managers. Every successful or unsuccessful hire generates new data that could be used for future training. Some of this data is highly sensitive - like internal performance reviews or salary history - while other elements are publicly available.

TalentMatch's Interfaces shape how different users interact with the system. Recruiters might use a detailed dashboard that shows full candidate profiles and prediction confidence scores, while hiring managers see a streamlined interface focused on key qualifications and fit metrics. Part of the hiring managers interface is a chatbot that permits free-form questions and answers on one or more candidates. There's also an API that allows the system to exchange data with other HR tools. Each interface controls what information is visible and what actions users can take.

The Agents in TalentMatch automate key parts of the recruitment workflow. When the system identifies promising candidates, an agent might automatically schedule initial screenings based on calendar availability. Another agent continuously monitors job boards for new postings that match specified criteria, feeding fresh opportunities into the system. These agents operate with varying degrees of autonomy - some take action automatically while others require human approval.

PathFinder has similar components working together but configured differently for internal career development. Its core Model focuses on predicting career trajectories and skill development needs. It works with Datasets that span employee performance history, learning and development activities, and internal job transitions. Its Interfaces emphasise employee self-service and career exploration, while its Agents might automate tasks like suggesting relevant training opportunities or connecting employees with mentors.

The governance implications become particularly interesting where these systems intersect. For example, both systems might use similar models for skills assessment but trained on different data sets - external candidates versus internal employees. An interface designed for recruitment might inadvertently collect data that affects career development recommendations. Understanding these connections is crucial for effective governance.

Model Cards for Governance

Models broadly fall into two categories in most organisations: those developed internally and those accessed as external services. Internal models, like PathFinder's career trajectory prediction, is built and trained internally. We have deep visibility into the development and complete control over the operation of internal models. External models may be commercial services (like a third-party resume parser) or open-source models (perhaps a language model from HuggingFace for processing job descriptions).

When documenting models for governance purposes, several established frameworks offer valuable guidance. The AI industry has been working to standardize how we describe and document models, recognising that good governance requires consistent, comprehensive documentation.

Model Cards, first proposed by Google researchers in 20192, have become one common standard way to document machine learning models. Their framework emphasises intended use, performance characteristics, and ethical considerations. You'll find Model Cards on HuggingFace's model hub3, in major cloud AI services, and across open-source projects. They provide a structured way to capture both technical capabilities and ethical implications.

Model Cards and System Cards by Meta is another option4. These focus on documenting how models behave when combined with other technologies in a system. Their approach to behavioural documentation is particularly relevant for governance, as it helps identify potential misuse scenarios and unintended consequences. For more regulated industries, frameworks like the US FDA's Good Machine Learning Practice5 emphasise documentation that demonstrates safety and efficacy. Their approach requires detailed records of model development, testing, and monitoring - valuable when AI systems make high-stakes decisions about people's lives or livelihoods. ISO 42001 itself doesn't prescribe a specific documentation format but emphasises the need for traceability and accountability in AI systems.

You could certainly use one of these approaches very effectively, although initially I prefer to capture an even narrower set of information. That’s because when I think about documenting models for governance purposes, I need to strike a balance between around the depth and technicality of information. For governance purposes, we need business managers, lawyers, auditors, scientists and engineers to all come to a common understanding, so sometimes we sacrifice technical, architectural and scientific detail to focus on communicating a model’s purpose, limitations, and potential impacts. Here’s what I think are the essential elements to capture:

The first thing we need to reflect is purpose and scope, including what it should not be used for. TalentMatch's bias detection model, for instance, might be great at flagging potentially discriminatory language in job descriptions but completely unsuitable for evaluating actual hiring decisions.
Next comes training provenance - understanding where the model came from. For internal models, this means documenting the training data sources, key development decisions, and validation process. For external models, it means recording the provider, version, and any customisation or fine-tuning that you might have done. This history becomes invaluable when investigating unexpected behaviours or biases.
Performance characteristics are next. We need to understand where the model performs well and, more importantly, where it struggles. Does PathFinder's skill assessment work equally well for technical and non-technical roles? Does its accuracy vary across career stages or industries?
Dependencies and relationships have important governance implications. If multiple capabilities share the same underlying model, changes or issues can ripple through the system in unexpected ways. Perhaps TalentMatch's resume analysis and PathFinder's skill assessment both rely on the same language model for processing text.
Control mechanisms document how we manage the model in operation. What parameters can be adjusted? Who has authority to make changes? What monitoring is in place? When TalentMatch's job fit model drifts from its baseline performance, who gets notified and what actions can they take?
Finally, and perhaps most crucially for governance, we need to document known limitations and risks. Every model has them, and pretending otherwise is dangerous. A model trained primarily on tech industry data might struggle with other sectors. One optimised for speed might sacrifice some accuracy.

All this information should to be captured in a way that's accessible to different stakeholders - from technical teams to compliance officers to business leaders. That's where our card system comes in. The following model card structure captures the key elements needed for governance while remaining accessible to different stakeholders. Technical teams can find the operational details they need, while compliance officers can quickly understand risks and controls. You’ll develop a model card that makes sense for your own organisation and systems - consider this just an example:

Example Model Card for TalentMatch Job Fit Prediction
1️⃣ PURPOSE
Primary: Predict candidate success likelihood for specific roles
Intended Use: Initial screening and prioritization
Explicitly Not For: Final hiring decisions or compensation determination
2️⃣ OWNERSHIP & CONTROL
Owner: Recruitment Analytics Team
Update Authority: Senior Data Scientists
Last Retrained: 2024-01-15
Current Version: 3.4.1
3️⃣ CHARACTERISTICS
Type: Supervised learning (gradient-boosted decision tree)
Provider/Origin: Internally developed
Base Framework: XGBoost v1.5.2
Training Data Sources: Historical hiring outcomes 2018-2023 (Dataset DS-223)
Performance: 84% accuracy overall (see detailed metrics doc)
Demographic Testing: Validated across 5 protected categories
4️⃣ DEPENDENCIES
Requires: Resume Parser Model v2.1
Shared Components: Skills Taxonomy with PathFinder
Input Requirements: Structured candidate profile, role requirements
5️⃣ LIMITATIONS & RISKS
Known Biases: Lower accuracy for career changers
Edge Cases: May undervalue non-traditional qualifications
Performance Bounds: Requires minimum 6 months outcome data
Monitoring Alerts: Drift detection >5% from baseline
6️⃣ GOVERNANCE CONTROLS
Human Oversight: Required for confidence scores <75%
Audit Trail: All predictions logged with rationale
Override Process: Documented in SOP-235
Required Reviews: Quarterly bias assessment

Dataset Cards for Governance

Moving on to the Datasets of AI systems - we have to recognise that systems' datasets are constantly evolving, being enriched with new information, and potentially picking up biases or quality issues along the way. This makes their documentation particularly crucial (and challenging) for governance.

Let's return to TalentMatch as our example. At first glance, it might seem to work with a single dataset of candidate information. But when we look closer, we discover a web of interconnected data: historical hiring records used for training, real-time candidate profiles, assessment results, feedback from hiring managers, and performance data from successful hires. Each of these datasets has its own origins, quality considerations, and governance implications.

When it comes to documenting datasets, the AI community has a number of sophisticated approaches. Data scientists often employ detailed statistical descriptions - from basic metrics like mean and standard deviation to complex analyses of feature distributions, correlations, and statistical power. Frameworks like Google's Data Cards for ML6 or the Datasheets for Datasets approach developed by Timnit Gebru and colleagues7 provide some really comprehensive templates for dataset documentation. There are simple dataset structures in open source, such as HuggingFace Datasets8 and specialised tools for dataset versioning, lineage tracking, and quality monitoring.

But for governance purposes, I start with a much narrower focus (at least to begin with). While a data scientist might care deeply about the kurtosis of feature distributions or the intricacies of their data cleaning pipeline, for governance we need on focus on aspects that directly impact risk, reliability, and responsible use. So we need information captured in Dataset Cards that focus on aspects like lineage, sensitivity, usage constraints, and relationships, because these are the factors that determine whether we can use data responsibly and effectively in our AI systems. The cards need to tell the story of our data - where it came from, how it's being used, what sensitivities it might contain, and how it's protected.

You may want to come up with the minimum set of information that you think is relevant for your organisation and context, but here's what I think are the most important pieces:

Lineage and provenance form the foundation. We need to understand not just where data originated, but how it's been processed, combined, and transformed. When TalentMatch uses historical hiring data, for instance, we need to know which hiring decisions were included, what cleaning was applied, and how the data was anonymised. This history becomes invaluable when investigating potential biases or unexpected behaviours. If data was purchased from 3rd parties (as it very commonly is), we need to understand and verify the assurances they provide - it’s not enough to simply say that we assumed the 3rd party provided high-quality, clean, unbiased data (even if the purchase contract says it is - buyer beware).
Quality characteristics can be difficult to condense to simple metrics. Yes, we need to know about completeness and accuracy, but we also need to understand representativeness. Does our hiring data reflect all departments equally? Are certain time periods over-represented? These patterns can subtly influence model behaviour in ways that matter for governance.
Privacy and sensitivity classifications are crucial. Some data elements, like candidate names or contact information, are obviously sensitive. Others, like aggregated performance metrics or skill assessments, occupy a greyer area. We need to keep track of what sensitive information exists in each dataset and how it's protected.
Usage constraints and permissions shape how data can be responsibly used. Perhaps some historical data was collected under privacy policies that didn't anticipate AI analysis. Maybe certain datasets can only be used for specific purposes due to contractual obligations, consent processes, or even the citizenship of the subject (think GDPR). These constraints need to be captured and surfaced to prevent misuse.
We also need to be aware of refresh and update mechanisms so that we can have some confidence that the dataset is current. Some datasets, like skills taxonomies, might be updated quarterly. Others, like candidate profiles, update in real-time. Understanding these patterns is crucial for monitoring data quality and detecting drift.
Relationships and dependencies between datasets are often overlooked but critically important. If TalentMatch and PathFinder share a common skills dataset, changes to that dataset could affect both systems simultaneously.

So, here’s what that looks like in practice:

Example Dataset Card for TalentMatch Historical Hiring Outcomes
1️⃣ OVERVIEW
Name: Historical Hiring Outcomes DS-223
Description: Comprehensive record of hiring decisions and outcomes from 2018-2023
Primary Use: Training job fit prediction models
Current Size: 125,000 records
Last Major Update: 2024-01-10
Owner: Jim Davidson
2️⃣ LINEAGE
Source Systems: ATS records (2018-2021); HR Management System (2021-2023)
Performance Review Database Processing Steps: Automated PII removal; Standardisation of job titles; Validation against employee records
Quality Assurance: Monthly completeness checks
3️⃣ SENSITIVITY
Classification: Highly Sensitive
PII Elements: Names/emails removed, demographic data aggregated
Access Controls: Role-based, requires HR clearance
Retention Policy: 5 years from collection
Regulatory Requirements: GDPR, CCPA
4️⃣ QUALITY CHARACTERISTICS
Completeness: 94% required fields populated
Temporal Coverage: 2018-2023 (gap in Q2 2020 due to system migration)
Known Biases: Overrepresentation of technical roles (65% vs 45% org-wide); Limited data from international offices Data Quality Score: 87/100 (methodology in DQ-892)
5️⃣ USAGE PARAMETERS
Approved Uses: Model training, aggregate analysis
Prohibited Uses: Individual employee evaluation
Required Controls: Aggregation of sensitive fields
Minimum Sample Sizes: 50 records for demographic analysis
6️⃣ RELATIONSHIPS
Dependent Models: Job Fit Prediction (primary); Skills Gap Analysis (secondary)
Linked Datasets: Current Employee Profiles (DS-224); Skills Taxonomy (DS-225) Impact Scope: High - affects multiple core prediction
7️⃣ MAINTENANCE
Update Frequency: Monthly append of new records
Quality Monitoring: Weekly automated checks
Drift Detection: Monitored against 2023 baseline Review Schedule: Quarterly validation by HR Analytics

This simplified, structured card serves multiple audiences. Data scientists can quickly understand or confirm what they're working with, assurance professionals can verify appropriate controls are in place, and governance committees can assess risks and dependencies. Even though it’s a very narrow set of information on a Dataset Card, it can still help prevent issues before they arise. We can imagine that when TalentMatch's team wanted to use historical hiring data to train a new salary prediction feature, the Dataset Card immediately flagged this as a prohibited use. When they noticed performance differences between technical and non-technical roles, the documented bias in representation helped explain why.

The key is keeping this documentation living and relevant. Dataset Cards are ideally kept as just one-page, so they’re relatively easy to keep up to date. They need regular reviews and updates as data evolves, uses change, and new relationships emerge. This ongoing maintenance is crucial for effective governance, ensuring our understanding of our data foundation remains current and accurate. There are sophisticated tools for doing this, but in my experience, nothing beats a one-page printed card, stuck on a wall. Just remember, the goal isn't to document everything about the data - it's to illuminate the aspects that matter for governance.

Interface Cards for Governance

Now let's turn to Interfaces - the boundary points where AI systems connect with users and other systems. These touchpoints shape not just how people interact with AI, but often determine whether a system will be used safely and effectively in practice. Interfaces aren’t just the dashboards and chatbot screens where humans interact with AI, but also the APIs, service endpoints, and system integrations that connect our AI capabilities to the broader technical ecosystem. Each interface, whether human or machine-facing, represents both an opportunity for value creation and a potential point of failure or attack.

Think of interfaces as the doors and windows of your AI system's house. You want a welcoming entrance for residents and secure barriers against intruders, so our AI interfaces must balance accessibility with protection. When TalentMatch exposes its job fit predictions, it does so through multiple interfaces - each with its own security considerations. The recruiter's dashboard needs strong authentication and audit trails. The hiring manager's mobile app requires secure data transmission and session management. The candidate portal must protect sensitive personal information. And the APIs integrating with other HR systems need robust access controls and input validation to prevent manipulation.

These technical interfaces bring their own governance challenges. An API endpoint exposing model predictions could be probed systematically to reverse-engineer the underlying model. A poorly secured integration point might leak sensitive training data. Even seemingly innocent status updates could reveal confidential information through careful pattern analysis. The governance approach and architecture of our interfaces must account for these threats while still delivering their purpose.

Let me share what has proven most crucial to document about interfaces, bearing in mind that the information you need here is probably more technical in nature, and much more security focused than any of the other types of cards:

How authentication and authorisation is preformed is foundational. Beyond simple user permissions, we need to document how different types of access are verified and managed. How are API keys rotated? How are session tokens handled? What encryption protects data in transit? These technical controls determine whether our interfaces can resist sophisticated attacks.
Input validation becomes a critical security boundary. Every interface that accepts data - whether from a human user or another system - represents a potential attack vector. When a hiring manager uploads a job description, we need more than just bias checking - we need protection against malicious file uploads, SQL injection, and other technical attacks. The same applies to API endpoints - what rate limiting prevents denial of service? What schema validation prevents malformed requests?
To ensure privacy, we need careful design of both human and technical interfaces. It's not enough to mask sensitive data on screens - we need to consider what information might be inferred from API responses, what metadata might leak through system integrations, and how multiple legitimate requests could be combined to compromise confidentiality. Our interface documentation needs to make these privacy implications explicit.
At the same time, we can't lose sight of the human factors that determine whether interfaces will be used safely and effectively. The most secure system in the world fails if users resort to dangerous workarounds because the intended interface is too cumbersome. This is why our interface documentation needs to address both technical security and human usability.

So like I said, these are the most technical cards, but here’s what I consider important to capture:

Example Interface Card for TalentMatch Recruiter Dashboard
1️⃣ OVERVIEW
Name: Recruiter Primary Dashboard ID IF-101
Purpose: Enable recruiters to review and act on candidate recommendations
Interface Type: Web application + REST API
User Base: Internal recruiters, HR systems
Risk Level: High - processes sensitive candidate data
2️⃣ TECHNICAL ARCHITECTURE
Authentication: SAML2 SSO with MFA requirement
Session Management: JWT tokens, 30-minute timeout
API Security: OAuth 2.0 with PKI certificates
Data Transport: TLS 1.3 enforced
Rate Limiting: 100 requests/minute per user
Deployment: Load-balanced across three regions
3️⃣ SECURITY CONTROLS
Access Management: Role-based access with quarterly review; IP range restrictions for API access; Automated deprovisioning on role change
Input Validation: Schema validation on all API endpoints; sanitisation of user-provided content; File upload scanning and quarantine
Monitoring: Real-time threat detection; Anomaly detection for usage patterns; Automated blocking of suspicious activity
4️⃣ PRIVACY PROTECTION
Data Minimisation: PII redaction in logs and displays; Aggregate-only view for sensitive metrics; Time-limited data retention
Access Controls: Geographic data restrictions; Department-level data segregation; Audit logging of sensitive field access
5️⃣ USER FUNCTIONALITY
Key Features: Candidate fit scoring visualisation; Detailed factor breakdown; Batch candidate comparison; Action triggers (schedule interview, request additional info)
Required Training: Basic dashboard orientation (TR-201)
6️⃣ USER INTERFACE CONTROLS
Display Elements: Primary fit score (0-100 scale); Confidence indicator (High/Medium/Low); Key factor visualisation (top 3 positive/negative); Low confidence score alerts; Bias detection warnings; Data quality notifications
7️⃣ PERMISSIONS & WORKFLOW
Action Authorities: View candidates - All recruiters; Initial contact - Junior recruiters; Override scores - Senior recruiters only; Bulk actions - Team leads only
Required Approvals: Two-person review for high-impact decisions; Manager approval for unusual patterns; Ethics review for new feature deployment
8️⃣ MONITORING & AUDIT
Security Monitoring: Real-time intrusion detection; Failed authentication tracking; Data access pattern analysis
Usage Analytics: Feature utilisation rates; Error frequencies; Override patterns; Response times
Audit Trail: Comprehensive action logging; User session recording; Change history preservation
9️⃣ INCIDENT RESPONSE
Security Incidents: Automated threat blocking; Incident escalation workflow; Forensic data preservation
Operational Issues: Degraded service procedures; Backup system activation; Communication templates
🔟 DEPENDENCIES & INTEGRATIONS
Connected Systems: Job Fit Prediction Model (M-101); Candidate Database (DS-224)
Security Dependencies: Identity Provider; Certificate Authority; Security Information and Event Management (SIEM); Data Loss Prevention (DLP) system

The aim with this Interface card is to capture just the most essential information for both the human and technical dimensions of interface governance. Again, just one-page of documentation is particularly valuable during system changes. It may not provide all the details - that will be in engineering designs and specifications - but it provides the pointers to make sure important considerations and requirements are not missed. And when it comes to identifying threats and vulnerabilities, these kinds of one page Interface cards are incredibly valuable for threat modelling workshops.

Agents

Let me complete our mapping of AI system components by examining Agents - the newest and possibly the most challenging element to govern. Agents are the automated or semi-automated actors that implement decisions and take actions based on our model outputs. Not every AI system will have Agents, but in a cyber-physical system (one that interacts with the real world), these are perhaps the most crucial elements to document from a governance perspective, as they represent the bridge between AI decision-making and real-world impact.

While models make predictions and interfaces present information, agents are what actually make things happen - sending emails, scheduling interviews, updating records, or triggering other systems. In our hypothetical TalentMatch, for instance, when the system identifies a promising candidate, it's not just displaying that information - agents might automatically schedule initial screenings, send personalised welcome messages, and update candidate status across connected systems.

The power of agents to take autonomous action makes their governance particularly difficult. An agent that automatically rejects candidates based on certain criteria isn't just making a recommendation - it's making consequential decisions that directly impact people's lives. An agent that proactively reaches out to passive candidates needs to carefully balance engagement with privacy concerns. An agent that can make decisions with financial consequences has to be able to confidently assess the potential loss. These autonomous capabilities demand rigorous documentation and oversight.

Although we’re very early in the adoption of agents, nevermind formalised ways to govern such agents, here's what I believe are the most essential factors to document about agents in a simplified form:

The scope of authority defines what actions an agent can take without human intervention. This goes beyond a simple list of permitted actions - we need to understand the full extent and limits of the agent's autonomy. Can it merely suggest actions, or can it execute them? What thresholds trigger the need for human approval? Consider our TalentMatch scheduler: it might have authority to schedule initial screenings but require human approval for final-round interviews. These boundaries need to be explicitly documented and regularly reviewed as capabilities evolve.
The impact radius helps us understand how far-reaching an agent's actions can be. Some agents might affect a single user or transaction, while others could impact entire groups or systems simultaneously. A TalentMatch agent that updates a single candidate's status has a narrow impact radius, but one that implements a new hiring freeze policy could affect thousands of candidates at once. Understanding this reach is crucial for appropriate oversight and control mechanisms.
The failure modes and fallbacks document how we handle things when they go wrong - because they will. What happens if an agent loses connection to a critical service? How do we ensure graceful degradation rather than catastrophic failure? For each autonomous capability, we need clear documentation of what could go wrong and how the system maintains safety. Our interview scheduler might need fallback procedures for when calendar systems are unavailable or when urgent changes are needed outside business hours.
Resource consumption and constraints tell us what limits we place on an agent's activities. This might include both technical resources (API calls, processing time) and business resources (budget authority, time commitments). An agent scheduling interviews needs clear limits on how many slots it can book, how far in advance it can schedule, and what resources it can commit. These constraints are crucial guardrails to prevent unintended consequences from automated actions.
The oversight mechanisms detail how we maintain meaningful human control over agent behavior. This goes beyond simple approvals to include monitoring patterns, detecting anomalies, and ensuring accountability. How do we track an agent's decisions over time? Who reviews its performance and how often? What triggers elevated scrutiny? In TalentMatch, we might need special oversight when an agent's rejection rate spikes or when it starts scheduling patterns deviate from historical norms.

Let me show this through a concrete example:

Example Agent Card for TalentMatch Interview Scheduler
1️⃣ OVERVIEW
Name: Automated Interview Scheduler ID: AG-301
Purpose: Streamline interview scheduling for qualified candidates
Activation: Triggered by candidate fit score >85
Operating Hours: 24/7 with quiet hours 22:00-06:00 local time
2️⃣ AUTHORITY SCOPE
Autonomous Actions: Schedule initial phone screenings; Send calendar invites; Update candidate status
Required Approvals: Final round scheduling; Any schedule changes <24h notice; Multiple reschedule requests
3️⃣ DECISION LOGIC
Primary Triggers: High fit score (>85); Complete candidate profile; Available interviewer slots
Blocking Conditions: Incomplete requirements; Previous rejection within 6 months; Scheduling conflicts
Exception Handling: Escalation to human recruiter; Automatic hold on edge cases
4️⃣ SAFETY CONTROLS
Rate Limits: Max 3 attempts per candidate; Max 20 schedules per day; Cooldown period between attempts
Validation Checks: Time zone verification; Double-booking prevention; Communication template compliance
5️⃣ MONITORING & OVERSIGHT
Activity Logging: All scheduling attempts; Success/failure rates; Response patterns
Alert Conditions: Unusual activity spikes; High failure rates; Pattern anomalies
Review Schedule: Weekly activity summary
6️⃣ DEPENDENCIES
Required Services: Calendar integration; Email system; Candidate database
Connected Agents: Notification Manager (AG-302); Status Updater (AG-303)
7️⃣ COMMUNICATION
Message Templates: Initial invitation; Confirmation; Rescheduling; Cancellation
Language Support: English, Spanish

The technical capabilities of agents are rapidly changing and we’re only beginning to learn how to best govern them, so this kind of Agent Card is likely to change substantially as those capabilities and governance evolve. New features might expand an agent's authority, changes in organisational policy might require adjusted decision criteria, and learned patterns from monitoring might suggest needed safeguards.

Together with our Model Cards, Dataset Cards, and Interface Cards, these Agent Cards complete our documentation of AI systems' core components. Each card type illuminates different aspects of governance, but they work together to create a complete picture of how our systems operate, what risks they present, and how we manage them responsibly.

The power of this mapping approach lies in its ability to make complex systems comprehensible without losing crucial detail. By breaking down AI systems into these fundamental components and documenting each in a structured but accessible way, we create a foundation for effective governance that can evolve alongside our AI capabilities. I prefer to have a single page card for each element, printed on laminated cards, or posted on a wall so they become artifacts of communication and team-work.

In these two articles, we've built a comprehensive picture of our AI landscape - from high-level business capabilities down to the intricate interplay of models, datasets, interfaces, and agents. We've not got an approach for structured documentation that shows not just what our systems do, but how they work together, what they are comprised of and where risks might emerge. It’s an excellent starting point for your AI Management System, but before moving forward, we have one last discovery step to take.

You see, your company likely already has established practices for managing technology, handling data, ensuring compliance, and mitigating risks. Some of these will provide strong foundations you can build upon. Others might need careful adaptation to handle AI's unique challenges. And in some cases, you'll need to bridge gaps between traditional governance and AI's novel demands. What you cannot do is pretend these existing structures don't exist or attempt to build a parallel governance universe. The art of successful AI governance lies not just in understanding your AI systems, but in weaving their oversight seamlessly into your organisation's broader governance fabric. That's the challenge we'll tackle next - turning form an inventory of our AI Systems to an inventory of our existing governance foundations.