To use AI on company data without leaking it: keep each source's existing permissions instead of copying data into a new index, govern access at the point the AI retrieves, hide the sensitive part of a file rather than the whole file, record every access content-blind, put guardrails on autonomous agents, and keep a tamper-evident audit you can show to security.
1. Connect sources without copying the data
The safest place for your knowledge is where it already lives, with the permissions it already has. Connect AI to your sources, Slack, Drive, GitHub, Notion and the rest, rather than bulk-copying everything into a new index that loses those permissions.
2. Govern at the point of retrieval
Before the AI answers, confirm who is asking and check their real permissions against each source. Return only what that specific person or agent is cleared to see, every time, in real time. This is the single most important step.
3. Redact the sensitive part, not the whole file
Blocking entire documents makes AI useless and trains people to work around it. Better to withhold just the sensitive field, one salary column, one unreleased number, and let the rest stay useful. People get answers; secrets stay secret.
4. Record every access, content-blind
Write every access to a tamper-evident log that proves what happened without storing the content itself. Content-blind matters: the record is then safe to share with auditors, and no vendor can read your data through the log.
5. Put guardrails on agents
Autonomous agents act faster than people and at larger scale. Give them explicit limits, human-in-the-loop on sensitive actions, and a kill switch. The same governance that covers people should cover agents.
6. Be able to prove it
When security or legal asks whether anything leaked, the answer should come with receipts. A tamper-evident, independently verifiable audit, optionally anchored on-chain, turns can you prove it into yes, here.