Copresence

Copresence is a property of communication identified in Clark and Brennan's foundational work on grounding — the collaborative process by which communicators establish mutual understanding. Copresence means that participants can interact with what others are interacting with simultaneously. In a face-to-face conversation, copresence is literal: both parties are physically present in the same environment and can both see, hear, and act on the same shared objects and events at the same time. This is what allows a speaker to point at something and trust that the listener will see exactly what is being pointed to, without requiring explicit verbal description of the referent.

In digital communication, copresence must be technologically mediated. A video call provides visual copresence (both parties see the same video feed); a shared document provides textual copresence (both parties see the same text); a screen-sharing session can provide application-level copresence (both parties see and interact with the same software). The richer the copresence channel, the more efficiently complex information can be communicated, because participants can rely on direct mutual awareness rather than requiring explicit verbal transfer of everything in the shared context.

The relevance to AI interaction models is that copresence is a key enabler of efficient human-AI collaboration. An AI that can see what the user is seeing (visual copresence), hear what the user is hearing (audio copresence), and observe what the user is doing (behavioral copresence) can collaborate more efficiently because it does not require the user to verbally describe everything in their context. The interaction model's continuous processing of audio and video streams is precisely what enables copresence with the user's perceptual environment — unlike turn-based systems where the AI only receives explicit user inputs and has no independent access to the shared context.

The limitation is that copresence requires the user to be comfortable with the AI having access to their perceptual environment. There is a fundamental tension between the collaborative benefits of copresence (richer shared context, more efficient communication) and the privacy implications of continuous audio and video access. How to design systems that provide the collaborative benefits of copresence while respecting user privacy and agency is an open design question.