What makes a good human interface?

When I discuss interfaces on this blog, I’m most often referring to software interfaces: intermediating mechanisms from our human intentions to computers and the knowledge within them. But the concept of a human interface extends far before it and beyond it. I’ve been trying to build myself a coherent mental framework for how to think about human interfaces to knowledge and tools in general, even beyond computers.

This is the second of a pair of pieces on this topic. The other is Instrumental interfaces, engaged interfaces.

What makes a user interface good?

What are the qualities we want in something that mediates our relationship to our knowledge and tools?

When I visited Berlin earlier this year for a small conference on AI and interfaces, I spent my last free night in the city wandering and pondering whether there could be a general answer to this expansive question. My focus — both then and now — is on engaged interfaces, interfaces people use to deeply understand or explore some creative medium or knowledge domain, rather than to complete a specific well-defined task. (In this post, when I write interface, I specific mean this type.) The question of what makes an interface compelling is particularly interesting for this type because, as I noted in the other post in this series, inventing good primitives for engaged interfaces demands broad, open-ended exploration. I was hopeful that foundational principles could guide our exploration process and make our search more efficient.

I returned home from that trip with a hazy sense of those principles, which have since become more crisp through many conversations, research, and experiments.

What makes a good human interface?

A good engaged interface lets us do two things. It lets us

see information clearly from the right perspectives, and
express our intent as naturally and precisely as we desire.

To see and to express. This is what all great engaged interfaces — creative and exploratory tools — are about.

To see

A good engaged interface makes visible what is latent. In that way, they are like great maps. Good interfaces and maps enable us to more effectively explore some domain of information by visualizing and letting us see the right slices of a more complex, underlying reality.

Data visualizations and notations, the backbone of many kinds of graphical interfaces, are maps for seeing better. Primitives like charts, canvases, (reverse-)chronological timelines, calendars, are all based on taking some meaningful dimension of information, like time or importance, and mapping it onto some space.

If we take some liberties with the definition of a data visualization, we can consider interface patterns like the “timeline” in an audio or video editing app. In fact, the more capable a video editing tool, the greater variety of maps that tool offers users, enabling them to see different dimensions of the underlying project. An experienced video editor doesn’t just work with video clips on a timeline, but also has a “scope” for visualizing the distribution of color in a frame, color histograms and curves for higher-level tuning, audio waveforms, and even complex filtered and categorized views for navigating their vast library of source footage. These are all maps for seeing information clearly from diverse perspectives.

Straying even further, a table of contents is also a kind of data visualization, a map of a longer document that helps the reader see its structure at a glance. A zoomed-out thumbnail grid of a long paged document is yet another map in disguise, where the reader can see a different more scannable perspective on the underlying information.

Even when there isn’t an explicit construction of space in the interface, there is often a hidden metaphor gesturing at one. When we open a folder in a file browser, for example, we imagine hierarchies of folders above and below to which we can navigate. In a web browser, we imagine pages of history coming before and after the current page. When editing a document, the undo/redo “stack” gestures at a hidden chronological list of edits. Sometimes, these hidden metaphors are worth reifying into concrete visuals, like a list of changes in a file history view or a file tree in the sidebar of a code editor. But over time these inherently cartographic metaphors get collapsed into our imagination as we become more adept at seeing them in our minds.

To express

Once we’ve seen what is in front of us, we need to act on that understanding. Often that comes in the form of manipulating the thing being visualized — the thing we see in the interface. A good engaged interface also helps us here by transparently translating natural human interactions into precise intents in the domain of the tool.

Simple applications accomplish this by letting the user directly manipulate the element of interest. Consider the way map applications allow the user to explore places by dragging and zooming with natural gestures, or how the modern WIMP desktop interface lets users directly arrange windows that logically correspond to applications. When possible, directly manipulating the underlying information or objects of concern, the domain objects, minimizes cognitive load and learning curve.

Sometimes, tools can give users much more capability by inventing a new abstraction. Such an abstraction represents latent aspects of a domain object that couldn’t be individually manipulated before. In one type of implementation, a new abstraction shows individual attributes of some underlying object that can now be manipulated independently. We often see this in creative applications like Photoshop, Figma, or drag-and-drop website builders, where a sidebar or attribute panel shows independent attributes of a selected object. By interacting directly a color picker, font selector, or layout menus in the panel — the surrogate objects — the user indirectly manipulates the actual object of concern. To make this kind of interaction more powerful many of these tools also have a sophisticated notion of selection. “Layers” in image editing apps are a new abstraction that makes both selection and indirect attribute manipulation more useful.

A second type of surrogate object is focused not on showing individual attributes, but on revealing intermediate states that otherwise wouldn’t have been amenable to direct manipulation, because they weren’t concrete. Spreadsheet applications are full of UI abstractions that make intermediate states of calculation concrete. A typical spreadsheet will contain many cells that store some intermediate result, not to mention the concept of a formula itself, which is all about making the computation itself directly editable. Version control systems take the previously inaccessible object of past versions of a document or the concept of a single change — a “diff” — and allow the user to directly manipulate them to undo or reorder edits.

Direct manipulation

All of the interfaces I mention above are examples of direct manipulation, a term dating back at least to 1983 for interfaces that:

Make key objects for some task visible to the user, and
Allow rapid, reversible, incremental action on the objects.

This kind of an interface lets us re-use our intuition for physical objects, movement, and space to see and express ideas in more abstract domains. An underrated benefit of direct manipulation is that it enables low-friction iteration and exploration of an idea space. Indeed, I think it’s fair to say that direct manipulation is itself merely a means to achieve this more fundamental goal: let the user easily iterate and explore possibilities, which leads to better decisions.

In the forty years since, direct manipulation has eaten away at nearly every corner of the landscape of knowledge tools. But despite its ubiquity, the most interesting and important part of creative knowledge work — the understanding, coming up with ideas, and exploring options part — still mostly takes place in our minds, with paper and screens serving as scratchpads and memory more than true thinking aids. There are very few direct manipulation interfaces to ideas and thoughts themselves, except in specific constrained domains like programming, finance, and statistics where mathematical statements can be neatly reified into UI elements.

Of course, we have information tools that use direct manipulation principles, like graphical word processors and mind mapping software. But even when using these tools, a user has to read and interpret information on screen, transform and manipulate them in the mind, and then relay their conclusions back into the computer. The intermediate states of thinking are completely latent. In the best thinking tools today, we still can’t play with thoughts, only words.

We are in the pre-direct manipulation, program-by-command-line age of thinking tools, where we cannot touch and shape our thoughts like clay, where our tools let us see and manipulate words on a page, but not the concepts and ideas behind them.

This realization underlies all of my technical research and interface explorations, though I’m certainly not early nor unique in pursuing this vision. To me, solving this problem means freeing our most nuanced and ineffable ideas from our individual heads. It would give us a way to translate those thoughts into something we can hold in our hands and manipulate in the same way we break down an algebra problem with pencil and paper or graphs on a grid.

What could we accomplish if, instead of learning to hold the ever more complex problems in our world within our minds, we could break down and collaborate on them with tools that let us see them in front of us in full fidelity and bring our full senses and dexterity to bear on understanding and exploring the possibilities?

← Instrumental interfaces, engaged interfaces

Good creative tools are virtuosic and open-ended →

I share new posts on my newsletter. If you liked this one, you should consider joining the list.

Have a comment or response? You can email me.