On the Uses of AI in Creative Work

Jul 25, 2024

Note: this is a draft that I am actively working on.

Last night I attended a demo night at South Park Commons where some folks working on AI and malleable software gave demos of their recent creations.

The prevailing sentiment: AI (specifically LLMs) enable malleability right now.

Computers Did It Backwards

Linus Lee gave an excellent talk proposing the synthesis of thought using LLM interfaces. He remarked on the history of arts becoming sciences. Consider music. We started by creating sounds using materials from nature, and through trial and error we learned how to create these sounds consistently and how to manufacture instruments to deliver a desired sound. We had sophisticated notions of timbre, pitch, and timing. Then we began to understand the physics of sound, created a mathematical model of sound waves, and used this model to create synthesizers which could generate any sound electronically. Thus art became science and then engineering, and craft became industrialized. The same process occurred with visual art and color. What about computing?

Computers began as rational instruments for mathematical calculation. It took a century of work before we could imagine creative uses for them, and another half-century before artificial intelligence progressed to the point of enabling general-purpose fuzzy computation using LLMs. Now we must follow the trend and discover a mathematical model of LLMs in order to use them effectively as a tool for thought. ¹

The Two Uses of AI

When we leverage LLMs in the creative process (be it programming, writing, or even non-text mediums like design, visual art), a familiar tension arises: the use of AI enables us and augments us, but diminishes our agency and threatens to reduce us from an active to a passive role. This is familiar because every technological advancement in art (made possible by the aforementioned mathematical models) goes this way. For example: from painting to photography, then digital photography, then AI editing, then AI image generation…

This tension can be resolved by piecing apart the two ways one wants to use AI in creative work: as spectacle and as assistant.

TODO

AI as Spectacle

For inspiration, and end-users to delegate the creative process to an oracle

Also occasionally useful to the skilled artist

TODO

Qualities:

defy the user’s expectations
expand the user’s abilities
surprise the user
do things the user can’t
produce a finished product
general-purpose, which means it can help with uses the designers of the software did not intend
- e.g. Kuwaiti dentist using websim

AI as Assistant

Automate the boring stuff. For an experienced craftsman to focus on their overall design while speeding up stuff they know how to do, but isn’t that interesting or new or deep, while still having access to the manual method if they need it.

TODO

Qualities:

augment the user’s existing abilities
defer to the user in all decisions
be explainable, interpretable, changeable
follow a process, not just random inspiration
only do things the user can already do, but faster
operate on a small, structured piece of a whole
- see Geoffrey’s structured AI diffs in Patchwork
specialized, situated, opinionated

Improvements to AI

TODO

what AI will be able to do soon:
- LLMs will become:
  - faster
  - cheaper
  - better at humor, creativity
  - better at scientific reasoning
  - highly personalized
  - highly situated
  - multimodal in both input and output
  - enabled agents (can perform operations on their own, subject to user approval)
- what they won’t be able to do? be robust, reliable, consistent. that’s not their purview

AI Maximalism

TODO

cost will go down but still costly
vs AI minimalism
AI companies vs companies that happen to use AI as a tool
- former is usually AI as spectacle, latter is usually AI as assistant

Explainable AI

TODO

Concepts/Components
LLMs good at javascript webapps because they’re trained on a huge sample of code for that…
- admit that this is the underlying structure (the “IR” of LLMs) and exploit it

There is a mathematical model for how LLMs work, but not for what they do. To make a rough analogy: suppose we had 3D printers capable of creating perfect wooden instruments, but absolutely no idea how sound waves work. We would totally empowered to make the best instruments, but totally blind to what makes an instrument make the sounds we want. This is what prompting is like.↩︎