The visual mode refers to the images and characters that people see.
It is sometimes possible to find compositions that almost, if not completely, rely on a single mode. For instance, the “No Guns” symbol has no alphabetic text and no sound. Like many signs, it relies for its meaning on visual information. However, we might be able to say that the sign uses the spatial mode as well, since the gun appears behind the red bar that signals “no” or “not allowed.” So while the visual dominates in signs, even this composition is not “purely” visual.
The aural mode is focused on sound including, but not limited to, music, sound effects, ambient noises, silence, tone of voice in spoken language, volume of sound, emphasis, and accent. 
An example of an aural mode — one that depends almost exclusively on sound — might be the recording of a public speech that was delivered orally to a live audience, such as William Howard Taft’s 1908 speech “The Farmer and the Republican Party.” This speech exemplifies the aural mode. Delivered before radio and recorded on a 10” phonograph record, a speech like this one represents one of the early examples of hearing a speech without being in the same time and place as the speaker.
The gestural mode “refers to the way movement is interpreted. Facial expressions, hand gestures, body language, and interaction between people are all gestural modes. This has always been important in face-to-face conversations and in theater, but it has become more apparent on the web lately with the wide use of YouTube and other video players. The gestural mode works with linguistic, visual, aural, and sometimes even spatial modes in order to create more detail and convey it better to the consumer”
Linguistic (or Alphabetic)
The linguistic mode refers to written or spoken words. The mode includes word choice, the delivery of written or spoken text, the organization of words into sentences and paragraphs, and the development and coherence of words and ideas. Linguistic is not always the most important mode; this depends on the other modes at play in the text, the type of text, and other factors. Linguistic is probably the most widely used mode because it can be both read and heard on both paper or audio. The linguistic mode is the best way to express details and list.
The spatial mode, as the name implies, refers to the arrangement of elements in space. It involves the organization of items and the physical closeness between people and objects.
A good example of the spatial mode might be the different ways in which chairs and desks are arranged in a classroom.
Here is a “traditional” classroom: Individual desks are arranged in orderly rows, facing the front of the room to make the teacher who would stand before the chalkboard the center of attention. The teacher also stands at a distance from the students; the students who sit in the back could hardly even see the board!
By contrast, in this advertisement for “collaborative classrooms,” we see the chairs and desks clustered in small groups so that students can work together on projects. The classroom is also de-centered, which suggests that the teacher and students are working together as partners rather than in a hierarchical manner. All of the people are in close proximity to one another.
- Kristin L. Arola, Jennifer Sheppard, and Cheryl E. Ball. Writer/Designer: A Guide to Making Multimodal Projects. Bedford/St. Martin's. 2014. ↵