In Plain Language: a new ISO standard for a generative age?

Andrew Joscelyne
11 min readAug 13, 2023

--

After many years of promoting “plain language” (PL), mainly in the English-speaking world, the organizations behind “official” PL development have been successful in establishing plain language as an ISO standard. It received the label ISO 24495–1 in February, 2023.

What could this mean for language accessibility across cultures? What sort of issues does it raise in a generative AI mindscape?

ISO 24495–1:2023 is intended for anyone who creates or helps create content for “public” documents. It is particularly useful for government forms, contracts, and information services, as well as commercial documentation, terms of service (e.g. privacy policies for online apps!), and product manuals. The standard is also applicable to technical writing, legislative drafting, and even using “controlled languages”.

So the aim is to reduce the misunderstanding of information due to either over-fast readers who fail to grasp written content properly, or variously “disabled” readers (neural-, lingual-, etc.) who do not have native mastery of a given tongue, writing style, or communication media. By applying a standard, there can be agreement on what constitutes “language understandability” as a social criterion within a community.

The kind of questions that a PL user should therefore try to answer are:

- What do I want to communicate?

- How do I want it done — fast, slowly, with humor, explicitly, in shorthand, and so on?

- Who is likely to read it — and also who is my ideal reader (or “a list of optional target readers” apart from a lawyer…)?

Note how news reports and commentator articles right across the web are currently changing from the old headline + a set of paragraphs to a headline title plus a small set of numbered, titled paragraphs such as these:

The news: …

Details:

What else:

Why it matters:

This shows how a PL approach can model the logic of a news item as the sort of questions a reader might raise about that item. Seeing everything as the answer to a typical reader’s questions is a classic strategy for PL.

How global is PL?

The new ISO standard has been developed by three mainly Anglophone organizations working in the PL sphere — Clarity (covering North America, Australia, and the UK), PLAIN (30 countries involved), and the US Center for Plain Language. In general, it looks as if PL is a mainly Anglo + European idea. It is hard to find examples of an official PL trend elsewhere.

The long road towards economic and cultural integration undertaken by European countries since WW2 has naturally involved considerable heated discussion about how to handle language and communication enhancement issues.

The EU opted for a policy of language equity and non-discrimination by encouraging positive multilingualism across the bloc. This covers a huge syllabus of applications, ranging from education options to designing translation technology. And it has presumably examined the PL question in general, even though each member state is responsible for implementing its own policies and methods.

For example, one PL product that includes six major EU languages has gone a step further and built software solutions to encourage PL uptake. This outfit has developed an app with a decidedly un-plain name called Glanupegalain — an e-learning package for writing content in nine PLs, with interactive lessons and instruction videos. It’s hard to get information on how effective it has been or how many people have actually trained on it. PL is clearly not a communication tooling priority in most countries.

In the USA, The Plain Writing Act of 2010 is a federal law that requires federal agencies to use PL in certain communications with the public, such as publications, forms, and letters. The law applies to documents that are necessary for obtaining federal government benefits and services, as well as for filing taxes

Switzerland too has a Simplified Language bureau, largely associated with the media-handicapped community. Article 8 of the Federal Constitution states that everyone is equal before the law. Paragraph 2 states that “no one shall be discriminated against on the grounds of origin, race, sex, age or language…”

PL — an example

Here’s a typical example of what is meant by “plain language” as opposed to harder-to-read or “complicated” official text:

Before PL: “A commitment to eliminating contamination in reservoirs for human consumption is part of the Committee on Infrastructure and Planning’s prioritization agenda. Water service restoration plans are being evaluated by our senior hydro-mechanical engineers and technicians have been dispatched to areas affected by the service outage.” (47 words)

Using PL: “Providing clean drinking water is an essential city service. We are working to get your water service restored as quickly as possible.” (22 words)

This is the kind of radical communication tweak that most of us would support in the public sphere, and there is an increasing awareness that more directions, instructions, and similar informative language acts targeting the public can use this style.

Note also that document layout by citizens in general can also be redesigned to simplify and enhance the interface to actionable information. The rapid growth of “word processing” starting in the 1980s and the subsequent tsunami of apps on mobile phones has obviously transformed interfaces into a key design focus for textual presentation in general.

The context for all this is clear: the worldwide rise in local legislation, migration, travel, and population upheavals generally has meant that more humans are being confronted by more unfamiliar official linguistic information more often than ever before. PL is designed to ease the mental work needed to access and respond to vital knowledge from official sources.

However, the broader question now is this: can we automate PL production in line with this new standard so that everyone can benefit at the lowest cost? And what happens when we integrate this kind of language “knowledge/skill” standard into the fundamental generative act of prompting text and speech using emerging generative AI technology?

But first, a few plain facts…

Varieties of Plain

We are all familiar with the concepts of language registers, standards, or levels such as “plain”, “literate”, “colloquial”, “dialectal”, and so on. They appear to make sense but are rarely applied as explicit guides to practice. How often do we specifically request a register or “standard” of a given language when discussing content? We simply tend to imitate what we have already seen.

“Standard English”, for example, is a vague term for some imaginary corpus of core grammatical usage and lexical choice. But we sort of know what is meant by it. Above all, it avoids thousands of terms and expressions from the best-known dialects, idiolects, slang, and other types of geographically, historically, socially, or technically conditioned local speech/writing.

Another way to talk about standard vs. other language varieties is to suggest that written language should not attempt to imitate typical spoken forms in certain cases — e.g. where language is compressed in some way or uses rhyme and other sound features that do not communicate clearly in official writing.

By making PL appear to emphasize widely-shared usage (as opposed to special vocab-driven registers such as medical, legal, scientific, financial, literary/poetic, etc. writing), it certainly hits the “equality of understanding” button for many social groups accessing information.

But PL cannot really ensure constant understandability for everyone from every background over time. We know that usage and fields of reference are constantly changing, as are the contents of official or legally binding documents. What might be plain language today may be a local puzzle tomorrow, or mystify a specific group of “second-language” reader/users who eventually come on-stream.

So there is probably no definition of PL that satisfies everyone working in the communication sector. For many English speakers, “plain” typically means avoiding Latinate lexical items — e.g. don’t say contradict — say disagree, or argue against.

Yet PL goes much further than controlling the scale of lexical difficulty or user vocabulary size. It is a form of language that encourages ease of understanding whatever level of literacy a reader or listener might have, and aids that individual to grasp significant content with universal implications — especially official, legal, and religious content.

Cultural translation

Take the Christian Bible. At least in English, there have been various easier-to-read/understand (hence PL?) versions of the Bible (e.g. EasyEnglish), given that this text has been translated into English several times over the past 1,500 years in an effort to either reflect the “true” meaning where other translations apparently fail, or to bring the language up to date and more accessible for a new generation of readers.

These include versions for blind (Braille) and otherwise disabled users (e.g. signers), for “young people”, and some socio-medical categories. These all use appropriate vocabulary (restricted range) and “simple” syntax, as in PL. A braille version of any text will follow the lexis and syntax of the language in question, as only the formal structure of letters is at stake. So there could easily be a PL version of a Braille Bible or any other document.

However, a signed version of the same (i.e. visual signing for a deaf/dumb cohort via a media) is rather different. It is not clear that PL has any influence on signing, given that signers use a specific system for encoding meanings that is not based on lexically varied alphabetic text.

You might, though, like an ear-plug PL version of say a Shakespeare play so you quickly grasp what is being said in the more obscure language uttered on stage or screen…

PL and generative textuality

As suggested, this new PL standard happens to arrive just as generative AI is all the textual rage. Some types of written language output can now be produced automatically in bulk by deploying large language model (LLM)-driven technology. Basically these devices will deliver a written/spoken response to any input question or suggestions — aka prompts — framed in words (but presumably visual signing could also be built into the technology).

Note that these language models do not have automatic access to ground truths or human reasoning, but simply echo their monumental memories of human norms of speaking/writing content (derived from vast data resources) in a few dozen languages.

Given certain caveats, an LLM therefore outputs normative (not original or creative) content at great speed in a readable fashion, making it a useful but obviously imperfect tool for anyone needing non-creative summaries, proposals, lists of facts about large text collections, clarity about a piece of content, and so on.

As an example of its possible linkage to PL, open your favorite generative AI app and try using “Put this into plain language” as a prompt together with a reasonable-sized piece of official text of some kind. It will probably output a list of summarizing sentences in bland English.

Yet if ever there was a language operation that merited automation, it is surely PL production: high-volume, rule-bound, repetitive, low-creativity-quotient writing is precisely what humans need but can’t produce or read/digest easily in long documents. PL versions or summaries of such documents in a number of languages could soon be just a few seconds away from your prompt…

Customized coding

Another critical area for PL expansion could be the democratization of software coding. Generative AI will almost certainly be applied to helping individuals write the code for their own customized solutions to machine tasks in text, visuals, accounting, gaming, inventing, planning… you name it. The most obvious use case is writing code that enables you to produce a better, personalized yet automated solution for your work or entertainment needs.

For example, Agentflow is a user-friendly tool for creating and executing complex workflows using large LLMs. Using the interface, you can write down your workflow in “plain English” using JSON files, and Agentflow helps develop custom functions and generate autonomous outputs. This opens up what we still call “software development” to a far broader population of users.

We can therefore imagine a future in which familiar language can be used as a suitable prompt for designing personal software applications for knowledge management or other text applications. Just as the democratization of writing/reading (“any child can learn the alphabet”) led to the massive structuring, sharing, comparing, and expanding of knowledge and literary experiments in earlier ages.

Beyond plain towards “my” language

In a generative-AI text ecology, therefore, we can imagine many more variants to the basic PL model of transforming complex text into a standardized easier-to-read/understand format, both linguistically and possibly visually.

However, if an application can deliver a PL version, it can also do many variants of PL, tailored in each case to a number of factors specified by the user — or rather pre-specified to adapt all kinds of language understanding to a specific end-user. The key virtue of this “co-pilot” (the machine helps you) version of knowledge production and consumption — aka writing and reading — is surely that anything you access can ultimately be customized to your end-user needs, whether linguistic, intellectual, or social.

If so, then some types of reading could change their nature: instead of being a typical human problem of “grappling with a text”, machine-assisted reading can focus on expressing the realistic outcomes of gathering and engaging with information in “my” language, not simply that of the original text. In other words, we can expect more and more textual content to be “translated” into my natural idiom over time, making it easier for me to rapidly understand, personalize, and evaluate larger quantities of information.

This is the exact reverse of humanity’s basic multi-millennial education project: rather than exploring the world by opening outwards, we shall bring all that world out there back home. Instead of assuming different personas as modes of existence in the world, we could reduce otherness to a feature that we can rewrite in our own terms!

Logically, if this is possible, what I write as me will in turn be slightly remodeled by you into your idiom when you read stuff sent from me to you. Instead of reading just your words, I shall be reading my information filtered from your language.

In a sense, this process of “personalizing” content began in Europe in the late 19th century with the ability for most people to look up words they don’t understand in a dictionary — the very first stage in appropriating meanings by asking a remote source to adapt to your needs. Soon we might be able to use technology to ask for an automatic conversion of any information into our own familiar idiom whenever possible.

If so, what would this mean for e-duc-ation (i.e. being led out of our personal ignorance into the light of collective knowledge), for “learning to read and write” in a given society?

As generative tech will inevitably filter down to operate inside children’s devices, this process of having a machine learn what you prefer and understand best will begin to influence language acquisition and management at an early age. Heh, Johnny, is that too hard to read/understand? Just ask the app for your very own plain précis. But if you prefer the taste and texture of the real text, start reading the original from scratch again…

And on the flip side, you can rapidly enter your amateur, plain “my language” version of your content into the machine and have it output in a few seconds a more polished, clearer text, better geared to different shades of public consumption. But give it a careful reread first, just in case.

At the end of the day, you will be able to rapidly read in your plain language the results of a complex argument in a technical paper, write a rebuttal in your own idiom, and then have the machine clean it up instantly for publication on your blog, complete with quotes from original documents.

Mmmm. This all sounds a bit like the current trend of rewriting children’s books in a more woke language, or changing the way female characters have been portrayed in older works of fiction. Editing is all. Maybe our Maker didn’t actually create the world — she simply rewrote it in a plainer version.

--

--