Building a multilingual digital twin Part 2
As noted in Part 1 of this post, the implicit timeline for big and small language projects, plus the global endangered language program adds up to a workload on multilinguality likely to last for decades before we achieve satisfying progress. In an exponentially growing world, we owe it to everyone to move faster.
I suggest here that we should abandon thinking solely in terms of current real-world exploration to map and engage with the world’s languages. In keeping with the exponential age, we should embrace digital transformation as the appropriate agenda and start by asking the question: Can we create a virtual smart space for resourcing, examining, and innovating around multilinguality in action by collectively constructing a digital language twin?
Our goal would be to redesign the language barrier as a virtual “open gate.” The internet as we find it is not sufficient for our purposes. As an instrument of knowledge exploration it remains too text-based, too focused on search & find, on lists and linear practices from the print age, too piece-meal and not “intelligent” enough.
Of course, all that information is extremely useful, just as books, pens and radios still are. But humans experience language as performative as well as informative — we use languages to achieve personal or community-wide outcomes, to change our worlds…and the world. This means operating beyond the content of interpretable texts and pivoting towards more direct verbal interaction in a multimedia environment. The purpose of this post, therefore, is to promote 3D multilinguality as a living virtual experience for everyone. It should be transformative, not just an academic experiment or metaverse game. A model for investigating a new dimension when communicating tomorrow.
A word about the concept of digital twins. Today, these are virtual devices (using compute, screens, visuals, and sometimes special glasses to model the real world) that devote vast quantities of data to re-create a real-world process or complex, such as a factory or a huge machine. By virtualizing these large constructs into a database and then representing them as manipulable 3D images, you can inspect or test an industrial part in a plant or optimize a machine process to reduce risk, improve return on investment, or accelerate a supply chain. All without having to move things around at great expense in the real world until you are sure of what you can achieve.
I suggest that we could borrow this general concept of digital twins and apply it semi-metaphorically to the real geo-sociology of language behavior and its real-world outcomes.
Multilinguality in this approach can be modelled in a 3D world as a global mosaic of language smart spaces (including all their internal narrative and structural properties, ranging from speaker numbers and structural properties to inter-relationships with other languages, history, and more) involving a huge cloud of random intersecting behavioral properties. By collecting as much as possible of these data into the right multimedia format, we could create a new way of modelling, understanding, sharing information about, and engaging visually and audibly with human language in an AI/tech age.
Certainly we would have to link up virtually with the communities speaking many of these languages to enable them to celebrate their tongues. Key to such a twinship approach is the fact that it brings human beings together in a new way around an exploration of language and human behavior. This is not simply a question of enhancing language technology, but an attempt to use technology to advance an equitable investment in language as a domain of interest. So play, fantasy, and exploration would be primary activities enabled by such a digital twin, and humans might use avatars to circulate and perform inside this space.
Obviously there needs to be sufficient infrastructure around the globe to support this functionality. Especially communications bandwidth in all geographies, which currently poses a challenge in many countries. Today’s gigabit access in the richer world is now being enabled by 5G, F5G, and Wi-Fi 6 for homes, individuals, and organizations. This is set to evolve toward 10 gigabit capacity enabled by 6G, F6G, and Wi-Fi 8. Hopefully this can spread to the rest of the planet as soon as possible…
The digital language twin, then, looks like the most appropriate vehicle for global language immersion via a virtual world. Why? Because global multilinguality on Earth involves up to seven thousand tongues, which we have so far perceived as colors on maps or lists on pages. No one is going to travel the planet in search of them all. Sure, we can read about all these languages in Wikipedia articles, but we can’t easily experience in a direct way the acoustic buzz or semantic intimacy or collective use of many hundreds of tongues.
A virtual playground — no doubt an instance of what will become the emerging metaverse — therefore looks like the only vehicle on the horizon for transforming our experience of multilinguality from slow, fact-sharing scholarly inquiry to rapid and more immersive human encounters. Provided of course that this solution can overcome all the ethical issues around bias, privacy, data protection, and the human right to self-determination.
Such a digital twin could also attempt to model visually, acoustically, and even “semantically” the parallels and differences between languages from different geographies. It could engender multidimensional conversation between (at least some of) the world’s language speakers or their avatars, raising the stakes from factual knowledge to experiential insight and then to the overall improvement of tech solutions for different kinds of human interactions. And it could raise, discuss and advise on legal issues around using specific languages as a human right.
This could be achieved via virtual journeys, workshops, games, performances, classes, one-on-ones, and more. Ideally, language’s digital twin would attract artists, designers, and imagination-driven engineers to create mini-worlds within the twin which could model, experiment and play with different types of relationships that people — especially children — might have with languages. With no geographical limits. And aided by access to constant real time translation where necessary.
This platform should always be geared to respectfully demystifying the “other”, creating virtual communities around language issues that any speaker might encounter. As a twin world of human language, it would constantly gather and circulate data and machine learning in the background, so as to deliver parallels, translations, and connections that can lead to better learning and understanding at the human interface. These same data would also be relevant for most other language activities out in the “real” world.
Here is a check list of the possible benefits of this sort of digital-only environment. Tomorrow’s metaverse tech is likely to advance and change very quickly, so there will be plenty of room for improvements at the interface that cannot yet be anticpated.
- The digital twin is virtual and immersive: this means anyone can join in without overcrowding the room. This is vital for a multilingual population ultimately covering thousands of the world’s languages in parallel. Nor should this kind of construct be excessively tech-focused. The ideal would be for all the complex compute, AI and other tech driving the twin to disappear completely behind a foreground experience of contact-making, sharing, learning, entertaining, and playing.
- It is multimedia: this means we do not have to rely on writing as a priority communicative medium. Language is spoken and sung, requiring a sound and music dimension via an acoustic channel. And our mouths are physically close to our eyes: so language also evokes or is stimulated by (moving) images of all kinds. We could invent a 3D version of the dictionary…
- It is naturally suited to human encounters through games and playing, one of the obvious attractions of the upcoming metaverse as a whole. Language exchange or learning through play opens up vast horizons for smart designers to attract, amaze and educate new types of participant. It could also include access to a simpler augmented reality alternative that enables users wearing special glasses to explore multilinguality through playful apps designed to overlay the real world rather than replace it.
- It gives access to a unique research lab by easily creating spontaneous special interest groups that can use the digital twin to explore vital multilingual issues in areas ranging from policy, rights, sociolinguistic diversity, learning/teaching, and translation to signing (there are 200 different signed languages and up to 70 million deaf people across the planet), tech (TinyML, etc.), history, gender, endangerment, child-speak, aging, brain science, and more. All these communities would also have access to the (open, not prohibitively-priced licensed) foundational data, modelling, compute, and audio/video resources to network in real time on any issue.
- In addition, as a digital space, it can grow in many different directions without encountering physical constraints. It can therefore help speed up the research-to-product process to improve speech and text translation services, and offer a test-pad for similar local technologies right across the language waveband. It could enable R&D and other communities to undertake very large-scale tests or collect data (under acceptable “open” conditions) that might make massively multilingual translation far more feasible far more quickly than present linear developments can.
- It would be able to provide a more global forum for specifically human communication problems such as language handicaps, language-related impairments, childhood and third-age language usage issues, where the large-scale sharing and discussing of problems might lead to better solutions for certain conditions.
- It would naturally enable endangered languages (EL) to enjoy a full seat at the table. Some ELs may not want to play in this sort of space, due to its excessive technology pivot and association with big power corporate ideologies, as well as the fact that any language content would be surrounded by potential polluters and influencers. But in fact, a carefully managed presence in the twin would ideally remove that pejorative “endangered” label from a given tongue: an EL that is speaking and interacting with others at least has a sporting chance of survival in a changed world.
- On a planet facing a climate crisis, raw cultural and political divisions, poverty, and potential migration tragedies, the possibility of scenario building, exploring and modeling alternative communication exercises for communities within the twin might help inspire the invention of real-world solutions to crisis-driven language questions more effectively, rapidly, and globally than existing methods.
- The twin could crucially help dissect ethical issues about tech and language under a single lens: subjects could also include the benefits, limits, and dangers of creating data twins of conversations, stories, personal dialogs across multiple languages. The aim would be to confront views and ideas about the problems and potential of AI-inspired data collection as an aid in building a world of more positive language engagement.
- Finally, and perhaps more ominously, a global language space might also encourage us to raise difficult questions about language. Are some languages more effective than others at certain intellectual or communicative or delicate tasks (not just noting that language X has a special word for abc)? Can we ensure that all languages are equally expressive for all genders? Should we force change onto a language (e.g. using tech interference in the human brain) to improve their capacity to express a cultural or human ideal? How far are mood and sentiment encoded similarly across languages? Does a larger speaker population of a language contribute to a wider range of expressivity, or is it the opposite? And so on.
So how do we ensure that people will come and populate this sort of virtual space, share their languages, ideas, and emotions? Does the concept really meet an emerging need, or is it just wishful thinking? What about the danger of being invaded by crazies and wreckers if you make the twin into an open-access space? And what if the Internet of Things — a web of tiny computers inserted into everyday objects — began to listen to all our conversations in the twin and manipulate them in some illegal way?
This post sketches a small blueprinting effort to get the digital twin idea rolling. Obviously, the lack of global digital equity means that the world’s language populations won’t all share the same baseline technology access to a metaverse facility such as a digital language twin for some time. This and many other financial, technical, social, legal, and ethical issues cannot be treated here, but will naturally influence any debate.
The key move is to dare to build a community that can stay focused on desirable futures for the next exponential stage in the evolution of a multilingual planet. It could at least work out the most appropriate models. And then find the resources to explore possible futures. Any ideas on the general viability of this proposal are welcome!