There has been a considerable amount of buzz surrounding ChatGPT and GPT-4, with most discussions revolving around either 1) using prompts to generate text or 2) implementing GPT-4 via the OpenAI API (find the documentation here) to integrate it into applications. However, these points often miss the more significant underlying trend - natural language is becoming the new user interface (UI).
In this article, I have referred to this type of interface as NLUI (Natural Language User Interfaces). However, Lorcan Dempsey proposed a much better name – ChUI (Chat User Interfaces), which is pronounced "chewy." ChUI also better encapsulates that there is a two-way, iterative, interaction involved. I highly recommend checking out Lorcan's blog.
A Brief History of UI in Software
Over the decades, we've seen various stages in the evolution of UI design. We started with the command-line interface (CLI), which required users to input text-based commands to interact with a computer. This eventually gave way to the graphical user interface (GUI), where users interacted with on-screen elements like buttons and icons.
The development of web-based user interfaces and touch-based interfaces, as seen on smartphones and tablets, also represent important steps in UI design. Additionally, virtual reality (VR) and augmented reality (AR) interfaces are considered by many as an opportunity to further contribute to the ongoing evolution of UI design, offering immersive and interactive experiences.
It's essential to recognize that each of these these UI paradigms have their own constraints. Building apps to work for users today means working within the limitations of each UI paradigm and finding ways to meet users' needs within these boundaries.
However, with the public release of ChatGPT, we are now witnessing the dawn of a new era in UI design: the Large Language Model powered natural language user interface (NLUI). NLUIs empower users to interact with applications using natural, conversational language, effectively eliminating the dependence on traditional UIs.
Its important to note that NLUIs have been in existence for some time; however, the recent advancements in AI, as demonstrated by ChatGPT, have cast this approach in a new light. The remarkable progress in natural language understanding and generation now allows for more sophisticated and intuitive interactions, making NLUIs a more viable and attractive option for a wide range of applications.
The introduction of Large Language Model NLUIs has the potential to revolutionize the way users interact with systems. By building a AI-enabled text or audio interface, such as a chat interface, it becomes possible to map user requests to specific operations. Instead of navigating around a GUI, users can simply input the desired operations 'directly' into the system.
This doesn't just change the way we interact with apps (eg. buttons vs text), it fundamentally alters how we work with them. Exposing the operations to the chat prompt makes the entire workflow more malleable, allowing users to fluidly request various operations, using the language they are comfortable with, without being constrained by the logic of the GUI.
The Impact of NLUI on Workflow: A Case Study in Publishing Systems
One of the most compelling use cases for NLUI can be found in the world of publishing systems. The challenges of building apps to manage workflows for journals are well-documented, particularly when it comes to catering to a diverse range of publishers. As I have discussed in my other articles and demonstrated at Coko, it is possible to design workflow systems that meet these divergent needs within the one system, but there are inherent limitations.
From a bird's-eye perspective, all journal workflows may appear similar, encompassing submission, review, improvement, and sharing stages. However, when examined more closely, it becomes apparent that numerous small differences in workflows are informed by the culture of production within each organization. As publishers progress through these four stages, their processes increasingly diverge, even down to the specific terms used for various operations.
Moreover, there are always ad-hoc operations that take the workflow "off" the happy path and towards bespoke, nuanced, eddies. Currently, these challenges are managed with GUIs, meaning there are specific things users can do from a graphical user interface to manage these issues. Some applications (the vast number of journal platforms out there) manage these problems badly, but there are some that manage a portion of these issues well. The challenge for users of any of these solutions is to learn and work within the applications workflow paradigm, and going to the right interfaces to do the right things.
Often to get the application to conform better to the workflow users use various 'hacks' - some are within the system (using the system in ways other than how it was designed to work), and some are are external to the system (augmenting the main tool with various other GUI tools like wikis, spreadsheets etc).
Integrating NLUI into Publishing Workflows
Incorporating NLUI into publishing workflows has the potential to greatly enhance the user experience and streamline processes. By allowing users to request specific actions using natural language, an NLUI can make it easier for users to execute tasks that align with their desired workflow, without having to rely on complex GUI navigation or "hacks" to achieve their goals.
For instance, a user working with an NLUI-enabled publishing system could simply type or say, "Assign reviewer John Smith to manuscript ID 1234," and the system would process the request accordingly. Similarly, a user could request the system to generate a report on the progress of all manuscripts under review, by stating, "Show me the review status of all current manuscripts."
These natural language interactions enable users to work more intuitively with the system, as the NLUI adapts to the user's language and specific requests, rather than forcing the user to conform to the limitations of a GUI. Additionally, with an NLUI, users can more effortlessly integrate ad-hoc operations into their workflows, since the system can be engineered to comprehend and execute a wide array of natural language requests that would typically fall outside the predetermined order imposed by the GUI.
By introducing an NLUI into publishing workflows, users can benefit from a more seamless, efficient, and customizable experience, allowing them to focus on their primary tasks without being hindered by the constraints of traditional GUIs or resorting to external tools and workarounds.
The rapid advancement in AI technologies, such as ChatGPT, offers a tremendous opportunity for publishing tool builders to rethink and redesign the way users interact with their systems. By embracing these technologies, publishers can not only improve the user experience but also unlock new possibilities in terms of adaptability, efficiency, and customization in their workflows. It is essential for tool builders to recognize the potential of ChatGPT and similar tools not as a paradigm that adds functionality to the GUI, but as an interface that can replace the GUI.
As we look to the future of publishing systems (and leaving aside further AI-UI innovations for the minute), it is clear that the integration of NLUIs will play a pivotal role in shaping the industry. By enabling users to work more intuitively, engage in complex and ad-hoc operations with ease, and interact with systems in a more human-like manner, we can create a new generation of publishing tools that cater to a wide range of needs and preferences. By considering natural language as the new UI and embracing the transformative potential of AI technologies like ChatGPT, we can revolutionize the way we think about, develop, and utilize publishing tools in the years to come.
© Robots Cooking, 2023, CC-BY-SA
Image ('NLUI') public domain, created by MidJourney from prompts by RC.