DESOSA 2021

The Building Blocks of NVDA

In our previous blog NVDA’s vision, we gave an overview of the screen reading application NVDA, which aims to help visually impaired people access their computer. In this blog, we dive into the source code of NVDA and analyze the project from a developer’s point of view.

Architectural View

In Cesare Pautasso’s book Software Architecture: Visual Lecture Notes1, he gives two viewpoints of analyzing software architecture, the descriptive architectural view and the prescriptive architectural view. The descriptive architectural view is based on the actual code, while the prescriptive one depicts a blueprint for a project. In this section, we aim to analyze the existing source code of project NVDA. Thus, we present a descriptive architectural view in this section.

Improving the accessibility of computers for blind and visually impaired individuals is the ultimate goal of NVDA. Thus, the interactions between each individual user and the software is one of the most important parts of NVDA. For this reason, NVDA exploits the MVC-U (Model-View-Controller-User) architecture framework in its software development. The MVC-U architecture framework separates information representation from user interaction. In general, model manages the data, logic and rules of the application, view renders presentation of the model in a particular format, and controller accepts input and converts it into commands for the model or view2. However, the view of project NVDA is slightly different from usual scenarios. Since NVDA is built for blind and visually impaired users, the “view” displays in the form of speech or braille rather than GUI.

Figure: Architectural View of NVDA

The picture above presents the MVC-U view of NVDA. The following sections in this article present extensive descriptions of the organization from five different viewpoints, including the containers view, components view, connectors view, development view and run time view.

Containers View

In order to acquire a clear view of the main logical execution environments in which the NVDA system runs, this section presents a containers view of NVDA. In the C4 Model3, a container represents an application or a data store. A container is something that needs to be running in order for the overall system to work.

Figure: Containers View of NVDA

The picture above shows the containers view of NVDA. The main container in which the NVDA system runs, is the windows application. The NVDA windows application container communicates with third party applications by NVDA accessibility APIs and native APIs. Meanwhile, it makes use of the NVDA controller client API to output speech or braille by speak/braille systems. In the next section, we will illustrate the different components of these containers.

Components view

NVDA consists of many different components that combine together into a working system. Not all components are needed for all users, as this depends on their hardware and software setup.

Figure: The components and internal connections in NVDA

The main loop of NVDA uses the following components4:

  • The Core handles startup and shutdown of all other components.
  • The Input manager covers multiple input types: keyboard, mouse, touch.
  • There are multiple Output methods, each with their own manager.
  • Localisation manages the language of all output.
  • The Event manager keeps track of all the events that other components request. The event queue is emptied by the core loop.

In order to work with external applications, there are several components supporting different connections:

  • Information from other applications will in general go through the Accessibility APIs, which then give information to the different handlers.
  • App modules cover all application-specific code, to make NVDA work with different Windows applications.
  • Add-ons allow developers to add specific functionality, such as support for a program or braille display.
  • Plug-ins provide global functionalities, such as extra commands.

Furthermore, the Text navigation and access is a utility component used by many other classes, but it is most notably used for formatting the output.

The Config module keeps track of the different configuration profiles and tells the rest of the components what configuration is currently active.

Finally, NVDA helper connects to certain things outside of NVDA, being the C++ part of NVDA and communication with external programs. The C++ part of NVDA helper creates a virtual buffer of the flattened representation of visual interfaces for applications. For example, on a website, this would mean creating a list representation of the website, instead of the usual 2D layout. Let’s take a closer look at the connections between these components.

Connections

Most internal connections are function calls, which are highly interconnected in the code. Because of this, it is not straightforward to separate the different components in the codebase. Let’s have a look at the most interesting connections between components.

Going back to the NVDA helper, the C++ part injects code into other running programs, in order to build a local buffer there, which can be called by the python part of NVDA helper.

The Synthesizer component makes calls to the different linked synthesizers.

Add-ons are installed through the add-on handler, after which the related Plug-ins can overwrite and extend default NVDA behaviour.

Finally, the Accessibility API support components encapsulate their respective APIs. We will be taking a closer look at this in the API design principles section.

Development view

So, you have just read about the different components and connections in NVDA, but how did this structure come about? This section will briefly describe everything from the development view and explain some decisions that have been made when building NVDA, starting with the programming language.

NVDA consists of roughly 85% python code and 15% C++5. Python was picked because it allows for rapid development among other benefits. Meanwhile, C++ is used for code that needs to be injected into other processes, and was picked for its high performance.

Finally, it is worth mentioning that NVDA relies heavily on external accessibility APIs to gather information. This was done because it is not feasible to develop such APIs themselves, and to reuse as many existing systems as possible. This also allows for easy portability between Windows versions. The main accessibility APIs that are used, are Microsoft Active Accessibility (also known as IAccessible), IAccessible2, Java Access Bridge and Microsoft UI Automation.

Runtime view

To illustrate how the different components interact at runtime, we describe the common scenario of a user opening a browser to surf the web with the help of NVDA. In this section, the various components are denoted as component6.

The user starts by executing the Launcher. This performs some basic initialisation and starts the Core. First of all, Core loads all configurations such as language, synth and speech settings. After that, it initialises all other components, such as IAccessible, GUI and Speech. Basically, everything NVDA needs to run properly is loaded. Finally, it enters NVDA’s main loop, which remains active as long as NVDA is running. The user can now use NVDA to open their browser.

In each iteration of the main loop, Core pumps the API and input handlers, and the main queue. When the user hovers over a desktop icon, the browser’s icon in this case, an event is inserted into the main queue by mouseHandler (or a different input method used) and queueHandler. This event will be pumped by Core in the next iteration, which means NVDA will announce the name of the icon. This way, the user knows whether to click it, or continue looking for the one they want. When the desktop is clicked, all available icons will be listed.

Different API handlers listen for events for specific accessibility and native APIs, for example for highlighting the address bar. After an event is received for an application, in this case the browser, an NVDA object is fetched or constructed and the event is then added to the main queue. API handler modules include IAccessibleHandler, JABHandler and UIAHandler. Each event being broadcast by NVDA allows visually impaired people to understand what is going on on their screen. They can now browse the web!

The runtime scenario has several dependencies, a list of which can be found here. Most notably, at runtime NVDA depends on eSpeak NG (with Sonic) for its speech synthesizing, IAccessible2 for some accessibility features, and liblouis for Braille integration.

Key quality attributes

In our previous essay, we identified accessibility, internationalization, configurability and extensibility as key quality attributes. We will now discuss how NVDA tackles these attributes and potential trade-offs between them.

Accessibility & Configurability

The NVDA user interface is simple, easily navigable with NVDA’s screen reading capabilities and shortcuts. NVDA is also accessible to users who use different speech synthesizers, input and output devices, such as refreshable braille displays, which are particularly important for deafblind users. All of these capabilities can be configured from the UI. Additionally, users can configure the behavior of speech synthesizers by defining how and when symbols and complex expressions are pronounced with a symbols.dic file or editing a characterDescriptions.dic to help distinguish between characters with similar pronunciation. Finally, it is also easy to install and toggle add-ons as needed.

There is a potential trade-off between having too many configuration options, and accessibility. Thankfully, NVDA manages to tackle this by providing sensible defaults (e.g. a default speech synthesizer).

Extensibility & Internationalization

The most important way in which NVDA offers extensibility, is how it provides multiple abstractions to handle UI elements. To explain this, let’s zoom in on the components view:

Figure: The accessibility APIs that NVDA uses

In NVDA, depending on which of the shown accessibility APIs the current application supports, the appropriate API is used to encapsulate each UI element into a single NVDAObject, which can then be used to retrieve information and manipulate UI elements. To handle application-specific or nonstandard widgets, it is easy to extend this class with so-called overlay classes or subclasses. This makes it incredibly easy to extend the system and introduce new behavior, as well as create new add-ons, which can add functionality and UI support for additional applications. We will be taking a closer look at these add-ons in the next section.

This extensibility has a possible trade-off with the internationalization of NVDA. NVDA supports over 55 languages and can automatically detect UI of a different language. However, developed add-ons risk only being usable for one language. Thankfully, NVDA makes it easy to add locale-specific information for add-ons, which are automatically loaded based on the user’s language.

API design principles

A discussion on API design principles is complicated by the fact that NVDA does not have a clear API. There are certain helper functions and encapsulations which are denoted as APIs, but are these really APIs? Pautasso defines the rule of three to identify an API1:

you know you have designed a reusable API only after at least three applications have been built on top of it

One interface which clearly satisfies this rule, and would even satisfy a rule of fifty if it existed, is the AppModuleHandler. This can be considered an API for interacting with the underlying NVDAObjects and is the main way that contributors can develop new add-ons. It is an event-driven API, allowing developers to subscribe to different UI events for the NVDAObjects.

The most important principle which the API adheres to, is the small interfaces principle. When interacting with the API, the developer does not need to exchange almost any information. The developer simply subscribes to the desired events. When the previously discussed Core fires this event off, the relevant NVDAObject will be returned, which the developer can then perform the desired operations on. This also provides a healthy balance between reusability and usability, as the event system is easy to use for different developers, while the returned NVDAObjects provide a powerful way to manipulate elements for different use cases.

Conclusion

Hopefully, this blog post provides some insights into the inner workings of NVDA and the way that key concerns are built into the system. In our next essay, we will talk about the way that NVDA ensures the quality of its product, and how this impacts the evolution of the system.


  1. Pautasso, Cesare. Software Architecture: visual lecture notes.(2020) ↩︎

  2. https://en.wikipedia.org/wiki/Model%E2%80%93view%E2%80%93controller ↩︎

  3. https://c4model.com/ ↩︎

  4. The NVDA community wiki: https://github.com/nvaccess/nvda-community/wiki/internals ↩︎

  5. The analysis by GitHub on NVDA’s GitHub page↩︎

  6. Format taken from last year’s essay The architecture of architecting Rollercoasters ↩︎

NVDA
Authors
Gedeon d' Abreu de Paulo
Robin Cromjongh
Ricardo Jongerius
Hang Ji