NetworkX, a Python package, allows users to perform a variety of network analysis tasks.
The open-source package attracts a large and varied community. In our final blog-post of this series, we will analyse NetworkX’s community and highlight its relationship to its architecture.
One can draw a bidirectional relationship between NetworkX and its community. NetworkX and its maintainers foster an environment for the community to thrive, and the community influences the architecture of the project to facilitate its feature-rich and simple interface.
1. Community Overview
A large community is one of the essential elements for successful open-source projects1 because it encourages:
- Discussions
- Collaboration between people with different backgrounds
- Bug discovery
- Attracting new, diverse users
- Functional improvements
- Project’s fervour for relevancy.
Only a handful of NetworkX’s contributors are the core developers, the rest are regular developers, researchers, as well as common users. The overall communication happens over four channels: GitHub, Twitter, Reddit and Stackoverflow.
GitHub is used for direct interaction with core developers for reporting issues or providing improvement suggestions. This channel acts as a way to directly affect the project and be a medium for collaboration.
Twitter and Reddit, on the other hand, are more of a platform for common users and researchers to share and interact with each other to share their creations. Additionally, these channels are also used as announcement platforms. Lastly, Stackoverflow is more focused on helping users with technical problems as well as providing knowledge to handle NetworkX effectively.
2. People
To see the relationship between the project and the community clearly, we need to decompose it into the underlying social groups. This directly relates to the different relationship dynamics that can arise between a project and a community23.
2.1 Contributors
Core-developers are responsible for the management of the project, act as a final authority for making architectural decisions, and are responsible for change management through auditing issues and incoming merge requests. Additionally, core developers are managing the financial aspect of the project which can be found here.
Contributing-developers are people who decide to pick up an issue and develop a solution while collaborating with core developers through GitHub. But that is about it, contributing developers do not have any authority or deciding power, at most they can express their opinion regarding some issue or decision but can not influence it directly. Having this in mind, we can still see how core developers are mindful of contributing developers and actively involve themselves in arising discussions.
As mentioned before, the community interacts primarily through GitHub for contribution4. GitHub enables trackable issues and suggestions, the ability to track, reference and comment on code-changes, and supports dynamic and rich discussions. During the initial discussion, the core developers actively participate, giving good suggestions and providing helpful examples. The changes are discussed well and it is a rigorous process (demonstrated here) and the changes are rarely accepted without additional improvements. Such a collaboration process exemplifies the effort to uphold the key quality attributes.
2.2 Users
Many users are either researchers working with various projects (Geneticists, Social Scientists, Urban Planners, Epidemiologists, etc.) or enthusiasts creating interesting applications/graphs. Many of these use Twitter to share their creations or publications which use the package. Some users are companies that use NetworkX, be it to visualize their data, demonstrate their findings, or simply use NetworkX as an example in their Python learning platform. Such wide use of NetworkX in scientific and commercial fields raise a good example of the success in empowering the vision of the project.
Developers are increasingly open-sourcing their work as GitHub repositories. By simply querying “NetworkX” in GitHub you can find 2,044 repositories either using or expanding NetworkX. This indicates trust and wide usage of the project. From a software architecture perspective, it is paramount for the core team to uphold the key quality attributes and be careful when introducing new features and changes in the project.
3. Diversity in Skills
The NetworkX community has a diversity of skills from different domains. We list some examples of how people in different fields combine their unique skills to use NetworkX.
- System Engineers: people studying systems care about how to quickly locate the security vulnerabilities in a system. They put more emphasis on using different network ontologies to build small networks and investigate test processes through network simulations5.
- Data Analysis: as an important data structure, graphs are frequently used by data scientists in their daily work. Common skills include using graphs to visualize output, spot abnormal patterns6 and build models.
- Mathematicians: mathematicians are good at constructing and optimizing algorithms, so they may pay more attention to the implementation details of the algorithms' module rather than the visualization.
- Social Scientists: there are many interpersonal networks, behaviour patterns, knowledge structures and other researches that need to build and visualise graphs in the field of social science. NetworkX provides flexible node types to help analyze such as the unstructured data from interview texts.
- Biologists: gene maps, chemical reactions and various complex applications place high demands on graph libraries to process conveniently and stably. NetworkX guarantees that biologists can use their skills for discovering useful information in complex graphs.
As a graph library attracting people from different domains, NetworkX is popular enough to be acknowledged first7. The stability of operating complex networks, the flexibility of defining graphs, the python-based scientific community and word-of-mouth are all plus points for NetworkX to be used in combination with multiple skill sets.
4. Knowledge transfer
NetworkX’s knowledge transfer is largely based on the documentation and examples as opposed to presentations and events. These examples can be more modified by practitioners of various disciplines. While it is less interactive, it is more long-lasting. Researchers and developers document their usages of NetworkX further providing rich knowledge transfer.
All these usages are well-documented and try to include varied and useful examples. The documentation and examples can be categorised into the following:
- General-purpose: examples containing instructions for getting started with sample code-snippets.
- Specific: explanations or examples explaining specific algorithms, their usage, pros and cons.
- Complex: A complex example usually integrates other libraries along with the documentation of usage and made decisions.
5. Impact by the Community
“What in NetworkX bridges it to different domains?”. The answer is the modular components8 in the architecture, e.g. how the Algorithms and Drawing components work with the Core. Users need only be concerned with specific functionalities of NetworkX.
5.1 Academia
The statistics of the Scopus9 search show that NetworkX supports multiple academic fields. Most of the use cases are in computer science, engineering and mathematics since these fields deal with graph problems and have the programming know-how.
NetworkX aims at providing usability by easing the required effort to manipulate and study networks, which supports the researchers to build up their scientific network graphs10 and conduct analyses11 conveniently.
The wild academic use forces NetworkX to go in a scientific direction. This entails more graph algorithms along with robust testing. Academics report new issues and provide good pull requests for specific academic requirements. One example is illustrated in Issue #4056 by a user to call for adding graph learning algorithms such as node2vec. The core-developer further adds the “Enhancement” label on this. Another example, Issue#4463, shows a clear direction towards graph-summarization algorithms for academic ventures12. Explicit tests and variations of current graph objects are added into the library which allows the existing graph structure to be more flexibly combined with new algorithms.
5.2 Industry
Due to the active dependencies with other external libraries, NetworkX has also devoted itself to maintaining the issues caused by running these dependencies. For instance, issues caused by GraphML (used to describe the structural properties of a graph) are labelled by maintainers as “Enhancement” and “fixes”. This forces the developers to be more aware of the external environment when developing their code.
NetworkX is working towards better integration explanations with other packages in the community. There are pull requests for improving the external gallery examples for other popular graph libraries, such as igraph, pygraphviz and jit, shown in PRs #4422, #4427, #4261. Such integration strengthens the connection between NetworkX and the industry. The greater support for other packages increases user-stickiness since users need not switch between different packages to find suitable methods.
A user from Stackoverflow13 provides a great answer regarding how to export a NetworkX graph into JSON and further displays it via the D3.js library. He especially mentions that the cool thing is that NetworkX does not enforce one to use any specific library, which reflects the integration job has been recognized in the community.
5.3 General Public
Here we will talk about the use of NetworkX by practitioners for non-academic purposes. We explore two interesting projects posted on the blog and Twitter #networkx where the general public use NetworkX to build their applications. Dima Goldenberg implements a nice network view of the Eurovision2018 votes by building a directed graph using NetworkX from the edge-list14. Will Gregory(he/him) shows a network of sea surface temperature anomalies15 using different coloured links showing covariance. These projects reflect that NetworkX helps convey powerful, vivid messages. While NetworkX is affecting the real world, the real world forces the API to be simpler. Several GitHub PRs (#1680, #1688) aim to remove redundant code, enforce polymorphism and simplify the use of the API.
6. Effect of Community on Architecture
We have penetrated deep into “Who makes up the community?”; Now, we will look at how they affect different elements of NetworkX’s architecture:
- Simplicity: users suggest simplicity changes that encourage abstraction and polymorphism. (PR#1685)
- Gentle learning-curve: the startup process needs to be simple for everybody. What is “gentle” for a programmer might not be for a biologist. (PR#1684)
- API: The API structure of NetworkX is affected by community suggestions and improvements to existing implementations. (PR#4395)
- Robustness: there might be certain edge cases that occur in one application-domain but not the other. For example, contributors can improve how data is handled to accidental changes to the supplied data. (PR#4372)
- Feature-set: addition of network-algorithms or drawing methods by the community are expansions of the feature-set. (PR#4240)
- Interoperability: addition or changes how certain data is converted and managed to support other libraries and structures. (PR#4319)
- Software Variability: this is usually effected through contributions introducing or changing available parameters or contexts in which a function can be applied. (PR#4384)
7. Conclusion
NetworkX has to accommodate various requirements to remain useful, relevant and current. The community benefits from the package, and the community-experience influences the architecture and the future direction for most of the functional elements of NetworkX.
The diversity in the community is a unique attribute. It enables the core-developers to identify non-intuitive approaches; approaches that have a real-world impact such as combating climate change, curing diseases, communicating with the masses… simply building a better world.
References
-
Discovering community patterns in open-source: a systematic approach and its evaluation ↩︎
-
NetworkX, Contribution Guide ↩︎
-
Holtschulte, Neal, and Melanie Moses. “Diversity and resistance in a model network with adaptive software.” Security Informatics 1.1 (2012): 1-11. ↩︎
-
NetworkX, Architectural Style ↩︎
-
Goyanes, Manuel, and Luís De-Marcos. “Academic influence and invisible colleges through editorial board interlocking in communication sciences: a social network analysis of leading journals.” Scientometrics 123.2 (2020): 791-811. Available here. ↩︎
-
Vargas, David L., et al. “Correlation between student collaboration network centrality and academic performance.” Physical Review Physics Education Research 14.2 (2018): 020112. Available here. ↩︎
-
Stackoverflow, NetworkX visualization using external packages. ↩︎
-
Network of sea surface temperatures depicting correlations. ↩︎