undefinedbehavior-website/content/posts/rfcartography.md
2023-02-28 19:51:08 +01:00

6.6 KiB

Title: RFCartography - Visualizing relations between RFCs Date: 2023-02-28 19:50 Author: Error Slug: rfcartography Summary: RFCs make up a nice set of structured data with many relations. I built a small(-ish) website that analyzes these relationships and visualizes them as graphs. License: CC-BY-NC https://creativecommons.org/licenses/by-nc/4.0/

Setting Sails

RFCs are nice documents which explain how the internet works. They define protocols, describe best practices and discuss the organisation of internet infrastructure. Unlike some standards from other organizations, they are publicly available for free, usually written in a comprehensible manner and structured in a logical way.

Looking for a new project to do in my spare time at the end of 2022, I ended up thinking about how to visualize relations between RFCs. I always liked the way RFCs are connected to each other: Information about which RFC updates which, which RFC is obsoleted by which, and so on, all displayed at the top of the document, easy to find. So I decided to create RFCartography[°portmanteau word built out of "RFC" and "Cartography"; If you were under the impression that it has something to do with radio frequencies, you were misled, sorry.], a small tool that draws graphs of these relations.

A graph of the relations of RFC791 (Internet Protocol) generated by RFCartography

Parsing all the RFCs in order to extract the required information would have been a lot of effort and probably prone to errors. Fortunately, rfc-editor.org has a nice, machine readable RFC index file in XML format with all the meta data for each RFC. 1 This includes, among other data, all the relations between RFCs:

obsoletes
points to older RFCs which are superseded by this document 2, 3
obsoleted-by
points to newer RFCs which supersede this document 2, 3
updates
points to older RFCs which are modified and/or extended by this document 2, 3
updated-by
points to newer RFCs which modify and/or extend this document 2, 3
is-also
shows which other IDs this document is known as 2, 3
see-also
references other relevant documents 2

How the Cartographer works

RFCartography consists out of the IndexParser component, which parses the index file and makes its information available, and the RFCartography component, which builds the requested graphs and generates SVGs from them. This is held together by a Flask application that initially calls the IndexParser and provides the data returned by the parser to the RFCartographer, which it then repeatedly consults to generate graphs and render SVGs. Further, the Flask application implements the web frontend. The application is made available by a web server and a wsgi server.

Component diagram of RFCartography

When a request is received, the Flask application calls the RFCartographer to generate the subgraph belonging to the requested RFC. In order to do this, the initial RFC is added to a queue. Every RFC in the queue is added to the graph and all of its references are checked and added as edges to the graph. If a referenced document in not in the queue and hasn't been analyzed yet, it is added to the queue. To prevent long waiting times, a maximum depth can be specified. Once the queue is empty, the subgraph generation is completed and the graph can be returned.[°I profoundly apologise for this atrociously unreadable piece of code. In my defence: it seems to be working]

todo:      list[tuple[Document, int]]  = [(core, 0)]
done:      list[Document]              = []
graph:     MultiDiGraph                = MultiDiGraph()
graph.add_node(core)
nodes[core.type].append(core)

while len(todo) > 0:
    node: tuple[Document, int] = todo.pop(0)
    if node[0] not in done:
        done.append(node[0])
        if node[1] < max_depth or max_depth <= 0:
            for neighbor in node[0].get_references():
                if not neighbor[1].type in node_types:
                    continue
                if not graph.has_node(neighbor[1]):
                    graph.add_node(neighbor[1])
                    nodes[neighbor[1].type].append(neighbor[1])
                graph.add_edge(node[0], neighbor[1], reftype=neighbor[0])
                edges[neighbor[0]].append((node[0], neighbor[1]))
                todo.append((neighbor[1], node[1]+1))

Afterwards, the subgraph can be rendered into a SVG with the help of pyplot.

Here be Dragons

Of course it does not work without problems. First of all, RFCartography is really slow. Generating and rendering graphs takes a while, leading to long response times. This can be mitigated to a degree by caching responses, but to really improve on this issue, the application should probably use a database backend.[°I'll add this to RFCartography at some point]

Another issue seems to be due to a bug in pyplot. SVGs generated by RFCartography are supposed to have hyperlinks for each node that point to a page with details about that document. However, this does not work in some edge cases. If, for example, a subgraph consists out of only one node, pyplot does not add links to it. Further, it does not add links to the nodes if different shapes are used for different node types.[°Using different shapes would have been nice from an accessibility point of view, but since it doesn't work at the moment, RFCartography has to rely on colors only.]

A graph of the relations of RFC2322 (Management of IP numbers by peg-dhcp) generated by RFCartography

Draw your own Maps

You can try RFCartography yourself on rfcartography.net. The source code can be found here. I'm not sure whether this project is actually useful for anyone, but it was fun to build it.


  1. https://www.rfc-editor.org/rfc-index.xml ↩︎

  2. https://www.rfc-editor.org/rfc-index.xsd ↩︎

  3. J. Halpern, L. Daigle and O. Kolkman. (2016, May) RFC 7841: RFC Streams, Headers, and Boilerplates. https://www.rfc-editor.org/rfc/rfc7841 ↩︎