undefinedbehavior-website/content/blog/rfcartography.md

100 lines
6.5 KiB
Markdown
Raw Permalink Normal View History

2023-02-28 19:51:08 +01:00
Title: RFCartography - Visualizing relations between RFCs
Date: 2023-02-28 19:50
Author: Error
Slug: rfcartography
2023-10-24 14:19:59 +02:00
Summary: Analyzing relationships between RFCs and drawing graphs of them
2023-02-28 19:51:08 +01:00
License: CC-BY-NC
https://creativecommons.org/licenses/by-nc/4.0/
RFCs are nice documents which explain how the internet works.
They define protocols, describe best practices and discuss the organisation of internet infrastructure.
Unlike some standards from other organizations, they are publicly available for free, usually written in a comprehensible manner and structured in a logical way.
Looking for a new project to do in my spare time at the end of 2022, I ended up thinking about how to visualize relations between RFCs.
I always liked the way RFCs are connected to each other:
Information about which RFC updates which, which RFC is obsoleted by which, and so on, all displayed at the top of the document, easy to find.
So I decided to create RFCartography[°portmanteau word built out of "RFC" and "Cartography"; If you were under the impression that it has something to do with radio frequencies, you were misled, sorry.], a small tool that draws graphs of these relations.
2023-10-24 14:19:59 +02:00
![A graph of the relations of RFC791 (Internet Protocol) generated by RFCartography]({static}/blog/rfcartography/791.svg "A graph of the relations of RFC791 (Internet Protocol) generated by RFCartography")
2023-02-28 19:51:08 +01:00
Parsing all the RFCs in order to extract the required information would have been a lot of effort and probably prone to errors.
Fortunately, [rfc-editor.org](https://rfc-editor.org) has a nice, machine readable RFC index file in XML format with all the meta data for each RFC. [^1]
This includes, among other data, all the relations between RFCs:
obsoletes
: points to older RFCs which are superseded by this document [^2], [^3]
obsoleted-by
: points to newer RFCs which supersede this document [^2], [^3]
updates
: points to older RFCs which are modified and/or extended by this document [^2], [^3]
updated-by
: points to newer RFCs which modify and/or extend this document [^2], [^3]
is-also
: shows which other IDs this document is known as [^2], [^3]
see-also
: references other relevant documents [^2]
RFCartography consists out of the IndexParser component, which parses the index file and makes its information available, and the RFCartography component, which builds the requested graphs and generates SVGs from them.
This is held together by a Flask application that initially calls the IndexParser and provides the data returned by the parser to the RFCartographer, which it then repeatedly consults to generate graphs and render SVGs.
Further, the Flask application implements the web frontend.
The application is made available by a web server and a wsgi server.
2023-10-24 14:19:59 +02:00
![Component diagram of RFCartography]({static}/blog/rfcartography/components.svg "Component diagram of RFCartography")
2023-02-28 19:51:08 +01:00
When a request is received, the Flask application calls the RFCartographer to generate the subgraph belonging to the requested RFC.
In order to do this, the initial RFC is added to a queue.
Every RFC in the queue is added to the graph and all of its references are checked and added as edges to the graph.
If a referenced document in not in the queue and hasn't been analyzed yet, it is added to the queue.
To prevent long waiting times, a maximum depth can be specified.
Once the queue is empty, the subgraph generation is completed and the graph can be returned.[°I profoundly apologise for this atrociously unreadable piece of code. In my defence: it seems to be working]
todo: list[tuple[Document, int]] = [(core, 0)]
done: list[Document] = []
graph: MultiDiGraph = MultiDiGraph()
graph.add_node(core)
nodes[core.type].append(core)
while len(todo) > 0:
node: tuple[Document, int] = todo.pop(0)
if node[0] not in done:
done.append(node[0])
if node[1] < max_depth or max_depth <= 0:
for neighbor in node[0].get_references():
if not neighbor[1].type in node_types:
continue
if not graph.has_node(neighbor[1]):
graph.add_node(neighbor[1])
nodes[neighbor[1].type].append(neighbor[1])
graph.add_edge(node[0], neighbor[1], reftype=neighbor[0])
edges[neighbor[0]].append((node[0], neighbor[1]))
todo.append((neighbor[1], node[1]+1))
Afterwards, the subgraph can be rendered into a SVG with the help of pyplot.
2023-10-24 14:19:59 +02:00
Of course this does not work without problems.
2023-02-28 19:51:08 +01:00
First of all, RFCartography is really slow.
Generating and rendering graphs takes a while, leading to long response times.
This can be mitigated to a degree by caching responses, but to really improve on this issue, the application should probably use a database backend.[°I'll add this to RFCartography at some point]
Another issue seems to be due to a bug in pyplot.
SVGs generated by RFCartography are supposed to have hyperlinks for each node that point to a page with details about that document.
However, this does not work in some edge cases.
If, for example, a subgraph consists out of only one node, pyplot does not add links to it.
Further, it does not add links to the nodes if different shapes are used for different node types.[°Using different shapes would have been nice from an accessibility point of view, but since it doesn't work at the moment, RFCartography has to rely on colors only.]
2023-10-24 14:19:59 +02:00
![A graph of the relations of RFC2322 (Management of IP numbers by peg-dhcp) generated by RFCartography]({static}/blog/rfcartography/2322.svg "A graph of the relations of RFC2322 (Management of IP numbers by peg-dhcp) generated by RFCartography")
2023-02-28 19:51:08 +01:00
2024-03-12 13:56:16 +01:00
You can try RFCartography yourself on [rfcartography.undefinedbehavior.de](https://rfcartography.undefinedbehavior.de/ "RFCartography").
2023-02-28 19:51:08 +01:00
The source code can be found [here](https://git.undefinedbehavior.de/undef/RFCartography "RFCartography - undefined git server").
I'm not sure whether this project is actually useful for anyone, but it was fun to build it.
[^1]: [https://www.rfc-editor.org/rfc-index.xml](https://www.rfc-editor.org/rfc-index.xml "RFC Index")
[^2]: [https://www.rfc-editor.org/rfc-index.xsd](https://www.rfc-editor.org/rfc-index.xsd "RFC Index Schema")
[^3]: J. Halpern, L. Daigle and O. Kolkman. (2016, May) RFC 7841: RFC Streams, Headers, and Boilerplates. [https://www.rfc-editor.org/rfc/rfc7841](https://www.rfc-editor.org/rfc/rfc7841 "RFC7841: RFC Streams, Headers, and Boilerplates")