Graphing the connections between my blog posts
https://shkspr.mobi/blog/2025/01/graphing-the-connections-between-my-blog-posts/
I love ripping off good ideas from other people's blogs. I was reading Alvaro Graves-Fuenzalida's blog when I saw this nifty little force-directed graph:
When zoomed in, it shows the relation between posts and tags.
In this case, I can see that the posts about Small Gods and Pyramids both share the tags of Discworld, Fantasy, and Book Review. But only Small Gods has the tag of Religion.
Isn't that cool! It is a native feature of Quartz's GraphView. How can I build something like that for my WordPress blog?
Aim
Create an interactive graph which shows the relationship between a post, its links, and their tags.
It will end up looking something like this:
You can get the code or follow along to see how it works.
This is a multi-stage process. Let's begin!
What We Need
When on a single Post, we need the following:
- The tags assigned to that Post.
- Internal links back to that Post.
- Internal links from that Post.
- The tags assigned to links to and from that Post.
Tags assigned to that Post.
This is pretty easy! Using the get_the_tag_list()
function we can, unsurprisingly, get all the tags associated with a post.
$post_tags_text = get_the_tag_list( "", ",", $ID );$post_tags_array = explode( "," , $post_tags_text );
That just gets the list of tag names. If we want the tag IDs as well, we need to use the get_the_tags()
function.
$post_tags = get_the_tags($ID);$tags = array();foreach($post_tags as $tag) { $tags[$tag->term_id] = $tag->name; }
Backlinks
Internal links back to the Post is slightly trickier. WordPress doesn't save relational information like that. Instead, we get the Post's URl and search for that in the database. Then we get the post IDs of all the posts which contain that string.
// Get all the posts which link to this one, oldest first$the_query = new WP_Query( array( 's' => $search_url, 'post_type' => 'post', "posts_per_page" => "-1", "order" => "ASC" ));// Nothing to do if there are no inbound linksif ( !$the_query->have_posts() ) { return;}
Backlinks' Tags
Once we have an array of posts which link back here, we can get their tags as above:
// Loop through the postswhile ( $the_query->have_posts() ) { // Set it up $the_query->the_post(); $id = get_the_ID(); $title = esc_html( get_the_title() ); $url = get_the_permalink(); $backlink_tags_text = get_the_tag_list( "", ",", $ID ); $backlink_tags_array = explode( "," , $backlink_tags_text );}
Links from the Post
Again, WordPress's lack of relational links is a weakness. In order to get internal links, we need to:
- Render the HTML using all the filters
- Search for all
<a href="…">
- Extract the ones which start with the blog's domain
- Get those posts' IDs.
Rendering the content into HTML is done with:
$content = apply_filters( "the_content", get_the_content( null, false, $ID ) );
Searching for links is slightly more complex. The easiest way is to load the HTML into a DOMDocument, then extract all the anchors. All my blog posts start /blog/YYYY
so I can avoid selecting links to tags, uploaded files, or other things. Your blog may be different.
$dom = new DOMDocument();libxml_use_internal_errors( true ); // Suppress warnings from malformed HTML$dom->loadHTML( $content );libxml_clear_errors();$links = [];foreach ( $dom->getElementsByTagName( "a" ) as $anchor ) { $href = $anchor->getAttribute( "href" ); if (preg_match('/^https:\/\/shkspr\.mobi\/blog\/\d{4}$/', $href)) { $links[] = $href; }}
The ID of each post can be found with the url_to_postid()
function. That means we can re-use the earlier code to see what tags those posts have.
Building a graph
OK, so we have all our constituent parts. Let's build a graph!
Graphs consist of nodes (posts and tags) and edges (links between them). The exact format of the graph is going to depend on the graph library we use.
I've decided to use D3.js's Force Graph as it is relatively simple and produces a reasonably good looking interactive SVG.
Imagine there are two blog posts and two hashtags.
const nodes = [ { id: 1, label: "Blog Post 1", url: "https://example.com/post/1", group: "post" }, { id: 2, label: "Blog Post 2", url: "https://example.com/post/2", group: "post" }, { id: 3, label: "hashtag", url: "https://example.com/tag/3", group: "tag" }, { id: 4, label: "anotherHashtag", url: "https://example.com/tag/4", group: "tag" },];
- Blog Post 1 links to Blog Post 2.
- Blog Post 1 has a #hashtag.
- Both 1 & 2 share #anotherHashtag.
const links = [ { source: 1, target: 2 }, { source: 3, target: 1 }, { source: 4, target: 1 }, { source: 4, target: 2 },];
Here's how to create a list of nodes and their links. You will need to edit it for your own blog's peculiarities.
<?php // Load WordPress environmentrequire_once( "wp-load.php" );// Set up arrays for nodes and links$nodes = array();$links = array();// ID of the Post$main_post_id = 12345;// Get the Post's details$main_post_url = get_permalink( $main_post_id );$main_post_title = get_the_title( $main_post_id );// Function to add new nodesfunction add_item_to_nodes( &$nodes, $id, $label, $url, $group ) { $nodes[] = [ "id" => $id, "label" => $label, "url" => $url, "group" => $group ];}// Function to add new relationshipsfunction add_relationship( &$links, $source, $target ) { $links[] = [ "source" => $source, "target" => $target ];}// Add Post to the nodesadd_item_to_nodes( $nodes, $main_post_id, $main_post_title, $main_post_url, "post" );// Get the tags of the Post$main_post_tags = get_the_tags( $main_post_id );// Add the tags as nodes, and create links to main Postforeach( $main_post_tags as $tag ) { $id = $tag->term_id; $name = $tag->name; // Add the node add_item_to_nodes( $nodes, $id, $name, "https://shkspr.mobi/blog/tag/" . $name, "tag" ); // Add the relationship add_relationship( $links, $id, $main_post_id );}// Get all the posts which link to this one, oldest first$the_query = new WP_Query( array( 's' => $main_post_url, 'post_type' => 'post', "posts_per_page" => "-1", "order" => "ASC" ));// Nothing to do if there are no inbound linksif ( $the_query->have_posts() ) { // Loop through the posts while ( $the_query->have_posts() ) { // Set up the query $the_query->the_post(); $post_id = get_the_ID(); $title = esc_html( get_the_title() ); $url = get_the_permalink(); // Add the node add_item_to_nodes( $nodes, $post_id, $title, $url, "post" ); // Add the relationship add_relationship( $links, $post_id, $main_post_id ); // Get the tags of the Post $post_tags = get_the_tags( $post_id ); // Add the tags as nodes, and create links to main Post foreach($post_tags as $tag) { $id = $tag->term_id; $name = $tag->name; // Add the node add_item_to_nodes( $nodes, $id, $name, "https://shkspr.mobi/blog/tag/" . $name, "tag" ); // Add the relationship add_relationship( $links, $id, $post_id ); } }}// Get all the internal links from this post// Render the post as HTML$content = apply_filters( "the_content", get_the_content( null, false, $ID ) );// Load it into HTML$dom = new DOMDocument();libxml_use_internal_errors( true );$dom->loadHTML( $content );libxml_clear_errors();// Get any <a href="…" which starts with https://shkspr.mobi/blog/$internal_links = [];foreach ( $dom->getElementsByTagName( "a" ) as $anchor ) { $href = $anchor->getAttribute( "href" ); if (preg_match('/^https:\/\/shkspr\.mobi\/blog\/\d{4}$/', $href)) { $internal_links[] = $href; }}// Loop through the internal links, get their hashtagsforeach ( $internal_links as $url ) { $post_id = url_to_postid( $url ); // Get the Post's details $post_title = get_the_title( $id ); // Add the node add_item_to_nodes( $nodes, $post_id, $post_title, $url, "post" ); // Add the relationship add_relationship($links, $main_post_id, $post_id ); // Get the tags of the Post $post_tags = get_the_tags( $post_id ); // Add the tags as nodes, and create links to main Post foreach( $post_tags as $tag ) { $id = $tag->term_id; $name = $tag->name; // Add the node add_item_to_nodes( $nodes, $id, $name, "https://shkspr.mobi/blog/tag/" . $name, "tag" ); // Add the relationship add_relationship( $links, $id, $post_id ); }}// Deduplicate the nodes and links$nodes_unique = array_unique( $nodes, SORT_REGULAR );$links_unique = array_unique( $links, SORT_REGULAR );// Put them in the keyless format that D3 expects$nodes_output = array();$links_output = array();foreach ( $nodes_unique as $node ) { $nodes_output[] = $node;}foreach ( $links_unique as $link ) { $links_output[] = $link;}// Return the JSONecho json_encode( $nodes_output, JSON_PRETTY_PRINT );echo "\n";echo json_encode( $links_output, JSON_PRETTY_PRINT );
Creating a Force Directed SVG
Once the data are spat out, you can include them in a web-page. Here's a basic example:
<!DOCTYPE html><html lang="en"> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <title>Force Directed Graph</title> <script src="https://d3js.org/d3.v7.min.js"></script> </head> <body> <svg width="800" height="600"> <defs> <marker id="arrowhead" markerWidth="10" markerHeight="7" refX="10" refY="3.5" orient="auto" fill="#999"> <path d="M0,0 L10,3.5 L0,7 Z"></path> </marker> </defs> </svg> <script>
const nodes = []; const links = []; const width = 800; const height = 600; const svg = d3.select("svg") .attr( "width", width ) .attr( "height", height ); const simulation = d3.forceSimulation( nodes ) .force( "link", d3.forceLink( links ).id( d => d.id ).distance( 100 ) ) .force( "charge", d3.forceManyBody().strength( -300 ) ) .force( "center", d3.forceCenter( width / 2, height / 2 ) ); // Run simulation with simple animation simulation.on("tick", () => { link .attr("x1", d => d.source.x) .attr("y1", d => d.source.y) .attr("x2", d => d.target.x) .attr("y2", d => d.target.y); node .attr("transform", d => `translate(${d.x},${d.y})`); }); // Draw links const link = svg.selectAll( ".link" ) .data(links) .enter().append("line") .attr( "stroke", "#999" ) .attr( "stroke-width", 2 ) .attr( "x1", d => d.source.x ) .attr( "y1", d => d.source.y ) .attr( "x2", d => d.target.x ) .attr( "y2", d => d.target.y ) .attr( "marker-end", "url(#arrowhead)" ); // Draw nodes const node = svg.selectAll( ".node" ) .data( nodes ) .enter().append( "g" ) .attr( "class", "node" ) .attr( "transform", d => `translate(${d.x},${d.y})` ) .call(d3.drag() // Make nodes draggable .on( "start", dragStarted ) .on( "drag", dragged ) .on( "end", dragEnded ) ); // Add hyperlink node.append("a") .attr( "xlink:href", d => d.url ) // Link to the node's URL .attr( "target", "_blank" ) // Open in a new tab .each(function (d) { const a = d3.select(this); // Different shapes for posts and tags if ( d.group === "post" ) { a.append("circle") .attr("r", 10) .attr("fill", "blue"); } else if ( d.group === "tag" ) { // White background rectangle a.append("rect") .attr("width", 20) .attr("height", 20) .attr("x", -10) .attr("y", -10) .attr("fill", "white"); // Red octothorpe a.append("path") .attr("d", "M-10,-5 H10 M-10,5 H10 M-5,-10 V10 M5,-10 V10") .attr("stroke", "red") .attr("stroke-width", 2) .attr("fill", "none"); } // Text label a.append( "text") .attr( "dy", 4 ) .attr( "x", d => ( d.group === "post" ? 12 : 14 ) ) .attr( "fill", "black" ) .style("font-size", "12px" ) .text( d.label ); }); // Standard helper functions to make nodes draggable function dragStarted( event, d ) { if ( !event.active ) simulation.alphaTarget(0.3).restart(); d.fx = d.x; d.fy = d.y; } function dragged( event, d ) { d.fx = event.x; d.fy = event.y; } function dragEnded( event, d ) { if (!event.active) simulation.alphaTarget(0); d.fx = null; d.fy = null; }
</script> </body></html>
Next Steps
It needs a bit of cleaning up if I want to turn it into a WordPress plugin. It might be nice to make it a static SVG rather than relying on JavaScript. And the general æsthetic needs a bit of work.
Perhaps I could make it 3D like my MSc Dissertation?
But I'm pretty happy with that for an afternoon hack!
You can get the code if you want to play.