Once again I will use the data made available by Andrew Beveridge to first demonstrate the use of categorical pageRank and breakdown pageRank by the sequence of the books, that will help us find the winners of game of thrones and secondly to show some visualizations options Neo4j community has to offer.

To find more details about the dataset check Analyzing the Graph of Thrones by William Lyon or my Neo4j GoT social graph analysis.

## Import

Lets first define the schema of our graph.

We only need one constraint on Person label. This will speed up our import and later queries.

CREATE CONSTRAINT ON (p:Person) ASSERT p.id IS UNIQUE;

As the data of all five books is available on github we can import all 5 books using a single cypher query and APOC‘s `load.json`

.

We will differentiate networks from different books using separate relationship types. We need to use `apoc.merge.relationship`

as Cypher does not allow using parameters for relationship types. Network from the first book will be stored as relationship type INTERACTS_1, second INTERACTS_2 and so on.

UNWIND ['1','2','3','45'] as book LOAD CSV WITH HEADERS FROM 'https://raw.githubusercontent.com/mathbeveridge/asoiaf/master/data/asoiaf-book' + book + '-edges.csv' as value MERGE (source:Person{id:value.Source}) MERGE (target:Person{id:value.Target}) WITH source,target,value.weight as weight,book CALL apoc.merge.relationship(source,'INTERACTS_' + book, {}, {weight:toFloat(weight)}, target) YIELD rel RETURN distinct 'done'

## Categorical pagerank

As described in my previous blog post, categorical pageRank is a concept where we break down the global pageRank into categories and run pageRank on each category subset of the graph separately to get a better understanding of the global pageRank.

Here we will use books as categories, so that we get character’s importance breakdown by the sequence of books.

UNWIND ['1','2','3','45'] as sequence MERGE (book:Book{sequence:sequence}) WITH book,sequence CALL algo.pageRank.stream( 'MATCH (p:Person) WHERE (p)-[:INTERACTS_' + sequence + ']-() RETURN id(p) as id', 'MATCH (p1:Person)-[INTERACTS_' + sequence + ']-(p2:Person) RETURN id(p1) as source,id(p2) as target', {graph:'cypher'}) YIELD node,score // filter out nodes with default pagerank // for nodes with no incoming rels WITH node,score,book where score > 0.16 MERGE (node)<-[p:PAGERANK]-(book) SET p.score = score

### Biggest winner of the game of thrones by books so far

Basically we will order pageRank values by the sequence of the books and return top ten characters with the highest positive changes in pageRank.

MATCH (person:Person)<-[pagerank:PAGERANK]-(book:Book) // order by book sequence WITH person,pagerank,book order by book asc WITH person,collect(pagerank) as scores RETURN person.id as person, scores[0].score as first_score, scores[-1].score as last_score, length(scores) as number_of_books ORDER BY last_score - first_score DESC LIMIT 10

While Jon Snow leads by absolute positive difference in pageRank, Victarion Greyjoy is very interesting. He had pageRank score 0.59 in the second book, was missing in third, and jumped to 4.43 in fourth and fifth book.

Stannis Baratheon is probably at the peak of his career judging by the show and is surprisingly in second place. Other than that the list is made out of the usual suspects.

I also checked the characters with the biggest negative change, but it turns out that they are mostly dead so it’s not all that interesting.

## Spoonjs

Thanks to Michael Hunger spoonJS is back. With it we can visualize charts directly in Neo4j browser.

Within a few clicks you can get it set up following the guide.

:play spoon.html

In our example we will visualize characters sorted by pageRank in the last two books combined.

MATCH (p:Person)<-[r:PAGERANK]-(:Book{sequence:'45'}) RETURN p.id as person,r.score as score ORDER BY score DESC LIMIT 15

Three out of the first four places belong to the Lannisters, with the most important being Tyrion. If you think about it from this perspective what GoT is really about, you might think it’s just a family crisis of the Lannisters with huge collateral damage 🙂

## 3d force graph

Another cool visualization project by Michael is called 3d-force-graph. It lets us visualize and explore graphs.

We will use pageRank to define the size of the nodes, so that the most important nodes will be the biggest. To represent communities in the graph we use the color of the nodes.

We need to run label propagation or Louvain algorithm to find communities within our graph and store them as a property of the nodes.

We run label propagation using only the network of characters from the last two books.

CALL algo.labelPropagation('Person','INTERACTS_45','BOTH',{partitionProperty:'lpa',iterations:10})

I like this visualization because it is 3d and you can approach from different angles and zoom in and out of the graph while exploring it.

## Neovisjs

We can also use neovis.js, developed by William Lyon, to visualize graphs. Similarly as before we use label propagation results to color the nodes. To mix it up a bit we will use betweenness centrality of the nodes, instead of pageRank, to represent the size of the nodes in the graph.

Run betweenness centrality algorithm.

CALL algo.betweenness('Person','INTERACTS_45',{direction:'BOTH',writeProperty:'betweenness'})

In the visualization we also defined the size of the relationships based on the weight and colored them according to the community of the pair of nodes they are connecting.

Code for Neovis and 3d-force-graph visualization used in the post can be found on github. Have fun!