Foreign Keys as an Incidence Matrix: A High Level Look

In a previous post, we took a multigraph of foreign keys in the AdventureWorks database in SQL Server 2008 and manipulated it into a simple graph. This yielded a high level view of the foreign keys in a database. A perfect supplementary diagram to any database diagram. We were also able to derive some interesting and potentially useful facts regarding foreign keys in the database such as the graph center and graph periphery.

The graph from our previous post is below:


It really is a pretty constellation of vertices and edges. It’s possible to go a few steps further and manipulate this captured information in other ways. For example, one could create an incidence matrix of the graph. An incidence matrix where the columns are the edges and the rows are the vertices of a graph. If a vertex i is connected to an edge j, then the i,j entry of the incidence matrix is 1, otherwise, it’s marked 0.

In our previous example, we created a graph named fkgraph by executing:

fkgraph = Graph[querygraph]


To retrieve an incidence matrix from this graph, we may use the IncidenceMatrix function in Mathematica. Once executed, we may execute:

IncidenceMatrix[fkgraph]//MatrixForm.

If we do not utilize the PostFix function (//) to format IncidenceMatrix function with the function MatrixForm, we will get a sparse array like below.


A partial output of the desired output is captured below:


If we look above at the SparseArray output, we can immediately see that the number of rows of the incidence matrix is sixty-seven. That is, there are sixty-seven points in the graph. Consequentially, we know that there are sixty-seven tables with foreign key relationships in the AdventureWorks database as each point represents a table. There are simpler ways of deriving that information but as one proceeds to more abstract layers, it’s good to remember what one is actually working with.

What about the matrix itself? What is the rank of the matrix? To understand the rank of this matrix, we will use the Mathematica function MatrixRank.

MatrixRank[IncidenceMatrix[fkgraph]]

The output is:

Sixty-seven. Who knew? It’s a well known fact that the column rank and the row rank of a matrix are equal. Sixty-seven is the number of rows (points) in the matrix and eighty-six is the number of columns (edges). This would imply that the difference between total columns and the size of the column space is the number of dependent columns. Dependent columns are, in this representation, different choices one could make to JOIN one table with another.

The matrix itself is rather overwhelming. To get a better at a glance view of what the matrix looks like, we can use the MatrixPlot function.


The orange squares represent the value of one and the white of zero. In this sense, we have a rough understanding of the dispersion of ones and zeroes throughout the matrix.

I hope I’ve introduced enough tools in Mathematica to get one started with manipulating datasets and beginning to derive a deeper understanding regarding their properties. The topic is very deep and there’s much to learn and discover.