The AdventureWorks Database as a Transversal Matroid: Part I – Generating a Bipartite Graph

In previous posts, we took the foreign keys present in the AdventureWorks2008R2 database and used them to represent the database as a multigraph, a simple graph, and an incidence matrix. We even calculated properties such as graph radius and periphery. Another way to investigate relationships between objects is to present them as a matroid (or combinatorial pre-geometry).

To start, the simple graph of the finite key relationships present in the AdventureWorks database is already a cycle matroid. This is a very well known result as all finite graphs are cycle matroids. The independent sets in a cycle matroid are the edges that do not form a closed path.  Arguably, the cycle matroid derived from the AdventureWorks database graph may have some interesting properties.


Instead of investigating the properties of this cycle matroid, we will investigate the AdventureWorks database as a transversal matroid. The definition of a transversal matroid in “Matroid Theory” by James Oxley is as follows:

A way to visualize transversals is via bipartite graphs:


The AdventureWorks database actually works as a great candidate for a transversal matroid. The family of subsets of the AdventureWorks database is the schemas. The AdventureWorks2008R2 database has the following schemas: dbo, HumanResources, Person, Production, Purchasing, and Sales. It’s easy to query the database for its schemas:

SELECT DISTINCT SCHEMA_NAME(schema_id) FROM sys.tables

The expected output is shown below.


To visualize the relationship between tables and schemas, we will need to create a bipartite graph of the relationship. We will create a connection to the database the usual way. Once that is accomplished, we will need to run the following command:


The reason we use CASE in the T-SQL statement is because there’s a table named Person too. So to take care of the fact that there’s a schema and a table of the same name, we just rename it. An partial listing of the output is show below:

We will need to combine the two columns into something Mathematica will like to work with for graphing.

I tried having the SQL query seamlessly put out columns of the form “SpecialOffer” -> “Sales”. While the query executed the way one would expect in SSMS, the parser Mathematica uses choked on it. So here we are, inelegantly combining the results. It is what it is. Perhaps I’ll find the trick.

We apply the custom function Combine to the query result set.


Finally, it’s time to see something worth looking at. The command below:


Returns:


To see the entire graph, click on the image. It’s very large.

While we’ve generated a great looking image of the relationship between schemas and tables in the AdventureWorks database, we have not yet described the matroid yet. In the second part of this series, we will generate the maximal independent sets to describe this matroid. I’m looking forward to it!