By James Schintz, Senior Consultant

I was recently turned on to the site, http://parkscore.tpl.org, as a great way to compare metropolitan areas.  What really got me excited about these visuals is how sophisticated geospatial analysis is combined with traditional descriptive visualizations to not only tell a story, but also encourage exploratory analysis.

Blog_Tile1aBlog_Tile2a

This analysis was made possible by using shapefiles.  Given that QlikView, Spotfire, & Tableau users are all clamoring for geospatial support, I thought it would be good to explore a widely used data format, shapefiles.  The shapefile format stores simple feature geometry, allowing users to custom-define regions in a vector design.  At its core, a shapefile is like any other tabular format data, only with an additional geometry column containing polygon vector shapes.

Deciding if a shapefile is the best option for your BI solution is a simple matter of ease-of-use.  Shapefiles represent an area on a map, so they will not provide your address specific XY coordinate points. The easiest way to get started is to use the shapefiles that are publically available.  TIGER shapefiles  are the most widely known.  Bardess can help you create custom shapefiles that represent your company’s sales region for example, or even the floor plan of a building. Behind the scenes in most maps, multiple shapefiles are utilized via layering, providing users different views depending on the zoom level.

Once you have the shapefiles you need, the next step entails some integration between the shapefile and your original data set.  Ideally, both your map and descriptive visualizations should auto-update depending on user selections.  Associating the underlying data between your shapefile and original data set is an important step for user interactivity.  In many cases, this equates to a simple join between the shapefile and the proprietary data.

And as a final warning for all those just beginning to use shapefiles, the first step to take when using a shapefile is always determining its coordinate system.  Available in the metadata descriptions or made public by the source’s author, this information assures that the vectors or points from multiple layers align properly.

 

About the Author

James Schintz is a Senior Consultant at Bardess Group Ltd.  With experience in both the data analysis and management realms, he has worked in the financial, marketing and public sectors. Drawing from rigorous statistical techniques, machine learning, and geospatial analysis, he incorporates advanced analytics to promote better decision making processes.