At the heart of data analysis, there lies a need to understand the real world entities being represented in the data. Every data set we encounter is an attempt to capture a slice of our complex world and communicate some information about it in a way that has potential to be informative to humans, machines, or both. Moving from basic analyses to advanced analytics requires the ability to imagine multiple ways of conceptualizing the composition of entities and the relationships present in our data. It also requires the realization that different levels of aggregation, disaggregation, and transformation can open up new pathways to understanding our data and identifying the valuable insights it contains.
In this talk, we’ll discuss several ways to think about the composition and representation of our data. We’ll also demonstrate a series of methods that leverage tools like networks, hierarchical aggregations, and unsupervised clustering to visually explore our data, transform it to discover new insights, help frame analytical problems and questions, and even improve machine learning model performance. In exploring these approaches, and with the help of Python libraries such as Pandas, Scikit-Learn, Seaborn, and Yellowbrick, we will provide a practical framework for thinking creatively and visually about your data and unlocking latent value and insights hidden deep beneath its surface.