Unstructured data is everywhere - in the form of posts, status updates, bloglets or news feeds in social media or in the form of customer interactions Call Center CRM. While many organizations study and monitor social media for tracking brand value and targeting specific customer segments, in our experience blending the unstructured data with the structured data in supplementing data science models has been far more effective than working with it independently.
In this talk we will show case an end-to-end topic and sentiment analysis pipeline we've built on the Pivotal Greenplum Database platform for Twitter feeds from GNIP, using open source tools like MADlib and PL/Python. We've used this pipeline to build regression models to predict commodity futures from tweets and in enhancing churn models for telecom through topic and sentiment analysis of call center transcripts. All of this was possible because of the flexibility and extensibility of the platform we worked with.
Why do we care about Apps as well as Data? By ‘apps’ we mean “enterprise and cloud applications” and how they are built. Pivotal has said a lot about data in public but we care about apps just as much. Leveraging the strengths of vFabric and Spring, Pivotal will continue to enable customers to build the applications they need. Applications are how our customers offer many new products and services today. Apps can accelerate customer interactions and realize value from data by presenting it to users in a meaningful way. With tools like Spring and Cloud Foundry, we can make ‘big data’ comprehensible and ‘easy’ to developers and hence to enterprises. And of course: users generate data, sensors generate data, phones generate data.. but much of this data comes from some sort of application.