big data data science machine learning hadoop cascading spark mesos scalding cascalog nlp python jupyter scala use cases enterprise data workflows ai textrank streaming twitter cluster computing open data pmml aws cloud computing text analytics r active learning graph algorithms approximation algorithms case studies ipython notebook functional programming management human-in-the-loop learning docker mesosphere clojure o'reilly media publishing real-time analytics sql knime advanced math distributed systems google predictive modeling java disambiguation ontology open source scikit-learn chicago history apache hadoop analytics networkx datasketch spacy deep learning content discovery media video computable content inverted classroom education graphx community certification mooc graph queries abstract algebra datacenter computing marathon linux low latency graph theory airbnb linux containers isolation borg mathematics statistics portland sas ansi sql palo alto mapreduce algorithms enterprise redis gephi business strategy social media knowledge graph search learning experiences nike nginx kaltura best practices literate programming summarization standards pfa accountability governance avro recommender systems social context kubernetes learning curve continuous learning computational thinking philosophy parquet thebe json oscon notebooks brazil sao paulo qcon iot paco nathan pagerank probabilistic data structures system architecture business stanford functio cluster scheduling quasar probabilistic programming chronos cgroups omega mbrace augustus julia mlbase summingbird titan genetic programming metascale sears chug virtualization university of chicago ensembles kdd hadoop summit windows azure texas pattern language predictive models optimization tdd optiq application layer enterprise architecture splunk bigdata tf-idf data analysis pentaho imvu continuous deployment emr enron infochimps datameer
See more