NLP Text Recommendation System Journey to Automated Training

•

0 likes•183 views

The document discusses the goal of building an NLP text recommender system to provide customer service agents with relevant answers to customer questions, the approach taken including developing features for an ML ranking model and architecture for recommendations serving, model training, and system evolution over multiple versions to support multi-tenancy, dynamic training, and rollbacks.

Data & Analytics

NLP text recommender system:
Journey to auto model training at scale
Aditya Sakhuja
Engineering Lead, Salesforce Einstein
@sakhuja

Agenda
● Goal
● Scenario
● Approach & Metrics
● ML System Architecture
○ Recs Serving
○ Feature Engineering
○ Model Training
● ML System Evolution
● Training CI, Deployments & Rollbacks
● Cloud Native
● Challenges & Takeaways

Agents rely on traditional search results for
ﬁnding relevant answers to often long and time
sensitive customer questions.
Scenario

Business Metrics
● Agent Time to Resolution
● Agent Time spent per case
● Case-Article Attach Rate
● # of recommendations served
● MAO, MAU
● Serving Latency

Recommendations Serving
Layer 1 : Candidate Generation
● NLP : Extract POS, NER, Noun and key terms from user query
● IR speciﬁc Query Formulation
● Candidates Generated
Layer 2 : Ranking Model
● <question, article> pairwise feature generation
● Candidates evaluated by model
● Candidates above the threshold are recommended

Data Prep & Feature Engineering
● Multi tenant data ingestion pipeline
● Data Cleansing and Sanity checks
● Precompute TDF, Corpus Statistics
● Feature Vectors computation
● 100+ of NLP features across different statistical feature categories
● Serving Training Drift

Model Training
● Ranking Model
● Auto tuned hyperparams
● Auto Model comparison
● Metrics
○ AUC
○ F-Measure
○ Precision, Recall
○ Hit Rate @K

ML System Evolution
version 0
● Heuristic based answer recommendations POC. First pilot sign up.
● Communities use case: community selected bestAnswer, as positive label.
● Generic model trained on open source dataset Stanford SQuAD
version 1
● Ranking model : <question, answer> pairwise probability
● Notebooks based on-demand training
● Static conﬁgured data ﬁltering

ML System Evolution
version 2
● Dynamically conﬁgured training dataset attributes
● Model retraining
● Multilingual Support
● Multitenant Auto-trained models
● Observability
● Trained Model Deployments & Rollbacks

Challenges
● Data
○ Privacy and sharing compliances – GDPR, HIPAA, Accessibility
○ Freshness / Hydration
○ Handling encrypted data at rest and in motion
○ Too sparse, not meeting thresholds
○ Too dense, training performance SLA not met
● Custom, non standard ﬁelds and datatypes
● Building ML Infrastructure along the way
● Training Serving Skew
● Cold start problem

Takeaways
● Start small, Ship and Iterate
● Prioritize ML infrastructure
● Start with simple interpretable models
● Scale model learning to the size of your data
● Prioritize Observability
● Prioritize Data privacy over model quality

Feedback
Your feedback is important to us.
Don’t forget to rate
and review the sessions.

What's hot

Production ready big ml workflows from zero to hero daniel marcous @ waze

Ido Shilon

Porting R Models into Scala Spark

carl_pulley

Building a machine learning model is an iterative process. A data scientist will build many tens to hundreds of models before arriving at one that meets some acceptance criteria. However, the current style of model building is ad-hoc and there is no practical way for a data scientist to manage models that are built over time. In addition, there are no means to run complex queries on models and related data. In this talk, we present ModelDB, a novel end-to-end system for managing machine learning (ML) models. Using client libraries, ModelDB automatically tracks and versions ML models in their native environments (e.g. spark.ml, scikit-learn). A common set of abstractions enable ModelDB to capture models and pipelines built across different languages and environments. The structured representation of models and metadata then provides a platform for users to issue complex queries across various modeling artifacts. Our rich web frontend provides a way to query ModelDB at varying levels of granularity. ModelDB has been open-sourced at https://github.com/mitdbg/modeldb.

ModelDB: A System to Manage Machine Learning Models: Spark Summit East talk b...

Spark Summit

Machine Learning development brings many new complexities beyond the traditional software development lifecycle. Unlike in traditional software development, ML developers want to try multiple algorithms, tools, and parameters to get the best results, and they need to track this information to reproduce work. In addition, developers need to use many distinct systems to productionize models. To address these problems, many companies are building custom “ML platforms” that automate this lifecycle, but even these platforms are limited to a few supported algorithms and to each company’s internal infrastructure. In this session, we introduce MLflow, a new open-source project from Databricks that aims to design an open ML platform where organizations can use any ML library and development tool of their choice to reliably build and share ML applications. MLflow introduces simple abstractions to package reproducible projects, track results, and encapsulate models that can be used with many existing tools, accelerating the ML lifecycle for organizations of any size.

"Managing the Complete Machine Learning Lifecycle with MLflow"

Databricks

AT&T has been involved in AI from the beginning, with many firsts; “first to coin the term AI”, “inventors of R”, “foundational work on Conv. Neural Nets”, etc. and we have applied AI to hundreds of solutions. Today we are modernizing these AI solutions in the cloud with the help of Databricks and a variety of in-house developments. This talk will highlight our AI modernization effort along with its application to Fraud which is one of our biggest benefitting applications.

AI Modernization at AT&T and the Application to Fraud with Databricks

Databricks

Productionizing Deep Reinforcement Learning with Spark and MLflow

Databricks

Did you know 160,000,000,000 pounds of food ends up in North American landfills each year? Flashfood is helping reduce food waste by providing a mobile marketplace where grocers can sell food nearing its best before date. In 2020 alone Flashfood diverted 11.2 million pounds of food waste while saving shoppers 29 million dollars on groceries. To operate and optimize the marketplace, Flashfood ingests, processes, and surfaces a wide variety of data from the core application, partners, and external sources. As the volume, variety and velocity of sources and sinks proliferate, the complexity of scheduling and maintaining jobs increases in tandem. We noticed this complexity largely stemmed from different implementations of core ETL mechanics, rather than business logic itself. We’ve implemented declarative data pipelines following a mantra of ‘code once use often’ to solve for this complexity. We started by building a highly configurable Apache Spark application which is initialized with details of the source, file type, transformation, load destination, etc. We then used Airflow to extend on the DatabricksRunSubmitOperator which allowed us to customize the cluster and parameters used in execution. Finally, we used airflow-declartive to generate DAGs in YAML, enabling us to set configurations, instantiate jobs, and orchestrate execution in a human readable file. The declarative nature means less specialized personnel are able to set up an ETL with confidence, no longer requiring a deep knowledge of Apache Spark intricacies. Additionally, by ensuring that boilerplate logic was only implemented once, we reduced maintenance and increased delivery speed by 80%.

Code Once Use Often with Declarative Data Pipelines

Databricks

Spark pipelines represent a powerful concept to support productionizing machine learning workflows. Their API allows to combine data processing with machine learning algorithms and opens opportunities for integration with various machine learning libraries. However, to benefit from the power of pipelines, their users need to have a freedom to choose and experiment with any machine learning algorithm or library. Therefore, we developed Sparkling Water that embeds H2O machine learning library of advanced algorithms into the Spark ecosystem and exposes them via pipeline API. Furthermore, the algorithms benefit from H2O MOJOs – Model Object Optimized – a powerful concept shared across entire H2O platform to store and exchange models. The MOJOs are designed for effective model deployment with focus on scoring speed, traceability, exchangeability, and backward compatibility. In this talk we will explain the architecture of Sparkling Water with focus on integration into the Spark pipelines and MOJOs. We’ll demonstrate creation of pipelines integrating H2O machine learning models and their deployments using Scala or Python. Furthermore, we will show how to utilize pre-trained model MOJOs with Spark pipelines.

Productionizing H2O Models with Apache Spark with Jakub Hava and Michal Maloh...

Databricks

"GOJEK, the Southeast Asian super-app, has seen an explosive growth in both users and data over the past three years. Today the technology startup uses big data powered machine learning to inform decision-making in its ride-hailing, lifestyle, logistics, food delivery, and payment products. From selecting the right driver to dispatch, to dynamically setting prices, to serving food recommendations, to forecasting real-world events. Hundreds of millions of orders per month, across 18 products, are all driven by machine learning. Building production grade machine learning systems at GOJEK wasn't always easy. Data processing and machine learning pipelines were brittle, long running, and had low reproducibility. Models and experiments were difficult to track, which led to downstream problems in production during serving and model evaluation. In this talk we will cover these and other challenges that we faced while trying to scale end-to-end machine learning systems at GOJEK. We will then introduce MLflow and explore the key features that make it useful as part of an ML platform. Finally, we will show how introducing MLflow into the ML life cycle has helped to solve many of the problems we faced while scaling machine learning at GOJEK. "

Scaling Ride-Hailing with Machine Learning on MLflow

Databricks

For Roularta, a news & media publishing company, it is of a great importance to understand reader behavior and what content attract, engage and convert readers. At Roularta, we have built an AI-driven article quality scoring solution on using Spark for parallelized compute, Delta for efficient data lake use, BERT for NLP and MLflow for model management. The article quality score solution is an NLP-based ML model which gives for every article published – a calculated and forecasted article quality score based on 3 dimensions (conversion, traffic and engagement).

Building an ML Tool to predict Article Quality Scores using Delta & MLFlow

Databricks

As the nation’s leading advocate for people aged 50+, each month AARP conducts thousands of campaigns made up of hundreds of millions of emails, mails and phone calls to over 37 million members and a broader universe of non-members. Missing information on demographics results in less accurate profiling and targeting strategies. For example, there are 1.5 Million active members and 15 Million expired members missing gender information in AARP’s database. The Name gender model is a use case where AARP Data Analytics team utilized the Databricks Lakehouse platform to create a fully automated machine learning model. The Random Forest Classifier used 800 thousand existing distinct first names, ages, and over 700 variables derived from the letter composition of first names to predict gender. It leveraged MLflow to track the accuracy of models and log the metrics overtime, and registered multiple model versions to pick the best for production. Model training and scoring were scheduled and the auto ML pipeline significantly minimized manual working hours after the initial set up. As a result, AARP dramatically (and accurately) improved the coverage of gender information from about 92.5% to 99.5%.

Gender Prediction with Databricks AutoML Pipeline

Databricks

Looking to build a robust machine learning infrastructure to streamline MLOps? Learn from Provectus experts how to ensure the success of your MLOps initiative by implementing Data QA components in your ML infrastructure. For most organizations, the development of multiple machine learning models, their deployment and maintenance in production are relatively new tasks. Join Provectus as we explain how to build an end-to-end infrastructure for machine learning, with a focus on data quality and metadata management, to standardize and streamline machine learning life cycle management (MLOps). Agenda - Data Quality and why it matters - Challenges and solutions of Data Testing - Challenges and solutions of Model Testing - MLOps pipelines and why they matter - How to expand validation pipelines for Data Quality

MLOps and Data Quality: Deploying Reliable ML Models in Production

Provectus

Here at T-Mobile when a new account is opened, there are fraud checks that occur both pre- and post-activation. Fraud that is missed has a tendency of falling into first payment default, looking like a delinquent new account. The objective of this project was to investigate newly created accounts headed towards delinquency to find additional fraud. For the longevity of this project we wanted to implement it as an end to end automated solution for building and productionizing models that included multiple modeling techniques and hyper parameter tuning. We wanted to utilize MLflow for model comparison, graduation to production, and parallel hyper parameter tuning using Hyperopt. To achieve this goal, we created multiple machine learning notebooks where a variety of models could be tuned with their specific parameters. These models were saved into a training MLflow experiment, after which the best performing model for each model notebook was saved to a model comparison MLflow experiment. In the second experiment the newly built models would be compared with each other as well as the models currently and previously in production. After the best performing model was identified it was then saved to the MLflow Model Registry to be graduated to production. We were able to execute the multiple notebook solution above as part of an Azure Data Factory pipeline to be regularly scheduled, making the model building and selection a completely hand off implementation. Every data science project has its nuances; the key is to leverage available tools in a customized approach that fit your needs. We are hoping to provide the audience with a view into our advanced and custom approach of utilizing the MLflow infrastructure and leveraging these tools through automation.

Advanced Model Comparison and Automated Deployment Using ML

Databricks

Managing the Machine Learning Lifecycle with MLflow

Databricks

In healthcare, DICOM is an international standard format for storing medical images (MRI/CT representations). Each image has associated with it embedded metadata and pixel data. There is currently a tremendous amount of effort in healthcare to incorporate image analytics within clinical data analysis. Apache Spark is a natural framework to integrate these efforts. This session presents an analytics workflow using Apache Spark to perform ETL on DICOM images, and then to perform Eigen decomposition to derive meaningful insights on the pixel data. The workflow integrates a Java based framework DCM4CHE with Apache Spark to parallelize the big data workload for fast processing. Users can extract features based on the metadata and run efficient clean/filter/drill-down for preprocessing. See a demonstration of predictive analytics with visualization using the metadata to derive insights, such as likelihood of a condition or efficacy of medication administered. The speakers will also present performance benchmarks of this workflow on various datasets and cluster configurations to demonstrate the benefits of running this kind of analysis workflow on Apache Spark.

A Predictive Analytics Workflow on DICOM Images using Apache Spark with Anahi...

Databricks

Societe Generale is one of the major banks in France and has many data science teams across the globe. After years of explorations and prototyping, it is time for the company to really deploy machine learning projects at scale to the production environment. To achieve that goal, we have been working hard to define a standard process of collaboration between data engineers and data scientists. And we also designed and deployed an infrastructure for productionizing machine learning. During this presentation, you will be looking at the following points of our adventure: 1. Difficulties that we had for putting ML applications into production, such as lack of model registry; hard to deploy ML libraries to our Hadoop cluster; collaboration between data scientists and data engineers etc. ? 2. How did we deploy MLflow as a key technical component to our production hadoop environment given different security constraints. 3. How did we build a CI/CD pipeline to deploy ML applications automatically. MLflow plays an important role in this piepline. 4. A first and concrete production project developed on top of this infrastructure with MLflow, Spark streaming, Sklearn and CI/CD. The key takeaways of this presentation would be: 1. To productionize machine learning in a big structure like Société Générale, a process of collaboration should be clearly defined. 2. A ML model registry is key to ML productionization. MLflow is the best solution we found. 3. A CI/CD pipeline is essential to the success of a machine learning application.

Machine Learning at Scale with MLflow and Apache Spark

Databricks

Ml infra at an early stage

Nick Handel

Best Practices for Engineering Production-Ready Software with Apache Spark

Databricks

With data as a valuable currency and the architecture of reliable, scalable Data Lakes and Lakehouses continuing to mature, it is crucial that machine learning training and deployment techniques keep up to realize value. Reproducibility, efficiency, and governance in training and production environments rest on the shoulders of both point in time snapshots of the data and a governing mechanism to regulate, track, and make best use of associated metadata. This talk will outline the challenges and importance of building and maintaining reproducible, efficient, and governed machine learning solutions as well as posing solutions built on open source technologies – namely Delta Lake for data versioning and MLflow for efficiency and governance.

Importance of ML Reproducibility & Applications with MLfLow

Databricks

dataxu bids on ads in real-time on behalf of its customers at the rate of 3 million requests a second and trains on past bids to optimize for future bids. Our system trains thousands of advertiser-specific models and runs multi-terabyte datasets. In this presentation we will share the lessons learned from our transition towards a fully automated Spark-based machine learning system and how this has drastically reduced the time to get a research idea into production. We'll also share how we: - continually ship models to production - train models in an unattended fashion with auto-tuning capabilities - tune and overbooked cluster resources for maximum performance - ported our previous ML solution into Spark - evaluate the performance of high-rate bidding models Speakers: Maximo Gurmendez, Javier Buquet

SparkML: Easy ML Productization for Real-Time Bidding

Databricks

What's hot (20)

Production ready big ml workflows from zero to hero daniel marcous @ waze

Porting R Models into Scala Spark

ModelDB: A System to Manage Machine Learning Models: Spark Summit East talk b...

"Managing the Complete Machine Learning Lifecycle with MLflow"

AI Modernization at AT&T and the Application to Fraud with Databricks

Productionizing Deep Reinforcement Learning with Spark and MLflow

Code Once Use Often with Declarative Data Pipelines

Productionizing H2O Models with Apache Spark with Jakub Hava and Michal Maloh...

Scaling Ride-Hailing with Machine Learning on MLflow

Building an ML Tool to predict Article Quality Scores using Delta & MLFlow

Gender Prediction with Databricks AutoML Pipeline

MLOps and Data Quality: Deploying Reliable ML Models in Production

Advanced Model Comparison and Automated Deployment Using ML

Managing the Machine Learning Lifecycle with MLflow

A Predictive Analytics Workflow on DICOM Images using Apache Spark with Anahi...

Machine Learning at Scale with MLflow and Apache Spark

Ml infra at an early stage

Best Practices for Engineering Production-Ready Software with Apache Spark

Importance of ML Reproducibility & Applications with MLfLow

SparkML: Easy ML Productization for Real-Time Bidding

Similar to NLP Text Recommendation System Journey to Automated Training

AI hype or reality

Awantik Das

Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf

vitm11

LLM-Datacraft.pdf

Jyotirmoy Sundi

Webinar: Question Answering and Virtual Assistants with Deep Learning

Lucidworks

End to end MLworkflows

Adam Gibson

Moving from BI to AI : For decision makers

zekeLabs Technologies

Agile india2018 exp_report

Vinayak Joglekar

Delivering Machine Learning Solutions by fmr Sears Dir of PM

Product School

Machine Learning becomes more and more common practice in many companies. ML teams size is growing and collaboration goes out of office and personal laptops. The complexity of ML projects leads to adopting distributed team collaboration, cloud based infrastructure and distributed machine learning. Well defined and manageable process for ML experiments becomes a central issue. Practices to apply automated pipelines, models and data set versioning helps to establish a good manageable process in project and provide reproducible results. This speech helps to start with handling models and datasets versioning using open source tools: DVC, mlflow, Luigi, etc.

Reproducibility and experiments management in Machine Learning

Microsoft exam (2)

Resume

Resume_ETL__Testing

Nikita (1)

Swarna pippalla Testing

swarna pippalla

Humana strives to help the communities we serve and our individual members achieve their best health – no small task in the past year! We had the opportunity to rethink our existing operations and reimagine what a collaborative ML platform for hundreds of data scientists might look like. The primary goal of our ML Platform, named FlorenceAI, is to automate and accelerate the delivery lifecycle of data science solutions at scale. In this presentation, we will walk through an end-to-end example of how to build a model at scale on FlorenceAI and deploy it to production. Tools highlighted include Azure Databricks, MLFlow, AppInsights, and Azure Data Factory. We will employ slides, notebooks and code snippets covering problem framing and design, initial feature selection, model design and experimentation, and a framework of centralized production code to streamline implementation. Hundreds of data scientists now use our feature store that has tens of thousands of features refreshed in daily and monthly cadences across several years of historical data. We already have dozens of models in production and also daily provide fresh insights for our Enterprise Clinical Operating Model. Each day, billions of rows of data are generated to give us timely information. We already have examples of teams operating orders of magnitude faster and at a scale not within reach using fixed on-premise resources. Given rapid adoption from a dozen pilot users to over 100 MAU in the first 5 months, we will also share some anecodotes about key early wins created by the platform. We want FlorenceAI to enable Humana’s data scientists to focus their efforts where they add the most value so we can continue to deliver high-quality solutions that remain fresh, relevant and fair in an ever changing world.

FlorenceAI: Reinventing Data Science at Humana

Databricks

Resume

Ashish Lakhade

C2_W1---.pdf

Humayun Kabir

2018-Sogeti-TestExpo-Intelligent_Predictive_Models.pptx

Minh Nguyen

Machine learning

Saravanan Subburayal

Machine learning at scale - Webinar By zekeLabs

zekeLabs Technologies

Similar to NLP Text Recommendation System Journey to Automated Training (20)

AI hype or reality

Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf

LLM-Datacraft.pdf

Webinar: Question Answering and Virtual Assistants with Deep Learning

End to end MLworkflows

Moving from BI to AI : For decision makers

Agile india2018 exp_report

Delivering Machine Learning Solutions by fmr Sears Dir of PM

Reproducibility and experiments management in Machine Learning

Microsoft exam (2)

Resume

Resume_ETL__Testing

Nikita (1)

Swarna pippalla Testing

FlorenceAI: Reinventing Data Science at Humana

Resume

C2_W1---.pdf

2018-Sogeti-TestExpo-Intelligent_Predictive_Models.pptx

Machine learning

Machine learning at scale - Webinar By zekeLabs

More from Databricks

DW Migration Webinar-March 2022.pptx

Databricks

The world of data architecture began with applications. Next came data warehouses. Then text was organized into a data warehouse. Then one day the world discovered a whole new kind of data that was being generated by organizations. The world found that machines generated data that could be transformed into valuable insights. This was the origin of what is today called the data lakehouse. The evolution of data architecture continues today. Come listen to industry experts describe this transformation of ordinary data into a data architecture that is invaluable to business. Simply put, organizations that take data architecture seriously are going to be at the forefront of business tomorrow. This is an educational event. Several of the authors of the book Building the Data Lakehouse will be presenting at this symposium.

Data Lakehouse Symposium | Day 1 | Part 1

Databricks

Data Lakehouse Symposium | Day 1 | Part 2

Databricks

Data Lakehouse Symposium | Day 2

Databricks

Data Lakehouse Symposium | Day 4

Databricks

In this session, learn how to quickly supplement your on-premises Hadoop environment with a simple, open, and collaborative cloud architecture that enables you to generate greater value with scaled application of analytics and AI on all your data. You will also learn five critical steps for a successful migration to the Databricks Lakehouse Platform along with the resources available to help you begin to re-skill your data teams.

5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop

Databricks

Bad data leads to bad decisions and broken customer experiences. Organizations depend on complete and accurate data to power their business, maintain efficiency, and uphold customer trust. With thousands of datasets and pipelines running, how do we ensure that all data meets quality standards, and that expectations are clear between producers and consumers? Investing in shared, flexible components and practices for monitoring data health is crucial for a complex data organization to rapidly and effectively scale. At Zillow, we built a centralized platform to meet our data quality needs across stakeholders. The platform is accessible to engineers, scientists, and analysts, and seamlessly integrates with existing data pipelines and data discovery tools. In this presentation, we will provide an overview of our platform’s capabilities, including: Giving producers and consumers the ability to define and view data quality expectations using a self-service onboarding portal Performing data quality validations using libraries built to work with spark Dynamically generating pipelines that can be abstracted away from users Flagging data that doesn’t meet quality standards at the earliest stage and giving producers the opportunity to resolve issues before use by downstream consumers Exposing data quality metrics alongside each dataset to provide producers and consumers with a comprehensive picture of health over time

Democratizing Data Quality Through a Centralized Platform

Databricks

Data scientists face numerous challenges throughout the data science workflow that hinder productivity. As organizations continue to become more data-driven, a collaborative environment is more critical than ever — one that provides easier access and visibility into the data, reports and dashboards built against the data, reproducibility, and insights uncovered within the data.. Join us to hear how Databricks’ open and collaborative platform simplifies data science by enabling you to run all types of analytics workloads, from data preparation to exploratory analysis and predictive analytics, at scale — all on one unified platform.

Learn to Use Databricks for Data Science

Databricks

Application performance monitoring (APM) has become the cornerstone of software engineering allowing engineering teams to quickly identify and remedy production issues. However, as the world moves to intelligent software applications that are built using machine learning, traditional APM quickly becomes insufficient to identify and remedy production issues encountered in these modern software applications. As a lead software engineer at NewRelic, my team built high-performance monitoring systems including Insights, Mobile, and SixthSense. As I transitioned to building ML Monitoring software, I found the architectural principles and design choices underlying APM to not be a good fit for this brand new world. In fact, blindly following APM designs led us down paths that would have been better left unexplored. In this talk, I draw upon my (and my team’s) experience building an ML Monitoring system from the ground up and deploying it on customer workloads running large-scale ML training with Spark as well as real-time inference systems. I will highlight how the key principles and architectural choices of APM don’t apply to ML monitoring. You’ll learn why, understand what ML Monitoring can successfully borrow from APM, and hear what is required to build a scalable, robust ML Monitoring architecture.

Why APM Is Not the Same As ML Monitoring

Databricks

Autonomy and ownership are core to working at Stitch Fix, particularly on the Algorithms team. We enable data scientists to deploy and operate their models independently, with minimal need for handoffs or gatekeeping. By writing a simple function and calling out to an intuitive API, data scientists can harness a suite of platform-provided tooling meant to make ML operations easy. In this talk, we will dive into the abstractions the Data Platform team has built to enable this. We will go over the interface data scientists use to specify a model and what that hooks into, including online deployment, batch execution on Spark, and metrics tracking and visualization.

The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix

Databricks

In this talk, I will dive into the stage level scheduling feature added to Apache Spark 3.1. Stage level scheduling extends upon Project Hydrogen by improving big data ETL and AI integration and also enables multiple other use cases. It is beneficial any time the user wants to change container resources between stages in a single Apache Spark application, whether those resources are CPU, Memory or GPUs. One of the most popular use cases is enabling end-to-end scalable Deep Learning and AI to efficiently use GPU resources. In this type of use case, users read from a distributed file system, do data manipulation and filtering to get the data into a format that the Deep Learning algorithm needs for training or inference and then sends the data into a Deep Learning algorithm. Using stage level scheduling combined with accelerator aware scheduling enables users to seamlessly go from ETL to Deep Learning running on the GPU by adjusting the container requirements for different stages in Spark within the same application. This makes writing these applications easier and can help with hardware utilization and costs. There are other ETL use cases where users want to change CPU and memory resources between stages, for instance there is data skew or perhaps the data size is much larger in certain stages of the application. In this talk, I will go over the feature details, cluster requirements, the API and use cases. I will demo how the stage level scheduling API can be used by Horovod to seamlessly go from data preparation to training using the Tensorflow Keras API using GPUs. The talk will also touch on other new Apache Spark 3.1 functionality, such as pluggable caching, which can be used to enable faster dataframe access when operating from GPUs.

Stage Level Scheduling Improving Big Data and AI Integration

Databricks

In this talk, I would like to introduce an open-source tool built by our team that simplifies the data conversion from Apache Spark to deep learning frameworks. Imagine you have a large dataset, say 20 GBs, and you want to use it to train a TensorFlow model. Before feeding the data to the model, you need to clean and preprocess your data using Spark. Now you have your dataset in a Spark DataFrame. When it comes to the training part, you may have the problem: How can I convert my Spark DataFrame to some format recognized by my TensorFlow model? The existing data conversion process can be tedious. For example, to convert an Apache Spark DataFrame to a TensorFlow Dataset file format, you need to either save the Apache Spark DataFrame on a distributed filesystem in parquet format and load the converted data with third-party tools such as Petastorm, or save it directly in TFRecord files with spark-tensorflow-connector and load it back using TFRecordDataset. Both approaches take more than 20 lines of code to manage the intermediate data files, rely on different parsing syntax, and require extra attention for handling vector columns in the Spark DataFrames. In short, all these engineering frictions greatly reduced the data scientists’ productivity. The Databricks Machine Learning team contributed a new Spark Dataset Converter API to Petastorm to simplify these tedious data conversion process steps. With the new API, it takes a few lines of code to convert a Spark DataFrame to a TensorFlow Dataset or a PyTorch DataLoader with default parameters. In the talk, I will use an example to show how to use the Spark Dataset Converter to train a Tensorflow model and how simple it is to go from single-node training to distributed training on Databricks.

Simplify Data Conversion from Spark to TensorFlow and PyTorch

Databricks

There is no doubt Kubernetes has emerged as the next generation of cloud native infrastructure to support a wide variety of distributed workloads. Apache Spark has evolved to run both Machine Learning and large scale analytics workloads. There is growing interest in running Apache Spark natively on Kubernetes. By combining the flexibility of Kubernetes and scalable data processing with Apache Spark, you can run any data and machine pipelines on this infrastructure while effectively utilizing resources at disposal. In this talk, Rajesh Thallam and Sougata Biswas will share how to effectively run your Apache Spark applications on Google Kubernetes Engine (GKE) and Google Cloud Dataproc, orchestrate the data and machine learning pipelines with managed Apache Airflow on GKE (Google Cloud Composer). Following topics will be covered: – Understanding key traits of Apache Spark on Kubernetes- Things to know when running Apache Spark on Kubernetes such as autoscaling- Demonstrate running analytics pipelines on Apache Spark orchestrated with Apache Airflow on Kubernetes cluster.

Scaling your Data Pipelines with Apache Spark on Kubernetes

Databricks

Pipelines have become ubiquitous, as the need for stringing multiple functions to compose applications has gained adoption and popularity. Common pipeline abstractions such as “fit” and “transform” are even shared across divergent platforms such as Python Scikit-Learn and Apache Spark. Scaling pipelines at the level of simple functions is desirable for many AI applications, however is not directly supported by Ray’s parallelism primitives. In this talk, Raghu will describe a pipeline abstraction that takes advantage of Ray’s compute model to efficiently scale arbitrarily complex pipeline workflows. He will demonstrate how this abstraction cleanly unifies pipeline workflows across multiple platforms such as Scikit-Learn and Spark, and achieves nearly optimal scale-out parallelism on pipelined computations. Attendees will learn how pipelined workflows can be mapped to Ray’s compute model and how they can both unify and accelerate their pipelines with Ray.

Scaling and Unifying SciKit Learn and Apache Spark Pipelines

Databricks

In this talk about zipline, we will introduce a new type of windowing construct called a sawtooth window. We will describe various properties about sawtooth windows that we utilize to achieve online-offline consistency, while still maintaining high-throughput, low-read latency and tunable write latency for serving machine learning features.We will also talk about a simple deployment strategy for correcting feature drift – due operations that are not “abelian groups”, that operate over change data.

Sawtooth Windows for Feature Aggregations

Databricks

We want to present multiple anti patterns utilizing Redis in unconventional ways to get the maximum out of Apache Spark.All examples presented are tried and tested in production at Scale at Adobe. The most common integration is spark-redis which interfaces with Redis as a Dataframe backing Store or as an upstream for Structured Streaming. We deviate from the common use cases to explore where Redis can plug gaps while scaling out high throughput applications in Spark. Niche 1 : Long Running Spark Batch Job – Dispatch New Jobs by polling a Redis Queue · Why? o Custom queries on top a table; We load the data once and query N times · Why not Structured Streaming · Working Solution using Redis Niche 2 : Distributed Counters · Problems with Spark Accumulators · Utilize Redis Hashes as distributed counters · Precautions for retries and speculative execution · Pipelining to improve performance

Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink

Databricks

In the era of microservices, decentralized ML architectures and complex data pipelines, data quality has become a bigger challenge than ever. When data is involved in complex business processes and decisions, bad data can, and will, affect the bottom line. As a result, ensuring data quality across the entire ML pipeline is both costly, and cumbersome while data monitoring is often fragmented and performed ad hoc. To address these challenges, we built whylogs, an open source standard for data logging. It is a lightweight data profiling library that enables end-to-end data profiling across the entire software stack. The library implements a language and platform agnostic approach to data quality and data monitoring. It can work with different modes of data operations, including streaming, batch and IoT data. In this talk, we will provide an overview of the whylogs architecture, including its lightweight statistical data collection approach and various integrations. We will demonstrate how the whylogs integration with Apache Spark achieves large scale data profiling, and we will show how users can apply this integration into existing data and ML pipelines.

Re-imagine Data Monitoring with whylogs and Spark

Databricks

Machine learning (ML) models are typically part of prediction queries that consist of a data processing part (e.g., for joining, filtering, cleaning, featurization) and an ML part invoking one or more trained models. In this presentation, we identify significant and unexplored opportunities for optimization. To the best of our knowledge, this is the first effort to look at prediction queries holistically, optimizing across both the ML and SQL components. We will present Raven, an end-to-end optimizer for prediction queries. Raven relies on a unified intermediate representation that captures both data processing and ML operators in a single graph structure. This allows us to introduce optimization rules that (i) reduce unnecessary computations by passing information between the data processing and ML operators (ii) leverage operator transformations (e.g., turning a decision tree to a SQL expression or an equivalent neural network) to map operators to the right execution engine, and (iii) integrate compiler techniques to take advantage of the most efficient hardware backend (e.g., CPU, GPU) for each operator. We have implemented Raven as an extension to Spark’s Catalyst optimizer to enable the optimization of SparkSQL prediction queries. Our implementation also allows the optimization of prediction queries in SQL Server. As we will show, Raven is capable of improving prediction query performance on Apache Spark and SQL Server by up to 13.1x and 330x, respectively. For complex models, where GPU acceleration is beneficial, Raven provides up to 8x speedup compared to state-of-the-art systems. As part of the presentation, we will also give a demo showcasing Raven in action.

Raven: End-to-end Optimization of ML Prediction Queries

Databricks

Semantic segmentation is the classification of every pixel in an image/video. The segmentation partitions a digital image into multiple objects to simplify/change the representation of the image into something that is more meaningful and easier to analyze [1][2]. The technique has a wide variety of applications ranging from perception in autonomous driving scenarios to cancer cell segmentation for medical diagnosis. Exponential growth in the datasets that require such segmentation is driven by improvements in the accuracy and quality of the sensors generating the data extending to 3D point cloud data. This growth is further compounded by exponential advances in cloud technologies enabling the storage and compute available for such applications. The need for semantically segmented datasets is a key requirement to improve the accuracy of inference engines that are built upon them. Streamlining the accuracy and efficiency of these systems directly affects the value of the business outcome for organizations that are developing such functionalities as a part of their AI strategy. This presentation details workflows for labeling, preprocessing, modeling, and evaluating performance/accuracy. Scientists and engineers leverage domain-specific features/tools that support the entire workflow from labeling the ground truth, handling data from a wide variety of sources/formats, developing models and finally deploying these models. Users can scale their deployments optimally on GPU-based cloud infrastructure to build accelerated training and inference pipelines while working with big datasets. These environments are optimized for engineers to develop such functionality with ease and then scale against large datasets with Spark-based clusters on the cloud.

Processing Large Datasets for ADAS Applications using Apache Spark

Databricks

At Adobe Experience Platform, we ingest TBs of data every day and manage PBs of data for our customers as part of the Unified Profile Offering. At the heart of this is a bunch of complex ingestion of a mix of normalized and denormalized data with various linkage scenarios power by a central Identity Linking Graph. This helps power various marketing scenarios that are activated in multiple platforms and channels like email, advertisements etc. We will go over how we built a cost effective and scalable data pipeline using Apache Spark and Delta Lake and share our experiences. What are we storing? Multi Source – Multi Channel Problem Data Representation and Nested Schema Evolution Performance Trade Offs with Various formats Go over anti-patterns used (String FTW) Data Manipulation using UDFs Writer Worries and How to Wipe them Away Staging Tables FTW Datalake Replication Lag Tracking Performance Time!

Massive Data Processing in Adobe Using Delta Lake

Databricks

More from Databricks (20)

DW Migration Webinar-March 2022.pptx

Data Lakehouse Symposium | Day 1 | Part 1

Data Lakehouse Symposium | Day 1 | Part 2

Data Lakehouse Symposium | Day 2

Data Lakehouse Symposium | Day 4

5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop

Democratizing Data Quality Through a Centralized Platform

Learn to Use Databricks for Data Science

Why APM Is Not the Same As ML Monitoring

The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix

Stage Level Scheduling Improving Big Data and AI Integration

Simplify Data Conversion from Spark to TensorFlow and PyTorch

Scaling your Data Pipelines with Apache Spark on Kubernetes

Scaling and Unifying SciKit Learn and Apache Spark Pipelines

Sawtooth Windows for Feature Aggregations

Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink

Re-imagine Data Monitoring with whylogs and Spark

Raven: End-to-end Optimization of ML Prediction Queries

Processing Large Datasets for ADAS Applications using Apache Spark

Massive Data Processing in Adobe Using Delta Lake

Recently uploaded

Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore Booking Contact Details :- WhatsApp Chat :- +91-7737669865 2-May-2024(SMW) Call Girls In Model Towh Bangalore +91-7737669865 !! Best Woman Seeking Man Call Girls Service, Escorts Service in Home Hotel in Bangalore NCR 24 Hours Available Service Call Girls, Contact Us +91-7737669865 (Any Time. Any Where) Call Girls in Bangalore, Noida, Gurgaon, Ghaziabad,Sexy Indian Female Escorts Service Bangalore NCRWelcome To Bangalore Escorts Service – An All Over New Bangalore Very Sexy Hot Call Girls Agency Service Escorts In South BangaloreNCRBangalore’s No. 1 High Profile Independent Female Escorts Service. We Provide Good Quality Educated Profile At Very Regnebal Price 100% Safe And Original.We Are Provide Escorts Service All OYO Hotels ,3*,4*,5* Star Hotel And Home Flat, Apartment. Guest-House. Services In -Call And Out – Call Both Are Services Available. 24Hrs. Any Time Any Where. In All Over Bangalore Noida Gurgaon Ghaziabad Faridabad.More Information And Contact Profile Real Pic Visit Our Website City Wise Escorts Service Agency.Good Looking Cheap And Best Models Girls U Can Get Best Click On Link……Night Call Girls Now In Hotel Le Meridien Gurgaon Near Female Escort One Shot — 5000/in call (time 1 hour), 6000/out call Two shot with one girl — 8000/in call (time 2 hour), 10000/out call Body to body massage with sex- 8000/in call (time 1 hour) Full night Service for one person– 12000/in call, 13000/out call (shot limit 3-4 shots) Full night Service for more than 1 person — please contact Us —7737669865 We are available 24*7 all days of the year. Call us — 7737669865 Thank you for Visiting.

Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...

amitlee9823

➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Escorts Service Booking Contact Details :- WhatsApp Chat :- +91-7737669865 Call Girls In Model Towh +91-7737669865 !! Best Woman Seeking Man Call Girls Service, Escorts Service in Home Hotel in NCR 24 Hours Available Service Call Girls, Contact Us +91-7737669865 (Any Time. Any Where) Call Girls in , Noida, Gurgaon, Ghaziabad,Sexy Indian Female Escorts Service NCRWelcome To Escorts Service – An All Over New Very Sexy Hot Call Girls Agency Service Escorts In South NCR’s No. 1 High Profile Independent Female Escorts Service. We Provide Good Quality Educated Profile At #K09 Very Regnebal Price 100% Safe And Original.We Are Provide Escorts Service All OYO Hotels ,3*,4*,5* Star Hotel And Home Flat, Apartment. Guest-House. Services In -Call And Out – Call Both Are Services Available. 24Hrs. Any Time Any Where. In All Over Noida Gurgaon Ghaziabad Faridabad.More Information And Contact Profile Real Pic Visit Our Website City Wise Escorts Service Agency.Good Looking Cheap And Best Models Girls U Can Get Best Click On Link……Night Call Girls Now In Hotel Le Meridien Gurgaon Near Female Escort One Shot — 5000/in call (time 1 hour), 6000/out call Two shot with one girl — 8000/in call (time 2 hour), 10000/out call Body to body massage with sex- 8000/in call (time 1 hour) Full night Service for one person– 12000/in call, 13000/out call (shot limit 3-4 shots) Full night Service for more than 1 person — please contact Us —7737669865 We are available 24*7 all days of the year. Call us — 7737669865 Thank you for Visiting.

➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...

amitlee9823

Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure

Pooja Nehwal

Call Girl In Dwarka ☎92055#41914 ¶¶ Indian,Russian Best Quality full Educated And Full Cooperative Independent Call Girls Escort Services In New Delhi- I Have Extremely Beautiful Broad Minded Cute Sexy & Hot Call Girls and Escorts, We Are Located in 3* 4* 5* Hotels in Delhi. Safe & Secure High Class Services Affordable Rate 100% Satisfaction, Unlimited Enjoyment. Any Time for Model/Teens Escort in Delhi High class luxury and premium escorts agency Indian Russian Call Girls In Delhi Booking Good High Profile Escorts (Call Girls) In Delhi 5 Star Hotel ,Incall Service,OutCall Service, We provide services by Call Girls,College Girls,Modals Get High Profile queens,Well Educated,Good Looking,Full Cooperative Model, Russian Models,Punjabi Girls Kashmeri Girls Services etc… We Provide Hottest Female With Safe And Consensual With Most Limits Respected Complete Satisfaction Guaranteed…Service. Call Me Spacial For Including Incall//outcall Service In New Delhi Indian Russian Escorts Service

Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...

Delhi Call girls

Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage with sex Booking Contact Details :- WhatsApp Chat :- +91-9920725232 4-May-2024(SMW) Call Girls In Model Towh +91-9920725232 !! Best Woman Seeking Man Call Girls Service, Escorts Service in Home Hotel in NCR 24 Hours Available Service Call Girls, Contact Us +91-9920725232 (Any Time. Any Where) Call Girls in , Noida, Gurgaon, Ghaziabad,Sexy Indian Female Escorts Service NCRWelcome To Escorts Service – An All Over New Very Sexy Hot Call Girls Agency Service Escorts In South NCR’s No. 1 High Profile Independent Female Escorts Service. We Provide Good Quality Educated Profile At Very Regnebal Price 100% Safe And Original.We Are Provide Escorts Service All OYO Hotels ,3*,4*,5* Star Hotel And Home Flat, Apartment. Guest-House. Services In -Call And Out – Call Both Are Services Available. 24Hrs. Any Time Any Where. In All Over Noida Gurgaon Ghaziabad Faridabad.More Information And Contact Profile Real Pic Visit Our Website City Wise Escorts Service Agency.Good Looking Cheap And Best Models Girls U Can Get Best Click On Link……Night Call Girls Now In Hotel Le Meridien Gurgaon Near Female Escort One Shot — 5000/in call (time 1 hour), 6000/out call Two shot with one girl — 8000/in call (time 2 hour), 10000/out call Body to body massage with sex- 8000/in call (time 1 hour) Full night Service for one person– 12000/in call, 13000/out call (shot limit 3-4 shots) Full night Service for more than 1 person — please contact Us —9920725232 We are available 24*7 all days of the year. Call us — 9920725232 Thank you for Visiting.

Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...

amitlee9823

Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand Booking Contact Details :- WhatsApp Chat :- +91-7737669865 Call Girls In Model Towh +91-7737669865 !! Best Woman Seeking Man Call Girls Service, Escorts Service in Home Hotel in NCR 24 Hours Available Service Call Girls, Contact Us +91-7737669865 (Any Time. Any Where) Call Girls in , Noida, Gurgaon, Ghaziabad,Sexy Indian Female Escorts Service NCRWelcome To Escorts Service – An All Over New Very Sexy Hot Call Girls Agency Service Escorts In South NCR’s No. 1 High Profile Independent Female Escorts Service. We Provide Good Quality Educated Profile At #K09 Very Regnebal Price 100% Safe And Original.We Are Provide Escorts Service All OYO Hotels ,3*,4*,5* Star Hotel And Home Flat, Apartment. Guest-House. Services In -Call And Out – Call Both Are Services Available. 24Hrs. Any Time Any Where. In All Over Noida Gurgaon Ghaziabad Faridabad.More Information And Contact Profile Real Pic Visit Our Website City Wise Escorts Service Agency.Good Looking Cheap And Best Models Girls U Can Get Best Click On Link……Night Call Girls Now In Hotel Le Meridien Gurgaon Near Female Escort One Shot — 5000/in call (time 1 hour), 6000/out call Two shot with one girl — 8000/in call (time 2 hour), 10000/out call Body to body massage with sex- 8000/in call (time 1 hour) Full night Service for one person– 12000/in call, 13000/out call (shot limit 3-4 shots) Full night Service for more than 1 person — please contact Us —7737669865 We are available 24*7 all days of the year. Call us — 7737669865 Thank you for Visiting.

Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand

amitlee9823

Week-01-2.ppt BBB human Computer interaction

fulawalesam

Model Call Girl Services in Delhi reach out to us at 🔝 9953056974 🔝✔️✔️ Our agency presents a selection of young, charming call girls available for bookings at Oyo Hotels. Experience high-class escort services at pocket-friendly rates, with our female escorts exuding both beauty and a delightful personality, ready to meet your desires. Whether it's Housewives, College girls, Russian girls, Muslim girls, or any other preference, we offer a diverse range of options to cater to your tastes. We provide both in-call and out-call services for your convenience. Our in-call location in Delhi ensures cleanliness, hygiene, and 100% safety, while our out-call services offer doorstep delivery for added ease. We value your time and money, hence we kindly request pic collectors, time-passers, and bargain hunters to refrain from contacting us. Our services feature various packages at competitive rates: One shot: ₹2000/in-call, ₹5000/out-call Two shots with one girl: ₹3500/in-call, ₹6000/out-call Body to body massage with sex: ₹3000/in-call Full night for one person: ₹7000/in-call, ₹10000/out-call Full night for more than 1 person: Contact us at 🔝 9953056974 🔝. for details Operating 24/7, we serve various locations in Delhi, including Green Park, Lajpat Nagar, Saket, and Hauz Khas near metro stations. For premium call girl services in Delhi 🔝 9953056974 🔝. Thank you for considering us!

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE

9953056974 Low Rate Call Girls In Saket, Delhi NCR

FESE Capital Markets Fact Sheet 2024 Q1.pdf

MarinCaroMartnezBerg

Klinik_ Apotek Onlin 085657271886 Solusi Menggugurkan Masalah Kehamilan Anda Jual Obat Aborsi Asli KLINIK ABORSI TERPEECAYA _ Jual Obat Aborsi Cytotec Misoprostol Asli 100% Ampuh Hanya 3 Jam Langsung Gugur || OBAT PENGGUGUR KANDUNGAN AMPUH MANJUR OBAT ABORSI OLINE" APOTIK Jual Obat Cytotec, Gastrul, Gynecoside Asli Ampuh. JUAL ” Obat Aborsi Tuntas | Obat Aborsi Manjur | Obat Aborsi Ampuh | Obat Penggugur Janin | Obat Pencegah Kehamilan | Obat Pelancar Haid | Obat terlambat Bulan | Ciri Obat Aborsi Asli | Obat Telat Bulan | Pil Aborsi Asli | Cara Menggugurkan Konten | Cara Aborsi Tuntas | Harga Obat Aborsi Asli | Pil Aborsi | Jual Obat Aborsi Cytotec | Cara Aborsi Sendiri | Cara Aborsi Usia 1 Bulan | Cara Aborsi Usia 2 Tahun | Cara Aborsi Usia 3 Bulan | Obat Aborsi Usia 4 Bulan | Cara Abrasi Usia 5 Bulan | Cara Menggugurkan Konten | Kandungan Obat Penggugur | Cara Menghitung Usia Konten | Cara Mengatasi Terlambat Bulan | Penjual Obat Aborsi Asli | Obat Aborsi Garansi | Kandungan Obat Peluntur | Obat Telat Datang Bulan | Obat Telat Haid | Obat Aborsi Paling Murah | Klinik Jual Obat Aborsi | Jual Pil Cytotec | Apotik Jual Obat Aborsi | Kandungan Dokter Abrasi | Cara Aborsi Cepat | Jual Obat Aborsi Bergaransi | Jual Obat Cytotec Asli | Obat Aborsi Aman Manjur | Obat Misoprostol Cytotec Asli. "APA ITU ABORSI" “Aborsi Adalah dengan membendung hormon yang di perlukan untuk mempertahankan kehamilan yaitu hormon progesteron, karena hormon ini dibendung, maka jalur kehamilan mulai membuka dan leher rahim menjadi melunak,sehingga mengeluarkan darah yang merupakan tanda bahwa obat telah bekerja || maksimal 1 jam obat diminum || PENJELASAN OBAT ABORSI USIA 1 _7 BULAN Pada usia kandungan ini, pasien akan merasakan sakit yang sedikit tidak berlebihan || sekitar 1 jam ||. namun hanya akan terjadi pada saatdarah keluar merupakan pertanda menstruasi. Hal ini dikarenakan pada usiakandungan 3 bulan,janin sudah terbentuk sebesar kepalan tangan orang dewasa. Cara kerja obat aborsi : JUAL OBAT ABORSI AMPUH dosis 3 bulan secara umum sama dengan cara kerja || DOSIS OBAT ABORSI 2 bulan”, hanya berbedanya selain mengisolasijanin juga menghancurkan janin dengan formula methotrexate dikandungdidalamnya. Formula methotrexate ini sangat ampuh untuk menghancurkan janinmenjadi serpihan-serpihan kecil akan sangat berguna pada saat dikeluarkan nanti. APA ALASAN WANITA MELAKUKAN ABORSI? Aborsi di lakukan wanita hamil baik yang sudah menikah maupun belum menikah dengan berbagai alasan , akan tetapi alasan yang utama adalah alasan-alasan non medis (termasuk aborsi sendiri / di sengaja/ buatan] MELAYANI PEMESANAN OBAT ABORSI SETIAP HARI, SIAP KIRIM KESELURUH KOTA BESAR DI INDONESIA DAN LUAR NEGERI. HUBUNGI PEMESANAN LEBIH NYAMAN VIA WA/: 085657271886

Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...

ZurliaSoop

5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed

amy56318795

Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand Booking Contact Details :- WhatsApp Chat :- +91-7737669865 Call Girls In Model Towh +91-7737669865 !! Best Woman Seeking Man Call Girls Service, Escorts Service in Home Hotel in NCR 24 Hours Available Service Call Girls, Contact Us +91-7737669865 (Any Time. Any Where) Call Girls in , Noida, Gurgaon, Ghaziabad,Sexy Indian Female Escorts Service NCRWelcome To Escorts Service – An All Over New Very Sexy Hot Call Girls Agency Service Escorts In South NCR’s No. 1 High Profile Independent Female Escorts Service. We Provide Good Quality Educated Profile At #K09 Very Regnebal Price 100% Safe And Original.We Are Provide Escorts Service All OYO Hotels ,3*,4*,5* Star Hotel And Home Flat, Apartment. Guest-House. Services In -Call And Out – Call Both Are Services Available. 24Hrs. Any Time Any Where. In All Over Noida Gurgaon Ghaziabad Faridabad.More Information And Contact Profile Real Pic Visit Our Website City Wise Escorts Service Agency.Good Looking Cheap And Best Models Girls U Can Get Best Click On Link……Night Call Girls Now In Hotel Le Meridien Gurgaon Near Female Escort One Shot — 5000/in call (time 1 hour), 6000/out call Two shot with one girl — 8000/in call (time 2 hour), 10000/out call Body to body massage with sex- 8000/in call (time 1 hour) Full night Service for one person– 12000/in call, 13000/out call (shot limit 3-4 shots) Full night Service for more than 1 person — please contact Us —7737669865 We are available 24*7 all days of the year. Call us — 7737669865 Thank you for Visiting.

Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand

amitlee9823

VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K to 25K High Profile Escorts In Pune Booking Now open +91- 8005736733 Why you Choose Us- +91- 8005736733 HOT⇄ 8005736733 Mr ashu ji Call Mr ashu Ji +91- 8005736733 (V030524]N) 𝐇𝐨𝐭𝐞𝐥 𝐑𝐨𝐨𝐦𝐬 𝐈𝐧𝐜𝐥𝐮𝐝𝐢𝐧𝐠 𝐑𝐚𝐭𝐞 𝐒𝐡𝐨𝐭𝐬/𝐇𝐨𝐮𝐫𝐲🆓 .█▬█⓿▀█▀ 𝐈𝐍𝐃𝐄𝐏𝐄𝐍𝐃𝐄𝐍𝐓 𝐆𝐈𝐑𝐋 𝐕𝐈𝐏 𝐄𝐒𝐂𝐎𝐑𝐓 Hello Guys ! High Profiles young Beauties and Good Looking standard Profiles Available , Enquire Now if you are interested in Hifi Service and want to get connect with someone who can understand your needs. Service offers you the most beautiful High Profile sexy independent female Escorts in genuine ✔✔✔ To enjoy with hot and sexy girls ✔✔✔ ★providing:- • Models • vip Models • Russian Models • Foreigner Models • TV Actress and Celebrities • Receptionist • Air Hostess • Call Center Working Girls/Women • Hi-Tech Co. Girls/Women • Housewife

VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...

SUHANI PANDEY

BigBuy dropshipping via API with DroFx.pptx

olyaivanovalion

Digital advertising, or paid media, encompasses the strategic deployment of online advertisements to reach target audiences efficiently and effectively. This includes any digital platform that supports advertising to deliver unique messages for any objective. Understanding the mechanics of digital advertising platforms, along with insights into audience behaviors and preferences, allows marketers to optimize their ad spend and achieve significant engagement and conversion rates. This lecture is for Advanced Digital & Social Media Strategy (MGMTX 466.05) at UCLA Extension.

Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...

Valters Lauzums

Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore Escorts Service Booking Contact Details :- WhatsApp Chat :- +91-7737669865 4-May-2024(SMW) Call Girls In Model Towh +91-7737669865 !! Best Woman Seeking Man Call Girls Service, Escorts Service in Home Hotel in NCR 24 Hours Available Service Call Girls, Contact Us +91-7737669865 (Any Time. Any Where) Call Girls in , Noida, Gurgaon, Ghaziabad,Sexy Indian Female Escorts Service NCRWelcome To Escorts Service – An All Over New Very Sexy Hot Call Girls Agency Service Escorts In South NCR’s No. 1 High Profile Independent Female Escorts Service. We Provide Good Quality Educated Profile At Very Regnebal Price 100% Safe And Original.We Are Provide Escorts Service All OYO Hotels ,3*,4*,5* Star Hotel And Home Flat, Apartment. Guest-House. Services In -Call And Out – Call Both Are Services Available. 24Hrs. Any Time Any Where. In All Over Noida Gurgaon Ghaziabad Faridabad.More Information And Contact Profile Real Pic Visit Our Website City Wise Escorts Service Agency.Good Looking Cheap And Best Models Girls U Can Get Best Click On Link……Night Call Girls Now In Hotel Le Meridien Gurgaon Near Female Escort One Shot — 5000/in call (time 1 hour), 6000/out call Two shot with one girl — 8000/in call (time 2 hour), 10000/out call Body to body massage with sex- 8000/in call (time 1 hour) Full night Service for one person– 12000/in call, 13000/out call (shot limit 3-4 shots) Full night Service for more than 1 person — please contact Us —7737669865 We are available 24*7 all days of the year. Call us — 7737669865 Thank you for Visiting.

Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...

amitlee9823

Accredited-Transport-Cooperatives-Jan-2021-Web.pdf

adriantubila

Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore Booking Contact Details :- WhatsApp Chat :- +91-7737669865 2-May-2024(SMW) Call Girls In Model Towh Bangalore +91-7737669865 !! Best Woman Seeking Man Call Girls Service, Escorts Service in Home Hotel in Bangalore NCR 24 Hours Available Service Call Girls, Contact Us +91-7737669865 (Any Time. Any Where) Call Girls in Bangalore, Noida, Gurgaon, Ghaziabad,Sexy Indian Female Escorts Service Bangalore NCRWelcome To Bangalore Escorts Service – An All Over New Bangalore Very Sexy Hot Call Girls Agency Service Escorts In South BangaloreNCRBangalore’s No. 1 High Profile Independent Female Escorts Service. We Provide Good Quality Educated Profile At Very Regnebal Price 100% Safe And Original.We Are Provide Escorts Service All OYO Hotels ,3*,4*,5* Star Hotel And Home Flat, Apartment. Guest-House. Services In -Call And Out – Call Both Are Services Available. 24Hrs. Any Time Any Where. In All Over Bangalore Noida Gurgaon Ghaziabad Faridabad.More Information And Contact Profile Real Pic Visit Our Website City Wise Escorts Service Agency.Good Looking Cheap And Best Models Girls U Can Get Best Click On Link……Night Call Girls Now In Hotel Le Meridien Gurgaon Near Female Escort One Shot — 5000/in call (time 1 hour), 6000/out call Two shot with one girl — 8000/in call (time 2 hour), 10000/out call Body to body massage with sex- 8000/in call (time 1 hour) Full night Service for one person– 12000/in call, 13000/out call (shot limit 3-4 shots) Full night Service for more than 1 person — please contact Us —7737669865 We are available 24*7 all days of the year. Call us — 7737669865 Thank you for Visiting.

Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...

amitlee9823

Sampling (random) method and Non random.ppt

Dr. Soumendra Kumar Patra

April 2024 - Crypto Market Report's Analysis

manisha194592

Recently uploaded (20)

Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...

➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...

Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure

Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...

Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...

Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand

Week-01-2.ppt BBB human Computer interaction

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE

FESE Capital Markets Fact Sheet 2024 Q1.pdf

Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...

5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed

Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand

VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...

BigBuy dropshipping via API with DroFx.pptx

Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...

Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...

Accredited-Transport-Cooperatives-Jan-2021-Web.pdf

Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...

Sampling (random) method and Non random.ppt

April 2024 - Crypto Market Report's Analysis

NLP Text Recommendation System Journey to Automated Training

1. NLP text recommender system: Journey to auto model training at scale Aditya Sakhuja Engineering Lead, Salesforce Einstein @sakhuja

2. Agenda ● Goal ● Scenario ● Approach & Metrics ● ML System Architecture ○ Recs Serving ○ Feature Engineering ○ Model Training ● ML System Evolution ● Training CI, Deployments & Rollbacks ● Cloud Native ● Challenges & Takeaways

3. Customer Service Agent Assist Goal

4. Agents rely on traditional search results for ﬁnding relevant answers to often long and time sensitive customer questions. Scenario

5. Approach

6. Business Metrics ● Agent Time to Resolution ● Agent Time spent per case ● Case-Article Attach Rate ● # of recommendations served ● MAO, MAU ● Serving Latency

7. ML System Architecture

8. Recommendations Serving Layer 1 : Candidate Generation ● NLP : Extract POS, NER, Noun and key terms from user query ● IR speciﬁc Query Formulation ● Candidates Generated Layer 2 : Ranking Model ● <question, article> pairwise feature generation ● Candidates evaluated by model ● Candidates above the threshold are recommended

9. Recommendations Serving

10. Data Prep & Feature Engineering ● Multi tenant data ingestion pipeline ● Data Cleansing and Sanity checks ● Precompute TDF, Corpus Statistics ● Feature Vectors computation ● 100+ of NLP features across different statistical feature categories ● Serving Training Drift

11. Model Training ● Ranking Model ● Auto tuned hyperparams ● Auto Model comparison ● Metrics ○ AUC ○ F-Measure ○ Precision, Recall ○ Hit Rate @K

12. Model Auto Training Pipeline

13. ML System Evolution version 0 ● Heuristic based answer recommendations POC. First pilot sign up. ● Communities use case: community selected bestAnswer, as positive label. ● Generic model trained on open source dataset Stanford SQuAD version 1 ● Ranking model : <question, answer> pairwise probability ● Notebooks based on-demand training ● Static conﬁgured data ﬁltering

14. ML System Evolution version 2 ● Dynamically conﬁgured training dataset attributes ● Model retraining ● Multilingual Support ● Multitenant Auto-trained models ● Observability ● Trained Model Deployments & Rollbacks

15. Model Deployment, CI & Rollbacks

16. Cloud Native Training

17. Challenges ● Data ○ Privacy and sharing compliances – GDPR, HIPAA, Accessibility ○ Freshness / Hydration ○ Handling encrypted data at rest and in motion ○ Too sparse, not meeting thresholds ○ Too dense, training performance SLA not met ● Custom, non standard ﬁelds and datatypes ● Building ML Infrastructure along the way ● Training Serving Skew ● Cold start problem

18. Takeaways ● Start small, Ship and Iterate ● Prioritize ML infrastructure ● Start with simple interpretable models ● Scale model learning to the size of your data ● Prioritize Observability ● Prioritize Data privacy over model quality

19. Thank you!

20. Feedback Your feedback is important to us. Don’t forget to rate and review the sessions.

NLP Text Recommendation System Journey to Automated Training

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to NLP Text Recommendation System Journey to Automated Training

Similar to NLP Text Recommendation System Journey to Automated Training (20)

More from Databricks

More from Databricks (20)

Recently uploaded

Recently uploaded (20)

NLP Text Recommendation System Journey to Automated Training