Jenkins is an open-source automation tool. It is generally used for doing automated builds, tests, and deployments. This alongside its ability to trigger actions on various types of events makes it a good CI/CD platform. It might also work well as a potential scheduling tool.
Accompanying Blog: https://blog.anant.us/data-engineers-lunch-15-introduction-to-jenkins/
Accompanying YouTube: https://youtu.be/3joAU4PJmhs
Sign Up For Our Newsletter: http://eepurl.com/grdMkn
Join Data Engineer’s Lunch Weekly at 12 PM EST Every Monday:
https://www.meetup.com/Data-Wranglers-DC/events/
Cassandra.Link:
https://cassandra.link/
Follow Us and Reach Us At:
Anant:
https://www.anant.us/
Awesome Cassandra:
https://github.com/Anant/awesome-cassandra
Email:
solutions@anant.us
LinkedIn:
https://www.linkedin.com/company/anant/
Twitter:
https://twitter.com/anantcorp
Eventbrite:
https://www.eventbrite.com/o/anant-1072927283
Facebook:
https://www.facebook.com/AnantCorp/
2. Jenkins Overview
● Jenkins is …
○ An open source automation server
■ Automated builds
■ Automated tests
■ Automated deployment
○ A CI/CD platform
■ Build triggers on code base update
○ A scheduling tool?
■ Build triggers on schedule
● Works on Windows, Linux, Mac, Docker, Kubernetes
4. Jenkins Plugins
● Added functionality
○ Can write and upload custom
plugins
○ Options to install most used
plugin on install
○ Can install, remove, upgrade
plugins from UI
6. Jenkins Pipeline (2)
● Define a series of stages
○ Executed one after another
■ Potentially using different build agent
■ Can execute linear workflows
○ Defined by name in a Jenkinsfile
■ Split into steps (a series of commands)
● Can run command line prompts
○ Ex. sh ‘python --version’
● Includes ‘wrapper’ steps like timeout or retry
■ Includes ‘post’ stage where commands can be run based on build outcome
8. Jenkins Build Triggers - Schedule
● Uses cron notation to schedule recurring builds
○ Supports common schedule aliases
■ @hourly, @daily, @midnight, @weekly, @monthly
○ Can define multiple schedules for irregular builds
○ Every <time-period> vs on <time-period>
■ /<number> or H/<number> builds every <time period>
● Ex. H/20 * * * * = every 20 minutes
■ <number> builds at that time
● 20 * * * * = 20 minutes into every hour
9. Jenkins Build Triggers - Schedule (2)
● Lists and ranges
○ Can list numbers to define multiple specific times to build
■ Ex. 20,40 * * * * = build at the 20 and 40 minute marks every hour
○ Can use ranges to build at every time interval between two numbers
■ Ex. 0 0 * * 1-5 = build at midnight every Monday-Friday
10. Jenkins Build Triggers - Other Projects
● Can set projects to build after other projects
○ Means we aren’t really limited to linear pipelines
■ Pros
● Can define DAGs of stages, steps
○ One build can trigger only after a number of other builds have completed
○ Multiple builds can trigger after a particular build has completed
■ Cons
● Must separate code bases (at least Jenkinsfiles)
● Must define all branches manually through build triggers
11. Comparison to Airflow
● Can both define DAGs of tasks
● Jenkins can’t dynamically define DAGs / Pipelines in code (unless code modifies Jenkinsfiles)
● Airflow DAG building only requires dependencies defined in code, Jenkins must be built into build
triggers
● Both can manually trigger
● Jenkins can’t trigger specific steps, while Airflow API allows running of specific tasks alone
● Easier to use Jenkins to execute arbitrary code, Airflow requires specific Python import to use
shell functionality