Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Go concurrency
1. Writing Concurrent Programs
robustly and productively with Go and zeromq
18 November 2014
Loh Siu Yin
Technology Consultant, Beyond Broadcast LLP
1 of 35
2. Traditional v. Concurrent
sequential (executed one after the other)
concurrent (executed at the same time)
concurrency -- reflects the way we interact with the real world
2 of 35
4. Concurrent languages
C with pthreads lib
Java with concurrent lib
Scala with actors lib
Scala with Akka lib
Go
Any difference between Go and the other languages?
4 of 35
6. Why Go for concurrent programs?
Go does not need an external concurrency library
Avoids, locks, semaphore, critical sections...
Instead uses goroutines and channels
6 of 35
7. Software I use to write concurrent programs:
Go -- Go has concurrency baked into the language.
zeromq -- zeromq is a networking library with multitasking functions.
For more info:
golang.org (http://golang.org)
zeromq.org (http://zeromq.org)
7 of 35
8. Go Hello World program
package main
import "fmt"
func main() {
fmt.Println("Hello Go!")
} Run
Why capital P in Println?
Println is not a class (like in java).
It is a regular function that is exported (visible from outside) the fmt package.
The "fmt" package provides formatting functions like Println and Printf.
8 of 35
9. Package naming in Go
package main
import "fmt"
func main() {
fmt.Println("Hello Go!")
} Run
If you import "a/b/c/d" instead of import "fmt" and that package "a/b/c/d" has
exported Println.
That package's Println is called as d.Println and not a.b.c.d.Println.
If you import "abcd", what is the package qualifier name?
9 of 35
10. Go Concurrent Hello World
package main
import (
"fmt"
"time"
)
func a() {
for {
fmt.Print(".")
time.Sleep(time.Second)
}
}
func b() {
for {
fmt.Print("+")
time.Sleep(2 * time.Second)
}
}
func main() {
go b() // Change my order
go a()
//time.Sleep(4 * time.Second) // uncomment me!
} Run
10 of 35
11. goroutine lifecylce
goroutines are garbage collected when they end. This means the OS reclaims the
resources used by a goroutine when it ends.
goroutines are killed when func main() ends. The killed goroutines are then garbage
collected.
goroutines can be run from functions other than main. They are not killed if that
function exits. They are killed only when main() exits.
The GOMAXPROCS variable limits the number of operating system threads that can
execute user-level Go code simultaneously. [default = 1]
At 1 there is no parallel execution,
increase to 2 or higher for parallel execution if you have 2 or more cores.
golang.org/pkg/runtime (http://golang.org/pkg/runtime/)
11 of 35
12. Go channels
Go channels provide a type-safe means of communication between:
the main function and a goroutine, or
two goroutines
What is:
a goroutine?
the main function?
12 of 35
13. func x()
func x() returns a channel of integers can can only be read from.
Internally it runs a goroutine that emits an integer every 500ms.
type MyInt int
func x() <-chan int {
ch := make(chan int)
go func(ch chan<- int) {
// var i MyInt // Make i a MyInt
var i int
for i = 0; ; i++ {
ch <- i // Send int into ch
time.Sleep(500 * time.Millisecond)
}
}(ch)
return ch
} Run
Demo type safety. Use MyInt for i.
13 of 35
14. func y()
func y() returns a channel of integers can can only be written to.
All it does is run a goroutine to print out the integer it receives.
func y() chan<- int {
ch := make(chan int)
go func(ch <-chan int) {
for {
i := <-ch
fmt.Print(i, " ")
}
}(ch)
return ch
} Run
14 of 35
15. Go channels 1
package main
import (
"fmt"
"time"
)
type MyInt int
func x() <-chan int {
ch := make(chan int)
go func(ch chan<- int) {
1 of 2 ...
// var i MyInt // Make i a MyInt
var i int
for i = 0; ; i++ {
ch <- i // Send int into ch
time.Sleep(500 * time.Millisecond)
}
}(ch)
return ch
} Run
15 of 35
16. Go channels 2
func y() chan<- int {
ch := make(chan int)
go func(ch <-chan int) {
2 of 2
for {
i := <-ch
fmt.Print(i, " ")
}
}(ch)
return ch
}
func main() {
xch := x() // emit int every 500 ms
ych := y() // print the int
for {
select {
case n := <-xch:
ych <- n // send it you ych for display
case <-time.After(501 * time.Millisecond): // Change me
fmt.Print("x")
}
}
} Run
16 of 35
17. Synchronizing goroutines
n gets an integer from xch and pushes it to ych to be displayed.
func main() {
xch := x() // emit int every 500 ms
ych := y() // print the int
for {
select {
case n := <-xch:
ych <- n // send it you ych for display
case <-time.After(501 * time.Millisecond): // Change me
fmt.Print("x")
}
}
} Run
What if the source of data, xch, is on a different machine?
Go can't help here. There is no longer a network channel package.
Rob Pike (one of the Go authors) said that he didn't quite know what he was doing...
17 of 35
18. zeromq Networking Patterns
Pub/Sub Many programs can pub to a network endpoint. Many other programs can
sub from that endpoint. All subscribers get messages from mulitple publishers.
Req/Rep Many clients can req services from a server endpoint which rep with
replies to the client.
Push/Pull Many programs can push to a network endpoint. Many other programs
can pull from that endpoint. Messages are round-robin routed to an available
puller.
18 of 35
19. Using zeromq in a Go program
Import a package that implements zeromq
19 of 35
20. zeromq Pusher
package main
import (
"fmt"
zmq "github.com/pebbe/zmq2"
)
func main() {
fmt.Println("Starting pusher.")
ctx, _ := zmq.NewContext(1)
defer ctx.Term()
push, _ := ctx.NewSocket(zmq.PUSH)
defer push.Close()
push.Connect("ipc://pushpull.ipc")
// push.Connect("tcp://12.34.56.78:5555")
for i := 0; i < 3; i++ {
msg := fmt.Sprintf("Hello zeromq %d", i)
push.Send(msg, 0)
fmt.Println(msg)
} // Watch for Program Exit
} Run
20 of 35
21. zeromq Puller
func main() {
fmt.Println("Starting puller")
ctx, _ := zmq.NewContext(1)
defer ctx.Term()
pull, _ := ctx.NewSocket(zmq.PULL)
defer pull.Close()
pull.Bind("ipc://pushpull.ipc")
for {
msg, _ := pull.Recv(0)
time.Sleep(2 * time.Second) // Doing time consuming work
fmt.Println(msg) // work all done
}
} Run
This puller may be a data-mover moving gigabytes of data around. It has to be
rock-solid with the program running as a daemon (service) and never shut down.
Go fits this description perfectly! Why?
In addition, Go has a built-in function defer which helps to avoid memory leaks.
21 of 35
22. msg, _ := pull.Recv(0)
functions in Go can return multiple values.
the _ above is usually for an err variable. Eg. msg,err := pull.Recv(0)
a nil err value means no error
if you write msg := pull.Recv(0) [note: no _ or err var], the compiler will fail the
compile with an error message (not a warning).
typing _ forces the programmer to think about error handling
msg,err := pull.Recv(0)
if err != nil {
fmt.Println("zmq pull:", err)
}
22 of 35
23. zeromq Controller/Pusher in ruby
#!/usr/bin/env ruby
require 'ffi-rzmq'
puts "Starting ruby pusher"
ctx = ZMQ::Context.new(1)
push = ctx.socket(ZMQ::PUSH)
push.connect("ipc://pushpull.ipc")
# push.connect("tcp://12.34.56.78:5555")
(0..2).each do |i|
msg = "Hello %d" % i
push.send_string(msg)
puts(msg)
end
push.close()
ctx.terminate() Run
The pusher may be a controller that is in active development -- requiring frequent
code updates and restarts. With zeromq, we can decouple the stable long-running
process from the unstable code being developed. What is the advantage of this?
23 of 35
24. Putting it all together
email_mover (puller) has two slow tasks: email and move big data.
24 of 35
25. email_mover
func main() {
fmt.Println("Starting email_mover")
z := zmqRecv()
e := emailer()
m := mover()
for {
s := <-z
e <- s
m <- s
// report when done to 0.1ms resolution
fmt.Println(time.Now().Format("05.0000"), "done:", s)
}
} Run
25 of 35
27. emailer goroutine
func emailer() chan<- string {
ch := make(chan string, 100) // buffered chan
go func(ch <-chan string) {
for {
s := <-ch
time.Sleep(1 * time.Second)
fmt.Println("email:", s)
}
}(ch)
return ch
} Run
27 of 35
28. mover goroutine
func mover() chan<- string {
ch := make(chan string, 100) // buffered chan
go func(<-chan string) {
for {
s := <-ch
time.Sleep(3 * time.Second)
fmt.Println("move:", s)
}
}(ch)
return ch
} Run
28 of 35
29. email_mover main()
Why do we need buffered channels in emailer() and mover() and not for zmqRecv()?
func main() {
fmt.Println("Starting email_mover")
z := zmqRecv()
e := emailer()
m := mover()
for {
s := <-z
e <- s
m <- s
// report when done to 0.1ms resolution
fmt.Println(time.Now().Format("05.0000"), "done:", s)
}
} Run
email_mover is a stable service in production written in Go.
This service should never stop, leak memory or lose data.
29 of 35
30. Go Pusher (not changed)
package main
import (
"fmt"
zmq "github.com/pebbe/zmq2"
)
func main() {
fmt.Println("Starting pusher.")
ctx, _ := zmq.NewContext(1)
defer ctx.Term()
push, _ := ctx.NewSocket(zmq.PUSH)
defer push.Close()
push.Connect("ipc://pushpull.ipc")
// push.Connect("tcp://12.34.56.78:5555")
for i := 0; i < 3; i++ {
msg := fmt.Sprintf("Hello zeromq %d", i)
push.Send(msg, 0)
fmt.Println(msg)
} // Watch for Program Exit
} Run
30 of 35
31. ruby Pusher (not changed)
#!/usr/bin/env ruby
require 'ffi-rzmq'
puts "Starting ruby pusher"
ctx = ZMQ::Context.new(1)
push = ctx.socket(ZMQ::PUSH)
push.connect("ipc://pushpull.ipc")
# push.connect("tcp://12.34.56.78:5555")
(0..2).each do |i|
msg = "Hello %d" % i
push.send_string(msg)
puts(msg)
end
push.close()
ctx.terminate() Run
31 of 35
32. email_mover maintenance
email_mover sends emails and moves files. These are well understood, stable
functions. This service should never be shutdown.
However, hardware needs maintenance. How do we swap-in a new email_mover
without losing data?
zeromq to the rescue!
"ØMQ ensures atomic delivery of messages; peers shall receive either all message parts
of a message or none at all."
32 of 35
33. Maintenance Procedure:
Disconnect the network cable from the first email_mover host. zeromq messages
will not begin to queue up at the pushers. Because zeromq message delivery is
atomic, no data is lost.
Connect the network cable to the new email_mover host. zeromq messages begin
to flow again.
Job done!
33 of 35
34. Why is data not lost?
Receive
The half-received message / data packet received by the old host was deemed as not
received by zeromq and is discarded.
Send
The half-sent message / data packet that was interrupted when the connection was
broken was detected by zeromq as not delivered and will be sent again. This time to
the new host.
34 of 35
35. Thank you
Loh Siu Yin
Technology Consultant, Beyond Broadcast LLP
siuyin@beyondbroadcast.com (mailto:siuyin@beyondbroadcast.com)
35 of 35