Shai Rosenfeld - Such Blocking, Very Concurrency, Wow

ScaleConf

20 Jun 201442:19

Summary

TLDRThe speaker discusses various models of concurrency and parallelism, highlighting their similarities and differences. They delve into concepts such as threads, mutexes, message passing, and shared memory, as well as different scheduling methods like preemptive and cooperative. The talk emphasizes the importance of understanding these models to effectively handle the complexity of modern computing systems, where doing more things at once is crucial for scaling and performance.

Takeaways

📈 The talk focuses on various models of concurrency and parallelism, explaining their similarities and differences.
🔄 Concurrency is about dealing with multiple things at once, while parallelism is about doing multiple things at once.
💡 Rob Pike differentiates concurrency and parallelism, with concurrency being about structuring programs to do more things, and parallelism about executing them simultaneously.
🌟 Joe Armstrong's analogy explains concurrency as one coffee machine with two queues, and parallelism as two coffee machines with two queues each.
🔧 Operating systems use threads and processes as the primary mechanisms for executing work.
🔄 Scheduling can be either preemptive, where the system can interrupt tasks at any time, or cooperative, where tasks yield control to others explicitly.
🤝 Concurrency models need to coordinate execution, often dealing with atomicity and ensuring data consistency.
🔒 Mutexes and locks are commonly used in concurrency models to manage access to shared resources and prevent data corruption.
📨 Message passing and shared memory are two primary methods of communication between executing entities in concurrent systems.
🚀 Different concurrency models have their advantages and disadvantages, and the choice of model depends on the specific requirements and constraints of the problem at hand.
🎓 The subject of concurrency is vast and complex, with each model deserving deep exploration and understanding.

Q & A

What is the main topic of the discussion?
-The main topic of the discussion is concurrency and scaling, with a focus on different models and their applications in handling multiple tasks simultaneously.
What is the difference between concurrency and parallelism as explained in the transcript?
-Concurrency is about dealing with a lot of things at once, structuring a program to do more things, whereas parallelism is about doing a lot of things at once. Parallelism is a condition that arises when two things are happening literally at the same second, such as running two processes or two threads on a multi-core computer.
What are the two primary ways of executing tasks as mentioned in the transcript?
-The two primary ways of executing tasks are preemptive scheduling, where the system reserves the right to interrupt any task at any time, and cooperative scheduling, where tasks voluntarily yield control to other tasks.
What is the significance of atomicity in the context of concurrency?
-Atomicity is significant in concurrency because it is the fundamental reason why concurrency is challenging. Non-atomic operations can lead to issues like race conditions and data corruption when multiple threads or processes try to modify shared data simultaneously.
How does the use of mutexes address the problem of atomicity?
-Mutexes, or mutual exclusion locks, are used to ensure that only one thread can access a shared resource at a time, thus preventing other threads from interfering and ensuring atomicity of operations.
What are the advantages and disadvantages of using threads with mutexes?
-Advantages include widespread availability and familiarity among programmers. Disadvantages include potential issues with deadlocks, live locks, and the overhead of managing locks.
How does message passing differ from shared memory communication?
-Message passing involves sending and receiving messages between processes or threads, which can avoid the complexity of shared memory management. Shared memory involves writing and reading from shared memory locations, which can lead to atomicity issues if not properly managed.
What is the actor model as explained in the transcript?
-The actor model is a concurrency model that uses message-passing to manage state and communication between entities called actors. Each actor has its own state and mailbox for receiving messages, and they communicate by sending messages to each other's mailboxes.
What are the benefits of using the actor model?
-Benefits include the ability to express concurrency in a concise and object-oriented manner, the avoidance of shared state which reduces the risk of corruption, and the natural fit for distributed systems due to the message-passing nature of actors.
How does event-driven programming differ from other concurrency models?
-Event-driven programming is single-threaded and relies on callbacks and events to handle asynchronous input, without the need for multi-threading or multi-processing. It's typically used in scenarios where resource management is crucial, such as handling a large number of concurrent connections.
What are the challenges of implementing event-driven programming?
-Challenges include the difficulty in debugging due to the loss of context and call stack, the potential for callback hell, and the reliance on state management to navigate the program flow.

Outlines

00:00

🔗 Introduction to Concurrency and Scaling

The speaker introduces the dense topic of concurrency in computing, encouraging the audience to follow along with detailed slides available online. The discussion is set to explore scaling and concurrency, crucial for performing multiple operations simultaneously. The speaker works at Engine Yard but will focus on general scaling rather than specific services. Cats are humorously included in the presentation as a nod to typical tech talks. The overall aim is to map out various concurrency models and their practical applications.

05:01

🔄 Understanding Concurrency and Parallelism

This section dives deeper into the definitions and distinctions between concurrency and parallelism. The speaker clarifies these concepts through various authoritative views, including Rob Pike's differentiation: concurrency as handling many things at once and parallelism as doing many things at the same time. The segment includes analogies like queues at coffee machines to make these concepts more relatable and easier to grasp for the audience.

10:01

📊 Models of Concurrency: Threads and Mutexes

The discussion moves to practical models of concurrency, starting with the basic 'threads and mutexes'. This model involves threads (units of execution) and mutexes (locks that prevent simultaneous execution of critical sections of code) to manage atomicity—ensuring operations complete in whole steps without interruption. The speaker explains how mutexes help prevent data corruption by controlling the sequence of thread operations, highlighting both the advantages and limitations of this approach.

15:03

🔄 Advanced Concurrency: Threads and Transactions

Exploring more sophisticated concurrency mechanisms, the speaker introduces 'threads and transactions', which uses a transactional memory approach to manage shared data. This model provides an 'optimistic' strategy, where operations assume no data conflicts and only rollback if conflicts occur, similar to database transactions. This section discusses the benefits of increased concurrency and better composability, along with the drawbacks of potential rollback complexities.

20:04

🔮 Futures and Promises: Asynchronous Programming

The talk progresses to 'futures and promises', a model that allows for asynchronous programming. This approach involves using 'futures' to represent pending results and 'promises' as guarantees of future values. The model facilitates non-blocking operations, where tasks can proceed without waiting for other tasks to complete, thus enhancing efficiency. The speaker covers the advantages of abstraction from direct locking mechanisms and the challenges of potential delays in execution.

25:05

🔁 Processes and Inter-process Communication (IPC)

Focusing on 'processes and inter-process communication (IPC)', this segment discusses the use of separate processes to perform concurrent tasks, communicating via messages rather than shared memory. This model is especially useful in systems where processes operate independently, reducing the risks of data corruption and deadlock. The speaker explains the use of IPC for scalability and the potential high costs associated with process management.

30:06

🔄 Communicating Sequential Processes (CSP) and Actors

The presentation examines CSP and the actor model, both of which manage concurrency through message passing between independent agents or 'actors'. Each actor processes messages sequentially, ensuring that operations are isolated and reducing the chances of data conflict. This model is highlighted for its scalability and robustness in systems like telecoms where reliability is critical. The speaker emphasizes the flexibility and potential complexity of implementing these models.

35:07

🔄 Coroutines and Event-driven Programming

The speaker discusses 'coroutines' and 'event-driven programming', emphasizing their utility in scenarios where non-blocking operations are critical, such as user interfaces and network servers. Coroutines allow for cooperative task management, while event-driven architectures handle tasks as they become actionable, leading to efficient resource use. The segment covers the strengths and weaknesses of these models, particularly in contexts requiring high concurrency with minimal overhead.

40:08

🔗 Recap and Future Directions in Concurrency

Concluding the presentation, the speaker reiterates that no single concurrency model is universally best, urging the audience to consider the specific needs and constraints of their projects. The talk aims to expand the listeners' understanding of concurrency, providing a foundation for further exploration and innovation in this complex field. The session ends with a Q&A, emphasizing interactive learning and the ongoing development of concurrency technologies.

Mindmap

Keywords

💡Concurrency

Concurrency refers to the concept of dealing with multiple tasks or processes simultaneously. In the context of the video, it is a fundamental aspect of computer programming where the speaker discusses various models and techniques to handle concurrent operations. The speaker illustrates this with examples such as threads and mutexes, and the challenges that arise from ensuring atomicity in operations.

💡Parallelism

Parallelism is the practice of executing multiple processes or tasks at the same time, typically across different processing units like multi-core CPUs. While similar to concurrency, parallelism focuses on the actual simultaneous execution of tasks. The speaker differentiates between concurrency and parallelism using the analogy of a single coffee machine versus two coffee machines, where the latter represents parallel processing.

💡Mutex

A mutex, short for mutual exclusion, is a locking mechanism used in concurrent programming to ensure that only one process can access a shared resource at a time. Mutexes are essential in preventing race conditions and ensuring data integrity during concurrent operations. The speaker explains that using a mutex can help in structuring a program to handle multiple tasks without corrupting shared data.

💡Race Condition

A race condition occurs in concurrent programming when the outcome of the program depends on the relative timing or interleaving of events, such as the execution of threads. It often leads to unexpected behavior because the operations are not atomic, and the sequence of operations can vary with each execution. The speaker discusses the importance of atomicity in preventing race conditions.

💡Event Loop

An event loop is a programming construct that waits for events and dispatches them to event handlers or callbacks. It is central to event-driven and asynchronous programming models, allowing non-blocking operations. The speaker mentions event loops in the context of evented programming, where it processes events and associated callbacks without blocking the main thread of execution.

💡Callback

A callback is a function that is passed as an argument to another function and is intended to be executed at a later time. In the context of evented programming, callbacks are used to handle events once they occur. The speaker discusses the use of callbacks in event handlers, where they are invoked when specific events are detected by the event loop.

💡Actor Model

The actor model is a mathematical model for concurrent computation in which parallelism is expressed in terms of message passing between entities called actors. Each actor is an independent unit with its own state and behavior, and communicates with other actors by sending and receiving messages. The speaker explains the actor model as a concurrency model that uses message passing through mailboxes, similar to CSP but with actors having identities.

💡Message Passing

Message passing is a method of communication between processes or threads where a message is sent from one entity to another, and the recipient executes actions based on the content of the message. It is a fundamental concept in concurrent programming that helps avoid shared state and the associated synchronization issues. The speaker emphasizes message passing as a means of communication in models like CSP and the actor model.

💡Shared Memory

Shared memory is a method of inter-process communication where multiple processes have access to the same region of memory. It allows for efficient data sharing but requires careful synchronization to avoid concurrency issues. The speaker discusses shared memory as a means of communication between threads and processes, and the challenges it presents in ensuring atomic operations.

💡Preemptive Scheduling

Preemptive scheduling is a method used by an operating system to manage computer processes, where the OS can interrupt a currently running process to give CPU time to another process. This allows for multitasking and efficient use of system resources. The speaker contrasts preemptive scheduling with cooperative scheduling, highlighting how it can lead to concurrency and the need for proper synchronization mechanisms.

💡Cooperative Scheduling

Cooperative scheduling is a method of task scheduling where tasks voluntarily yield control to other tasks, rather than being interrupted by the scheduler. This approach requires each task to release control, often using a form of a 'yield' or 'sleep' function. The speaker discusses cooperative scheduling in the context of evented programming and coroutines, where tasks pass control to each other to avoid blocking the main execution thread.

Highlights

Concurrency and parallelism are distinct concepts; concurrency is about dealing with many things at once, while parallelism is about doing many things at once.

Rob Pike, creator of Go, differentiates concurrency as structuring programs to do more things, and parallelism as the occurrence of many things happening at once.

Joe Armstrong, creator of Erlang, explains concurrency and parallelism with the analogy of one coffee machine with two queues versus two coffee machines with two queues.

The fundamental challenge in concurrency is atomicity; ensuring operations are executed in a single, uninterruptible run is crucial for data integrity.

Operating systems execute work through threads and processes, with threads being either native OS threads or green threads within a virtual machine.

Scheduling mechanisms for executing tasks include preemptive scheduling, where the OS can interrupt tasks, and cooperative scheduling, where tasks voluntarily yield control.

Concurrency models can be broadly categorized by their communication methods: shared memory, message passing, and channels.

The use of mutexes (locks) is a common method to handle atomicity in concurrent programming, ensuring that shared data is accessed by only one thread at a time.

Software transactional memory (STM) provides an optimistic approach to concurrency, allowing for rollbacks in case of conflicts, similar to database transactions.

Message passing through channels or futures and promises is a concurrency model that abstracts away the need for locks, handling communication between executing entities.

Inter-process communication (IPC) is a method for separate processes to communicate, often avoiding shared memory to prevent corruption and simplify atomicity management.

Communicating Sequential Processes (CSP) is a model that uses channels for message passing, influencing languages like Go, which heavily relies on channels for concurrency.

The Actor model is similar to CSP but uses actors with individual mailboxes for message passing, making it akin to object-oriented threads.

Coroutines are single-threaded, cooperative concurrency models that allow for pausing and resuming execution, useful for managing local state without locks.

Event-driven programming is a concurrency model that uses a single execution thread, relying on callbacks, event handlers, and an event loop to manage non-blocking operations.

Event-driven programming can scale well in terms of resource usage, as seen in web servers like Nginx, which handle many connections with fewer threads than traditional models.

The choice of concurrency model should be influenced by the specific needs of the application, considering factors like simplicity, scalability, and the potential for parallelism.

Transcripts

00:00

all right uh so such blocking very

00:02

concurrency

00:03

wow that's uh what i'm going to talk

00:05

about um so the

00:07

the slides are going to be pretty dense

00:08

today i'm going to talk about a lot of

00:10

material so if you guys

00:11

um yesterday when i was sitting here i

00:13

had internet so if you guys want to pull

00:14

this up on your

00:15

laptop or ipad or whatever you can pull

00:18

this up it's

00:18

this link it's basically my name and

00:21

then

00:22

sb vc such blocking very concurrency so

00:24

you can

00:25

pull that up and follow along so like

00:28

jonathan says i work at engine yard it's

00:30

a company that basically

00:32

provisions stuff like windows azure and

00:35

amazon and provides you with a

00:38

ready stack to use your application with

00:40

so you just

00:41

it's basically similar to heroku and

00:43

other platforms as the services

00:45

that you just push your application

00:47

there similar to google app engine as

00:48

well

00:49

you just push your application and then

00:50

it kind of works but i'm not going to

00:52

talk about

00:53

engineer today i'm going to talk about

00:55

cats

00:57

and because every tech talk needs a cat

00:59

and actually the other day when we were

01:00

at that beer

01:02

tap place jonathan said that we should

01:05

put

01:05

every tech talking is a cat in it so i

01:07

put some images here for you to

01:09

enjoy there's a cat there's another cat

01:23

all right well i really want to talk

01:24

about scaling because this is scale conf

01:26

so it's actually not going to be related

01:27

to engine yard

01:28

what i want to talk about is uh scaling

01:31

and concurrency

01:32

so really what what is scaling right we

01:35

have a certain set amount of time to do

01:36

something

01:37

we have these things that we need to do

01:39

and then we need to do more of them

01:41

and that's what scaling is right so

01:42

scaling is really doing more things at

01:44

once

01:46

so when i was invited to come here i

01:48

decided okay i'm going to talk about all

01:49

the ways you can do more things

01:51

okay i'm going to talk about all the

01:53

talk about all the things

01:54

that you can do and to accomplish this

01:57

goal to do more things

01:59

so i'm going to map them all out right

02:00

and i decided to talk about concurrency

02:02

and parallelism because apparently

02:03

they're different

02:04

and i kept on kind of looking into it

02:06

and i realized well there's like a ton

02:07

of

02:08

models it wasn't really going to put me

02:09

off i was going to talk about all the

02:11

things

02:12

and there's just a ton of calculus and

02:14

all these you know mathematical models

02:15

and it just keeps on going and going and

02:17

going so i decided you know maybe i'm

02:18

not going to talk about everything in

02:20

depth

02:20

but i do i do want to to talk about

02:23

concurrency models

02:24

so what i ended up deciding on is i'm

02:26

going to kind of do a tour

02:28

of the different models that exist i'm

02:30

going to talk about

02:31

some general concepts that all these

02:33

models kind of share

02:35

and then i'm going to go through some of

02:36

the models and talk about them and kind

02:37

of

02:38

iterate some advantages and

02:39

disadvantages that each one of these

02:41

things have

02:42

and then i'm going to let you sit with

02:43

it in your own head and figure it out

02:45

because really

02:46

this is a huge subject it's i'm

02:48

surprised myself that i put all this

02:50

information in one in one presentation

02:52

because really even one specific model

02:54

you can spend a whole lifetime on

02:56

right this is it's a really big subject

02:58

and so i'm

02:59

going to leave it for you guys to to dig

03:01

in further

03:03

so talk about uh start with the the

03:06

general concepts which are the common

03:07

ingredients that all these models share

03:09

and what i've kind of mapped it out to

03:10

is the following things that i'm going

03:12

to start

03:13

just diving into so concurrency versus

03:15

parallelism are apparently different

03:17

right so what what is the difference

03:19

i found this on the internet says on the

03:22

haskell blog and it says not all

03:23

programmers agree on the meaning of the

03:24

terms parallelism and concurrency

03:26

they may define them in different ways

03:27

or do not distinguish them

03:29

at all and so i decided okay well let's

03:31

ask the authority figures

03:33

what what is going on and so rob pike

03:35

created go was a pretty good authority

03:37

figure for concurrency

03:38

says that concurrency is about dealing

03:40

with a lot of things at once and

03:41

parallelism is about doing a lot of

03:42

things at once

03:44

and they're they're similar but they're

03:46

a little different and he basically

03:47

talks

03:47

in this um this talk that he gave in san

03:50

francisco it's a really good talk and

03:52

you're going to have resources and links

03:53

that you guys can check out on the

03:54

slides but he basically says that

03:55

concurrency

03:56

is is structuring your program your

03:58

program to do more things

04:00

whereas parallelism is is the fact that

04:02

it happens a lot at once

04:03

and so the other authority figure i

04:05

decided to ask was monty python

04:07

figures and actually it was joe

04:09

armstrong

04:11

so erling is another highly concurrent

04:13

language and he

04:14

says when concurrency is two queues one

04:16

coffee machine

04:17

and parallelism is two queues two coffee

04:19

machines that's how he decides to

04:20

explain it to a five-year-old

04:22

it's a similar it's a similar concept

04:24

you can see parallelism is is happening

04:26

at the same time whereas concurrency is

04:27

just the way you structure it to have

04:29

two lines

04:29

happen at the same time but you still

04:31

only have this one

04:33

one coffee machine and so the way that

04:35

that it made sense to me

04:36

was that parallelism is really a

04:38

condition that arises when two things

04:39

are happening literally at the same

04:41

second so for example if you run

04:43

two processes or two threads on a

04:44

multi-core computer and they're

04:46

executing at the same second that's

04:47

parallel whereas concurrency is kind of

04:49

it's almost like a super set

04:51

of parallelism it doesn't have to be

04:52

parallel but it can be

04:54

and that's it's not entirely true

04:56

because some i've i've also read that

04:58

some people say that there's you know

04:59

there are certain things that are

05:00

parallel and not concurrent but i think

05:02

it's more of like a

05:03

philosophical programming kind of debate

05:06

thing so

05:08

so operating system mechanisms are we

05:10

have things that we need to execute to

05:11

do work right so how do we execute these

05:13

things

05:14

so really it's just threads and

05:15

processes right there's not there's not

05:16

much else to it

05:18

you have a regular operating system

05:19

process a linux task

05:21

and we have threads native operating

05:23

system threads

05:24

again same thing but they they just

05:26

share memory instead of

05:27

being contained self-contained and then

05:30

there are green threads

05:31

and virtual machine processes these are

05:33

basically kind of user level threads

05:36

that are done within the virtual machine

05:37

that you're running

05:38

or the interpreter and they're they're

05:40

similar to regular threads except

05:42

they're not on the operating system

05:43

level and for the purpose of the talk

05:45

i'm just going to really uh split it

05:47

into

05:47

processes and threads and the reason i

05:49

put leonardo dicaprio there is because

05:52

it's a virtual machine process it's a

05:53

process within a process so it's kind of

05:55

conception

05:58

so scheduling so we have these these

06:00

executing things right process and

06:02

threads is what i'm going to refer to

06:03

now as an executing thing

06:05

and these executing things need to be

06:06

scheduled so how how do they run like

06:09

when do they run

06:10

so there's really only two ways you can

06:11

that this is usually done right there's

06:12

the preemptive way

06:13

and then there's the cooperative way and

06:15

the preemptive way

06:17

is basically saying we reserve the right

06:18

to attack anyone at any given time so

06:20

basically anything can happen

06:21

right you have one thing that's running

06:23

and then who knows what happens who

06:25

knows how long has passed something else

06:26

gets invoked at the same

06:27

you know immediately and so that's you

06:30

know interrupts our

06:31

preemptive and scheduling threads and

06:33

operating system are preemptive

06:34

and that's preemption whereas the

06:36

cooperative uh

06:38

way of scheduling is basically it's kind

06:40

of handing off the baton to something

06:41

it's like

06:42

fred is running fred is doing all this

06:44

stuff and then he stops he pauses and

06:45

says okay

06:47

i'm done it's now your turn to do

06:48

something and then joe picks up the

06:50

baton and starts

06:51

you know making hot potatoes or

06:53

something stops when he's done he passes

06:55

on to harry and then the circle kind of

06:56

goes around but they're all

06:58

they're all working together and this is

06:59

the cooperative way of scheduling

07:02

so really what what this is is that it

07:04

relies on the program or the thing

07:05

that's running the executing thing

07:07

that's running to say i'm done

07:08

it's now your turn

07:12

so this is probably the most important

07:14

um common

07:15

common ingredient that all these things

07:17

share because really we have all these

07:19

things that are executing they're

07:20

running either preemptively or

07:21

cooperative

07:21

cooperatively or whatever but they they

07:24

need to coordinate between them what's

07:25

happening right because they're all

07:26

doing work with the same stuff so how do

07:28

they how do we make sure

07:29

that these things don't trip over each

07:30

other this kind of leads me into the

07:32

concept of

07:33

atomicity and atomicity is basically

07:36

the the fundamental reason of why

07:38

concurrency is hard

07:40

right this is the reason that that you

07:43

know people do phds on this stuff

07:45

is because things are not atomic and the

07:47

reason that it can create havoc is

07:48

because if you you look at that example

07:50

right there

07:51

you'll see you know the first thread is

07:52

trying to increment a variable right in

07:54

this case we want two threads to

07:56

increment a variable the first thread is

07:57

going to read the variable from ram

07:59

it's going to push it into the cpu into

08:01

some cpu register make the calculation

08:03

to

08:03

you know up it by one and save its local

08:05

stack and then

08:06

the other thread gets preemptively

08:08

scheduled it reads the variable from ram

08:10

and if you can see the first thread

08:12

hasn't written it back to memory yet so

08:13

it reads it and it's zero

08:15

and then uh the first thread writes it

08:17

back to ram the second thread uh picks

08:19

up from where it's left it increments it

08:21

it's one

08:22

and it pushes it down to ram and it's

08:23

it's one whereas in reality this should

08:25

have been two

08:26

right that's a serious problem because

08:28

you can't trust your programs doing what

08:29

you're doing it's it's bad

08:30

right and um yeah it's really bad

08:34

people can die so the way that you

08:38

commute these things communicate with

08:39

each other

08:40

is there really two kind of main ways to

08:42

do it one is is shared memory

08:44

and shared memory is basically writing

08:45

some state or some value to

08:47

uh you know variable in in ram basically

08:50

and then the other executing thing looks

08:51

at that and decides what in it what it

08:53

needs

08:54

to do according to that value right now

08:55

shared memory and it's a pretty

08:56

efficient means

08:57

of passing data between between these

08:59

executing things

09:01

the other way is through message passing

09:03

and this is a

09:05

we'll get to it later but message

09:06

passing is essentially instead of

09:08

invoking a certain function

09:10

that needs to be done you you pass the

09:12

message of what needs to be done to some

09:14

kind of proxy and then that thing

09:16

decides to you know it could either be

09:18

invoked automatically you know at the

09:20

same time or it could be done later

09:22

or whatever but the the idea of message

09:24

passing is that you don't directly

09:25

invoke something you just

09:26

tell something to happen

09:29

and um and channels is kind of a subset

09:32

to me it's a subset of message passing

09:34

except that your interface is through a

09:36

stream so instead of uh you having to

09:38

like

09:38

send a certain message and and have to

09:40

deal with that your api is basically

09:42

like a file

09:42

and this is pretty pretty cool because

09:44

if you think about like the unix

09:45

paradigm of everything as a file

09:47

you know having a file stream api for

09:50

message passing is

09:50

is nice so

09:54

i'm going to talk about the model so

09:55

those were kind of the the general

09:57

concepts that we're going to use to

09:58

try and categorize all these different

10:01

things in a way that

10:02

that hopefully will be easier to kind of

10:04

grasp and then go deeper

10:06

you know on your own time so this is the

10:09

link again

10:10

um because the slides are going to be

10:12

you know pretty dense

10:13

if you haven't pulled it up you're

10:14

welcome to pull it up now

10:18

so uh this is the this table that took

10:21

me quite a while to put together

10:22

and it's trying to take all these uh

10:24

these things that we've talked about and

10:26

trying to classify all of the the models

10:28

that we have

10:29

and i'm going to uh iterate through

10:31

every one of these and you'll see

10:32

you know the um you'll see this the

10:35

table kind of row on the top so you can

10:37

refer to that

10:38

as to when i continue going along

10:41

so threads in mutex is is is the pretty

10:45

you know uh pretty fundamental kind of

10:47

concurrency

10:48

thing that you do i'm pretty sure most

10:50

of you guys know it yesterday

10:52

i noticed that most of you are back in

10:53

developers so i'm pretty sure you guys

10:55

are familiar with

10:56

most of these things um but threads in

10:58

mutexism i'm sure is probably

11:00

pretty pretty something you're pretty

11:01

familiar with so

11:03

so again how do we deal with like the

11:05

atomicity problem right how do we make

11:06

sure that data doesn't corrupt it well

11:07

use a mutex right use a lock

11:09

and um i'm just going to the table on

11:11

the top right there you can see so

11:14

this uses threads and mutex is kind of

11:16

the name of this this pattern

11:17

um it uses threads right and it's it's

11:20

preemptively scheduled because operating

11:22

system threads are promptly

11:23

scheduled and it uses shared memory as a

11:26

means to communicate

11:27

and it does it through locks but with

11:29

locks but shared memory is kind of

11:31

the method of communication and it's

11:34

concurrent and parallel that's what

11:35

those that cp thing

11:37

means means you can run it literally at

11:38

the same time on a multi-core computer

11:40

and all of these models are concurrent

11:42

so it's really question if it's parallel

11:44

uh parallel or not

11:45

and then mutex is an example for this

11:47

pattern so how do we deal with

11:48

atomicity well we just use a mutex right

11:51

we lock around we have that

11:52

incrementing thing that we want to make

11:54

sure equals two at the end

11:56

so thread one tries to increment that

11:58

counter

11:59

and then thread two tries to increment

12:01

it but it can't access that shared data

12:03

because there's a lock around it

12:05

and so when it reads that variable from

12:07

ram it will equals one and then

12:09

in the end when it's done it'll be two

12:10

two and that'll be good right that's

12:12

what we want

12:14

so some uh you know pros and cons it's

12:17

the biggest advantage is that it's it's

12:19

really i mean if you're on any platform

12:20

you have threads and mutexes

12:22

it's everywhere it's pretty common um

12:25

as a programmer you know you you get

12:27

familiarized with it pretty much in the

12:29

beginning it's

12:30

a pretty common pattern and that's

12:32

that's an advantage because there's a

12:33

lot of resources around it

12:34

but uh you know disadvantages is that

12:37

because you have to deal with locking

12:38

you run into all these issues

12:40

of you know live lock and and deadlock

12:42

and you know if you have one thread

12:43

waiting on another thread then they just

12:45

it stalls and your computer crashes and

12:47

you see the blue screen

12:47

right that's that's basically uh

12:49

deadlock

12:51

so the next model is threads and

12:53

transactions and it's similar to

12:55

uh threads and mutexes but it works a

12:58

little differently it's kind of actually

13:00

like a database transaction so

13:02

essentially this uses threads

13:03

it's preemptive and it uses shared

13:05

memory but instead of using locks it

13:06

uses a commit abort

13:08

semantic and stm's shared transactional

13:11

memory and that's kind of the

13:12

the paradigm that's used and so like i

13:15

said it's

13:16

it's basically like a data database

13:17

transaction and

13:19

essentially dealing with atomicity is

13:22

done by

13:23

by explicitly saying this is an atomic

13:25

block and then when it needs to commit

13:27

when it needs to write to memory it

13:28

checks to see if the data hasn't changed

13:31

and if it can it'll commit that that

13:32

data and if it doesn't

13:34

then it won't commit that data and it'll

13:35

roll back and all the computation that

13:37

it's done will be reversed

13:39

well nothing will be written basically

13:41

so it can import at any time it can roll

13:43

back this is very similar to database

13:44

transactions

13:46

and so this is kind of the example of if

13:47

we're iterating a variable so we would

13:49

read it

13:50

and then we kind of surround that's not

13:53

it

13:54

uh we surround the the block of code

13:57

that we would need to be atomic in this

13:58

like atomic block

13:59

and then either it will get written or

14:01

it won't get written and what's what's

14:02

nice about this is that

14:03

you know instead of locking if you lock

14:05

around a big data structure

14:07

the other thread might need to wait

14:09

whereas in reality if the let's say he

14:11

was only changing one specific

14:12

member of that data structure you don't

14:15

need to lock right you've just you've

14:16

wasted resources on trying to lock

14:18

and when reality you if we would have

14:19

used stm two of the things could have

14:21

done it at the same time and they both

14:22

would have committed

14:23

so it's kind of like it's an optimistic

14:25

approach because it goes into it takes

14:27

into account that

14:28

you know you can do all this stuff that

14:30

you might be wasting you might be

14:31

waiting on for for no reason

14:34

so the the that's kind of you know one

14:36

advantage is that it's it's a little bit

14:38

increased in currency compared to

14:39

threads and mutexes

14:40

and um you know i don't have much

14:42

experience with this but um

14:44

they they you know they say that there's

14:46

basically stm kind of composes a little

14:48

better so if you have two abstractions

14:49

that use

14:50

uh threads and mutexes and you try to

14:52

join them into another abstractions

14:54

you still might need to to use locking

14:56

to kind of not step over each other

14:58

whereas

14:58

stm is supposed to compose a little

15:00

better you can look at that link if you

15:03

want to get into that a little more um

15:04

but the the main disadvantage for stm is

15:07

that because you have rollbacks

15:08

you can't actually guarantee that a

15:10

chunk of code is going to complete

15:11

because it could roll back

15:13

so that's that's a disadvantage you know

15:15

if you have some operation that needs to

15:16

complete no matter what you know you

15:18

you have to make sure that there's a

15:19

possibility that you that might not

15:21

happen

15:24

okay so futures and promises these also

15:27

uh use threads and uh this is kind of a

15:29

somewhat of a cooperative

15:31

scheduling and it's it's a little bit of

15:33

a funky

15:34

funky pattern in terms of the

15:36

categorizations that i've tried to use

15:38

it doesn't fit in perfectly but it's a

15:41

it's a cooperative model it uses message

15:43

passing as in

15:44

what you're calling the future and the

15:46

promise is is the message

15:48

and um and it's paralyzable because it

15:50

uses threads

15:51

and some examples are data flow and oz

15:53

variables

15:54

or as programming language and dataflow

15:56

variables and this is kind of an example

15:58

you can you can see that um what we do

16:00

is we say you know pinger dot so this is

16:02

you know a pinger and a ponger right we

16:04

want to make sure that these two

16:05

executing things can communicate back

16:06

and forth

16:07

and so we'll have a pinger you

16:08

know.future and that future is a

16:10

reference to something that is executing

16:12

in the background and then we can do

16:15

whatever we want and then when we call

16:16

value we we block until we either get

16:18

that value or if it has already

16:20

completed we're just going to continue

16:22

continue through it right and that's a

16:23

future and a future is the reference to

16:25

something that's going to be evaluated

16:27

and then the promise

16:28

oh sorry that's the promise and then the

16:30

future is the that value

16:32

right and so it's cooperative because

16:35

this thing is

16:36

is basically pausing until that other

16:38

thing is done right

16:39

and it uses message passing because that

16:42

that reference to that variable is

16:43

is that is that message right

16:47

so what's nice about this is it kind of

16:48

abstracts the the notions of locks away

16:50

so you as a programmer don't really have

16:52

to deal with locks all you do is

16:54

use futures and promises and that

16:55

abstraction of oh am i going to step

16:57

over myself is

16:58

is hidden because of the framework of

17:00

promises and futures

17:02

and um the disadvantage is similar to

17:04

locking which i didn't actually put

17:06

there but

17:06

essentially you know if you're not ready

17:08

to to get that data back like you're

17:10

going to block and

17:12

you're basically wasting time blocking

17:13

on that futures value if it hasn't

17:15

completed yet

17:16

all right so uh processes and

17:20

inter-process communication is actually

17:22

i was when i was putting these slides

17:23

together i was thinking oh maybe i

17:24

should put this before threads and

17:25

mutexes because this is this is

17:26

really the like canonical oh i want to

17:28

do more things at once i'm just going to

17:30

run another process of this thing

17:31

right and inter-process communication

17:33

uses processes it's pre-emptive because

17:35

just like

17:36

threads the operating system can run any

17:38

process at any given time

17:39

and i put shared memory because you

17:42

could use shared memory with processes

17:44

but really it's about message passing

17:46

and um so how do we how do we handle the

17:49

problem of atomicity with processes well

17:51

we just we don't share memory right and

17:52

then we don't really have to think about

17:54

uh what happens you know how are we

17:56

going to make sure that that the

17:57

the thing that we're trying to access

17:59

that shared thing

18:01

is going to be okay well we just don't

18:02

share anything if we don't share

18:03

anything

18:04

then we're all good right we don't have

18:05

to we don't have to worry about that

18:07

but then we run into issue of oh how do

18:08

we communicate well let's pass message

18:11

let's pass messages right and so i

18:14

take the opportunity to say well message

18:16

passing the ipc

18:17

is is heavily used with with channels

18:19

right like sockets

18:21

and pipes and all this stuff is

18:22

basically message passing through

18:23

channels

18:24

and um i found this quote which which i

18:26

i liked a lot and

18:28

let's just read it passing data through

18:30

a channel to a receiver implies

18:32

transfer of ownership of the data it's

18:34

important to to kind of grasp that

18:35

because

18:36

what you're doing is by passing a

18:38

message to through a channel to someone

18:40

or actually just message passing

18:42

you're saying well this shared thing

18:44

that we need to both you know

18:46

uh mutate or whatever well it's now your

18:48

turn

18:49

it's now your turn to do something i'm

18:50

done with it and that's how we we know

18:53

that like if we did have an atomicity

18:55

problem well we're communicating by

18:56

saying

18:57

it's it's your turn now and i'm done

18:58

like you can increment that variable and

19:00

i'm not gonna you know i've already

19:01

written it to memory for example

19:03

so anyway regarding uh processes in ipc

19:06

so

19:06

pipes i mean every time you open a shell

19:08

you're using inter-process communication

19:10

and channels and yes some pseudo code

19:13

you could do something like this right

19:15

the internet is processes and ipc

19:18

it's a huge concurrent system it's just

19:20

kind of cool

19:22

so some pros and and cons uh again you

19:25

you don't share any memories so you

19:27

can't corrupt it it's pretty easy

19:29

and because you have this this uh this

19:31

advantage you know you don't have to

19:32

deal with locking so you don't have any

19:33

other problems with locking

19:35

and uh the real biggest advantage with

19:36

this is that you can kind of scale it

19:38

horizontally right you can just add

19:40

boxes and boxes and boxes and boxes and

19:42

more and more computers

19:43

and just spawn off more processes and

19:45

you're good to go right

19:47

but then at the same the flip side of

19:48

that coin is that you know it could get

19:49

really expensive to use processes

19:51

whereas if

19:52

you could save a lot more ram and and

19:54

money by just using threads for example

19:57

also you know if you if you in your

20:00

program you actually need to spawn

20:01

threads and processes and not use you

20:03

know some pretty

20:04

pool of things you know that could take

20:06

more time because spawning a process is

20:07

more expensive than

20:08

than threads so

20:12

uh next up is uh csp so csp is

20:14

communicating sequential processes it's

20:16

a paper that was written in like the

20:17

1970s

20:18

you know someone has told me you know

20:21

everything in computer science was

20:22

created in the 1970s like since then

20:24

it's like you know you just repeat it

20:25

gets repeated but everything was created

20:27

in the 1970s

20:28

and so basically this could use threads

20:31

or processes csp is more of a

20:33

theoretical

20:34

paper but nowadays what is heavily used

20:36

that is influenced by csp is go

20:39

and if you um see that talk that i

20:41

linked with rob

20:42

pike beforehand and some other talks

20:44

that he's given he he talks a lot about

20:46

how csp has influenced you know creating

20:50

go basically

20:51

and go uses channels really heavily and

20:52

that's what uh csp talks about

20:55

is using channels as a way for

20:56

communication and you can see in this

20:58

example right here

21:00

um it's similar to ipc right we're using

21:02

it's kind of like using a pipe but you

21:03

see that construct of

21:04

that arrow um it's saying oh i send this

21:07

message

21:08

and then when that receiver listens on

21:10

that you know that message that channel

21:12

and then when it's when it gets that

21:13

message it'll block and then it'll get

21:15

that message and continue and do

21:16

whatever it needs to do

21:18

similar to select and kq and e-paul and

21:20

all that stuff

21:22

so uh so again the advantages to csp is

21:26

that you know

21:27

we're using message passing and channels

21:29

very heavily and like we had talked

21:31

about the ownership concept it's it's

21:32

pretty there's a lot of advantages to

21:34

that

21:35

and then some disadvantages is that

21:38

well i kind of kind of made these up but

21:40

you know i mean shared data could be

21:42

simpler to implement sometimes i mean

21:44

depending on the on the framework that

21:45

you're using

21:45

if you're using something that doesn't

21:47

have channels as like this first class

21:49

thing

21:49

you know using and you don't need a

21:51

complicated concurrency model

21:53

you know sometimes just having some

21:54

shared memory in a lock is is simpler

21:56

and this could be over engineering right

21:58

and then um there's the issue of if you

22:00

use message passing very heavily

22:03

you need to deal with big messages right

22:05

if you're sending or a ton of messages

22:07

you either have really really big

22:08

messages and your queue is going to

22:09

overflow

22:10

or you know you're passing tons and tons

22:12

of messages like continuously

22:14

and you're just filling that up as well

22:16

and so you have to deal with back

22:17

pressure which helps solve these

22:18

problems but that's something you need

22:20

to think of when you when you do uh

22:24

channels

22:26

so uh actors is similar to csp and um

22:30

again it uses threads and and or

22:32

processes it's preemptive

22:34

and and it uses message message passing

22:37

similar to csp but it kind of uses the

22:38

concept of the mailbox

22:40

i'm going to give an example but erlang

22:42

or the monty python guy he

22:44

the cellul um erlang is basically a in

22:47

an actor language right

22:48

and so in this example you can see we'll

22:51

have you know this ping actor and this

22:52

pong actor

22:53

and they're going to communicate with

22:55

each other by saying you know at the

22:56

very bottom start ping pong

22:59

it'll say async.ping and then you know

23:03

that will that message will be sent to

23:04

this mailbox and i'll show you a little

23:05

diagram that kind of shows it that

23:08

basically

23:08

you know this guy is going to send a

23:10

message to the pong actor

23:12

and it's going to go in that mailbox and

23:14

in its own timeline

23:16

it's going to process the mailbox and

23:18

say oh i need to ping now i need a pong

23:19

now and then when it's done it can

23:20

either reply with a message or not

23:22

but all the communication and all the

23:23

interaction is happening through these

23:25

mailboxes

23:26

and this is essentially how how an actor

23:28

model works and um

23:31

yeah don't communicate by sharing state

23:32

share state by communicating by sending

23:34

messages to two mail boxes

23:36

so so back to this example you can see

23:38

when we're when we're doing this you

23:40

know async.ping

23:41

it's basically calling you know the

23:43

first uh the first actor

23:45

and it's sending this ping message to

23:47

the mailbox

23:48

and then in its own timeline it will do

23:51

something and

23:52

what's really the way i find to

23:54

understand actors is that it's basically

23:57

it's object-oriented threads really

23:58

that's the way i see it and it

24:00

it's nice it's it's a nice kind of

24:02

concise way to express concurrency

24:06

so like i said it's similar to csp but

24:08

the main

24:09

difference with actors and or the kind

24:11

of way you can compare actors in csp

24:14

is that while csp communicate over

24:16

channels

24:17

and actors the identity of the actor is

24:19

kind of the

24:20

the the concept that you use to

24:22

communicate so

24:23

it's really the con it's really the

24:25

difference is what is kind of the first

24:26

class way of communicating

24:28

and so with csp as channels and with

24:31

actors it's the

24:32

object-oriented thread that you're

24:33

trying to talk to or the actor

24:38

so let's see yeah so so some of the

24:41

advantages again this is

24:42

similar to csp you know you use message

24:46

message passing which is through

24:47

mailboxes um pretty heavily which is

24:49

alternative to locks because of the

24:51

whole

24:51

ownership concept and that's pretty good

24:54

uh there's

24:55

uh well there's no there could be no

24:58

shared state i mean if you have

24:59

uh vm processors that don't share memory

25:01

then you have the advantage of like uh

25:03

processes in ipc where you just don't

25:04

share state and you can't really trip

25:06

over each other

25:06

although you could and again the

25:09

disadvantages are similar to csp

25:12

you know you have a lot of messages you

25:14

have really big messages that could be a

25:16

problem

25:18

so we're going to step into now we're

25:21

going to step into kind of a little bit

25:22

of a different

25:23

uh kind of track basically this is a

25:26

single threaded model

25:28

right this is co routines and um

25:31

coroutines

25:32

are you know if you use windows they're

25:35

fibers if you use ruby they're they're

25:36

fibers

25:37

um python has i forget the name but a

25:39

similar

25:40

similar concept and this

25:43

what's cool about this kind of sidetrack

25:45

is that we're going to is that it's the

25:46

cooperative state right up until now

25:48

we've done

25:48

all these preemptive kind of stuff and

25:51

the cooperative state is cool because

25:53

again this is like handing the baton

25:55

right this is fred joe and harry and all

25:57

those and

25:58

they're passing around the execution

25:59

rights and so you have this executing

26:01

thing but really it's

26:02

it's just a local stack this is just

26:04

like its own context

26:06

and you have you know the context of

26:08

what you're doing

26:10

and then you have this other concept and

26:12

it's kind of similar to an actor right

26:13

like you have the pinger and the ponger

26:14

and that's

26:15

those are the things that are in charge

26:16

of those uh things that they need to do

26:19

but they're but they're actually not an

26:20

executing theory they're not a thread

26:22

they're just

26:22

they're just a context that's saved it's

26:24

just saved local state

26:26

with the ability to stop and pause at

26:28

any given time and transfer the baton to

26:29

someone else and say now it's your turn

26:31

to do something

26:34

and what's cool about this is that you

26:35

wouldn't think like oh okay well that's

26:37

interesting but not you know that

26:38

amazing but it's it's kind of amazing

26:40

that there's a guy

26:41

who created an operating system he

26:43

basically built an operating system a

26:44

scheduler just using co routines

26:46

and there's no evented style there or

26:48

anything it's literally just co-routines

26:50

that you know work with each other and

26:52

you have an operating system

26:53

that that has a scheduler and all this

26:55

stuff and that's

26:56

um pretty amazing i actually i think

26:58

that's pretty amazing you can check out

26:59

these links

27:01

and um he goes it's a course where he'll

27:03

show you how to implement this whole

27:05

thing

27:06

i think he's in chicago

27:09

so some pros and cons again because we

27:13

have this

27:13

local context which is what a co-routine

27:16

really is

27:17

it's really expressive in terms of state

27:18

because you just pause where you are and

27:20

then when you

27:21

continue to execute you just execute

27:23

where you left off and so you don't need

27:24

to like

27:25

pass variables and functions and stuff

27:26

you just wherever you are you continue

27:28

right and you have

27:29

your local context of the variables and

27:31

so you just continue from where you were

27:33

and

27:34

um again because this is single threaded

27:36

there's no need for locks

27:37

so you know it's cooperative scheduling

27:39

and so you don't need

27:40

locks because things aren't going to

27:42

step over each other because there's

27:43

only one thing happening in

27:44

ever at any given time and uh

27:48

so this is scales vertically because if

27:49

you have a faster cpu then your you know

27:51

single executing thread is going to run

27:52

faster

27:53

but really you know that's kind of a

27:55

disadvantage because you can't

27:57

scale it horizontally right you can only

27:58

have one instance of this thing running

27:59

that's why i put on the top

28:01

you know it's it's concurrent it's

28:03

definitely concurrent but it's not i

28:04

mean you can build an operating system

28:05

with that

28:06

but it's not it's not parallel right you

28:07

can only run one instance of this you

28:09

can't run this on a multi-core

28:10

computer but the the biggest

28:13

disadvantage of this really is that

28:15

you're constrained to have all of these

28:17

things work together

28:18

because it's cooperative you know if if

28:20

you have one bad apple

28:22

they're all rotten you know if you have

28:24

one thing that's that's gonna

28:26

get stuck they're all gonna you know

28:29

be stuck so um

28:32

all right so the the next one is a

28:34

vended programming which is

28:35

uh similar to co routines in that it's

28:37

cooperative but it's different

28:39

and instead of using message passing

28:43

it uses shared memory as a way to kind

28:46

of communicate but it doesn't really

28:47

communicate because similar to

28:49

tino to uh to car routines we're just

28:52

we just have one single execution that's

28:54

running right and this is again

28:55

cooperative

28:56

and similar to coroutines it's not

28:58

parallel right we only have one instance

29:00

of this thing running

29:02

and so so i'm sure also you guys know

29:05

you know

29:05

i mean i'm sure you guys know about all

29:07

these things but you know vented

29:08

programming

29:09

is is in recent years got a lot of

29:12

popularity um you know the whole c taken

29:14

problem you know ten thousand concurrent

29:16

connections how do we solve that

29:17

uses you know a vented i o and um uses

29:20

the evented programming model

29:22

as tons of frameworks you know twisted

29:25

in python that's that's evented

29:26

node.js who is like the hippest language

29:28

right now right um that's evented

29:31

even heck 20 years ago you know ajax

29:33

that's evented

29:34

um so the the way that uh this is done

29:37

is again i'm going to kind of try and

29:39

step through the common building blocks

29:40

that evented programming

29:42

frameworks have in common

29:46

and then we'll kind of talk about the

29:47

the pros and and cons

29:50

so again we have event handlers and

29:54

and callbacks these are what some of the

29:55

common ingredients that most frameworks

29:57

will have

29:58

these these the event uh handlers are

30:00

basically that it's a function

30:01

or it's a closure that will get invoked

30:04

when something happens

30:05

and so when something happens we know

30:08

that we have to

30:09

run this procedure and that's what a

30:10

callback is that's what an event handler

30:12

is

30:13

and then we'll have a concept usually of

30:15

a dispatcher which is essentially kind

30:17

of

30:17

a registry to hold all those callbacks

30:20

and say

30:21

when something needs to get executed

30:22

it'll kind of look up in that registry

30:24

and see

30:25

oh justin bieber is playing on the radio

30:26

we should change the station that's the

30:28

call back that we've registered

30:30

and then you know the data that's

30:32

associated which is baby baby baby

30:36

so that's the dispatcher and then most

30:38

of these frameworks will also have

30:40

timers because a major concept of

30:43

invented programming is you know you

30:44

don't block the event loop

30:45

which means that you know because we

30:47

have only one executing thing and the

30:48

whole concept of you know one bad apple

30:49

ruins them all

30:50

if you have one callback that gets stuck

30:52

everything gets stuck so you can't sleep

30:54

so instead of sleeping you have timeouts

30:56

right and you have oh you have timers

30:58

and then when that timeout fires then

31:00

you you know you execute that callback

31:02

and so javascript which is you know one

31:04

of the first

31:06

you know web kind of async stuff um

31:09

uses timers and you know twisted uses

31:11

timers all the stuff

31:14

and then there's the event loop which is

31:16

the core of all the evented uh paradigms

31:18

and it's basically just a loop that

31:20

processes things right and you can see

31:22

you know redis right there that c code

31:23

it has that while

31:25

you know while don't stop just keep on

31:27

going and process all these things

31:29

twisted has a main loop every event uh

31:32

framework has a

31:32

a main loop or an event loop

31:36

and there are two ways to implement this

31:39

loop

31:39

there is the reactor pattern and there's

31:41

the proactor pattern

31:42

and the they're both similar they both

31:45

want to achieve

31:46

not blocking the the the loop and just

31:49

continue so that it keeps on working

31:50

uh you know seamlessly the reactor

31:52

pattern uses

31:54

usually some methods of you know the

31:56

select and e paul and kq and all that

31:58

stuff

31:58

to make sure that you know the readiness

32:00

effect right like make sure that

32:02

when you access this data it's not going

32:04

to block the kernel has read all the

32:05

data from the device

32:07

it's copied it into the user level space

32:08

and you know you can just

32:10

read it off some buffer and it'll work

32:12

and

32:13

that's the reactor pattern whereas the

32:15

proactive pattern

32:16

is uh it doesn't actually

32:19

make sure that that it's ready what it

32:21

does is you'll have an

32:23

extra callback on top of of what you've

32:25

already created

32:26

that's the completion callback and so

32:28

when something happens you just

32:29

immediately execute it in the background

32:31

and when that's done you can pass the

32:33

result back to the main thread

32:34

with that completion callback and

32:36

windows completion ports as an example

32:38

um there's other examples out in the

32:40

wild but that's that's how you implement

32:42

that main loop

32:44

how much time how much time do i have

32:49

uh yeah we're close to the end so um the

32:53

so libby v is a pretty you know common

32:56

um framework that's used to build other

32:58

frameworks it's written in c

33:00

uh by some russian guy all the russians

33:02

are pretty hard-ass

33:04

and um so this is what uh

33:07

what libiv actually does but really to

33:09

simplify it what what it's doing is that

33:10

it's processing events

33:12

when it gets to certain events it'll

33:13

look up in that dispatcher or the

33:14

registry of all the watchers or all the

33:16

callbacks they're different they're

33:17

different terminologies for for

33:18

different frameworks but that's the idea

33:20

and then it will process that either in

33:22

the proactor way or the reactor way and

33:24

that's really what what a vented

33:25

programming

33:26

is and so the

33:29

the big advantage of this is that

33:31

because we're single threaded and we're

33:33

using all these like readiness tricks

33:34

and things like that

33:36

we avoid polling right and polling is

33:38

kind of evil because you know instead of

33:39

asking something to say oh are we there

33:41

yet are we there yet are we there yet

33:42

are we there yet

33:43

you just when you're there you get a

33:44

notification oh look you're here now

33:46

what do you want you want to eat that

33:47

cookie have that cookie

33:48

so um yeah so basically you avoid

33:52

polling

33:52

and you're much closer to cpu bound than

33:55

i o bound which is

33:56

a huge win because i o is low

33:59

and right you're not doing what you

34:00

don't need to do you're just doing it at

34:02

the given time that needs to be executed

34:04

and um actually you know if you come

34:06

from a background of you know sequential

34:09

programming this could i i think it's a

34:11

it's a advantage

34:12

you know async programming is hard to

34:14

fit in your brain when you're not used

34:15

to it

34:16

it's uh it's very different it's a very

34:18

different way to think of how you do

34:19

things

34:21

and um it does so the kind of last point

34:24

right there is

34:25

it scales scales well versus threads but

34:27

there's a ton of research you know

34:28

saying well it actually doesn't it's not

34:30

much faster than threads it's actually

34:31

pretty similar

34:32

but the one place that it kind of does

34:34

shine in comparison

34:35

is that when you have a lot of threads

34:38

then you have to manage the overhead of

34:39

having a lot of threads if you have you

34:40

know

34:41

ten thousand threads then whatever

34:43

scheduling that needs to make sure that

34:44

it can run between those threads whereas

34:46

if you have just one

34:47

single thing that's running you don't

34:49

you don't have that overhead and so

34:50

that's why

34:51

for example nginx you know beats apache

34:55

you know hands down is because of that

34:57

it doesn't you know while apache

34:58

forks off a process or responds a thread

35:00

for every connection nginx doesn't do

35:02

that it just

35:03

it multiplexes it uses you know these

35:05

select and event loops and all these

35:07

things

35:08

and that's kind of why it scales very

35:09

well but

35:11

but again it's not necessarily faster

35:12

right like if you have a low amount of

35:14

threads maybe

35:15

it'll be a lot simpler to use threads

35:17

and not have the complexity of your

35:18

async programming

35:19

logic and so that that kind of leads me

35:21

to the disadvantages is that really

35:23

you know using callbacks is is kind of

35:26

hell

35:26

because you you lose the context of

35:29

where you came from right in sequential

35:31

programming you have you know some

35:33

function that calls another function

35:34

calls another function

35:35

and they all build each other on top of

35:36

a stock a stack and in one you know in

35:38

any function you can stop and look to

35:39

see oh

35:40

where did i come from right where was i

35:42

called from what's my

35:43

my caller stack whereas in callbacks

35:46

you're lost

35:47

right like you know this thing this

35:48

thing happens right now this function is

35:50

invoked i know that this is the

35:51

state of the program i can check all the

35:53

variables and see that this is what it

35:54

is

35:55

but i can't but who called me like i

35:57

just know that i'm called right now

35:58

that's it

35:59

and so it's really hard to debug because

36:00

you lose the stack and you can get into

36:02

callback hell which is kind of a little

36:04

separate but

36:05

also disadvantage of using callbacks

36:07

because you know you can just

36:09

it gets kind of messy right the callback

36:11

calling a callback and

36:13

things like that and so i don't know why

36:15

i put rely heavily on state as a con but

36:17

um but that's what you need to do for

36:19

vended programming

36:21

to kind of find your way around and also

36:24

using languages that have kind of

36:26

closures or first class

36:28

functions as as a as a thing it's

36:31

probably easier to do so like if you

36:32

want to do a vented programming in

36:34

languages that don't have that support

36:35

it's probably going to be harder to

36:36

implement because

36:37

those callbacks are basically functions

36:40

right they're basically closures

36:41

so having a framework that enables you

36:43

to do that it's going to be easier

36:46

and the evented kind of approach

36:49

like i said you know nginx versus apache

36:53

so that's the c taken problem there's

36:55

this guy who's uh i found online

36:56

is talking about you know there's this

36:58

manifesto he put up that's like the uh

37:00

what

37:00

you know it's time for the web to handle

37:02

10 million concurrent connections now

37:03

like why don't we do that why are we

37:04

talking about 10 000 connections

37:06

and it's kind of crazy but but he he

37:09

there's this link you can follow as well

37:12

and he basically talks about instead of

37:14

uh put instead of using the kernel as a

37:16

way to check when something is ready

37:17

just

37:18

push your application into the kernel

37:20

and make your application a driver

37:21

you know so it's an interesting

37:24

concept and you can check out those

37:26

links

37:28

so all things aren't black and white

37:32

right like it's it's easy as a

37:35

programmer to trying to get into a

37:37

mindset where

37:38

you know this is this and this is x and

37:40

this is why and this is the thing that

37:41

that needs to happen right now and try

37:43

to categorize things and

37:44

actually that's what this talk is all

37:45

about right i'm trying to like fit

37:47

things into neat little boxes and

37:49

try and make sure that oh this is black

37:50

and this is white and this is this and

37:51

but the reality is you know the world

37:53

isn't isn't like that

37:54

it's not black and white you know and as

37:56

programmers you can fall into that

37:57

thinking of like oh this is this is the

37:59

way it needs to be or

38:01

something like that whereas in reality

38:02

you know different different tools and

38:04

different models and different

38:05

constraints like for example

38:07

you know you have constraints of

38:08

business constraints and things like

38:09

that like all these different things

38:11

can lead you into using different things

38:13

at different times right but but

38:14

different models shine

38:15

in different places and so threads and

38:18

locks

38:18

are good for simple use cases they're

38:20

also good for

38:21

you know implementing other models a lot

38:23

of other models use threads and mutexes

38:25

in the background

38:26

and then actors and and csp you know

38:29

they

38:29

they build telecoms with erlang and you

38:32

know

38:33

goling's doing a ton of stuff that is

38:34

similar stuff and uh

38:36

whereas the invented model is kind of

38:37

good for you know

38:39

ui stuff where you're waiting for some

38:41

keyboard input for example and you want

38:42

to

38:43

you don't want to waste resources

38:45

waiting on something so when that thing

38:46

happens just do it right that's

38:48

kind of gui programming is very vented

38:50

that's why browsers and javascript kind

38:52

of work

38:54

so really you know it's about it's about

38:55

using the the best tool for the job

38:57

and again like the constraints of your

39:00

your life and your business and whatever

39:01

you're doing you know you have to find

39:02

what's right for you there is no

39:04

best or worst or model thing like that

39:07

you know

39:09

because you don't want to be stuck in

39:10

that situation so

39:13

so yeah and i think i think the the most

39:15

important really thing of it all is

39:17

is all these different models can really

39:19

you know expand your brain and and

39:21

really the point of this talk was to try

39:22

and get you to start thinking this isn't

39:24

a comprehensive kind of list of all the

39:26

things

39:26

but it's enough to kind of you know get

39:28

your brain juices flowing and being like

39:30

oh you know i wanna

39:32

that sounds interesting i wanna look

39:33

more into that and i'm by no means

39:35

you know the the world expert on the

39:37

matter so i'd be interested in you guys

39:39

teaching me what you know because i'm

39:41

definitely you know

39:42

just just learning just learning this

39:44

stuff so that's it

39:46

thank you

39:55

so questions

40:02

questions something i didn't see on your

40:05

list was data flow programming

40:08

yeah oh you you did see that no i didn't

40:11

was it there oh yeah it was it was in

40:13

the futures and promises

40:14

kind of uh uh yeah it uses data flow

40:17

programming is is kind of

40:19

a way to structure your programming

40:20

language to kind of flow according to

40:22

the data that that like you link data

40:24

together and then

40:25

you kind of flows according to that and

40:27

it uses futures and promises very

40:29

heavily data flow is

40:31

uses futures and promises so

40:34

it's a little larger than just

40:36

concurrency because it's a larger

40:38

context but it yeah it kind of falls

40:41

into that category so

40:43

um just by the way we did tweet a link

40:46

to those slides i mean you weren't

40:47

supposed to read everything along

40:49

during that what's that i tweeted a link

40:51

to your slides so

40:52

yeah um you weren't supposed to follow

40:54

everything along as he was going like

40:56

that's a lot of

40:57

cross-references for you to check out

40:58

later yeah there's a lot a lot of

41:00

references

41:00

and there's a lot of links and uh

41:04

maybe i can just go back to the uh to

41:06

the link and you guys can

41:07

just see it while they switch

41:12

yeah so you can pull that up any other

41:18

questions

41:27

hi um what's the largest

41:31

uh production concurrency

41:34

stuff that you've run and what language

41:38

what do i do basically what's the what's

41:40

the largest production system

41:43

what's the largest concurrently

41:45

concurrent

41:46

evented production system that you've

41:49

run

41:50

and what language well we don't run

41:52

actually a lot of evented

41:54

stuff at engine yard but we do use some

41:56

frameworks for example we have

41:58

you know similar github has like a

42:00

similar kind of bot that they do

42:02

you know stuff and essentially how big

42:05

uh

42:06

not very big it's internal system stuff

42:08

basically yeah

42:11

yeah anyone else

42:15

okay we're good thanks thank you

Rate This

★

★

★

★

★

5.0 / 5 (0 votes)

Related Tags

Concurrency ModelsScaling StrategiesTech IndustryProgramming ParadigmsMulti-core ComputingEvent-driven SystemsActor ModelMessage PassingLock-free Data StructuresHigh Performance

Browse More Related Video

Why the Future of AI & Computers Will Be Analog

Back when the internet was fun. (1999 Apple iBook)

Trixie and Brittany Broski Manifest Their Destinies (with Arts & Crafts)

Motherboard Default settings could be COOKING your CPU!

Mind-bending new programming language for GPUs just dropped...

Trope Talk: Trickster Heroes