This Server CANNOT Lose Data…

Linus Tech Tips
21 Mar 202427:57

Summary

TLDRThe script details the process of upgrading a video production team's storage and editing server, named 'Wanic,' to improve reliability and performance. The team transitions from a single server setup to a high-availability configuration using Super Micro Grand Twin A+ servers, each housing four independent compute nodes with AMD EPYC Genoa processors, 384GB of memory, and NVMe drives. The new system is designed to handle the team's extensive video projects with minimal downtime, leveraging software like WCA for file distribution and Axle AI for media asset management, showcasing impressive read speeds and IOPS. The video also explores the potential of AI in video editing and the challenges of integrating new hardware into an existing infrastructure.

Takeaways

  • πŸš€ The team has upgraded their storage and editing server, named 'Wanic 10', to improve efficiency and reduce downtime costs.
  • πŸ”„ The new server setup includes redundant drives and high availability features to ensure minimal disruption to the workflow.
  • πŸ’» Super Micro Grand Twin A+ servers power the new system, each containing four independent computers within a 2U chassis.
  • 🌟 Each server node boasts 384GB of memory, an AMD EPYC Genoa processor with 64 cores, and support for up to 24 NVMe drives.
  • πŸ’‘ The servers are equipped with 2200W 80 PLUS Titanium power supplies to handle the high-performance components.
  • πŸ”Œ The system supports live drive upgrading and swapping without downtime, thanks to the flexibility of WCA (WekaIO Matrix)
  • πŸ“Š The new storage solution demonstrated impressive read latency of 131 microseconds, achieving 4 million IOPS (Input/Output Operations Per Second).
  • πŸ”„ The team tested the high availability by simulating server failures, which were handled smoothly without affecting ongoing editing tasks.
  • πŸŽ₯ The setup is designed to support video editing with Adobe Premiere, which is sensitive to latency and benefits from the system's low-latency storage.
  • πŸ€– AI integration allows for efficient search and organization of the extensive video archive, enabling quick retrieval of specific clips.
  • πŸ”— The infrastructure relies on a combination of CPU and GPU resources for different AI tasks, with a dedicated GPU workstation for object and scene analysis.

Q & A

  • What is the main issue with the current storage setup mentioned in the script?

    -The main issue with the current storage setup is that it is all housed in a single server, which poses a significant risk of downtime and data loss if that server fails.

  • What is the term used in the script to describe the new server setup?

    -The term used to describe the new server setup is 'high availability', which aims to minimize downtime and data loss by incorporating redundancy and fault tolerance.

  • Which company sponsored the new server setup with their servers?

    -Super Micro sponsored the new server setup with their servers.

  • What are the key components of the Super Micro Grand Twin A+ server mentioned in the script?

    -The key components of the Super Micro Grand Twin A+ server include four independent computers, each with its own motherboard, 384 GB of memory, an AMD EPYC Genoa processor with 64 cores, dual M.2 slots for redundant boot drives, six PCIe Gen 5 slots, and multiple NVMe drives.

  • How many NVMe drives are installed in each node of the server setup?

    -Two NVMe drives are installed in each node of the server setup, with one being a 7 TB drive and the other a 15 TB drive.

  • What is the purpose of the OCP 3.0 small form factor mezzanine slots in the server setup?

    -The OCP 3.0 small form factor mezzanine slots are used to install the ConnectX-6 200 Gbit cards from Nvidia, which provide high-speed network connectivity for the servers.

  • What is the significance of the 2200 Watts 80 Plus Titanium power supplies in the server setup?

    -The 2200 Watts 80 Plus Titanium power supplies are necessary to handle the high power requirements of the server setup, which includes four 400 Watt EPYC Genoa CPUs, a large amount of RAM, up to 24 NVMe drives, and eight network cards.

  • How does the new server setup handle the potential failure of a machine?

    -The new server setup is designed to be highly available, meaning it should be able to continue operating uninterrupted even if one or two machines fail, thanks to its redundancy and the fact that each system has its own drives.

  • What is the role of the WCA (WekaIO) file system in the server setup?

    -The WCA file system is a high-performance, distributed file system designed specifically for NVMe drives. It is used to handle the distribution of terabytes of data to multiple machines and provides low-latency, high-throughput storage for video editing and other demanding applications.

  • What is the AI technology being used for in the script?

    -The AI technology is being used for media asset management, which includes facial recognition, object detection, and scene understanding. This allows for efficient search and retrieval of specific video clips and content from a large archive of footage.

  • How does the new server setup affect the editing process?

    -The new server setup provides a highly available and fast storage solution that minimizes downtime and improves the editing process by ensuring smooth access to large video files and projects, reducing the risk of software crashes due to latency.

Outlines

00:00

πŸš€ Introduction to High Availability Storage Solution

The paragraph introduces the need for a reliable and fast storage solution due to the high volume of video production work. The team's main editing server, 'Wanic,' has been reliable over the years but as the team grows, even a minute of downtime costs significantly. The solution is to add redundancy, which is achieved through a new setup called 'Wanic 10,' emphasizing high availability (HA). The setup includes two grand twin boxes, each containing four independent servers provided by Super Micro, the sponsor of the project.

05:01

πŸ› οΈ Detailed Overview of the Super Micro Grand Twin A+ Server

This paragraph delves into the specifics of the Super Micro Grand Twin A+ server, highlighting its capabilities and components. Each 2U server contains four independent computers, each with its own motherboard, 384GB of memory, an AMD EPYC Genoa processor with 64 cores, dual M.2 slots for redundant boot drives, and multiple PCIe Gen 5 and NVMe slots. The server's power supply is discussed, with each unit providing 2200 Watts and being 80 PLUS Titanium certified. The paragraph also touches on the high availability aspect, mentioning the need for redundant network cards and switches, and the goal of the new setup to withstand the failure of individual components without affecting operations.

10:03

πŸ”§ Assembly and Configuration of the High Availability System

The paragraph describes the assembly process of the new high availability system. It discusses the installation of the CPUs, the application of thermal paste, and the populating of memory channels with 32GB DIMMs of DDR5 ECC RAM. The paragraph also covers the installation of boot drives and the storage configuration, which includes two types of drives in each node. The setup's dashboard and the way it allocates resources for different tasks, such as drive containers and compute cores, are also explained. The paragraph concludes with a test of the system's resilience by simulating a server failure.

15:04

πŸ“Š Impressive Performance Metrics and Future Plans

This paragraph focuses on the performance metrics of the new storage system, highlighting the impressive read latency and IOPS (Input/Output Operations Per Second) achieved. It discusses the system's ability to handle high throughput and the potential for future upgrades, such as increasing the number of drives per node. The paragraph also talks about the use of the system for various types of workloads, including video editing and AI development, and mentions the capabilities of the WCA file system. The potential for using the system as a file server for Windows machines is also discussed.

20:05

πŸŽ₯ Utilizing AI for Media Asset Management and Searchability

The paragraph discusses the integration of AI for media asset management, allowing for the search and retrieval of specific clips based on content. It describes the process of generating proxies for the vast archive of footage and the use of a GPU workstation for AI analysis. The paragraph outlines the capabilities of the AI in identifying objects, scenes, and even specific people within video clips. It also touches on the potential for improving searchability and the challenges of managing the cables in the server rack.

25:09

πŸ’» Testing the Resilience of the High Availability System

The final paragraph demonstrates the resilience of the high availability system by intentionally removing servers from the network to simulate a catastrophic failure. The system continues to function smoothly despite the removal of critical components, showcasing its robustness. The paragraph concludes with acknowledgments to Super Micro for the servers, WCA for the software, and Axle for the AI detection, as well as a thank you to the viewers for their support.

Mindmap

Keywords

πŸ’‘High Availability

High Availability refers to the design and maintenance of systems to ensure they are accessible and operational as much as possible. In the context of the video, it is crucial for the team's workflow as downtime can be costly. The video discusses the implementation of a server setup that aims to minimize downtime and maintain high availability, even in the event of hardware failure.

πŸ’‘WCK Final Form

WCK Final Form is the name given to the upgraded server system described in the video. It represents the culmination of efforts to improve the team's storage and editing capabilities. The name suggests that this is the final and most advanced version of the WCK system, emphasizing its high performance and reliability.

πŸ’‘Redundant NVMe

Redundant NVMe refers to the use of non-volatile memory express (NVMe) drives in a redundant configuration to ensure data reliability and system performance. In the video, this term is used to describe the storage solution that allows for high-speed data access and the ability to withstand the failure of individual drives without data loss or significant performance degradation.

πŸ’‘Supermicro Grand Twin A+ Server

Supermicro Grand Twin A+ Server is a specific model of server hardware described in the video. It is notable for housing multiple independent compute nodes within a single 2U chassis, each with its own motherboard, memory, and processing power. This server is integral to the video's narrative as it forms the backbone of the new high availability system.

πŸ’‘WCA (WekaIO)

WCA, or WekaIO, is a high-performance, scalable file system designed for data-intensive workloads. In the video, it is used as the distributed storage solution that provides the necessary speed and reliability for the team's video editing and storage needs. WekaIO is highlighted for its ability to handle large amounts of data with low latency, making it ideal for high-speed storage requirements.

πŸ’‘ConnectX-6 200Gb Cards

ConnectX-6 200Gb Cards are high-speed network interface cards (NICs) used to connect servers to the network. These cards, provided by Mellanox (now part of Nvidia), are capable of delivering 200 gigabit per second (Gbps) of bandwidth. In the context of the video, these cards are essential for the server's ability to handle the data transfer demands of the video editing workflow.

πŸ’‘Mother Vault

Mother Vault is the name given to the team's archival storage system, which is described as holding years' worth of footage and data. It is a critical component of their infrastructure, allowing them to maintain access to a vast library of content. The video discusses the integration of the Mother Vault with the new server setup to improve efficiency and accessibility of the stored content.

πŸ’‘AI Detection

AI Detection in the context of the video refers to the use of artificial intelligence to analyze and identify objects, scenes, and faces within video footage. The video describes the use of AI for content analysis within their storage system, which can significantly improve the organization and retrieval of media assets.

πŸ’‘Latency

Latency in the context of the video refers to the delay in data transfer or the time it takes for a request to be processed and responded to. Low latency is crucial for video editing, as it affects the responsiveness of the editing software and the overall workflow efficiency. The video highlights the importance of minimizing latency in their storage and server setup.

πŸ’‘Scalability

Scalability refers to the ability of a system to handle growth, either in the number of users or the volume of data, by adding resources or components as needed. In the video, scalability is a key feature of the new server setup, allowing the team to accommodate future growth in their storage and processing needs without significant disruptions to their workflow.

πŸ’‘Redundant Power Supplies

Redundant Power Supplies indicate that a system has backup power sources to ensure continuous operation in the event of a power failure. In the video, this is a critical component of the high availability setup, ensuring that if one power supply fails, the system can continue to function without interruption.

Highlights

The team has reached a point where a single minute of downtime costs over $50 in payroll, emphasizing the need for high availability storage solutions.

The main editing server, Wanic, has been reliable for years, but as the team grows, the need for redundancy becomes more critical.

Wanic 10 is introduced as the final form of the server, designed for high availability with the ability to handle unplugging without noticeable impact.

The new server setup includes two Grand Twin boxes, each containing four entire servers provided by Super Micro, the sponsor of the project.

Each server inside the Grand Twin boxes has 384GB of memory, an AMD EPYC Genoa processor with 64 cores, and dual M.2 slots for redundant boot drives.

The server setup features six PCIe Gen 5 slots, with 2.5 in NVMe slots upfront and additional slots in the rear for I/O.

The servers are equipped with 2200 Watts 80 Plus Titanium power supplies, capable of handling high-performance components.

The network cards installed are ConnectX-6 200 Gbit cards from Nvidia, providing high-speed connectivity despite being limited by the slot speed.

The server design allows for high availability, with the system able to continue operating even if one of the servers dies.

The new server setup is tested by moving the entire team onto it without notice during a busy workday, demonstrating its reliability and performance.

The software used for distributing terabytes of video projects and other data is WCA, a redundant NVMe first file system.

The CPU chosen for the servers is the AMD EPYC Genoa 9534, a 64-core, 128-thread processor with 4GB of L3 cache and a 300W TDP.

The memory installed is 32GB DIMMs of DDR5 ECC, totaling 384GB across all eight memory channels.

The storage solution includes two Kokisk Speedy CD6 Gen 4 and VME drives in each node, with plans to switch to larger drives in the future.

WCA supports live upgrading and downgrading of drives, allowing for easy maintenance and expansion of the storage system.

The WKA dashboard provides a clear overview of the cluster servers, showing the allocation of cores for specific tasks and the overall performance.

The system achieved 4 million read IOPS with a latency of 1 millisecond average, demonstrating exceptional performance for a file system over a network.

The server setup is designed to avoid single points of failure, with each machine in the cluster being part of the SMB cluster for uninterrupted operation.

The use of AI for media asset management allows for efficient searching and organization of the vast amount of footage, enhancing the usability of the storage system.

The server's cooling system includes four fans, with a unique counter-rotating design in the IO module for efficient heat dissipation.

Transcripts

00:00

when you make as many videos as we do

00:01

you need a lot of fast reliable storage

00:04

and our main editing server wanic has

00:07

checked all of those boxes for years

00:09

it's a great little server it's built

00:11

out of high quality components and it

00:13

even looks cool but as our team is grown

00:16

we've reached the point where even a

00:18

minute one single minute of downtime

00:21

costs over $50 and that's just in

00:25

payroll now practically speaking the way

00:27

to mitigate that is by adding redundant

00:30

now our drives are already redundant

00:32

we've got 20 drives in there with data

00:34

striping but the problem is they all sit

00:37

in one single server I'm sure you can

00:40

see where this is going it's been over a

00:43

year in the making but it's finally here

00:45

wck final form and I'm calling it wanic

00:49

10 because it's the last wever avability

00:52

W told you this like 10 times nobody

00:56

even knows what high availability means

00:58

it means it's lus just go ahead unplug

01:00

one do it go for it well okay I should

01:03

probably tell you the stakes before you

01:05

do that each of these two grand twin

01:07

boxes has four entire servers inside of

01:09

them that were provided by super micro

01:11

who sponsored this whole thing and

01:12

they're set up with WCA a redundant nvme

01:15

first file system in this config it

01:17

should sustain two entire servers

01:20

dropping out without anyone even

01:21

noticing except that we moved the entire

01:24

team onto it last night without telling

01:25

anyone and it's the middle of the work

01:27

day with a ton of high priority videos

01:29

in progress do you really want to test

01:31

it right now I like I haven't tried that

01:33

all right here we go okay what could go

01:37

wrong I mean a

01:39

[Applause]

01:47

lot naturally a huge part of a project

01:50

like this is the software the stuff

01:52

that's going to handle Distributing all

01:54

of ourish terabytes of video projects

01:58

Word documents and Linux isos to the

02:01

multiple machines that we just showed

02:02

you but we can't install any software

02:05

until we have some Hardware so why don't

02:08

we start there meet the super micro

02:10

Grand twin A+ server as- 2115 gt-

02:15

hntr despite its sort of ordinary

02:18

looking appearance and unexciting

02:20

sounding name it

02:21

is anything but ordinary and it is very

02:25

exciting because inside this 2u is four

02:28

independent Compu computers but for what

02:31

we're doing four nodes please we want

02:36

eight inside each of these is a

02:39

completely independent motherboard 384

02:43

gigs of memory an AMD epic Genoa

02:45

processor with 64 cores dual m.2 slots

02:49

for redundant boot drives six pcie Gen 5

02:54

2 and 1/2 in nvme slots up front and

02:57

we've got IO in the rear now this bit

03:01

here could be a little confusing at

03:03

first glance but that is because not

03:06

only do we have USB but we have two full

03:10

gen 5x6 pcie connections back here along

03:13

with display output and power for the

03:16

entire server this whole thing slides

03:20

into the chassis which holds a really

03:22

cool modular backplane assembly that

03:24

we'll take a look at in a minute and

03:26

then passes through thank you Jake ah to

03:29

the back at the server where you've got

03:31

a Management Port a single USB port for

03:34

each server nope it's two and they're

03:36

shared what the I was about to ask cuz

03:39

we've also got a single VGA you see the

03:42

button for two servers there no way this

03:45

button toggles

03:47

yeah and okay before we talk about that

03:50

a little bit more look at these power

03:54

supplies each of these is

03:57

2200 Watts 80 plus typ tianium which

04:00

sounds like a lot but when you're

04:02

potentially handling four 400 wat epic

04:05

Genoa CPUs along with a bunch of ram up

04:07

to 24 nvme drives and eight network

04:10

cards well it seems downright reasonable

04:12

doesn't it is it 24 drives can't be 6

04:15

yes 6 * 4 is

04:17

24 and of course that's just one of them

04:21

we've got two of those and that means

04:23

that in the event that one of these dies

04:25

the system should be able to continue to

04:28

operate uninterrupted which is a big

04:30

part of the high availability goal that

04:33

we have for this deployment speaking of

04:36

high availability let's move on to our

04:38

network cards each of those pcie gen 5x6

04:43

slots I showed you guys before

04:44

terminates in one of these ocp 3.0 small

04:47

form factor mezzanine slots and what

04:50

we're putting in them is these connectx

04:53

6 200 gbit cards from

04:56

melanox excuse me from Nvidia that okay

05:00

these are the older Gen 4 ones so

05:03

they're going to be limited by the slot

05:05

speed of around 250 gabit per second but

05:08

if we had newer cards that means that

05:10

each of these nodes could do 200 plus

05:14

another 200 400 up to

05:16

800 gigabit which would of course be a

05:19

complete waste for us a because our

05:21

workload can't take advantage of it and

05:23

B because our switch is only 100 gbit

05:28

sorry of course the two ports are still

05:30

helpful we do have redundant

05:33

switches except there's kind of a

05:35

problem here that's still a single point

05:37

of failure in a perfect world we would

05:39

have two single port Nicks so if a Nick

05:42

were to die it would still be okay but

05:45

because we have so many nodes we're not

05:47

really worried about an individual node

05:49

you know they could have one boot drive

05:51

and it die or one Nick and it die we

05:54

still have an extra backup how many

05:56

nines do you want I mean I don't know

05:59

like one would would be good 9% which

06:02

Jokes Aside is a really good point if we

06:04

were architecting this properly there

06:06

are so many more considerations that we

06:09

would need to make like the power coming

06:11

into the rack would have to come from

06:13

two independent backed up sources the

06:16

connectivity to our clients would have

06:18

to be redundant as well the connectivity

06:20

between all of the systems would have to

06:22

be architected in such a way that no

06:23

matter what fails everything will stay

06:25

up and realistically for us we're not

06:28

going to get that deep into it because

06:30

our goal is better than we had before

06:33

which was a single machine with its own

06:35

built-in redundancies but other than

06:37

that nothing now at least we should be

06:39

able to lose a full machine out of these

06:41

eight we can restart one of our core

06:43

switches totally fine two machines out

06:45

of these eight and we can still be

06:48

limping along I mean limping is a bit of

06:50

a stretch it's going to be very fast now

06:53

normally if you buy a super micro

06:54

machine they're going to pre-build it

06:55

for you they're going to validate it for

06:57

you you can even have them pre-build an

06:59

entire Rack or racks of these things and

07:02

then validate your application on it

07:04

before it ships to you in fact we've got

07:07

a whole video that we did about that

07:08

that was sponsored by super micro a

07:09

little while back of course this is LT

07:13

my friends so we will be assembling this

07:16

one ourselves do you like that spin of

07:18

the screwdriver above the server don't

07:20

worry I won't miss I'll never miss see I

07:23

could do this a hundred times and I

07:24

would never miss why no it's fine it's

07:26

good it's okay we have seven more any

07:29

who for our CPU we've gone with an epic

07:31

Genova

07:32

9534 this is a 64 core

07:36

128 thread monster of a CPU it'll do 3.7

07:40

GHz Max boost it has A4 gigabyte of

07:44

level three cache a 300 wat TDP it

07:47

supports ddr5 memory up to 12 channels

07:51

and it supports a whopping 128 Lanes of

07:55

pcie Gen 5 originally we were intending

07:58

to go with 32 core chips but they were

08:01

out of stock so free upgrade lucky us

08:04

compared to previous generation AMD epic

08:06

CPUs dooa is a big step up in terms of

08:09

IO performance which makes it perfect

08:12

for this application and in the long

08:15

term I mean if we've got all the extra

08:16

CPU cores and a whole bunch of ram

08:19

anyway why run WCA on the bare metal

08:21

when we could install prox Mox and then

08:23

use the other cores for I don't know

08:26

High

08:27

availability Plex server yeah Linux isos

08:31

more realistically it would be something

08:33

like active directory yeah which we

08:35

don't really want to do right now

08:36

because if you run active directory on

08:38

one server and it goes down you're going

08:40

to have a really really bad time but if

08:42

you run it on a bunch of servers yeah

08:45

it's good great so normally server CPU

08:48

coolers would come with their own

08:50

thermal paste pre-applied but since

08:51

we're doing this ourselves and uh if you

08:53

look carefully it's not the first time

08:55

that it's been installed we are going to

08:57

be using okay thank you for that a piece

09:00

of Honeywell PTM 7950 this stuff is

09:04

freaking awesome it has great thermal

09:07

transfer properties and it can handle

09:09

varying temperatures like seriously I

09:12

don't remember many not even just

09:13

varying but like a lot of huge cycles

09:16

for a very very long time now available

09:19

LTD store.com is that big enough does

09:21

that cover all of the ccds and

09:23

cxs oh there's a second piece of PL am I

09:26

stupid is there a second piece of

09:28

plastic no there isn't should I put one

09:29

in the fridge no no no it's totally fine

09:31

I've done this like a bunch of times

09:32

yeah oh she's Min look at that see all

09:35

right easy I would recommend putting it

09:36

in the fridge before you use it all

09:38

right to ensure we're making the

09:40

absolute most of our CPU especially in

09:42

this High throughput storage workload

09:45

we're going to be populating all 12 of

09:47

our memory Channels with 32 gig dims of

09:50

ddr5 ECC running at 4,800 megga

09:53

transitors per second that's a total

09:57

of 384 three terabytes of memory what

10:02

across all eight

10:05

oh each of the cables Jake removing

10:07

right now is a pcie by8 cable that feeds

10:11

two of the drive bays in the front but

10:13

the reason he's taking them out is that

10:15

we can install our boot drives these are

10:18

consumer grade each system is getting

10:20

two Sab 512 gig gen 3 rocket drives and

10:24

it's not because they're particularly

10:26

special in any meaningful way they're

10:28

not even that fast by modern standards

10:30

but what they are is from our experience

10:32

reliable enough and they are fast enough

10:35

for what we're going to be doing which

10:36

is just booting our operating system off

10:39

of them movie Magic all of the other

10:41

nodes are already built so what do you

10:43

mean movie Magic super micro built them

10:45

Oh I thought you buil them super micro

10:46

builds them for you I took it apart okay

10:49

fine I took that one apart no secrets

10:51

left anymore yep no Intrigue no mystery

10:53

you know what is still mysterious is

10:55

inside of here I've actually never

10:56

opened this before Oh okay let's have a

10:57

look woo holy oh that's power supplies

11:01

yeah this is so cool so the whole

11:02

computer is cooled by four fans no way

11:05

there's the two power supply fans and

11:07

then these fans in their what do they

11:08

call this like IO module I think is what

11:10

they call it look at the blades on this

11:12

thing counter rotating you're serious

11:14

that's what you're looking at not this

11:16

the most delicate of spaghet oh my God

11:19

there's not even connectors every one of

11:22

these wires is soldered directly to the

11:24

back of the ocp 3.0 what yeah for

11:28

storage we're installing ing two of

11:29

kokia Speedy cd6 Gen 4 and vme drives in

11:34

each node so we've got one that is 7

11:37

tabt and another one that is 15

11:40

terabytes they're kind of placeholders

11:42

for now and in the long term we're going

11:44

to switch to Something in the

11:45

neighborhood of about 4 15 tab drives

11:48

per node but the drives we want to use

11:50

are currently occupied by oh that

11:52

project by a top secret pastry related

11:55

project so that's going to have to wait

11:57

the good news is that when those drives

11:59

become available WCA supports live

12:02

upgrading and downgrading so we can just

12:04

pull these drives swap in the new ones

12:06

pull swap pull swap pull swap as long as

12:08

we uh don't do it all at once are we

12:10

ready to fire these things up okay

12:12

there's a lot going on here what is that

12:13

is that a switch y hey look you can see

12:15

the button now oh that's

12:17

cool what you're hearing so far is just

12:21

the Nvidia SN 3700 32 Port 200 gig

12:25

switch oh my God it even says melanox on

12:28

the front I know maybe it's an old like

12:30

review sample demo univ we got it with

12:32

the $1 million PC and I'm pretty sure

12:34

that that was already in video at that

12:35

point can you hear that you hear it

12:36

getting louder yeah

12:39

who well that one's just excited to see

12:42

this is the WKA dashboard maybe if I go

12:44

over here cluster servers we can see all

12:46

of our servers we have two drives per

12:50

and then course this is a very

12:52

interesting part of how wo works it's

12:54

not like trass let's say where it just

12:56

uses the whole CPU for whatever you're

12:58

trying to do they dedicate and like

13:01

fence off specific cores for specific

13:04

tasks for instance each Drive gets a

13:06

core so we've got two Drive containers

13:09

that means two a full core per Drive

13:13

yeah damn yeah you also have compute

13:16

cores which do like the par calculation

13:19

and intercluster communication and then

13:21

there's front end which you don't

13:22

necessarily always have frontend cores

13:25

managed connecting to a file system so

13:27

if you just had drives and Compu compute

13:29

you wouldn't be able to access the files

13:31

on this machine so you would have your

13:32

backend servers right those would run

13:34

drives and compute which is the cluster

13:37

and then on your like GPU box you would

13:39

run just the front end and that would

13:41

allow the GPU box to connect to the

13:43

backend cluster servers oh the back-end

13:46

cluster servers don't need to run a

13:48

front end unless you want to be able to

13:50

access the files on that machine or from

13:54

that machine which we want to cuz we're

13:56

using SMB we're using it as a a file

13:59

server stupid NZ for our stupid windows

14:02

machines yeah you can also have a

14:05

dedicated front end machine yes so if

14:07

you had like a 100 backend servers but

14:09

then that's adding a single point of

14:10

failure which is what we're trying to

14:11

avoid you could have multiple of them

14:13

okay you thought they thought of that

14:15

yeah I set it up so every single machine

14:18

in the cluster all eight of them are

14:20

part of our SMB cluster which means it

14:23

cannot go down realistically there are a

14:26

ton of other file systems out there that

14:28

you could use for something like this

14:30

traz has their scale out setup for

14:32

clustered ZFS which only requires three

14:35

nodes and is something we'd be quite

14:37

interested in trying out or if you're

14:39

looking for object storage there's a

14:40

million options but the main open-

14:42

source one Min iio requires only four

14:45

nodes though when we saw how nuts WCA

14:48

was when we set up the million dooll

14:49

server cluster I mean we had to try it

14:52

out for ourselves and try it out we did

14:57

so this is each not node holy

15:01

sh look okay the crazy thing is look at

15:04

the read latency now guys look look hold

15:06

on hold on hold on at 70 gabt a second

15:09

we've seen numbers like this before but

15:12

we're talking with in some cases double

15:15

the number of drives and no file system

15:17

without a file system like raw to each

15:19

drive this is with a file system with a

15:22

file system over a network and we're

15:25

only using 100 Gig ports like usually

15:29

with a WCA setup like this you'd

15:30

probably use 200 yeah cuz we oh my God

15:33

we didn't know cuz we didn't even have

15:36

networking as a factor last time all the

15:39

drives were in one box I know this is

15:41

networking too and the crazy part is

15:43

we're not using RDMA this is like um

15:45

some fancy uh what's it called dpdk I

15:48

think is the library this is wild yeah

15:52

look at that so read latency 131 microc

15:55

seconds that's 4 million read iops with

15:59

a latency of 1 millisecond average are

16:02

are we able to keep using W FS like this

16:04

is a trial okay this software is quite

16:07

expensive this is unreal 4 million iops

16:09

this is like it is unreal it's way more

16:12

than we could possibly ever need but

16:15

it's cool it's so cool don't they

16:17

support tearing and everything oh yeah

16:19

here I'll show you actually what that

16:20

looks like this is on mother vault which

16:22

I think right now has 400 Tippy bytes

16:25

left so let's say Max Capacity is 400

16:27

terabytes now once we run out of the 100

16:31

terab of SSD capacity which you can see

16:33

here it'll just it'll tear I mean it

16:35

automatically tear anyways and you do

16:38

need to make sure that your object store

16:39

is at least the same size as the flash

16:42

or bigger because they're going to

16:44

automatically tear everything to it that

16:46

makes sense so in theory we

16:48

move manually copy everything from Vault

16:53

one time to wo one time because it