Complete Pytorch Tensor Tutorial (Initializing Tensors, Math, Indexing, Reshaping)

Aladdin Persson
28 Jun 202055:33

Summary

TLDR本视频深入探讨了PyTorch中张量的基本操作,包括张量的初始化、数学运算、索引、重塑等。通过实例演示了如何创建张量、进行元素级运算、矩阵乘法、广播、索引和高级索引技巧。同时,还介绍了如何使用不同的方法来重塑张量,以及如何进行张量的拼接和转置。这些基础知识为深入学习深度学习打下了坚实的基础。

Takeaways

  • 📘 学习基本的张量操作是深入理解PyTorch和深度学习的基础。
  • 🔢 张量的初始化可以通过多种方式,如列表嵌套、特定数值填充等。
  • 📈 张量可以设置数据类型(如float32)和设备(如CPU或CUDA)。
  • 🔧 张量的形状、数据类型和设备等属性可以通过特定的方法进行查询和设置。
  • 🔄 张量支持多种数学和比较操作,如加法、减法、除法、指数运算等。
  • 🔍 张量索引允许访问和操作张量的特定元素或子张量。
  • 🔧 张量可以通过reshape和view方法进行形状变换。
  • 🔄 张量可以通过广播机制进行自动维度扩展以执行元素级操作。
  • 🔍 张量提供了高级索引功能,如根据条件选择元素、使用高级索引器等。
  • 📊 张量操作包括矩阵乘法、矩阵指数运算、元素级乘法等。
  • 🔧 张量可以通过特定的函数进行操作,如torch.where、torch.unique、torch.numel等。
  • 🔄 张量可以通过concatenate、permute等方法进行维度操作和轴交换。

Q & A

  • 如何初始化一个PyTorch张量?

    -可以通过多种方式初始化张量,例如使用列表创建、使用torch.empty创建未初始化的数据、使用torch.zeros创建全零张量、使用torch.rand创建均匀分布的随机数张量等。

  • PyTorch张量的数据类型和设备如何设置?

    -可以通过设置dtype属性来指定数据类型,如torch.float32。设备可以通过设置device属性来指定,如使用CUDA或CPU。

  • 张量的基本属性有哪些?

    -张量的基本属性包括其设备(device)、数据类型(dtype)、形状(shape)以及是否需要梯度(requires_grad)。

  • 如何在PyTorch中进行张量数学运算?

    -PyTorch提供了丰富的张量运算方法,如加法(torch.add)、减法、除法(torch.true_divide)、元素乘法(torch.mul)、矩阵乘法(torch.matmul)等。

  • 什么是张量索引?

    -张量索引允许我们访问和操作张量的特定元素或子张量。可以通过指定索引、切片、布尔索引等方式进行。

  • 如何使用广播(broadcasting)进行张量操作?

    -广播允许在形状不完全相同的张量之间进行操作。PyTorch会自动扩展较小的张量以匹配较大张量的形状,然后进行相应的运算。

  • 张量重塑(reshaping)是什么意思?

    -张量重塑是指改变张量的形状而不改变其数据。可以使用view或reshape方法来实现,其中view要求张量在内存中是连续存储的。

  • 如何使用torch.cat进行张量拼接?

    -torch.cat用于将多个张量沿着指定的维度拼接。需要将张量作为元组传递,并指定拼接的维度。

  • 张量转置(transpose)是如何实现的?

    -张量转置可以通过permute方法实现,指定新的维度顺序。对于二维张量,可以直接使用T属性或torch.transpose方法。

  • 如何使用torch.where进行条件索引?

    -torch.where根据给定的条件返回满足条件的元素的索引。可以用于创建基于条件的新张量或对现有张量进行操作。

  • 张量的独特值(unique)如何获取?

    -可以使用torch.unique方法获取张量中所有独特的值。该方法返回一个排序后的独特值张量。

Outlines

00:00

📚 深度学习基础:张量操作入门

介绍了深度学习中张量操作的重要性,强调了学习张量操作是深入理解深度学习的基础。视频将分为四个部分:张量的初始化、张量数学运算、张量索引和张量重塑。鼓励观众观看完整视频以掌握这些基础操作,即使不能全部记住,至少了解它们的存在,这将在未来节省大量时间。

05:00

🔢 张量初始化与属性

详细讲解了如何初始化张量,包括使用列表、指定数据类型、设置设备(CPU或CUDA)、设置梯度要求等。还介绍了如何查看张量的设备位置、数据类型、形状和是否需要梯度等属性。

10:00

📈 张量数学运算与比较

介绍了张量的基本数学运算,如加法、减法、除法、元素级指数运算等。还讲解了如何进行矩阵乘法、矩阵指数运算、元素级比较以及如何使用广播功能。

15:01

🔄 张量索引与操作

解释了如何通过索引访问和修改张量中的特定元素,包括基本索引、高级索引技巧、条件索引等。还介绍了如何使用`torch.where`进行条件赋值和`torch.unique`获取张量中的唯一值。

20:05

📊 张量重塑与变形

展示了如何使用`view`和`reshape`方法改变张量的形状,包括转置、展平、改变维度顺序等。强调了`view`要求张量在内存中连续存储,而`reshape`则没有这个要求。还介绍了如何使用`torch.cat`进行张量拼接。

25:05

🎉 总结与结束

视频总结,强调了学习张量操作的重要性,并鼓励观众在评论区提问。提醒观众,掌握这些基础操作将使后续的深度学习任务变得更加容易。

Mindmap

Keywords

💡张量(Tensor)

张量是深度学习中用于表示数据的基本数据结构。在视频中,张量的创建、操作和变换是核心内容。例如,通过初始化张量来存储和处理数据,使用张量进行数学运算,以及通过张量索引来访问特定数据。

💡PyTorch

PyTorch是一个开源的机器学习库,广泛用于计算机视觉和自然语言处理等领域。视频中提到了如何使用PyTorch进行张量操作,如张量数学、比较操作、索引和重塑等。

💡初始化(Initialize)

在编程中,初始化是指为变量分配初始值的过程。视频中介绍了多种初始化张量的方法,如使用列表、指定数据类型和设备等。例如,通过`torch.tensor`来创建一个张量并初始化其值。

💡数学运算(Math Operations)

数学运算是张量操作中的基础,包括加法、减法、乘法等。视频中展示了如何使用PyTorch进行这些运算,例如通过`torch.add`进行加法,`torch.sub`进行减法。

💡索引(Indexing)

索引是访问张量中特定元素的方法。视频解释了如何通过索引来获取或修改张量中的元素,例如使用`X[0]`来获取第一个元素,或者使用`X[:, 0]`来获取所有行的第一个列。

💡重塑(Reshaping)

重塑张量是指改变张量的形状而不改变其数据。在视频中,通过`view`或`reshape`方法来改变张量的形状,如将二维张量变为一维张量,或者改变其行数和列数。

💡广播(Broadcasting)

广播是一种在不同形状的张量之间进行运算的机制。当进行运算的张量在某些维度上的大小不匹配时,PyTorch会自动扩展较小的张量以匹配较大的张量。视频中通过示例展示了广播在减法和指数运算中的应用。

💡设备(Device)

在深度学习中,设备通常指的是计算资源,如CPU或GPU。视频提到了如何设置张量在特定设备上运行,例如使用`.cuda()`将张量移动到GPU上,以提高计算效率。

💡梯度(Gradient)

梯度是机器学习中用于优化模型的关键概念,它表示损失函数相对于模型参数的变化率。视频中提到了`requires_grad`属性,它用于指示张量是否需要计算梯度。

💡矩阵乘法(Matrix Multiplication)

矩阵乘法是线性代数中的基本运算,也是深度学习中常用的操作。视频展示了如何使用`torch.mm`或`@`操作符来执行两个张量的矩阵乘法。

💡批量矩阵乘法(Batch Matrix Multiplication)

批量矩阵乘法是处理多个矩阵乘法的一种方法,它允许同时对多个矩阵进行乘法运算。视频中通过`torch.bmm`展示了如何在批量数据上执行矩阵乘法。

Highlights

介绍了张量的基本操作,强调了在深度学习中学习这些操作的重要性。

展示了如何初始化张量,包括使用列表和指定数据类型。

解释了如何设置张量在CUDA或CPU上运行。

介绍了张量的属性,如设备位置、数据类型和是否需要梯度。

提供了多种创建张量的方法,如使用torch.empty、torch.zeros、torch.rand等。

讨论了如何在不同设备之间移动张量,以及如何设置张量的设备。

展示了如何进行张量的基本数学运算,包括加法、减法和除法。

介绍了张量的广播(broadcasting)概念,以及如何在不同维度上进行操作。

解释了如何进行张量的索引,包括基本索引和高级索引技巧。

讨论了如何重塑张量,包括使用view和reshape方法。

展示了如何进行张量的矩阵乘法和矩阵指数运算。

介绍了如何进行张量的元素级操作,如元素级乘法和点积。

解释了如何使用torch.where进行条件索引和赋值。

展示了如何使用torch.unique获取张量中的唯一值。

讨论了如何使用torch.cat进行张量的拼接。

介绍了如何使用torch.permute和torch.squeeze进行张量的维度操作。

强调了学习张量操作对于未来节省时间的重要性。

提供了一个视频,旨在帮助观众掌握张量操作的基础知识。

Transcripts

00:00

learning about the basic tensor

00:01

operation is an essential part of pi

00:03

torch and it's worth to spend some time

00:05

to learn it and it's probably the first

00:08

thing you should do before you do

00:09

anything related to deep learning what

00:21

is going on guys hope you're doing

00:22

awesome and in this video we're gonna go

00:24

through four parts so we're gonna start

00:26

with how to initialize a tensor and

00:29

there are many ways of doing that we're

00:31

gonna go through a lot of them and then

00:33

we're gonna do you know some tensor math

00:36

math and comparison operations we're

00:39

gonna go through tensor indexing and

00:41

lastly we're gonna go through tensor

00:43

reshaping and I just want to say that I

00:46

really encourage you to watch this video

00:48

to the end so you get a grasp on these

00:50

tensor operations even if you don't

00:52

memorize them after this video and

00:54

there's probably no way you can memorize

00:56

all of them you would at least know that

00:58

they exist and that will save you a lot

01:00

of time in the future so in this video I

01:02

will cover the basics but I will also go

01:05

a bit beyond that and show you a lot of

01:08

useful things or operations for

01:10

different scenarios so there's no way

01:12

I'm able to cover all of the tensor

01:14

operations there are a bunch of more

01:16

advanced ones that I don't even know

01:18

about yet but these are definitely

01:20

enough to give you a solid foundation so

01:23

I guess we'll just get started and the

01:26

first thing I'm going to show you is how

01:28

to create a tensor so what we're gonna

01:31

do is we're gonna do my tensor and we're

01:34

gonna do torch tensor and we're gonna do

01:38

the first thing we're gonna do is a list

01:40

and we're gonna do a list inside that

01:42

list we're gonna do one two three and

01:45

then let's do another list and we're

01:49

gonna do four five six so what this

01:52

means right here is that we're gonna do

01:54

two rows so this is the first row and

01:57

then this is the second row all right so

02:00

we're gonna have two rows in this case

02:02

and we're gonna have three columns

02:05

so we could do then is we can do print

02:07

my tensor and we can run that and we're

02:10

gonna just get in an in a nice format so

02:13

we're gonna get one for for the first

02:16

column two five and three six now what

02:19

we can do as well is we can set the type

02:21

of this tensor so we can do D type

02:24

equals torch dot float and we can do

02:28

let's say float 32 so then if we print

02:33

it again we're gonna get that those are

02:35

float values now another thing we can do

02:37

is we can set the device that this

02:39

tensor should be on either CUDA or the

02:42

CPU now if you have a CUDA enabled GPU

02:45

you should almost always have a sensor

02:48

on the on CUDA otherwise you're gonna

02:50

have to use the CPU but we can specify

02:53

this using the device so we can do

02:55

device equals then we can set this to

02:58

CUDA if you have that available and then

03:02

if we print my tensor again we can say

03:05

see that the device says CUDA right here

03:07

if you do not have a it could enable GPU

03:11

then you're gonna have to write CPU

03:13

right here I think that CPU is also the

03:15

default she don't have to write it but

03:18

it can help to just be specific now if

03:21

we run this the it doesn't the device

03:24

doesn't show which means that it's on

03:26

the CPU can also set other arguments

03:29

like requires gradient which is

03:32

important for auto grab which I'm not

03:35

going to cover in this video but

03:36

essentially for computing the gradients

03:38

which are used when we do the backward

03:41

propagation through the computational

03:43

graph to update our parameters through

03:46

gradient descent but anyways I'm not

03:48

gonna go in-depth on that one thing we

03:50

can do as well is that you're gonna see

03:52

a lot when in my videos and just if you

03:55

read PI torch code is that people often

03:59

write device equals CUDA if torch that

04:05

CUDA dot is available like that

04:11

else CPU all right and in this case what

04:14

happens is that if you have CUDA enabled

04:16

the device is going to be set to CUDA

04:19

and otherwise it's going to be set to

04:20

the CPU kind of that priority that if

04:23

you have enabled and you should use it

04:25

otherwise you're gonna be stuck with the

04:27

CPU but that's all you have so what we

04:30

can do instead of writing the string

04:32

here is that we can just write device

04:35

like this

04:39

and now the great thing about this is

04:41

that two people can run it and if one

04:44

has kuda it's going to run on the GPU if

04:47

they don't have it's gonna run in the

04:48

cpu but the code works no matter if you

04:52

have it or not

04:52

now let's look at some attributes of

04:56

tensors so what we can do is we can as I

05:00

said I print my tensor and we just I get

05:03

so we can do that again and we just get

05:05

some information about the tensor like

05:07

what device it's on and if it requires

05:09

gradient what we can also do is we can

05:12

do my tensor and we can do dot D type so

05:16

that would we'll just in this case print

05:19

towards up float32 and we can also do

05:23

print my tensor that device what that's

05:27

gonna do is gonna show us what there is

05:30

detention or zone so CUDA and then

05:32

you're gonna get this right here which

05:33

essentially if you have multiple GPUs

05:35

it's gonna say on which GPU it's on in

05:39

this case I only have one GPU so it's

05:41

gonna say 0 and 0 I think is the default

05:44

one if you don't specify and then what

05:48

we can do as well we can do print my

05:50

tensor dot shape so yeah this is pretty

05:54

straightforward it's just gonna print

05:55

the shape which is a two by three what

05:59

we can do as well is we can do print

06:02

might answer that requires grad which is

06:06

gonna tell us if that answer requires

06:09

gradient or not which in this case we've

06:11

set it to true alright so let's move on

06:14

to some other common initialization

06:17

methods

06:21

what we can do is if we don't have the

06:24

exact values that we want to write like

06:26

in this case we had one two three and

06:28

four five six we can do x equals torch

06:32

that empty and we can do size equals

06:35

let's say three by three and what this

06:38

is gonna do is it's gonna create a three

06:40

by three

06:41

tensor and our matrix I guess and it's

06:47

gonna be empty in that it's going to be

06:48

uninitialized data but these days this

06:52

data values that isn't gonna be in this

06:55

industry are gonna be just whatever is

06:58

in the memory at that moment so the

07:01

values can really be random so don't

07:03

think that this should be zeros or

07:05

anything like that

07:06

it's just gonna be unleash uninitialized

07:09

theta now if you would want to have

07:11

zeros you can do torched add zeros

07:14

and you don't have to specify the the

07:19

size equals since it's the first

07:21

argument we can just write three three

07:24

like that and that's gonna be so what we

07:27

can do actually we can print let's see

07:30

we can print X right here and we can see

07:33

what value is it gets and in this case

07:36

it actually got zero zeros but that's

07:39

that's not what it's gonna happen in

07:42

general and yeah if you print X after

07:46

this it's also going to be zeros now

07:49

what we can also do is we can do x

07:51

equals torch dot R and and we can do

07:54

three three again and and what this is

07:58

gonna do it it's gonna initialize a

08:00

three by three matrix with values from a

08:03

uniform distribution in the interval 0 &

08:07

1 another thing we could do is we could

08:10

do X equal torched at once I'm looking

08:12

again to I don't a by 3 and this is just

08:17

gonna be a three by three matrix with

08:19

all values of 1 another thing we can do

08:23

is torch that I and we can and this is

08:26

we're gonna send in five by five or

08:29

something like that and this is gonna

08:31

create an identity matrix so we're gonna

08:35

have ones on the diagonal and the rest

08:38

will be will be zeros so if you're

08:41

curious why it's called I is because I

08:43

like that is the identity matrix how you

08:46

write in mathematics and if you say I it

08:50

kind of sounds like I yeah that makes

08:53

sense

08:53

so anyways one more thing we can do is

08:57

we give you something like torch that a

08:59

range and we can do start we can do end

09:04

and we can do step so you know basically

09:07

the arrange function is exactly like the

09:11

range function in Python so this should

09:13

be nothing nothing weird one thing I

09:15

forgot to do is just print them so we

09:16

can see what they look like so if we

09:19

print that one as I said we're gonna

09:20

have once on the diagonal and the rest

09:22

will be 0 if we print X after the

09:27

arranged

09:28

it's going to start at zero it's going

09:30

to have a step of one and the end is a

09:32

non-inclusive v del v value or five so

09:38

we're going to have 0 1 2 3 4 so if we

09:41

print X we're gonna see exactly that 0

09:45

and then up to inclusive for another

09:48

thing we can do is we can do x equals

09:50

torsion linspace and we can specify

09:53

where it should start so we can do start

09:57

equals 0.1 and we could do something

10:00

like end equals 1 and we could also do

10:02

step equals 10 so what this is gonna do

10:08

is it's gonna write this should be steps

10:11

so what this is gonna do is it's gonna

10:13

start at 0.1 and then it's gonna end at

10:16

1 and it's gonna have 10 values in

10:18

between those so what's going to happen

10:21

in this case it's gonna take the first

10:23

value 0.1 then the next one it's going

10:25

to be a point to a point 3 0.4 etc up to

10:29

1 so just to make sure we can print X

10:31

and we see that that's exactly what

10:34

happens and if we calculate the number

10:36

of points so we have 1 2 3 4 5 6 7 8 9

10:41

10 right so we're gonna have it equal 2

10:45

steps amount of point so then we can do

10:48

also x equals torch that empty as we did

10:52

for the first one and we can set the

10:54

size and I don't know 1 and 5 or

10:57

something like that and what we can do

10:59

then is we can do dot normal and we can

11:04

do mean equals 0 and the standard

11:08

deviation equals 1 and essentially so

11:12

what this is gonna do it's gonna create

11:13

the initialized data of size 1 1 by 5

11:17

and then it's just gonna make those

11:20

values normally distribute normally

11:22

distributed with a mean of 0 and

11:24

standard deviation of 1 so we could also

11:27

do this with something like we can also

11:30

do something like with the uniform

11:31

distribution so we can do dot uniform

11:33

and then we can do 0 and 1 which would

11:36

also be similar to what we did up here

11:39

for the torch ran

11:41

but of course here you can specify

11:42

exactly what you want for the lower and

11:45

the upper of the uniform distribution

11:47

another thing we can do is to torch dot

11:50

d AG and then we can do torch dot ones

11:54

of some size let's say three it's going

11:57

to create a diagonal matrix of ones on

12:01

the diagonal it's going to be shape

12:02

three so essentially this is gonna

12:05

create a 3x3 diagonal matrix essentially

12:09

this is gonna create a an identity

12:12

matrix which is three by three so we

12:14

could adjust as well used I

12:17

but this diagonal function can be used

12:20

on on it on any matrix so that we

12:23

preserve those values across a diagonal

12:26

and in this case it's just simple to use

12:29

torched at once now one thing I want to

12:32

show as well is how to initialize tensor

12:34

to different types and how to convert

12:36

them to different types

12:42

so let's say we have some tensor and

12:45

we're just going to do a torch dot a

12:47

range of 4 so we have 0 1 2 3 and yea so

12:53

here I set to start the end and this

12:56

step similarly to Python the step will

13:00

be 1 by default and the start will be 0

13:02

by default so if you do a range for

13:05

you're just gonna do that's the end

13:07

value now I think this is this is

13:09

initialized as a 64 by default but let's

13:13

say we want to convert this into a

13:15

boolean operator so true or false what

13:20

we can do is we can do tensor dot bool

13:22

and that will just create false true

13:26

true true so the first one is 0 that's

13:28

gonna be false and the rest will be

13:30

through true now what's great about

13:33

these as I'm showing you now when you

13:36

dot boo and I'm also going to show you a

13:38

couple more is that they work no matter

13:41

if you're on CUDA or the CPU so no

13:45

matter which one you are these are great

13:47

to remember because they will always

13:48

work now the next thing is we can do

13:51

print tensor that's short and what this

13:54

is gonna do is it's gonna create two

13:56

int16

13:58

and I think both of these two are not

14:01

that often used but they're good to know

14:04

about and then we can also do tensor dot

14:06

loan and what this is gonna do is it's

14:09

gonna do it to in 64 and this one is

14:13

very important because this one is

14:14

almost always used we're going to print

14:17

tensor 1/2 and this is gonna make it to

14:21

float 16 this one is not used that often

14:24

either but if you have well if you have

14:27

newer GPUs in in the 2000 series you can

14:31

actually train your your networks on

14:33

float 16 and that's that that's when

14:37

this is used quite often but if you

14:39

don't have such a GPU I don't have that

14:42

that new of a GPU then it's not possible

14:45

to to Train networks using float 16 so

14:50

what's more common is to use tenser dot

14:53

float

14:55

so this will just be a float 32-bit and

14:57

this one is also super important this

15:00

one is used super often so it's good to

15:03

remember this one and then we also have

15:06

tensor dot double and this is gonna be

15:08

closed 64 now the next thing I'm gonna

15:11

show you is how to convert between

15:15

tensor and let's say you have a numpy

15:18

array so we'll say that we import numpy

15:22

as MP now let's say that we have some

15:25

numpy array we have numpy zeros and I'd

15:28

say we have a 5 by 5 matrix and then

15:32

let's say we want to convert this to a

15:34

tensor well this is quite easy we can do

15:38

tensor equals Torche dot from numpy and

15:43

we can just sending that numpy array

15:45

that's how we get it to a tensor now if

15:48

you want to convert it back so you have

15:50

a a back the number array we can just do

15:54

let's say numpy array back we can do

15:57

tensor dot numpy and this is gonna bring

16:00

back the number array perhaps there

16:02

might be some numerical roundoff errors

16:04

but they will be exactly identical

16:07

otherwise so that was some how to

16:10

initialize any tensor and some other

16:13

useful things like converting between

16:16

other types in float and double and also

16:20

how to convert between numpy arrays and

16:25

tensors now we're going to jump to

16:28

tensor math and comparison operations so

16:32

we're gonna first initialize two tensors

16:35

which we know exactly how to do at this

16:37

point we're going to torch that tensor

16:40

one two three and we're gonna do some

16:44

y2b torsa tensor and then I don't know

16:48

nine eight seven and

16:53

we're going to start real easy so we're

16:55

just going to start with addition now

16:58

there are multiple ways of doing

16:59

addition I'm going to show you a couple

17:01

of different points so we can do

17:04

something like is that one to be torched

17:07

empty of three values then we can do

17:10

torch that add and we can do X Y and

17:13

then we can do out equals Z one now if a

17:16

print said one we're going to get 10 10

17:20

and 10 because we've added these these

17:23

together and as we can see 1 plus 9 is

17:26

10 2 plus 8 and 3 plus 7 so this is one

17:31

way another way is to just do Z equals

17:36

torch dot add of x and y and we're gonna

17:40

get exactly the same result now another

17:43

way and this is my preferred way it's

17:46

just to do X Z equals X plus y so real

17:51

simple and real clean and you know these

17:55

are all identical so they will do

17:57

exactly the same operations and so in my

18:00

opinion there's really no way no reason

18:02

not to use just the normal addition for

18:06

subtraction there are again other ways

18:09

to do it as well but I recommend doing

18:13

it like this so we do Z equals X minus y

18:19

now for division this is a little bit

18:22

more clunky in my opinion but I think

18:26

they are doing some changes for future

18:29

versions of Pi torch but we can do Z

18:33

equals torch dot true divide and then we

18:38

can do x and y what's what's going to

18:40

happen here is that it's going to do

18:42

element wise division if they are of

18:44

equal shape so in this case it's going

18:47

to do 1/9 as its first element 2 divided

18:50

by 8 3 divided by 7 let's say that Y is

18:53

just an integer so Y is I don't know -

18:56

then what's gonna happen it's gonna

18:58

divide every element in X by 2 so it

19:02

would be 1/2 3/2 and 3/2 if Y would be

19:06

in

19:07

now another thing I'm gonna cover is in

19:10

place operations so let's say that we

19:13

have T equals towards that zeros of

19:17

three elements and let's say we want to

19:20

add X but we want to do it in place and

19:23

what that means it will mutate the

19:24

tensor in place so it doesn't create a

19:27

copy so we can do that by T dot ad

19:30

underscore X and whenever you see any

19:34

operation followed by an underscore

19:36

that's when you know that the operation

19:39

is done in place so doing these

19:42

operations in place are oftentimes more

19:44

computationally efficient another way to

19:46

do in place is by doing T plus equals x

19:51

so this will also do an in place and

19:54

ition similarly as ad underscore

19:56

although perhaps confusingly is if you

20:00

do T equals T plus X it's not going to

20:04

be in place that's going to create a

20:07

copy first and yeah I'm gonna expert on

20:10

this particular subject so that's just

20:13

what I know to be the case to move along

20:15

let's look at exponentiation so let's

20:19

say that we want to do element wise

20:21

exponentiation how we can do that is by

20:24

doing Z equals X dot power of two so

20:28

what that means is since X in this case

20:31

is one two and three and we're doing a

20:34

power of two this is going to be element

20:36

wise a power of two so it's going to

20:38

become one four and nine and we can

20:41

print that just to make sure so then we

20:45

get one four and nine and another way to

20:47

do this which is my preferred way of

20:49

doing it is doing is e equals x asterisk

20:54

asterisk two so this is going to do

20:57

exactly the same operation just without

21:00

dot power let's do some simple

21:03

comparison so let's say we want to know

21:05

which values of X are greater than 0 and

21:09

less than zero so how we can do that is

21:12

by doing Z equals x greater than zero so

21:17

we can just do print said

21:19

and this is going to again also be

21:21

elementwise comparison so this is just

21:25

going to be true since all of them are

21:26

greater than zero and if we do something

21:28

like that equals x and then less than

21:31

zero those are all going to be false

21:35

because all of the elements are greater

21:36

than zero so that's just how you do

21:39

simple comparison let's look at matrix

21:43

multiplication so

21:47

can do that is we if we if we initialize

21:50

two matrices so we have x1 torch that

21:52

Rand and then we do 2 and 5 and then we

21:57

do x2 and then we do to a tractor trend

22:00

and then 5 and 3 how we do the matrix

22:04

multiplication as we do something like X

22:06

3 equals torch mm X 1 and X 2 and then

22:11

the outer shape of this will just be 2x3

22:14

an equivalent way of doing the matrix

22:16

multiplication is you can do X 3 equals

22:18

x 1 dot mm and then X 2 so these are

22:23

really equivalent

22:24

eeeek you write out the torch dot and

22:27

then or you just do the tensor and then

22:29

it's a it has this operation as an

22:33

attribute to that tensor now let's say

22:35

that we want to do some matrix

22:37

exponentiation meaning so we don't want

22:40

to do element wise exponentiation but

22:42

rather we want to take the matrix and

22:45

raise the entire matrix so how we can do

22:48

that let's do matrix exponent and let's

22:51

just pick let's just initialize it to

22:55

torture brand of five and five

22:59

and then we can do something like matrix

23:03

dot X or underscore X and then matrix

23:08

underscore power and then we can send in

23:11

the amount of times to erase that matrix

23:13

so we can do something like three and

23:15

this would be equivalent of course to

23:18

doing matrix exponent and the matrix

23:20

multiply by itself and then matrix

23:22

multiply it again by itself if we're

23:25

curious how they look like we can just

23:28

print it and it's going to be the same

23:30

shape so it's going to be a five by five

23:32

and yeah these values don't really mean

23:34

anything because their values are all

23:36

random but at least we can see that it

23:38

sort of seems to make sense if we do the

23:42

matrix multiplication three times we

23:44

will probably get something that looks

23:46

like that now let's look at how to do

23:48

element wise multiplication so we have X

23:54

and we have Y so we have one two three

23:57

nine eight seven what we can do is we

24:00

can do Z equals x and then just star Y

24:04

so that would just be an element wise

24:06

multiplication and so if we print said

24:10

that should be you know 1 times 9 and

24:12

then 2 times 8 so and then 3 times 7 so

24:17

it should so it should be 9 16 and 21 so

24:21

if we print that we see that we get

24:23

exactly in 9/16 and 21 as we expect

24:26

another thing we can do is the dot

24:30

product so essentially that means taking

24:32

the element wise multiplication then

24:34

taking the sum of that but we can do

24:37

that instantly by doing torch dot dot

24:42

and then x and y so that would just be

24:44

the sum of 21 16 and 9 so we can just do

24:49

print that for that and we can see that

24:52

this sum is 46 something a little bit

24:54

more advanced is a batch matrix

24:57

multiplication and I'm just gonna

24:59

initialize so I'm just going to

25:01

initialize the matrices first or tensors

25:05

I guess so we're gonna do batch equals

25:07

32 and equals 10 M equals 20 P equals 30

25:13

so this doesn't make any sense yet but

25:15

we're gonna do tensor one and we're

25:18

gonna do torch dot Rand and then we're

25:22

gonna do batch and then N and then M so

25:27

we have three dimensions for this tensor

25:30

essentially that's what means with when

25:32

we have batch matrix multiplication if

25:35

we just have two dimensions of the

25:37

tensor then we can just do normal matrix

25:39

multiplication right but if we have this

25:42

additional additional dimension for the

25:44

batch then we're gonna have to use batch

25:47

matrix multiplication so let's define

25:50

the second tensor so that's gonna be

25:52

tortured Rand and it's gonna be badge

25:54

and it's gonna be M and it's gonna be P

25:57

if we have the tensors in structured in

26:00

this shape then we're gonna do out after

26:04

batch mates from application it's just

26:06

going to be torch that bmm of tensor 1

26:10

and tensor to and so what happens here

26:13

is that we can see that the dimension

26:15

that match there are equal are these

26:18

ones right here so it's gonna do matrix

26:20

multiplication across that dimension so

26:24

the resulting shape in this case is

26:26

going to be badge and MP so we're gonna

26:29

do we're gonna get batch and then N and

26:35

then P next I want to cover something

26:38

else which is some examples of a concept

26:42

called broadcasting that are you gonna

26:44

encounter a lot of times in both numpy

26:48

and PI torch so let's say we have two

26:51

tensors we have X 1

26:54

which is just a random uniformly random

26:58

5x5 matrix and then we have X 2 which is

27:01

torch dot R and and let's say it's 1 by

27:04

5 now let's say that we do Z equals x 1

27:08

minus X 2 and mathematically that

27:11

wouldn't make sense right we can't

27:14

subtract a matrix by a vector but this

27:17

makes sense in pi torch and numpy and

27:20

why this makes sense or how does it make

27:23

sense it makes sense in that what's

27:25

going to happen is that this row is

27:30

going to be expanded so that it matches

27:34

the row of the first one in this case so

27:38

this one is going to be expanded so we

27:40

have five rows where each column are

27:42

identical to each other and then the

27:44

subtraction works so in other words this

27:49

vector here is going to be subtracted by

27:52

each row of this matrix that's what we

27:55

refer to as broadcasting when it

27:57

automatically expands for one dimension

28:00

to match the other one so that it it can

28:03

actually do the operation that we're

28:05

asking it to do we can also do something

28:07

like X 1 and then element wise

28:11

exponentiation 2 X 2 so again

28:13

mathematically this doesn't make too

28:15

much sense we can't raise this matrix

28:18

element wise by something that doesn't

28:20

match in the shape but but it's going to

28:22

be the same thing here in that it's

28:24

gonna copy across all of those rows that

28:27

we wanted to element wise raise it to so

28:31

in this case it's gonna be again be a 5

28:33

by 5 and then it can element wise raised

28:36

from those elements now let's look at

28:38

some other useful tensor operations we

28:42

can do something like sum of X which is

28:44

going to be torched after some and then

28:47

we can do X and we can also specify the

28:49

dimension that it should do the

28:51

summation in this case X is just a

28:55

single vector so we can just do

28:58

dimension equal 0 although if you have a

29:02

like we did here in tensor one where we

29:05

have three dimensions of the tensor you

29:07

can

29:08

specify which dimension it should sum

29:10

over another thing we can do is we can

29:12

do torch that max so torch like Max is

29:17

gonna return the values and the indices

29:20

of the maximum values so we're gonna do

29:23

tours of Max of X and then we also

29:26

specify the dimension for which we want

29:29

to take the maximum again in this case

29:31

we just have a vector so the only thing

29:34

that makes sense here is dimension 0 you

29:37

can also do this same thing values

29:39

indices but you can do the opposite so

29:42

the torch type min instead of torch at

29:45

max then we can again do X dimension

29:48

equals 0 we can compute the absolute

29:51

value by doing torch that absolute

29:53

absolute value or torch with ABS of X

29:56

and that's gonna take the absolute value

29:58

element wise for each in X we could also

30:01

do something like torch dot arguments of

30:05

X and then we specify the dimension so

30:08

this would do the same thing as torch

30:10

dot max except it only returns the the

30:14

index of the one that is the maximum I

30:18

guess this would be a special case of

30:20

the max function we could also do the

30:23

opposite so we can do torch that

30:25

argument of X and then across some

30:28

dimension we could also compute and I

30:31

know these are a lot of operations but

30:33

so we can also do the mean so we can do

30:36

towards that mean but to compute the

30:41

mean Python requires it to be a float so

30:45

what we have to do then is X dot float

30:47

and then we specify the dimension in

30:50

this case again we only have dimension 0

30:52

if we would for example 1/2 element wise

30:55

compare two vectors or matrices we can

30:59

use torch that eq and we can send in x

31:03

and y and this is going to check

31:05

essentially which elements are equal and

31:08

that's going to be true otherwise it's

31:10

going to be false in this case if we

31:12

scroll up we have x and y are not equal

31:16

in any element so they are all going to

31:19

be false

31:20

essentially if we if we print Zed this

31:25

is just gonna be false false false

31:27

because none of them are equal other

31:29

thing we can do is we could do torch

31:32

that sort and here we can send in Y for

31:36

example we can specify the dimension to

31:38

be sorted I mention 0 and then we can

31:41

specify the sending equals false

31:44

so essentially you know we're gonna in

31:47

this case sort the only dimension that Y

31:49

has and we're gonna sort it in ascending

31:52

order so this is default meaning it's

31:55

going to sort in ascending order so the

31:58

first value the first element is going

32:01

to be the smallest one and then it's

32:02

going to just going to be an increasing

32:04

order and what this returns is two

32:07

things you drew also it turns the sort

32:10

of Y but then it's also going to return

32:12

the illnesses that we need to swap so

32:16

that it becomes sorted all right these

32:19

are all a lot of tense reparations but

32:22

we're gonna so they're not that many

32:23

left on this one so one more thing we

32:26

can do is we can do torch dot clamp and

32:29

we can send in X for example and then we

32:32

can do min equals zero so what this is

32:36

gonna do is it's going to check all

32:39

elements of X that are less than zero

32:42

it's gonna set to zero and in this case

32:45

there are no max value but we could also

32:47

send the in max equals I don't know 10

32:50

meaning if any value is greater than 10

32:53

it's going to set it to 10 so it's gonna

32:55

clamp it to 10 but if we don't send any

32:57

value for the maximum in then we don't

33:00

have a max value that it clamps to so if

33:04

you recognize here if this is going to

33:08

clamp every value less than 0 to 0 and

33:10

any other value greater than 0 it's not

33:13

going to touch then that's exactly the

33:16

relative function so towards that clamp

33:18

ism is a general case and the rel is a

33:23

special case of clamp so let's say that

33:25

we initialize some tensor and we have I

33:28

don't know 1 0 1 1 1

33:32

and we do this to d-type torched our

33:36

pool so we have true or false values now

33:40

let's say that we want to check if any

33:42

of these values are are true so we can

33:47

do torch dot any of X and so this is of

33:51

course going to be true because we have

33:53

most of them to be true which means that

33:56

at least one is true so this will be

33:59

true but if we if we instead have Z

34:03

equals torched at all of X this means

34:07

that all of the values needs to be one

34:10

meaning there cannot be any value that's

34:13

false so in this case this is going to

34:17

be full since we have one value right

34:19

here that's zero I also want to add that

34:21

right here when we do torch that max you

34:26

could also just instantly do X dot max

34:29

and then dimension equals zero and you

34:32

could do that for a lot of these

34:33

different operations you can do that for

34:35

absolute value for the minimum for the

34:37

sum Arg max sword etc so here I'm being

34:43

very explicit in writing torch

34:46

everywhere which you don't need to do

34:48

unless you really want to so that was

34:51

all for math and comparison operations

34:54

as I said in the beginning these are a

34:56

lot of different operations so there's

34:58

no way you're gonna memorize all of them

34:59

but at least you can sort of understand

35:01

what they do and then you know that they

35:03

exist so if you encounter a problem that

35:06

you can think of I want to do this and

35:08

then you know that you know what to

35:10

search for essentially so let's now move

35:13

on to indexing in the tensor so what

35:17

we're gonna do for tensor indexing we're

35:19

gonna do let's say we have a batch size

35:21

of ten and we have 25 features of every

35:26

example in our batch so we can issue

35:29

initialize our input X and we can do

35:32

tours around and we can do batch size

35:35

how comma features and then let's say we

35:38

want to get the first example so we want

35:41

to get the features of the first example

35:44

well how we can do that

35:45

as we can do X of zero and that would

35:50

get us all of the 25 features I'm just

35:53

gonna write down shape and this is

35:56

equivalent to to doing x0 comma all like

36:02

this so what this would mean is that we

36:04

want the first one the first example in

36:08

our batch and then we want all the rest

36:11

in that specific dimension so we want

36:13

all the features but we could also just

36:15

do X of 0 directly now let's say that we

36:19

want the opposite so we want to get the

36:20

first feature for all of our examples

36:23

well how we can do that is we can do x

36:25

of all and then 0 which means that we

36:30

will get the first feature right the

36:32

first feature over all of the examples

36:34

so if we do that shape on that we're

36:37

just gonna get 10 right there since we

36:42

have 10 examples now let's say that we

36:44

want to do something a little bit more

36:45

tricky so we want to get the third

36:47

example in the batch and we want to get

36:50

the first ten features how we can do

36:53

that is we can do X of 2 so that's the

36:57

third example in the batch and then we

37:00

can do 0 and then 2 10 so what this is

37:04

gonna do essentially it's going to

37:06

create a list of so 0 to 10 is

37:10

essentially going to essentially going

37:12

to create a list so 0 1 up to 9 and so

37:17

what this means is that my torch is

37:20

gonna know that we want the second or

37:22

guess the third row and then we want all

37:25

of the elements of 0 up to 10 we could

37:29

also use this to assign our our tensor

37:34

so we could use X of 0 0 and we can just

37:38

like set this to 100 or something like

37:40

that and of course this also works for

37:43

this example right here and the previous

37:45

ones now let's say that we want to do

37:46

something more fancy so we're gonna call

37:49

this fancy indexing

37:52

we can do something like X is a torch

37:55

dot a range of 10 and then we could do

37:58

indices which is going to be a list of

38:00

let's say 2 5 & 8 what we can do then is

38:04

we can do print X of indices so what

38:10

this is gonna do is it's only gonna pick

38:12

out I guess in this case the third

38:14

example in a batch the sixth example in

38:17

our batch and the ninth example in our

38:20

batch since here we just did torture the

38:22

range 10 this is going to pick out

38:25

exactly the same values so it's gonna

38:27

pick out 3 elements from from this list

38:30

right here and it's gonna pick out the

38:32

the the value 0 5 & 8 exactly matching

38:37

the indices so if we would do yeah if we

38:41

would print that

38:44

see that we get exactly to five and

38:46

eight what we can also do is let's say

38:48

we do torch dot R and and we do three

38:52

and five and then let's say that we want

38:55

some specific rows so we do rows it goes

38:58

towards that tensor of 1 and 0 and we

39:04

specified the columns as well so we do

39:06

torso tensor of I don't know 4 and 0

39:13

then we can do print X of rows comma

39:18

columns what this is gonna do is it's

39:20

gonna first pick out the first or I

39:24

guess the second row and the fifth

39:28

column and then it's gonna pick out the

39:31

first row and the second column yeah

39:34

that's what it's gonna do it's gonna

39:35

pick out two elements so we can just do

39:39

that real quick so we print the shape

39:41

and we see that it's gonna pick out two

39:43

elements now let's look at some more

39:45

advanced indexing so let's say again we

39:49

have X is torch a range of 10 and let's

39:54

say that we want to just pick out the

39:56

elements that are strictly smaller than

39:59

2 and greater than 8 so how we can do

40:03

that is we can do print X and then we

40:06

can do list and then we're just going to

40:08

do it X less than 2

40:15

we can do or

40:18

or X is greater than 8 so this means is

40:24

it's going to pick out all the elements

40:26

that are less than 2 or it's gonna pick

40:30

out so it's gonna pick out the elements

40:33

that I listen to or if they are greater

40:36

than 8 essentially in this case it's

40:38

going to pick out the 0 and the 1 and

40:40

it's going to pick out the 9 so if we

40:43

just print that real quick we'll see

40:45

that it picks out 0 1 & 9 and what you

40:48

could also do is you can replace this

40:50

with a with an ensign so this would be

40:56

it needs to satisfy that it's both

40:58

smaller than 2 and greater than 8 which

41:01

of course would be wouldn't be possible

41:04

so if we run that that would just be an

41:06

empty tenser right there I want to show

41:09

you another example of this we could do

41:11

something like print X and then inside

41:14

the list we could do X dot remainder of

41:18

of 2 and we can do equal equal 0 so this

41:24

is gonna do just if the remainder of X X

41:28

modulus 2 is 0 then those are the

41:31

elements we're gonna pick out

41:32

essentially these are all the even

41:34

elements right so we're gonna pick out

41:36

the elem even elements which are 0 2 4 6

41:40

& 8 so if we print that real quick we

41:43

again see that we have 0 2 4 6 & 8 so

41:48

you can imagine that you can do some

41:50

pretty complicated indexing using stuff

41:53

like that so this is quite useful our

41:56

stuff also some other useful operations

42:00

are torch that where so we can do print

42:05

torch that where and we can do something

42:09

like a condition we're gonna set X is

42:12

greater than 5 and if X is greater than

42:15

5 then all we're gonna do is we're just

42:18

going to set the value to X and if it's

42:22

not greater than 5 then we're gonna

42:24

return X times 2 so what this is gonna

42:29

do is

42:30

if the value is one for example then

42:32

this condition is not satisfied and then

42:35

we're going to go and change it to x

42:37

times two so if you print this we're

42:41

gonna get this right here so zero just

42:44

stays zero because zero times two is

42:46

still zero one times two is gonna be 2 2

42:52

times 2 is 4 2 times 3 times 2 is 6 4

42:57

times 2 is 8

42:58

and then 5 times 2 is 10 because it's

43:02

not strictly greater than 5 and then

43:05

moving on it satisfied the condition and

43:08

then we just return the x value which

43:10

was which was 6 7 8 and 9 for the last

43:13

values another useful operation is let's

43:16

say we have a tensor and we have 0 0 1 2

43:22

2 3 4 what we can do to just get the

43:27

unique values of that tensor so that

43:30

would be 0 1 2 3 4 we could do a pretty

43:34

explanatory we could do dot unique and

43:38

if we print that we're gonna get exactly

43:40

what we expect we're gonna get 0 1 2 3 4

43:43

another thing we can do is we can do X

43:46

dot and dimension what this is gonna do

43:49

is it's gonna check how many dimensions

43:53

of X that we have so what that means is

43:56

in this case we have a single vector

43:59

which is gonna be a single dimension but

44:02

so if you just run that first of all

44:04

we're getting just gonna give one

44:06

because it's just a single dimension but

44:08

let's say that we would have a I don't

44:12

know a three dimensional tensor we would

44:15

have something like 5 by 5 by 5 and this

44:18

would then if we run something like with

44:22

that has this shape then and dimension

44:25

will be a 3 in that case another thing

44:27

you could also do is you could do print

44:31

X dot numeral and that will just count

44:35

the number of elements in the in X so

44:40

this is quite easy in this scenario

44:42

since we just have a vector but if this

44:45

was you know something larger with more

44:50

dimensions and more complicated numbers

44:52

then this can come useful so that's some

44:56

pretty in-depth on how to do some

44:58

indexing in the tensor let's move on to

45:01

the final thing which is how do we

45:03

reshape a tensor again let's pick a

45:06

favorite example we have tortured range

45:09

of 9 so we have 9 element now let's say

45:14

we want to make this a 3 by 3 matrix so

45:18

we can do X and let's do 3 by 3 we can

45:21

do X dot view and then 3 by 3 so if we

45:26

print X and then 3 by 3 dot shape we're

45:31

going to get the shape is 3 by 3 so

45:33

that's one way another way we can do

45:36

that is by doing X 3 by 3 it's gonna be

45:41

extra reshape and then 3 by 3 and this

45:45

is also going to work so view and

45:48

reshape are really very similar and the

45:52

different differences can be a bit

45:54

complicated but in simple terms view

46:00

acts on something called contiguous

46:03

tensors meaning that the tensor is

46:06

stored contiguously in memory so if we

46:10

have something like if we have a a

46:12

matrix really that's a contiguous block

46:16

of memory with pointers to every element

46:19

to form this matrix so for view it needs

46:23

so this memory block needs to be in

46:26

contiguous memory for reshape it doesn't

46:29

really matter if it's not and it's just

46:32

gonna make a copy so I guess in very

46:35

simple terms

46:36

reshape is the safe bet it's gonna

46:38

always going to work but you can have

46:41

some performance loss and if you know

46:44

how to use view and that's going to be

46:47

superior in many situations so I'm going

46:51

to show you an example of that so this

46:53

is going to be a bit more advanced so

46:56

if you don't follow this completely

46:57

that's fine but if we for example do y

47:01

equals x three by three and then we do

47:04

dot t so what that is going to do it's

47:08

going to transpose other 3x3 just

47:11

quickly this and when we do view on this

47:13

one and we sort of print it so we can do

47:19

print X 3 by 3 we get 0 1 2 3 4 5 6 7 8

47:26

if we do print of Y which is the

47:29

transpose of that we're going to get

47:31

zero three six one four seven two five

47:35

eight essentially if that would be a

47:37

long vector right that would be zero

47:41

three six one four seven two five eight

47:45

originally it was constructed as a zero

47:49

one two three up to eight and so if we

47:53

see right here for one element jump in Y

47:56

there's a three element jump in the

48:00

original X that we constructed or

48:03

initialized so comparing to the original

48:05

we're jumping steps in in this memory at

48:10

least this is how I think about it and

48:13

again I'm no expert on this but as I'm

48:16

my idea is that we're jumping in this

48:19

original memory up block we're jumping

48:23

different amounts of steps so this now

48:27

transposed version is is not a

48:31

contiguous block of memory so then if we

48:36

do something like Y dot view and then we

48:39

do nine to get back those nine element

48:42

we're going to get we're going to get an

48:47

error which says that at least one

48:50

dimension spans across two contiguous

48:52

subspaces

48:53

use dot reshape instead so what you can

48:57

do is you can use reshape and that's the

48:59

safe bet but you can also do Y dot can

49:03

Teague Lluis and then dot view

49:07

and that would work so again this is a

49:09

bit complicated and even I need to

49:11

explore this in more detail but at least

49:14

you know this is a problem to be

49:16

cautious of and a solution is to do this

49:20

contiguous before you do the view and

49:22

again the safe bet is to just use dot

49:26

reshape so moving on to another

49:27

operation let's say that we have x1 and

49:30

we have initialized to some random 2 by

49:35

5 2 by 5 matrix and we have X 2 to be

49:39

some torch that brand and let's say it's

49:42

also 2 by 5 then what we can do is if we

49:45

want to sort of add these two tensors

49:48

together we should use torch dot cat for

49:52

concatenate and we can concatenate X 1

49:55

and X 2 and it's important right here

49:57

that we send them in together in a tuple

49:59

and then we specify the dimension that

50:03

we want them to be concatenated along so

50:07

dimension 0 for example would add them

50:10

across this dimension if we just do that

50:13

shape of that we're gonna get 4 by 5

50:16

right which makes sense if we instead do

50:19

something like print torch that cat of X

50:23

1 X 2 and then we choose dimension 1 and

50:27

then we're gonna add across the 2nd

50:29

dimension so we're gonna get 2 by 10 now

50:31

let's say that we have this X 1 tensor

50:36

right here

50:36

and what we want to do is we want to

50:38

unroll it so that we have 10 elements

50:41

instead of this you know 2 by 5 element

50:44

so how we can do that is we can do

50:47

something like Z equals x 1 dot view and

50:51

then we can do minus 1 so this is gonna

50:55

fight which is gonna magically know that

50:57

you want to just flatten the entire

50:59

thing by sending in this minus 1 so if

51:03

we do print Z dot shape then we're gonna

51:06

get 10 elements and you know that's

51:09

exactly what we want it so but you know

51:12

let's make this a little bit more

51:13

complicated let's say that we also have

51:14

a batch of 64

51:18

let's say we have X to be towards round

51:20

of batch and then 2 & 5 so what we want

51:25

to do is we want to keep this dimension

51:27

exactly the same but we're okay - you

51:29

know concatenate or put together the

51:31

rest of them and we can even have

51:33

another one right here it's still going

51:36

to work but let's just say we have three

51:39

in this case so how we can do that is we

51:40

can do Z equals X dot view of batch and

51:45

then minus one so this is going to just

51:47

keep this dimension and it's gonna do -1

51:50

on the rest if it will be print Zed that

51:53

shape on that that would be 64 by 10 now

51:56

let's say that you instead wanted to do

51:58

something like switch the the axis so

52:01

you would still want to keep the batch

52:03

but you would want to switch these two

52:05

or something like that so you want the

52:07

this one to show five and you want this

52:10

one to shoot show two now how we do that

52:13

is you could do Z is X dot permute and

52:18

then you're going to specify the

52:20

dimension that you want them to be at so

52:23

we want to keep damaging zero at zero

52:25

for the second one we want the the

52:29

second dimension right in the original

52:31

to be on the first dimension so we're

52:34

gonna put two right here then we want

52:37

the dimension one to be on dimension 2

52:41

so we're gonna do oh two and one so

52:45

let's say that you wanted to transpose a

52:48

matrix that would just be a special case

52:52

of the permute function so I think we

52:55

used dot T up here and so this is very

52:59

convenient since we have a matrix right

53:01

here but if you have more dimensions you

53:04

would use dot permute and you can also

53:07

do dot permute on a matrix and though so

53:11

the transpose is really just a special

53:13

case of permute but so if we print Z

53:16

that's shape now we're gonna get 64 5 &

53:19

2 I'm gonna do another example let's say

53:21

we have X and we have torched at a range

53:24

of 10 so this would be you know 10 in in

53:29

size

53:30

see that we want to add one to it so we

53:33

want to make it a 1 by 10 vector how we

53:35

do that is we would do X and then

53:38

unscrew YZ and then let's say we want to

53:41

add the 1 to the front or the first one

53:44

we will do one squeeze zero and so if we

53:47

print that shape we're gonna get 1 by 10

53:50

now if you instead want to add it across

53:52

the other one you would do so you would

53:56

say you want to have it as a 10 by 1 you

53:59

would do X dot on squeeze of 1 and then

54:03

if we just print that shape then we're

54:05

gonna get 10 by 1 let's say we have

54:09

something like X is torture range of 10

54:12

and then let's do one squeeze 0 and then

54:18

unscrews 1 so this shape is 1 by 1 by 10

54:23

and then let's say that we want to you

54:26

know remove one of these so that we just

54:28

have 1 by 10 perhaps this is a bit

54:31

unsurprising we can just do Z equals x

54:35

dot squeeze of either 0 or 1 so we can

54:39

just do X dot squeeze of 1 and if we now

54:42

print that shape we're going to get a 1

54:44

by 10 right there I think this was a

54:47

quite long video but hopefully you were

54:50

able to follow all of these transfer

54:51

operations and hopefully you found it

54:54

useful I know that there's a lot to take

54:57

in a lot of information but if you

54:59

really you know get the foundation

55:01

correct then everything is going to

55:03

become a lot easier for you when you

55:05

actually start to do some more deep

55:07

learning related tasks so this is stuff

55:11

that's boring that you kind of just need

55:14

to learn and when you got that

55:16

everything else becomes easier so with

55:18

that said thank you so much for watching

55:20

a video if you have any questions then

55:21

leave them in the comment below and yeah

55:24

hope to see in the next video

55:28

[Music]

Rate This

5.0 / 5 (0 votes)

Related Tags
PyTorch张量操作深度学习基础教程数学运算张量重塑索引操作广播机制性能优化数据处理机器学习
Do you need a summary in English?