Complete Pytorch Tensor Tutorial (Initializing Tensors, Math, Indexing, Reshaping)
Summary
TLDR本视频深入探讨了PyTorch中张量的基本操作,包括张量的初始化、数学运算、索引、重塑等。通过实例演示了如何创建张量、进行元素级运算、矩阵乘法、广播、索引和高级索引技巧。同时,还介绍了如何使用不同的方法来重塑张量,以及如何进行张量的拼接和转置。这些基础知识为深入学习深度学习打下了坚实的基础。
Takeaways
- 📘 学习基本的张量操作是深入理解PyTorch和深度学习的基础。
- 🔢 张量的初始化可以通过多种方式,如列表嵌套、特定数值填充等。
- 📈 张量可以设置数据类型(如float32)和设备(如CPU或CUDA)。
- 🔧 张量的形状、数据类型和设备等属性可以通过特定的方法进行查询和设置。
- 🔄 张量支持多种数学和比较操作,如加法、减法、除法、指数运算等。
- 🔍 张量索引允许访问和操作张量的特定元素或子张量。
- 🔧 张量可以通过reshape和view方法进行形状变换。
- 🔄 张量可以通过广播机制进行自动维度扩展以执行元素级操作。
- 🔍 张量提供了高级索引功能,如根据条件选择元素、使用高级索引器等。
- 📊 张量操作包括矩阵乘法、矩阵指数运算、元素级乘法等。
- 🔧 张量可以通过特定的函数进行操作,如torch.where、torch.unique、torch.numel等。
- 🔄 张量可以通过concatenate、permute等方法进行维度操作和轴交换。
Q & A
如何初始化一个PyTorch张量?
-可以通过多种方式初始化张量,例如使用列表创建、使用torch.empty创建未初始化的数据、使用torch.zeros创建全零张量、使用torch.rand创建均匀分布的随机数张量等。
PyTorch张量的数据类型和设备如何设置?
-可以通过设置dtype属性来指定数据类型,如torch.float32。设备可以通过设置device属性来指定,如使用CUDA或CPU。
张量的基本属性有哪些?
-张量的基本属性包括其设备(device)、数据类型(dtype)、形状(shape)以及是否需要梯度(requires_grad)。
如何在PyTorch中进行张量数学运算?
-PyTorch提供了丰富的张量运算方法,如加法(torch.add)、减法、除法(torch.true_divide)、元素乘法(torch.mul)、矩阵乘法(torch.matmul)等。
什么是张量索引?
-张量索引允许我们访问和操作张量的特定元素或子张量。可以通过指定索引、切片、布尔索引等方式进行。
如何使用广播(broadcasting)进行张量操作?
-广播允许在形状不完全相同的张量之间进行操作。PyTorch会自动扩展较小的张量以匹配较大张量的形状,然后进行相应的运算。
张量重塑(reshaping)是什么意思?
-张量重塑是指改变张量的形状而不改变其数据。可以使用view或reshape方法来实现,其中view要求张量在内存中是连续存储的。
如何使用torch.cat进行张量拼接?
-torch.cat用于将多个张量沿着指定的维度拼接。需要将张量作为元组传递,并指定拼接的维度。
张量转置(transpose)是如何实现的?
-张量转置可以通过permute方法实现,指定新的维度顺序。对于二维张量,可以直接使用T属性或torch.transpose方法。
如何使用torch.where进行条件索引?
-torch.where根据给定的条件返回满足条件的元素的索引。可以用于创建基于条件的新张量或对现有张量进行操作。
张量的独特值(unique)如何获取?
-可以使用torch.unique方法获取张量中所有独特的值。该方法返回一个排序后的独特值张量。
Outlines
📚 深度学习基础:张量操作入门
介绍了深度学习中张量操作的重要性,强调了学习张量操作是深入理解深度学习的基础。视频将分为四个部分:张量的初始化、张量数学运算、张量索引和张量重塑。鼓励观众观看完整视频以掌握这些基础操作,即使不能全部记住,至少了解它们的存在,这将在未来节省大量时间。
🔢 张量初始化与属性
详细讲解了如何初始化张量,包括使用列表、指定数据类型、设置设备(CPU或CUDA)、设置梯度要求等。还介绍了如何查看张量的设备位置、数据类型、形状和是否需要梯度等属性。
📈 张量数学运算与比较
介绍了张量的基本数学运算,如加法、减法、除法、元素级指数运算等。还讲解了如何进行矩阵乘法、矩阵指数运算、元素级比较以及如何使用广播功能。
🔄 张量索引与操作
解释了如何通过索引访问和修改张量中的特定元素,包括基本索引、高级索引技巧、条件索引等。还介绍了如何使用`torch.where`进行条件赋值和`torch.unique`获取张量中的唯一值。
📊 张量重塑与变形
展示了如何使用`view`和`reshape`方法改变张量的形状,包括转置、展平、改变维度顺序等。强调了`view`要求张量在内存中连续存储,而`reshape`则没有这个要求。还介绍了如何使用`torch.cat`进行张量拼接。
🎉 总结与结束
视频总结,强调了学习张量操作的重要性,并鼓励观众在评论区提问。提醒观众,掌握这些基础操作将使后续的深度学习任务变得更加容易。
Mindmap
Keywords
💡张量(Tensor)
💡PyTorch
💡初始化(Initialize)
💡数学运算(Math Operations)
💡索引(Indexing)
💡重塑(Reshaping)
💡广播(Broadcasting)
💡设备(Device)
💡梯度(Gradient)
💡矩阵乘法(Matrix Multiplication)
💡批量矩阵乘法(Batch Matrix Multiplication)
Highlights
介绍了张量的基本操作,强调了在深度学习中学习这些操作的重要性。
展示了如何初始化张量,包括使用列表和指定数据类型。
解释了如何设置张量在CUDA或CPU上运行。
介绍了张量的属性,如设备位置、数据类型和是否需要梯度。
提供了多种创建张量的方法,如使用torch.empty、torch.zeros、torch.rand等。
讨论了如何在不同设备之间移动张量,以及如何设置张量的设备。
展示了如何进行张量的基本数学运算,包括加法、减法和除法。
介绍了张量的广播(broadcasting)概念,以及如何在不同维度上进行操作。
解释了如何进行张量的索引,包括基本索引和高级索引技巧。
讨论了如何重塑张量,包括使用view和reshape方法。
展示了如何进行张量的矩阵乘法和矩阵指数运算。
介绍了如何进行张量的元素级操作,如元素级乘法和点积。
解释了如何使用torch.where进行条件索引和赋值。
展示了如何使用torch.unique获取张量中的唯一值。
讨论了如何使用torch.cat进行张量的拼接。
介绍了如何使用torch.permute和torch.squeeze进行张量的维度操作。
强调了学习张量操作对于未来节省时间的重要性。
提供了一个视频,旨在帮助观众掌握张量操作的基础知识。
Transcripts
learning about the basic tensor
operation is an essential part of pi
torch and it's worth to spend some time
to learn it and it's probably the first
thing you should do before you do
anything related to deep learning what
is going on guys hope you're doing
awesome and in this video we're gonna go
through four parts so we're gonna start
with how to initialize a tensor and
there are many ways of doing that we're
gonna go through a lot of them and then
we're gonna do you know some tensor math
math and comparison operations we're
gonna go through tensor indexing and
lastly we're gonna go through tensor
reshaping and I just want to say that I
really encourage you to watch this video
to the end so you get a grasp on these
tensor operations even if you don't
memorize them after this video and
there's probably no way you can memorize
all of them you would at least know that
they exist and that will save you a lot
of time in the future so in this video I
will cover the basics but I will also go
a bit beyond that and show you a lot of
useful things or operations for
different scenarios so there's no way
I'm able to cover all of the tensor
operations there are a bunch of more
advanced ones that I don't even know
about yet but these are definitely
enough to give you a solid foundation so
I guess we'll just get started and the
first thing I'm going to show you is how
to create a tensor so what we're gonna
do is we're gonna do my tensor and we're
gonna do torch tensor and we're gonna do
the first thing we're gonna do is a list
and we're gonna do a list inside that
list we're gonna do one two three and
then let's do another list and we're
gonna do four five six so what this
means right here is that we're gonna do
two rows so this is the first row and
then this is the second row all right so
we're gonna have two rows in this case
and we're gonna have three columns
so we could do then is we can do print
my tensor and we can run that and we're
gonna just get in an in a nice format so
we're gonna get one for for the first
column two five and three six now what
we can do as well is we can set the type
of this tensor so we can do D type
equals torch dot float and we can do
let's say float 32 so then if we print
it again we're gonna get that those are
float values now another thing we can do
is we can set the device that this
tensor should be on either CUDA or the
CPU now if you have a CUDA enabled GPU
you should almost always have a sensor
on the on CUDA otherwise you're gonna
have to use the CPU but we can specify
this using the device so we can do
device equals then we can set this to
CUDA if you have that available and then
if we print my tensor again we can say
see that the device says CUDA right here
if you do not have a it could enable GPU
then you're gonna have to write CPU
right here I think that CPU is also the
default she don't have to write it but
it can help to just be specific now if
we run this the it doesn't the device
doesn't show which means that it's on
the CPU can also set other arguments
like requires gradient which is
important for auto grab which I'm not
going to cover in this video but
essentially for computing the gradients
which are used when we do the backward
propagation through the computational
graph to update our parameters through
gradient descent but anyways I'm not
gonna go in-depth on that one thing we
can do as well is that you're gonna see
a lot when in my videos and just if you
read PI torch code is that people often
write device equals CUDA if torch that
CUDA dot is available like that
else CPU all right and in this case what
happens is that if you have CUDA enabled
the device is going to be set to CUDA
and otherwise it's going to be set to
the CPU kind of that priority that if
you have enabled and you should use it
otherwise you're gonna be stuck with the
CPU but that's all you have so what we
can do instead of writing the string
here is that we can just write device
like this
and now the great thing about this is
that two people can run it and if one
has kuda it's going to run on the GPU if
they don't have it's gonna run in the
cpu but the code works no matter if you
have it or not
now let's look at some attributes of
tensors so what we can do is we can as I
said I print my tensor and we just I get
so we can do that again and we just get
some information about the tensor like
what device it's on and if it requires
gradient what we can also do is we can
do my tensor and we can do dot D type so
that would we'll just in this case print
towards up float32 and we can also do
print my tensor that device what that's
gonna do is gonna show us what there is
detention or zone so CUDA and then
you're gonna get this right here which
essentially if you have multiple GPUs
it's gonna say on which GPU it's on in
this case I only have one GPU so it's
gonna say 0 and 0 I think is the default
one if you don't specify and then what
we can do as well we can do print my
tensor dot shape so yeah this is pretty
straightforward it's just gonna print
the shape which is a two by three what
we can do as well is we can do print
might answer that requires grad which is
gonna tell us if that answer requires
gradient or not which in this case we've
set it to true alright so let's move on
to some other common initialization
methods
what we can do is if we don't have the
exact values that we want to write like
in this case we had one two three and
four five six we can do x equals torch
that empty and we can do size equals
let's say three by three and what this
is gonna do is it's gonna create a three
by three
tensor and our matrix I guess and it's
gonna be empty in that it's going to be
uninitialized data but these days this
data values that isn't gonna be in this
industry are gonna be just whatever is
in the memory at that moment so the
values can really be random so don't
think that this should be zeros or
anything like that
it's just gonna be unleash uninitialized
theta now if you would want to have
zeros you can do torched add zeros
and you don't have to specify the the
size equals since it's the first
argument we can just write three three
like that and that's gonna be so what we
can do actually we can print let's see
we can print X right here and we can see
what value is it gets and in this case
it actually got zero zeros but that's
that's not what it's gonna happen in
general and yeah if you print X after
this it's also going to be zeros now
what we can also do is we can do x
equals torch dot R and and we can do
three three again and and what this is
gonna do it it's gonna initialize a
three by three matrix with values from a
uniform distribution in the interval 0 &
1 another thing we could do is we could
do X equal torched at once I'm looking
again to I don't a by 3 and this is just
gonna be a three by three matrix with
all values of 1 another thing we can do
is torch that I and we can and this is
we're gonna send in five by five or
something like that and this is gonna
create an identity matrix so we're gonna
have ones on the diagonal and the rest
will be will be zeros so if you're
curious why it's called I is because I
like that is the identity matrix how you
write in mathematics and if you say I it
kind of sounds like I yeah that makes
sense
so anyways one more thing we can do is
we give you something like torch that a
range and we can do start we can do end
and we can do step so you know basically
the arrange function is exactly like the
range function in Python so this should
be nothing nothing weird one thing I
forgot to do is just print them so we
can see what they look like so if we
print that one as I said we're gonna
have once on the diagonal and the rest
will be 0 if we print X after the
arranged
it's going to start at zero it's going
to have a step of one and the end is a
non-inclusive v del v value or five so
we're going to have 0 1 2 3 4 so if we
print X we're gonna see exactly that 0
and then up to inclusive for another
thing we can do is we can do x equals
torsion linspace and we can specify
where it should start so we can do start
equals 0.1 and we could do something
like end equals 1 and we could also do
step equals 10 so what this is gonna do
is it's gonna write this should be steps
so what this is gonna do is it's gonna
start at 0.1 and then it's gonna end at
1 and it's gonna have 10 values in
between those so what's going to happen
in this case it's gonna take the first
value 0.1 then the next one it's going
to be a point to a point 3 0.4 etc up to
1 so just to make sure we can print X
and we see that that's exactly what
happens and if we calculate the number
of points so we have 1 2 3 4 5 6 7 8 9
10 right so we're gonna have it equal 2
steps amount of point so then we can do
also x equals torch that empty as we did
for the first one and we can set the
size and I don't know 1 and 5 or
something like that and what we can do
then is we can do dot normal and we can
do mean equals 0 and the standard
deviation equals 1 and essentially so
what this is gonna do it's gonna create
the initialized data of size 1 1 by 5
and then it's just gonna make those
values normally distribute normally
distributed with a mean of 0 and
standard deviation of 1 so we could also
do this with something like we can also
do something like with the uniform
distribution so we can do dot uniform
and then we can do 0 and 1 which would
also be similar to what we did up here
for the torch ran
but of course here you can specify
exactly what you want for the lower and
the upper of the uniform distribution
another thing we can do is to torch dot
d AG and then we can do torch dot ones
of some size let's say three it's going
to create a diagonal matrix of ones on
the diagonal it's going to be shape
three so essentially this is gonna
create a 3x3 diagonal matrix essentially
this is gonna create a an identity
matrix which is three by three so we
could adjust as well used I
but this diagonal function can be used
on on it on any matrix so that we
preserve those values across a diagonal
and in this case it's just simple to use
torched at once now one thing I want to
show as well is how to initialize tensor
to different types and how to convert
them to different types
so let's say we have some tensor and
we're just going to do a torch dot a
range of 4 so we have 0 1 2 3 and yea so
here I set to start the end and this
step similarly to Python the step will
be 1 by default and the start will be 0
by default so if you do a range for
you're just gonna do that's the end
value now I think this is this is
initialized as a 64 by default but let's
say we want to convert this into a
boolean operator so true or false what
we can do is we can do tensor dot bool
and that will just create false true
true true so the first one is 0 that's
gonna be false and the rest will be
through true now what's great about
these as I'm showing you now when you
dot boo and I'm also going to show you a
couple more is that they work no matter
if you're on CUDA or the CPU so no
matter which one you are these are great
to remember because they will always
work now the next thing is we can do
print tensor that's short and what this
is gonna do is it's gonna create two
int16
and I think both of these two are not
that often used but they're good to know
about and then we can also do tensor dot
loan and what this is gonna do is it's
gonna do it to in 64 and this one is
very important because this one is
almost always used we're going to print
tensor 1/2 and this is gonna make it to
float 16 this one is not used that often
either but if you have well if you have
newer GPUs in in the 2000 series you can
actually train your your networks on
float 16 and that's that that's when
this is used quite often but if you
don't have such a GPU I don't have that
that new of a GPU then it's not possible
to to Train networks using float 16 so
what's more common is to use tenser dot
float
so this will just be a float 32-bit and
this one is also super important this
one is used super often so it's good to
remember this one and then we also have
tensor dot double and this is gonna be
closed 64 now the next thing I'm gonna
show you is how to convert between
tensor and let's say you have a numpy
array so we'll say that we import numpy
as MP now let's say that we have some
numpy array we have numpy zeros and I'd
say we have a 5 by 5 matrix and then
let's say we want to convert this to a
tensor well this is quite easy we can do
tensor equals Torche dot from numpy and
we can just sending that numpy array
that's how we get it to a tensor now if
you want to convert it back so you have
a a back the number array we can just do
let's say numpy array back we can do
tensor dot numpy and this is gonna bring
back the number array perhaps there
might be some numerical roundoff errors
but they will be exactly identical
otherwise so that was some how to
initialize any tensor and some other
useful things like converting between
other types in float and double and also
how to convert between numpy arrays and
tensors now we're going to jump to
tensor math and comparison operations so
we're gonna first initialize two tensors
which we know exactly how to do at this
point we're going to torch that tensor
one two three and we're gonna do some
y2b torsa tensor and then I don't know
nine eight seven and
we're going to start real easy so we're
just going to start with addition now
there are multiple ways of doing
addition I'm going to show you a couple
of different points so we can do
something like is that one to be torched
empty of three values then we can do
torch that add and we can do X Y and
then we can do out equals Z one now if a
print said one we're going to get 10 10
and 10 because we've added these these
together and as we can see 1 plus 9 is
10 2 plus 8 and 3 plus 7 so this is one
way another way is to just do Z equals
torch dot add of x and y and we're gonna
get exactly the same result now another
way and this is my preferred way it's
just to do X Z equals X plus y so real
simple and real clean and you know these
are all identical so they will do
exactly the same operations and so in my
opinion there's really no way no reason
not to use just the normal addition for
subtraction there are again other ways
to do it as well but I recommend doing
it like this so we do Z equals X minus y
now for division this is a little bit
more clunky in my opinion but I think
they are doing some changes for future
versions of Pi torch but we can do Z
equals torch dot true divide and then we
can do x and y what's what's going to
happen here is that it's going to do
element wise division if they are of
equal shape so in this case it's going
to do 1/9 as its first element 2 divided
by 8 3 divided by 7 let's say that Y is
just an integer so Y is I don't know -
then what's gonna happen it's gonna
divide every element in X by 2 so it
would be 1/2 3/2 and 3/2 if Y would be
in
now another thing I'm gonna cover is in
place operations so let's say that we
have T equals towards that zeros of
three elements and let's say we want to
add X but we want to do it in place and
what that means it will mutate the
tensor in place so it doesn't create a
copy so we can do that by T dot ad
underscore X and whenever you see any
operation followed by an underscore
that's when you know that the operation
is done in place so doing these
operations in place are oftentimes more
computationally efficient another way to
do in place is by doing T plus equals x
so this will also do an in place and
ition similarly as ad underscore
although perhaps confusingly is if you
do T equals T plus X it's not going to
be in place that's going to create a
copy first and yeah I'm gonna expert on
this particular subject so that's just
what I know to be the case to move along
let's look at exponentiation so let's
say that we want to do element wise
exponentiation how we can do that is by
doing Z equals X dot power of two so
what that means is since X in this case
is one two and three and we're doing a
power of two this is going to be element
wise a power of two so it's going to
become one four and nine and we can
print that just to make sure so then we
get one four and nine and another way to
do this which is my preferred way of
doing it is doing is e equals x asterisk
asterisk two so this is going to do
exactly the same operation just without
dot power let's do some simple
comparison so let's say we want to know
which values of X are greater than 0 and
less than zero so how we can do that is
by doing Z equals x greater than zero so
we can just do print said
and this is going to again also be
elementwise comparison so this is just
going to be true since all of them are
greater than zero and if we do something
like that equals x and then less than
zero those are all going to be false
because all of the elements are greater
than zero so that's just how you do
simple comparison let's look at matrix
multiplication so
can do that is we if we if we initialize
two matrices so we have x1 torch that
Rand and then we do 2 and 5 and then we
do x2 and then we do to a tractor trend
and then 5 and 3 how we do the matrix
multiplication as we do something like X
3 equals torch mm X 1 and X 2 and then
the outer shape of this will just be 2x3
an equivalent way of doing the matrix
multiplication is you can do X 3 equals
x 1 dot mm and then X 2 so these are
really equivalent
eeeek you write out the torch dot and
then or you just do the tensor and then
it's a it has this operation as an
attribute to that tensor now let's say
that we want to do some matrix
exponentiation meaning so we don't want
to do element wise exponentiation but
rather we want to take the matrix and
raise the entire matrix so how we can do
that let's do matrix exponent and let's
just pick let's just initialize it to
torture brand of five and five
and then we can do something like matrix
dot X or underscore X and then matrix
underscore power and then we can send in
the amount of times to erase that matrix
so we can do something like three and
this would be equivalent of course to
doing matrix exponent and the matrix
multiply by itself and then matrix
multiply it again by itself if we're
curious how they look like we can just
print it and it's going to be the same
shape so it's going to be a five by five
and yeah these values don't really mean
anything because their values are all
random but at least we can see that it
sort of seems to make sense if we do the
matrix multiplication three times we
will probably get something that looks
like that now let's look at how to do
element wise multiplication so we have X
and we have Y so we have one two three
nine eight seven what we can do is we
can do Z equals x and then just star Y
so that would just be an element wise
multiplication and so if we print said
that should be you know 1 times 9 and
then 2 times 8 so and then 3 times 7 so
it should so it should be 9 16 and 21 so
if we print that we see that we get
exactly in 9/16 and 21 as we expect
another thing we can do is the dot
product so essentially that means taking
the element wise multiplication then
taking the sum of that but we can do
that instantly by doing torch dot dot
and then x and y so that would just be
the sum of 21 16 and 9 so we can just do
print that for that and we can see that
this sum is 46 something a little bit
more advanced is a batch matrix
multiplication and I'm just gonna
initialize so I'm just going to
initialize the matrices first or tensors
I guess so we're gonna do batch equals
32 and equals 10 M equals 20 P equals 30
so this doesn't make any sense yet but
we're gonna do tensor one and we're
gonna do torch dot Rand and then we're
gonna do batch and then N and then M so
we have three dimensions for this tensor
essentially that's what means with when
we have batch matrix multiplication if
we just have two dimensions of the
tensor then we can just do normal matrix
multiplication right but if we have this
additional additional dimension for the
batch then we're gonna have to use batch
matrix multiplication so let's define
the second tensor so that's gonna be
tortured Rand and it's gonna be badge
and it's gonna be M and it's gonna be P
if we have the tensors in structured in
this shape then we're gonna do out after
batch mates from application it's just
going to be torch that bmm of tensor 1
and tensor to and so what happens here
is that we can see that the dimension
that match there are equal are these
ones right here so it's gonna do matrix
multiplication across that dimension so
the resulting shape in this case is
going to be badge and MP so we're gonna
do we're gonna get batch and then N and
then P next I want to cover something
else which is some examples of a concept
called broadcasting that are you gonna
encounter a lot of times in both numpy
and PI torch so let's say we have two
tensors we have X 1
which is just a random uniformly random
5x5 matrix and then we have X 2 which is
torch dot R and and let's say it's 1 by
5 now let's say that we do Z equals x 1
minus X 2 and mathematically that
wouldn't make sense right we can't
subtract a matrix by a vector but this
makes sense in pi torch and numpy and
why this makes sense or how does it make
sense it makes sense in that what's
going to happen is that this row is
going to be expanded so that it matches
the row of the first one in this case so
this one is going to be expanded so we
have five rows where each column are
identical to each other and then the
subtraction works so in other words this
vector here is going to be subtracted by
each row of this matrix that's what we
refer to as broadcasting when it
automatically expands for one dimension
to match the other one so that it it can
actually do the operation that we're
asking it to do we can also do something
like X 1 and then element wise
exponentiation 2 X 2 so again
mathematically this doesn't make too
much sense we can't raise this matrix
element wise by something that doesn't
match in the shape but but it's going to
be the same thing here in that it's
gonna copy across all of those rows that
we wanted to element wise raise it to so
in this case it's gonna be again be a 5
by 5 and then it can element wise raised
from those elements now let's look at
some other useful tensor operations we
can do something like sum of X which is
going to be torched after some and then
we can do X and we can also specify the
dimension that it should do the
summation in this case X is just a
single vector so we can just do
dimension equal 0 although if you have a
like we did here in tensor one where we
have three dimensions of the tensor you
can
specify which dimension it should sum
over another thing we can do is we can
do torch that max so torch like Max is
gonna return the values and the indices
of the maximum values so we're gonna do
tours of Max of X and then we also
specify the dimension for which we want
to take the maximum again in this case
we just have a vector so the only thing
that makes sense here is dimension 0 you
can also do this same thing values
indices but you can do the opposite so
the torch type min instead of torch at
max then we can again do X dimension
equals 0 we can compute the absolute
value by doing torch that absolute
absolute value or torch with ABS of X
and that's gonna take the absolute value
element wise for each in X we could also
do something like torch dot arguments of
X and then we specify the dimension so
this would do the same thing as torch
dot max except it only returns the the
index of the one that is the maximum I
guess this would be a special case of
the max function we could also do the
opposite so we can do torch that
argument of X and then across some
dimension we could also compute and I
know these are a lot of operations but
so we can also do the mean so we can do
towards that mean but to compute the
mean Python requires it to be a float so
what we have to do then is X dot float
and then we specify the dimension in
this case again we only have dimension 0
if we would for example 1/2 element wise
compare two vectors or matrices we can
use torch that eq and we can send in x
and y and this is going to check
essentially which elements are equal and
that's going to be true otherwise it's
going to be false in this case if we
scroll up we have x and y are not equal
in any element so they are all going to
be false
essentially if we if we print Zed this
is just gonna be false false false
because none of them are equal other
thing we can do is we could do torch
that sort and here we can send in Y for
example we can specify the dimension to
be sorted I mention 0 and then we can
specify the sending equals false
so essentially you know we're gonna in
this case sort the only dimension that Y
has and we're gonna sort it in ascending
order so this is default meaning it's
going to sort in ascending order so the
first value the first element is going
to be the smallest one and then it's
going to just going to be an increasing
order and what this returns is two
things you drew also it turns the sort
of Y but then it's also going to return
the illnesses that we need to swap so
that it becomes sorted all right these
are all a lot of tense reparations but
we're gonna so they're not that many
left on this one so one more thing we
can do is we can do torch dot clamp and
we can send in X for example and then we
can do min equals zero so what this is
gonna do is it's going to check all
elements of X that are less than zero
it's gonna set to zero and in this case
there are no max value but we could also
send the in max equals I don't know 10
meaning if any value is greater than 10
it's going to set it to 10 so it's gonna
clamp it to 10 but if we don't send any
value for the maximum in then we don't
have a max value that it clamps to so if
you recognize here if this is going to
clamp every value less than 0 to 0 and
any other value greater than 0 it's not
going to touch then that's exactly the
relative function so towards that clamp
ism is a general case and the rel is a
special case of clamp so let's say that
we initialize some tensor and we have I
don't know 1 0 1 1 1
and we do this to d-type torched our
pool so we have true or false values now
let's say that we want to check if any
of these values are are true so we can
do torch dot any of X and so this is of
course going to be true because we have
most of them to be true which means that
at least one is true so this will be
true but if we if we instead have Z
equals torched at all of X this means
that all of the values needs to be one
meaning there cannot be any value that's
false so in this case this is going to
be full since we have one value right
here that's zero I also want to add that
right here when we do torch that max you
could also just instantly do X dot max
and then dimension equals zero and you
could do that for a lot of these
different operations you can do that for
absolute value for the minimum for the
sum Arg max sword etc so here I'm being
very explicit in writing torch
everywhere which you don't need to do
unless you really want to so that was
all for math and comparison operations
as I said in the beginning these are a
lot of different operations so there's
no way you're gonna memorize all of them
but at least you can sort of understand
what they do and then you know that they
exist so if you encounter a problem that
you can think of I want to do this and
then you know that you know what to
search for essentially so let's now move
on to indexing in the tensor so what
we're gonna do for tensor indexing we're
gonna do let's say we have a batch size
of ten and we have 25 features of every
example in our batch so we can issue
initialize our input X and we can do
tours around and we can do batch size
how comma features and then let's say we
want to get the first example so we want
to get the features of the first example
well how we can do that
as we can do X of zero and that would
get us all of the 25 features I'm just
gonna write down shape and this is
equivalent to to doing x0 comma all like
this so what this would mean is that we
want the first one the first example in
our batch and then we want all the rest
in that specific dimension so we want
all the features but we could also just
do X of 0 directly now let's say that we
want the opposite so we want to get the
first feature for all of our examples
well how we can do that is we can do x
of all and then 0 which means that we
will get the first feature right the
first feature over all of the examples
so if we do that shape on that we're
just gonna get 10 right there since we
have 10 examples now let's say that we
want to do something a little bit more
tricky so we want to get the third
example in the batch and we want to get
the first ten features how we can do
that is we can do X of 2 so that's the
third example in the batch and then we
can do 0 and then 2 10 so what this is
gonna do essentially it's going to
create a list of so 0 to 10 is
essentially going to essentially going
to create a list so 0 1 up to 9 and so
what this means is that my torch is
gonna know that we want the second or
guess the third row and then we want all
of the elements of 0 up to 10 we could
also use this to assign our our tensor
so we could use X of 0 0 and we can just
like set this to 100 or something like
that and of course this also works for
this example right here and the previous
ones now let's say that we want to do
something more fancy so we're gonna call
this fancy indexing
we can do something like X is a torch
dot a range of 10 and then we could do
indices which is going to be a list of
let's say 2 5 & 8 what we can do then is
we can do print X of indices so what
this is gonna do is it's only gonna pick
out I guess in this case the third
example in a batch the sixth example in
our batch and the ninth example in our
batch since here we just did torture the
range 10 this is going to pick out
exactly the same values so it's gonna
pick out 3 elements from from this list
right here and it's gonna pick out the
the the value 0 5 & 8 exactly matching
the indices so if we would do yeah if we
would print that
see that we get exactly to five and
eight what we can also do is let's say
we do torch dot R and and we do three
and five and then let's say that we want
some specific rows so we do rows it goes
towards that tensor of 1 and 0 and we
specified the columns as well so we do
torso tensor of I don't know 4 and 0
then we can do print X of rows comma
columns what this is gonna do is it's
gonna first pick out the first or I
guess the second row and the fifth
column and then it's gonna pick out the
first row and the second column yeah
that's what it's gonna do it's gonna
pick out two elements so we can just do
that real quick so we print the shape
and we see that it's gonna pick out two
elements now let's look at some more
advanced indexing so let's say again we
have X is torch a range of 10 and let's
say that we want to just pick out the
elements that are strictly smaller than
2 and greater than 8 so how we can do
that is we can do print X and then we
can do list and then we're just going to
do it X less than 2
we can do or
or X is greater than 8 so this means is
it's going to pick out all the elements
that are less than 2 or it's gonna pick
out so it's gonna pick out the elements
that I listen to or if they are greater
than 8 essentially in this case it's
going to pick out the 0 and the 1 and
it's going to pick out the 9 so if we
just print that real quick we'll see
that it picks out 0 1 & 9 and what you
could also do is you can replace this
with a with an ensign so this would be
it needs to satisfy that it's both
smaller than 2 and greater than 8 which
of course would be wouldn't be possible
so if we run that that would just be an
empty tenser right there I want to show
you another example of this we could do
something like print X and then inside
the list we could do X dot remainder of
of 2 and we can do equal equal 0 so this
is gonna do just if the remainder of X X
modulus 2 is 0 then those are the
elements we're gonna pick out
essentially these are all the even
elements right so we're gonna pick out
the elem even elements which are 0 2 4 6
& 8 so if we print that real quick we
again see that we have 0 2 4 6 & 8 so
you can imagine that you can do some
pretty complicated indexing using stuff
like that so this is quite useful our
stuff also some other useful operations
are torch that where so we can do print
torch that where and we can do something
like a condition we're gonna set X is
greater than 5 and if X is greater than
5 then all we're gonna do is we're just
going to set the value to X and if it's
not greater than 5 then we're gonna
return X times 2 so what this is gonna
do is
if the value is one for example then
this condition is not satisfied and then
we're going to go and change it to x
times two so if you print this we're
gonna get this right here so zero just
stays zero because zero times two is
still zero one times two is gonna be 2 2
times 2 is 4 2 times 3 times 2 is 6 4
times 2 is 8
and then 5 times 2 is 10 because it's
not strictly greater than 5 and then
moving on it satisfied the condition and
then we just return the x value which
was which was 6 7 8 and 9 for the last
values another useful operation is let's
say we have a tensor and we have 0 0 1 2
2 3 4 what we can do to just get the
unique values of that tensor so that
would be 0 1 2 3 4 we could do a pretty
explanatory we could do dot unique and
if we print that we're gonna get exactly
what we expect we're gonna get 0 1 2 3 4
another thing we can do is we can do X
dot and dimension what this is gonna do
is it's gonna check how many dimensions
of X that we have so what that means is
in this case we have a single vector
which is gonna be a single dimension but
so if you just run that first of all
we're getting just gonna give one
because it's just a single dimension but
let's say that we would have a I don't
know a three dimensional tensor we would
have something like 5 by 5 by 5 and this
would then if we run something like with
that has this shape then and dimension
will be a 3 in that case another thing
you could also do is you could do print
X dot numeral and that will just count
the number of elements in the in X so
this is quite easy in this scenario
since we just have a vector but if this
was you know something larger with more
dimensions and more complicated numbers
then this can come useful so that's some
pretty in-depth on how to do some
indexing in the tensor let's move on to
the final thing which is how do we
reshape a tensor again let's pick a
favorite example we have tortured range
of 9 so we have 9 element now let's say
we want to make this a 3 by 3 matrix so
we can do X and let's do 3 by 3 we can
do X dot view and then 3 by 3 so if we
print X and then 3 by 3 dot shape we're
going to get the shape is 3 by 3 so
that's one way another way we can do
that is by doing X 3 by 3 it's gonna be
extra reshape and then 3 by 3 and this
is also going to work so view and
reshape are really very similar and the
different differences can be a bit
complicated but in simple terms view
acts on something called contiguous
tensors meaning that the tensor is
stored contiguously in memory so if we
have something like if we have a a
matrix really that's a contiguous block
of memory with pointers to every element
to form this matrix so for view it needs
so this memory block needs to be in
contiguous memory for reshape it doesn't
really matter if it's not and it's just
gonna make a copy so I guess in very
simple terms
reshape is the safe bet it's gonna
always going to work but you can have
some performance loss and if you know
how to use view and that's going to be
superior in many situations so I'm going
to show you an example of that so this
is going to be a bit more advanced so
if you don't follow this completely
that's fine but if we for example do y
equals x three by three and then we do
dot t so what that is going to do it's
going to transpose other 3x3 just
quickly this and when we do view on this
one and we sort of print it so we can do
print X 3 by 3 we get 0 1 2 3 4 5 6 7 8
if we do print of Y which is the
transpose of that we're going to get
zero three six one four seven two five
eight essentially if that would be a
long vector right that would be zero
three six one four seven two five eight
originally it was constructed as a zero
one two three up to eight and so if we
see right here for one element jump in Y
there's a three element jump in the
original X that we constructed or
initialized so comparing to the original
we're jumping steps in in this memory at
least this is how I think about it and
again I'm no expert on this but as I'm
my idea is that we're jumping in this
original memory up block we're jumping
different amounts of steps so this now
transposed version is is not a
contiguous block of memory so then if we
do something like Y dot view and then we
do nine to get back those nine element
we're going to get we're going to get an
error which says that at least one
dimension spans across two contiguous
subspaces
use dot reshape instead so what you can
do is you can use reshape and that's the
safe bet but you can also do Y dot can
Teague Lluis and then dot view
and that would work so again this is a
bit complicated and even I need to
explore this in more detail but at least
you know this is a problem to be
cautious of and a solution is to do this
contiguous before you do the view and
again the safe bet is to just use dot
reshape so moving on to another
operation let's say that we have x1 and
we have initialized to some random 2 by
5 2 by 5 matrix and we have X 2 to be
some torch that brand and let's say it's
also 2 by 5 then what we can do is if we
want to sort of add these two tensors
together we should use torch dot cat for
concatenate and we can concatenate X 1
and X 2 and it's important right here
that we send them in together in a tuple
and then we specify the dimension that
we want them to be concatenated along so
dimension 0 for example would add them
across this dimension if we just do that
shape of that we're gonna get 4 by 5
right which makes sense if we instead do
something like print torch that cat of X
1 X 2 and then we choose dimension 1 and
then we're gonna add across the 2nd
dimension so we're gonna get 2 by 10 now
let's say that we have this X 1 tensor
right here
and what we want to do is we want to
unroll it so that we have 10 elements
instead of this you know 2 by 5 element
so how we can do that is we can do
something like Z equals x 1 dot view and
then we can do minus 1 so this is gonna
fight which is gonna magically know that
you want to just flatten the entire
thing by sending in this minus 1 so if
we do print Z dot shape then we're gonna
get 10 elements and you know that's
exactly what we want it so but you know
let's make this a little bit more
complicated let's say that we also have
a batch of 64
let's say we have X to be towards round
of batch and then 2 & 5 so what we want
to do is we want to keep this dimension
exactly the same but we're okay - you
know concatenate or put together the
rest of them and we can even have
another one right here it's still going
to work but let's just say we have three
in this case so how we can do that is we
can do Z equals X dot view of batch and
then minus one so this is going to just
keep this dimension and it's gonna do -1
on the rest if it will be print Zed that
shape on that that would be 64 by 10 now
let's say that you instead wanted to do
something like switch the the axis so
you would still want to keep the batch
but you would want to switch these two
or something like that so you want the
this one to show five and you want this
one to shoot show two now how we do that
is you could do Z is X dot permute and
then you're going to specify the
dimension that you want them to be at so
we want to keep damaging zero at zero
for the second one we want the the
second dimension right in the original
to be on the first dimension so we're
gonna put two right here then we want
the dimension one to be on dimension 2
so we're gonna do oh two and one so
let's say that you wanted to transpose a
matrix that would just be a special case
of the permute function so I think we
used dot T up here and so this is very
convenient since we have a matrix right
here but if you have more dimensions you
would use dot permute and you can also
do dot permute on a matrix and though so
the transpose is really just a special
case of permute but so if we print Z
that's shape now we're gonna get 64 5 &
2 I'm gonna do another example let's say
we have X and we have torched at a range
of 10 so this would be you know 10 in in
size
see that we want to add one to it so we
want to make it a 1 by 10 vector how we
do that is we would do X and then
unscrew YZ and then let's say we want to
add the 1 to the front or the first one
we will do one squeeze zero and so if we
print that shape we're gonna get 1 by 10
now if you instead want to add it across
the other one you would do so you would
say you want to have it as a 10 by 1 you
would do X dot on squeeze of 1 and then
if we just print that shape then we're
gonna get 10 by 1 let's say we have
something like X is torture range of 10
and then let's do one squeeze 0 and then
unscrews 1 so this shape is 1 by 1 by 10
and then let's say that we want to you
know remove one of these so that we just
have 1 by 10 perhaps this is a bit
unsurprising we can just do Z equals x
dot squeeze of either 0 or 1 so we can
just do X dot squeeze of 1 and if we now
print that shape we're going to get a 1
by 10 right there I think this was a
quite long video but hopefully you were
able to follow all of these transfer
operations and hopefully you found it
useful I know that there's a lot to take
in a lot of information but if you
really you know get the foundation
correct then everything is going to
become a lot easier for you when you
actually start to do some more deep
learning related tasks so this is stuff
that's boring that you kind of just need
to learn and when you got that
everything else becomes easier so with
that said thank you so much for watching
a video if you have any questions then
leave them in the comment below and yeah
hope to see in the next video
[Music]
5.0 / 5 (0 votes)