How to Optimize Performance in Unreal Engine 5
Summary
TLDR本视频介绍了如何使用虚幻引擎(Unreal Engine)的主要性能分析工具Unreal Insights来诊断和修复游戏中的性能问题。首先,视频展示了如何使用Unreal Insights进行追踪,并通过时间轴视图和火焰图来分析性能数据。接着,通过一个实际案例,展示了如何通过添加书签和CPU分析器范围来标记代码中的关键部分,从而定位性能瓶颈。案例中,开发者在时间回溯项目中遇到了性能瓶颈,通过分析发现,问题出在调试绘制API的过度使用上。为了解决这个问题,开发者采用了一种新的可视化组件,该组件继承自实例化静态网格组件,通过优化后的方法显著提升了性能,将帧率从20帧每秒提升到120帧每秒。视频最后鼓励观众在GitHub上查看代码,并提供了对回放项目的进一步了解。
Takeaways
- 📈 **Unreal Insights工具介绍**:Unreal引擎的内置性能分析工具Unreal Insights功能强大,可用于诊断和修复游戏中的性能问题。
- 🔍 **性能瓶颈分析**:通过实际案例分析,展示了如何使用Unreal Insights来定位并解决游戏性能瓶颈。
- 🛠️ **性能优化实践**:作者分享了在其时间回溯项目中遇到的性能问题,并通过优化解决了帧率下降的问题。
- 📊 **火焰图和时间线视图**:Unreal Insights提供了火焰图和时间线视图,帮助开发者深入理解游戏运行时的性能状况。
- ⏯️ **实时追踪与回放**:开发者可以实时开始和停止追踪,查看游戏运行时的性能数据,并进行回放分析。
- 📸 **书签与截图功能**:在追踪过程中,可以使用书签和截图功能标记重要时刻,方便后续分析。
- 📝 **代码标记与注释**:通过在代码中添加书签和CPU分析器范围,可以在Unreal Insights中更清晰地看到性能影响。
- 🔧 **性能问题的根源**:通过深入分析,发现性能问题并非最初预期的原因,而是由于后台线程中的渲染工作导致的。
- 🚀 **新解决方案的提出**:作者通过重构代码,使用实例化静态网格组件替代了调试绘制管道,显著提升了性能。
- 🎨 **自定义可视化组件**:创建了一个新的可视化组件,该组件继承自实例化静态网格组件,用于更高效地处理状态快照。
- 📦 **资源与社区贡献**:作者将代码上传至GitHub,鼓励社区使用并根据需要进行修改,以促进共同进步。
Q & A
Unreal引擎的性能分析工具是什么?
-Unreal引擎的性能分析工具是Unreal Insights,它非常强大,可以帮助用户诊断和修复游戏中的性能问题。
如何开始在Unreal Insights中捕获追踪信息?
-在Unreal Insights中,点击Trace菜单旁边的按钮开始追踪,该按钮变为红色表示正在追踪。
在Unreal Insights中,如何查看捕获的追踪信息?
-在Session Browser中查看所有捕获的追踪信息,双击一个追踪条目可以启动Unreal Insights并查看性能信息。
在Unreal Insights的哪个视图中可以查看帧的时间线和性能数据?
-在Unreal Insights的Timing View或Timing Pane中可以查看帧的时间线和性能数据。
使用Unreal Insights时,如何快速切换到紧凑视图?
-在Unreal Insights中,可以通过敲击键盘上的'C'键快速在全视图和紧凑视图之间切换。
如何在Unreal Insights中对特定帧进行详细分析?
-在Timing View中,可以左键点击想要分析的帧,然后按'F'键将时间线视图聚焦到该帧上进行详细分析。
Unreal Insights中的火焰图是什么?
-火焰图是一种性能分析工具,它从上到下展示了不同函数的调用关系和耗时,帮助开发者理解程序的性能瓶颈。
如何使用Unreal Insights来优化游戏性能?
-通过分析Unreal Insights提供的帧时间线、火焰图和性能计数器等信息,找到性能瓶颈,然后对代码进行优化。
在Unreal Insights中,如何捕获游戏运行时的性能数据?
-可以在编辑器中运行游戏,并通过传递命令行参数(如'-Trace=default')来捕获性能数据。
如何使用Unreal Insights来记录和回顾性能问题?
-可以使用Unreal Insights的Bookmark和Screenshot功能,在遇到性能问题时快速记录当前状态,并在分析时回顾。
在Unreal Insights中,如何查看和分析不同线程的性能数据?
-在Unreal Insights的线程视图中,可以选择不同的线程进行查看,包括GPU线程、游戏线程、渲染线程和工作线程等。
在Unreal Insights中,如何查看特定函数的调用者和被调用者?
-在Counters和Timers视图中,选择特定的函数,可以查看该函数的inclusive和exclusive时间,以及它的调用者和被调用者。
Outlines
😀 Unreal引擎性能诊断工具介绍
本段介绍了Unreal引擎的主要性能分析工具Unreal Insights的强大功能。讲解者展示了如何使用该工具来诊断和修复游戏中的性能问题。首先,通过编辑器底部的Trace功能启动Unreal Insights,然后通过Session Browser查看捕获的Trace。讲解者还介绍了如何使用火焰图来查看不同线程的性能数据,并通过时间轴查看帧的时间消耗。此外,还介绍了如何使用热键C在完整视图和紧凑视图之间切换,以及如何查看日志和计时器等其他功能。
🔍 性能瓶颈分析与实例
讲解者通过一个实时案例,展示了如何使用Unreal Insights来分析和解决性能瓶颈问题。他提到了一个时间回溯项目,当开启调试可视化功能时,帧率会大幅下降。通过添加书签和CPU分析器范围来标记关键操作,并使用火焰图来识别性能瓶颈。发现问题后,讲解者通过进入引擎源代码并添加额外的检测来进一步确认性能瓶颈的原因。
🛠️ 性能问题的解决策略
在确定了性能瓶颈后,讲解者采取了一种新的方法来解决性能问题。他放弃了使用调试绘图管道,转而创建了一个新的组件,该组件继承自实例化静态网格组件。通过这个新组件,他能够显著提高性能,即使在打开时间线可视化时,帧率也能保持在120 FPS。此外,讲解者还介绍了如何使用Unreal的建模工具来创建简单的网格,用于性能优化后的可视化。
📚 代码分享与总结
最后,讲解者总结了Unreal Insights的使用,并鼓励观众通过YouTube的互动功能来支持视频。他还提到了将代码分享到GitHub上,以便其他开发者可以学习和使用。此外,讲解者还提到了之前关于时间回溯项目的详细介绍视频,并以友好的告别结束了本视频。
Mindmap
Keywords
💡Unreal Engine
💡性能问题
💡Unreal Insights
💡帧率(FPS)
💡调试(Debugging)
💡火焰图(Flame Graph)
💡时间回溯(Time Rewinding)
💡性能瓶颈
💡实例化静态网格组件(Instanced Static Mesh Component)
💡调试绘图API(Debug Drawing API)
💡性能优化
Highlights
Unreal Engine的内置性能分析工具Unreal Insights功能强大,可用于诊断和修复游戏性能问题。
通过Unreal Insights的会话浏览器可以查看所有捕获的跟踪信息。
使用火焰图可以深入了解不同线程中的函数调用及其耗时。
时间轴视图允许用户通过鼠标滚轮缩放和拖动来查看不同时间段的性能数据。
CPU和GPU菜单提供了过滤选项,允许用户关闭不需要查看的线程。
使用热键C可以在完整视图和紧凑视图之间切换,以查看性能数据。
日志视图显示了重要事件和时间戳,双击可以跳转到时间轴上的特定位置。
计时器是分析每帧发生事件的有用工具,可以查看包含时间和独占时间。
通过快照功能可以捕获当前的性能跟踪状态,即使没有主动跟踪。
在性能跟踪时可以添加书签和截图,以便记录特定时刻的性能状态。
为了获得更准确的游戏性能数据,可以在独立模式下运行Unreal Insights。
通过添加书签和CPU分析器范围,可以在Unreal Insights的时间轴视图中标记代码的特定部分。
性能瓶颈分析揭示了调试绘制API在启用时会导致帧率大幅下降。
通过Unreal Engine源代码添加额外的检测,可以更深入地了解性能瓶颈的具体原因。
通过从调试绘制管道转移到实例化静态网格组件,显著提高了性能并解决了帧率下降的问题。
使用实例化静态网格组件的新解决方案,帧率在启用时间线可视化时保持在120 FPS。
新的视觉组件通过标准Unreal网格绘制管道,大大减少了每帧的绘制时间。
作者将所有相关代码上传至GitHub,供有兴趣的人学习和使用。
Transcripts
welcome friends I'm new all game
projects have to deal with performance
issues budgets get blown frame targets
are missed and FPS tanks today's video
has two parts first we're going to look
at unreal's Main profiling tool unreal
insights is super powerful I'll show you
how it works and I'll teach you how to
diagnose and fix perf problems in your
game second we're going to look at a
real world example I've got a nasty perf
bottleneck in my time rewinding project
why is my game running at 20 frames per
second we're going to profile it to find
out and then we're going to fix it let's
get started unreal insights and tracing
functionality now live down in the
bottom right of the editor if you're in
an older version of the engine you'll
need to look around in the tools menu to
launch unreal insights we click on trace
and then go to Unreal insights session
browser this will pull up the session
browser which shows all of the traces
that we've captured if we've never
captured a trace it's completely empty
to Capt capture a trace we click the
button next to the trace menu to start
tracing that will turn red which shows
us that we're tracing and if we look at
the session browser we now see that we
have an entry its status is live and the
file size is rapidly increasing I'm
going to go ahead and stop tracing
before I look at my actual Trace file
with that done we can doubleclick on the
entry and this will launch on real
insights the insights application shows
us a ton of performance information
across a of different panes and we will
go through everything but we're mostly
going to be looking at this timeline
view or this timing view here in the
center within the timing pane you can
zoom in and out by using the mouse wheel
and you'll notice right away that my
editor has been running for quite a
while because I I went and got lunch and
left it up you can pan left and right by
clicking and dragging and you can zoom
back in by scrolling in within this view
we have a bunch of different tracks so
at the top here I have a GPU track I
have a track for my game thread track
for the rhi thread the render thread a
bunch of worker threads basically every
thread that is running in my editor
process and for each track we have a
flame graph so if you've never used a
flame graph before it basically goes top
to bottom and we can look here at the
top and we can see okay we have a frame
on the game thread within that frame we
ticked the world we drew the viewport
slate ticked that slate tick called
slate draw Windows which turns around
and called slate prepass so on and so
forth so we can basically drill into a
bunch of deeper calls and as long as
those calls are instrumented we can see
what they were doing and how much time
they spent in this top pane we have
timings for all the different frames in
our capture the bigger the bar the
longer the frame so if I wanted to look
at this one fat frame I can see it took
52 1 12 milliseconds which is about 19
frames per second I can left click on
that to jump to it and then if I press
the F key it's going to focus my
timeline view to look at that frame
specifically so here I can see I
actually spent 26 milliseconds pumping
Windows messages this is probably well I
don't know what was going on there I
would need to to drill in but this is a
really long frame along the top view of
timing we have a timeline which is the
time since the process started we have a
bunch of dropdowns which allow us to
turn tracks on and off so for some
reason I didn't want to look at say the
game thread I can just turn that off and
turn it back on note for something to
show up in this list I have to be
capturing it and we'll we'll talk about
how you can figure what you're capturing
a little bit later the CPU GPU menu also
gives you groupings to filter so if I
want to turn off all the background
workers I can make them disappear like
that other has some additional things
that we can look at plugins has some
other special things as well but
basically this is just allowing us to
turn things on and off for filtering
another hotkey that is super useful is
you can tap C to toggle between full
View and compact view so if I wanted to
get a compact picture of everything that
was going on I can tap see and you can
look for basically big chunks of color
that might be interesting to into and to
go back to the full view we just tap C
again in addition to the timing view
down at the bottom here we have a view
of logs and any important events and if
we doubleclick it will take us to that
position in the timeline so that we can
drill in over on the right we have
counters and timers I find timers are
generally the most useful if I want to
look at what happened in a frame I can
click on any of the tracks in my flame
graph
that will select it in the timer view I
can look at inclusive and exclusive time
and I can also get a full picture of who
called whatever my selection is so
that's callers we can see here that
engine Loop tick called frame which
called slate tick which called slate
draw Windows which called slate prepass
and I can also look at Coles which is
everything that the thing I have
selected calls in this case slate
prepass has widget inv validate called
14 times but these timings are all all
tiny 4 and 1/2 micros seconds so there's
a whole bunch of stuff going on in slate
prepass that that isn't instrumented and
if I wanted to investigate this what I
would begin doing is instrumenting that
code path and and seeing the details of
what is going on a couple other cool
things that we can do here I can take a
snapshot there is a continuous running
buffer in the editor and when we click
this button it will snap that buffer and
create a trace just with those details
so this is useful if like let's say I'm
playing the game I have a big perf hitch
but I wasn't actively tracing I can take
a snapshot to see what was there and
then you know open it up just like our
first trace and we we could see what was
going on few other things tracing
there's a whole bunch of different
channels that we can choose what we want
to look at right now I am using the
default so CPU GPU some regions
bookmarks some other things while a
trace is
running we can capture bookmarks and
screenshots which I'll show you so first
let me do a bookmark which I can do from
this menu and I can also do a screenshot
from this menu or press contrl plus F9
I'm going to stop my Trace I'm going to
come back to the session browser we're
going to look at our most recent
trace and on this this timeline
view once we get back to where we were
you will see hey there's a bookmark here
with a timestamp and that will also show
up in our log view here so that was when
I pressed that bookmark button and that
gives me a clear visual indicator of
what was going on there I also have the
screenshot right here on this timeline
so if I hit a bug or a performance issue
and I wanted to take a screenshot so
that I could remember what was going on
at that point in time we can do that
with Control Plus F9 while tracing and
similarly that is also down here in our
log view so this allows us to really
explore and get a full view of what
unreal was doing and we can look
retroactively we can make changes we can
measure things over
time other things that are useful to
know running in the editor we have a
whole bunch of stuff running that is not
our game if we want really
representative person CF of just what
our game is doing we can either build a
build cook and package and then run
unreal insights against that or we can
run against Standalone mode in both
cases what we need to do is pass an
additional argument as a command line
parameter the structure of that command
line argument I believe is Dash Trace
equals default for what I was capturing
before but you can also provide a comma
separated list of arguments like CPU GPU
for the individual tracks that we want
to capture so I'm going to try Trace
equals
default and then I am going to switch
this to Standalone game we're going to
run that and with any luck we should see
that session pop up and we
do
so right here I've got unreal insights
running I also have the game coming up
my game is now up I've lost my mouse
Mouse but you can see as the game runs
we are capturing a trace just like we
did in the editor but now it is just my
game running it is not any of the editor
because we launched in Standalone mode
so that is pretty
cool when we close the game we will see
the trace stop now let's look at a real
world example I've been prototyping a
game with time rewinding mechanics the
way this works is each actor has a
component it keeps track of its
transforms and other state over time
when I rewind time I simply play that
state back in Reverse if you're
interested in the full details I have a
whole video on it but that's the gist of
what's going on I also built a debug
visualization feature that allows me to
see all of that state unfortunately when
I turn it on and a bunch of state has
accured my frame rate absolutely tanks
I'm really abusing the debug drawing API
when I was looking at fixing this issue
I was actually really surprised about
why it's slow so it's a great example of
why profiling is important to figure out
what's going on I started by
instrumenting my code I added bookmarks
which again gives that little text
visualization in the timeline view of
unreal insights and by adding CPU
profiler Scopes which will show up on my
flame graph so I've got a bookmark for
every time I turn my Snapshot
visualization on and off which is that
debug feature and I've also got a
bookmark whenever I begin or end a
rewind I've also gone through and
instrumented all of the individual
functions in my component so I can
confirm that I understand why things are
slow I was expecting the slowest part to
be in debug draw snapshots because what
we're doing is we're going through all
of that state that we've captured we're
calling draw debug point over and over
and over every frame and draw debug line
connecting all of those points over and
over every frame here is what the
profile actually looks like exactly as
we expected the moment we turn on the
debug visualization our frame rate tanks
the density of the work that's happening
in our flame graph goes way down and on
our frame timings we can see that we go
from running around 200 frames per
second dropping all the way down to 10
frames per second if I select one of
these frames and press F we can drill in
and see what's going on the total time
on the game thread is about 110
milliseconds interestingly most of that
time is spent here in game thread weight
for tasks about 90 milliseconds of that
110 milliseconds the rest of the time is
being spent ticking our rewind
components and calling debug draw
snapshots which is what we expected to
be slow some of these calls are up to 2
milliseconds for a single actor this
would be completely unacceptable in a
real game but the 14 milliseconds we
spent ticking does not explain why is
the frame 110 milliseconds long my
original intuition was I'm going to come
in to debug draw snapshots I'm going to
you know Kick the Can down the road a
little bit not really optimize this very
hard use the lifetime properties on
those apis to to make less calls every
frame but this is a pretty good hint
that that's not going to work and the
reason that's not going to work is the
way debug draw point and debug draw line
actually work is they write down some
State and that state actually gets
consumed as part of rendering work and
in fact if we scroll down and look at
other fra other threads we can see that
background worker number 14 in this case
is where all that work is happening 94
milliseconds of that 1110 millisecond
frame is happening on a background
thread if I shift to the compact View
and zoom out a little bit what we will
see is there are these big blocks of
work happening on very ious background
threads and that's because the renderer
is doing a bunch of background work that
is moving around every single frame but
this is really what the bottleneck is
that is causing uh our frame rate to dip
so hard
unfortunately the details of what we are
doing there is not
instrumented my next step was to go into
the engine source code add some
instrumentation rebuild from Source
repoint my project on top of that and
this is probably overkill for the common
case but I really wanted to cons confirm
that I understood what was going on here
here's what a similar profile looks like
with that instrumentation added we can
see that all the time is spent in draw
debug Primitives and underneath that we
are calling some functions in F batched
elements uh so drawing points drawing
thick lines in the code if we just
scroll through there's a RDG event for
adding debug Primitives that then calls
into the element back Patcher uh there
is a whole bunch of logic that is
running here in F batched elements draw
and this is where all of the actual work
for debug drawing is happening so our
current approach is totally doomed I had
to do something completely different
visualization V2 Grand reveal note we're
running at a locked 120 frames per
second when I turn the visualization on
WE remain at a locked 120 frames per
second the way I solved this performance
problem is I moved off of the debug
drawing pipeline entirely I made a new
rewind visualization component that
derives from the instanced static mesh
component there's only one function here
that is really important it's set
instances from snapshots and I
refactored my debug draw snapshots
function on the rewind component to
Simply turn around and call set
instances from snapshot on the
visualization component the way set
instances from snapshots works is
instead of running through all of of the
state snapshots and directly calling
debug drawing functions it instead runs
through those snapshots samples them
pulls out a bunch of transforms based on
how much time has passed and how much
movement has occurred since the last
snapshot it then does a little bit of
math to adjust the rotation of those
transforms so that they look at one
another it then updates all of the
instance transforms on that instance
static mesh component parent at this
point all we're doing is using the
standard unreal mesh drawing pipeline
which is very good at drawing a lot of
meshes and for the mesh I just used
unreal's modeling tools to take a Taurus
and an arrow and uh make a very super
dumb simple mesh kind of looks like a
Sonic ring but it makes me happy here's
what the profile looks like with the new
solution the main thing to notice is
that our frame rate remains completely
lock locked at 120 FPS we don't have a
big degradation even when we turn on the
timeline visualization you can see
that's happening CU our bookmark is
still here if I pick an arbitrary frame
and drill in what we're going to see is
that one that frame time is way way down
uh total frame time now is on the order
of about 9 milliseconds versus the 100
milliseconds that we saw before and the
time we spent ticking also went way down
previously we were ticking for about 13
milliseconds
and some of our biggest offenders in the
rewind component were taking 2
milliseconds each to tick now they're
running between about 10 and 20 micros
seconds on a fat one most of this is
still in visualized timelines if I
wanted to have an absolute ton of actors
there are some pretty obvious things I
could do to improve this further but for
now our total time ticking for the
entire frame for All actors in my world
is less than a millisecond we have
plenty of frame budget left and we are
no longer spending 50 to 100
milliseconds on that debug draw
Primitives RDG pass so this is solved as
usual all my code will be up on GitHub
feel free to do whatever you want with
it that's all for today's Whirlwind tour
of unreal insights if you found this
useful you'd be doing me a solid by
doing the standard YouTube engagement
shenanigan slam that like button ring
that Bell and if you want to learn more
about my rewind project definitely check
out that video until next time be kind
to one another
Peace
5.0 / 5 (0 votes)