What is Mass, And Why Should You Care? Part 1
- Category: Unrealengine
Introduction
One of the major new features to appear with the release of Unreal Engine in 5.0 is Mass, which was showcased extensively in the Matrix/CitySample demo. Many people have asked what exactly Mass is, and why it was created? In this article I'm going to use create a simple gameplay scenario and implement it in both the traditional GameFramework classes with Actors and Components, and then in Mass.
The article is aimed at novice programmers that do not have experience of ECS patterns, or perhaps more experienced programmers who are interested to see how to Epic have implemented their ECS in practical terms. Code samples here are not complete and contain code designed for brevity and ease of explanation. You can find the complete source and demo project here
midgen/UEPerfDemos: (github.com)
The Scenario - Many Random Movers
We'll start with a simple AI-related gameplay scenario. Say we want to have 1000 entities in our game, which move around randomly.
Each entity will need following data :
- FVector Location - The current location of the entity
- FVector MoveTarget - The location the entity is trying to move to
- FVector Velocity - The velocity of the entity
Each frame we perform the following logic:
- If an entity is within distance X of it's move target, it selects a new random move target.
- The velocity is calculated as the unit vector from our current location to move target, multiplied by our speed.
- The location is updated by adding our velocity multiplied by the frame delta time.
To facilitate the comparisons, we'll put these functions into a helper namespace so we can easily reference the logic from both standard Unreal code, and Mass.
UEPerfFunction.h
namespace UEPerf_Functions
{
void UpdateMoveTarget(const FVector& InCurrentLocation, FVector& OutMoveTarget);
void UpdateVelocity(const FVector& InMoveTarget, const FVector& InLocation, FVector& OutVelocity);
void UpdateMovement(const FVector& InVelocity, const float InDeltaSeconds, FVector& OutLocation);
}
UEPerfFunctions.cpp
namespace PerfFunction_Private
{
static constexpr float MoveCompleteRange = 50.f;
static constexpr float MoveCompleteRange2 = MoveCompleteRange * MoveCompleteRange;
static constexpr float MoveTargetRange = 5000.f;
static constexpr float MoveSpeed = 1000.f;
}
void UEPerf_Functions::UpdateMoveTarget(const FVector& CurrentLocation, FVector& MoveTarget)
{
if(FVector::DistSquared2D(CurrentLocation, MoveTarget) < PerfFunction_Private::MoveCompleteRange2)
{
MoveTarget.X = FMath::RandRange(-PerfFunction_Private::MoveTargetRange, PerfFunction_Private::MoveTargetRange);
MoveTarget.Y = FMath::RandRange(-PerfFunction_Private::MoveTargetRange, PerfFunction_Private::MoveTargetRange);
}
}
void UEPerf_Functions::UpdateVelocity(const FVector& InMoveTarget, const FVector& InLocation, FVector& OutVelocity)
{
OutVelocity = (InMoveTarget - InLocation).GetSafeNormal2D() * PerfFunction_Private::MoveSpeed;
}
void UEPerf_Functions::UpdateMovement(const FVector& Velocity, const float DeltaSeconds, FVector& OutLocation)
{
OutLocation+= (Velocity * DeltaSeconds);
}
The Unreal GameFramework Implementation
So now we implement what I guess we have to call a naive implementation. We need 1000 entities in the world so we create a basic actor and then a component that will contain the required data and tick function.
PerfActorBase.h
UCLASS(Blueprintable)
class APerfActorBase : public AActor
{
GENERATED_BODY()
public:
explicit APerfActorBase(const FObjectInitializer& InInitialiser);
UPROPERTY(BlueprintReadOnly)
TObjectPtr<UPerfActorComponent> PerfActorComponent;
};
PerfActorBase.cpp
namespace PerfActor_Private
{
static const FName PerfActorComponentName = TEXT("PerfActorComponent");
}
APerfActorBase::APerfActorBase(const FObjectInitializer& InInitialiser)
: Super(InInitialiser)
, PerfActorComponent(CreateDefaultSubobject<UPerfActorComponent>(PerfActor_Private::PerfActorComponentName))
{
}
PerfActorComponent.h
UCLASS(ClassGroup=(Custom), meta=(BlueprintSpawnableComponent))
class UEPERFDEMOS_API UPerfActorComponent : public UActorComponent
{
GENERATED_BODY()
public:
UPerfActorComponent();
virtual void TickComponent(float DeltaTime, ELevelTick TickType, FActorComponentTickFunction* ThisTickFunction) override;
private:
virtual void BeginPlay() override;
FVector Location{FVector::ZeroVector};
FVector MoveTarget{FVector::ZeroVector};
FVector Velocity{FVector::ZeroVector};
};
PerfActorComponent.cpp
UPerfActorComponent::UPerfActorComponent()
{
PrimaryComponentTick.bCanEverTick = true;
}
void UPerfActorComponent::BeginPlay()
{
Super::BeginPlay();
// Make sure our location is synced to the parent actor
Location = GetOwner()->GetActorLocation();
}
void UPerfActorComponent::TickComponent(float DeltaTime, ELevelTick TickType,
FActorComponentTickFunction* ThisTickFunction)
{
Super::TickComponent(DeltaTime, TickType, ThisTickFunction);
SCOPE_CYCLE_COUNTER(STAT_PerfDemoUpdateMoveTarget)
{
UEPerf_Functions::UpdateMoveTarget(Location, MoveTarget);
}
SCOPE_CYCLE_COUNTER(STAT_PerfDemoUpdateVelocity)
{
UEPerf_Functions::UpdateVelocity(MoveTarget, Location, Velocity);
}
SCOPE_CYCLE_COUNTER(STAT_PerfDemoUpdateMovement)
{
UEPerf_Functions::UpdateMovement(Velocity, DeltaTime, Location);
}
GetOwner()->SetActorLocation(Location);
}
In the tick, we call our three logic functions, UpdateMoveTarget, UpdateVelocity, and UpdateMovement, then update the actor location. We have scope cycle counters around each function to facilitate profiling.
All good! Now we just have to put 1000 of these in our scene and set them going. If you look at the demo project, I've made a simple spawner that creates them automatically at runtime, rather than having to place them manually.
Let's run the game, and see it in action! Note that in the demo project, the actors are invisible to avoid the rendering bottlenecks getting in the way. You can check they are in fact moving around by inspecting the actor transforms in the outliner while the game is running. Bring up the console and enter Stat PerfDemo to see our timings. Here are the timings on my PC.
Counter | CallCount | Inclusive Avg |
UEPerfDemo UpdateMoveTarget | 1000 | 18ms |
UEPerfDemo UpdateVelocity | 1000 | 18ms |
UEPerfDemo UpdateMovement | 1000 | 18ms |
Uh oh, we've already blown 30fps, and we're not even rendering anything.
We can have a look at our functions, but there's no obvious room for optimisation in the logic itself. Let's reduce the number of entities until we hit 30fps. On my aging PC I need to bring it down to 500 entities, which is still nowhere near good enough as we still have the rest of the game to build.
Pretty disappointing. Let's try implementing this in Mass and see what it can offer us.
The Mass Implementation
In the interests of keeping this article size manageable, we'll not go into the details of Entity Component System architectures here, rather just show in practical terms how to take the logic we created using the GameFramework and Actors, and move it into the Mass framework.
First up, we need somewhere to put our data. In mass, we don't use Components, we use Fragments, so let's create Fragments for our MoveTarget, Velocity, and Location data:
PerfDemoMassFragments.h
USTRUCT()
struct FPerfDemoMassFragment_Location : public FMassFragment
{
GENERATED_BODY()
FVector Location;
};
USTRUCT()
struct FPerfDemoMassFragment_MoveTarget : public FMassFragment
{
GENERATED_BODY()
FVector MoveTarget;
};
USTRUCT()
struct FPerfDemoMassFragment_Velocity : public FMassFragment
{
GENERATED_BODY()
FVector Velocity;
};
Now, Mass allows you to create preset collections of Fragments, known as Traits, to make configuring entities a bit easier, so let's make one:
PerfDemoMassTraits.h
UCLASS(meta=(DisplayName="PerfDemoRandomMovement"))
class UMassRandomMovementTrait : public UMassEntityTraitBase
{
GENERATED_BODY()
public:
virtual void BuildTemplate(FMassEntityTemplateBuildContext& BuildContext, const UWorld& World) const override;
};
PerfDemoMassTraits.cpp
void UMassRandomMovementTrait::BuildTemplate(FMassEntityTemplateBuildContext& BuildContext,
const UWorld& World) const
{
BuildContext.AddFragment<FPerfDemoMassFragment_Location>();
BuildContext.AddFragment<FPerfDemoMassFragment_MoveTarget>();
BuildContext.AddFragment<FPerfDemoMassFragment_Velocity>();
};
Next, we need to execute our logic. In Mass, rather than implementing tick functions on Actors, we create Processors, which act on the Fragments we just created.
PerfDemoMassProcessors.h
UCLASS()
class UPerfDemoMoveTargetProcessor : public UMassProcessor
{
GENERATED_BODY()
UPerfDemoMoveTargetProcessor();
protected:
virtual void ConfigureQueries() override;
virtual void Execute(FMassEntityManager& EntityManager, FMassExecutionContext& Context) override;
FMassEntityQuery EntityQuery;
};
UCLASS()
class UPerfDemoVelocityProcessor : public UMassProcessor
{
GENERATED_BODY()
UPerfDemoVelocityProcessor();
protected:
virtual void ConfigureQueries() override;
virtual void Execute(FMassEntityManager& EntityManager, FMassExecutionContext& Context) override;
FMassEntityQuery EntityQuery;
};
UCLASS()
class UPerfDemoMovementProcessor : public UMassProcessor
{
GENERATED_BODY()
UPerfDemoMovementProcessor();
protected:
virtual void ConfigureQueries() override;
virtual void Execute(FMassEntityManager& EntityManager, FMassExecutionContext& Context) override;
FMassEntityQuery EntityQuery;
};
PerfDemoMassProcessors.cpp
UPerfDemoMoveTargetProcessor::UPerfDemoMoveTargetProcessor()
: EntityQuery(*this)
{
ExecutionFlags = (int32)(EProcessorExecutionFlags::All);
ExecutionOrder.ExecuteInGroup = UE::Mass::ProcessorGroupNames::Tasks;
ExecutionOrder.ExecuteAfter.Add(UE::Mass::ProcessorGroupNames::Behavior);
bRequiresGameThreadExecution = true;
}
void UPerfDemoMoveTargetProcessor::ConfigureQueries()
{
EntityQuery.AddRequirement<FPerfDemoMassFragment_Location>(EMassFragmentAccess::ReadOnly);
EntityQuery.AddRequirement<FPerfDemoMassFragment_MoveTarget>(EMassFragmentAccess::ReadWrite);
}
void UPerfDemoMoveTargetProcessor::Execute(FMassEntityManager& EntityManager, FMassExecutionContext& Context)
{
const float CurrentTime = GetWorld()->GetTimeSeconds();
EntityQuery.ForEachEntityChunk(EntityManager, Context, [this, &EntityManager, CurrentTime, World = EntityManager.GetWorld()](FMassExecutionContext& Context)
{
const int32 NumEntities = Context.GetNumEntities();
const TConstArrayView<FPerfDemoMassFragment_Location> LocationList = Context.GetFragmentView<FPerfDemoMassFragment_Location>();
const TArrayView<FPerfDemoMassFragment_MoveTarget> MoveTargetList = Context.GetMutableFragmentView<FPerfDemoMassFragment_MoveTarget>();
for (int32 i = 0; i < NumEntities; ++i)
{
const FPerfDemoMassFragment_Location& Location = LocationList[i];
FPerfDemoMassFragment_MoveTarget& MoveTarget = MoveTargetList[i];
SCOPE_CYCLE_COUNTER(STAT_PerfDemoUpdateMoveTarget_Mass)
{
UEPerf_Functions::UpdateMoveTarget(Location.Location, MoveTarget.MoveTarget);
}
}
});
}
//**********************************
UPerfDemoVelocityProcessor::UPerfDemoVelocityProcessor()
: EntityQuery(*this)
{
ExecutionFlags = (int32)(EProcessorExecutionFlags::All);
ExecutionOrder.ExecuteInGroup = UE::Mass::ProcessorGroupNames::Tasks;
ExecutionOrder.ExecuteAfter.Add(UE::Mass::ProcessorGroupNames::Behavior);
bRequiresGameThreadExecution = true;
}
void UPerfDemoVelocityProcessor::ConfigureQueries()
{
EntityQuery.AddRequirement<FPerfDemoMassFragment_MoveTarget>(EMassFragmentAccess::ReadOnly);
EntityQuery.AddRequirement<FPerfDemoMassFragment_Location>(EMassFragmentAccess::ReadOnly);
EntityQuery.AddRequirement<FPerfDemoMassFragment_Velocity>(EMassFragmentAccess::ReadWrite);
}
void UPerfDemoVelocityProcessor::Execute(FMassEntityManager& EntityManager, FMassExecutionContext& Context)
{
const float CurrentTime = GetWorld()->GetTimeSeconds();
EntityQuery.ForEachEntityChunk(EntityManager, Context, [this, &EntityManager, CurrentTime, World = EntityManager.GetWorld()](FMassExecutionContext& Context)
{
const int32 NumEntities = Context.GetNumEntities();
const TConstArrayView<FPerfDemoMassFragment_MoveTarget> MoveTargetList = Context.GetFragmentView<FPerfDemoMassFragment_MoveTarget>();
const TConstArrayView<FPerfDemoMassFragment_Location> LocationList = Context.GetFragmentView<FPerfDemoMassFragment_Location>();
const TArrayView<FPerfDemoMassFragment_Velocity> VelocityList = Context.GetMutableFragmentView<FPerfDemoMassFragment_Velocity>();
for (int32 i = 0; i < NumEntities; ++i)
{
const FPerfDemoMassFragment_MoveTarget& MoveTarget = MoveTargetList[i];
const FPerfDemoMassFragment_Location& Location = LocationList[i];
FPerfDemoMassFragment_Velocity& Velocity = VelocityList[i];
SCOPE_CYCLE_COUNTER(STAT_PerfDemoUpdateVelocity_Mass)
{
UEPerf_Functions::UpdateVelocity( MoveTarget.MoveTarget, Location.Location, Velocity.Velocity);
}
}
});
}
//**********************************
UPerfDemoMovementProcessor::UPerfDemoMovementProcessor()
: EntityQuery(*this)
{
ExecutionFlags = (int32)(EProcessorExecutionFlags::All);
ExecutionOrder.ExecuteInGroup = UE::Mass::ProcessorGroupNames::Tasks;
ExecutionOrder.ExecuteAfter.Add(UE::Mass::ProcessorGroupNames::Behavior);
bRequiresGameThreadExecution = true;
}
void UPerfDemoMovementProcessor::ConfigureQueries()
{
EntityQuery.AddRequirement<FPerfDemoMassFragment_Velocity>(EMassFragmentAccess::ReadOnly);
EntityQuery.AddRequirement<FPerfDemoMassFragment_Location>(EMassFragmentAccess::ReadWrite);
}
void UPerfDemoMovementProcessor::Execute(FMassEntityManager& EntityManager, FMassExecutionContext& Context)
{
const float CurrentTime = GetWorld()->GetTimeSeconds();
EntityQuery.ForEachEntityChunk(EntityManager, Context, [this, &EntityManager, CurrentTime, World = EntityManager.GetWorld()](FMassExecutionContext& Context)
{
const int32 NumEntities = Context.GetNumEntities();
const TConstArrayView<FPerfDemoMassFragment_Velocity> VelocityList = Context.GetFragmentView<FPerfDemoMassFragment_Velocity>();
const TArrayView<FPerfDemoMassFragment_Location> LocationList = Context.GetMutableFragmentView<FPerfDemoMassFragment_Location>();
const float WorldDeltaTime = Context.GetDeltaTimeSeconds();
for (int32 i = 0; i < NumEntities; ++i)
{
FPerfDemoMassFragment_Location& Location = LocationList[i];
const FPerfDemoMassFragment_Velocity& Velocity = VelocityList[i];
SCOPE_CYCLE_COUNTER(STAT_PerfDemoUpdateMovement_Mass)
{
UEPerf_Functions::UpdateMovement( Velocity.Velocity, WorldDeltaTime, Location.Location);
}
}
});
}
You will notice some interesting snippets of code in the processor constructors, which we'll cover in more detail in part 2 of this series, but for now, just be aware that we've explicitly indicated that the processors must run on the game thread. This isn't necessary for the work we're doing in these processors, but done to keep the comparison benchmarks simple for this article. Note that as of the release of Unreal 5.1, Mass Processors will default to running off the game thread.
Now we're done with the code, we can fire up the editor and create the entity definition. We do this by creating a new MassEntityConfig data asset, and adding the PerfDemoRandomMovement trait we created earlier to it.
We add a Mass spawner to the level, and configure it to spawn 1000 entities. We'll use an EQS spawn generator to create a grid of spawn points, the config should look something like this.
Now, we're ready to hit play and see what using Mass has done for us.
Counter | CallCount | Inclusive Avg |
UEPerfDemo UpdateMoveTarget_Mass | 1000 | 0.06ms |
UEPerfDemo UpdateVelocity_Mass | 1000 | 0.06ms |
UEPerfDemo UpdateMovement_Mass | 1000 | 0.06ms |
We're executing *exactly* the same code, exactly the same number of times, but it's taking a fraction of the time. Doing a little testing, we can hit 30fps with 100,000 entities. We've gone from having 500 entities moving around randomly in our 33ms frame, to 100,000.
Where Does The Speed Come From?
In the GameFramework implementation, the data for each entity is stored in a ActorComponent, which, like all UObjects in Unreal, is simply new-ed onto the heap wherever the allocator sees fit. When we tick all the components, each time we're having stall the CPU to fetch the component data into cache, do the calculations, go look for the next component data, stall the CPU again, etc. In fact, our artificial test scenario probably runs a little faster than it would in a complete game, as we only have these entities in the scene, so the chances of a cache hit are fairly high, although I did try to make the scene more representative this by putting a few dummy components on each actor.
In Mass, all Fragments are stored in contiguous arrays (as with most ECS patterns, some kind of chunked array so you don't get hit with big copies as N grows). Mass processors when they execute then simply iterate over the container, as they're processing data for entity N, the data for N+1,2,3 is either already in the same cache line, or the prefetch engine can have it ready in another line. We eliminate all the cache misses and subsequent stalls while data is fetched, which is where we lose all the time in the GameFramework implementation.
There are also important benefits when it comes to multithreading of using the ECS pattern that will be covered in the next part of this series.
For more detailed information on where this performance comes from, have a search for Data Oriented Design, and Entity Component Systems. The PDF What Every Programmer Should Know About Memory is a great read if you want to go into detail.
The Trade-Off
You may now be thinking, why don't we pile all our code into Mass to see these performance gains? But alas, there is a trade-off here.
In our GameFramework implementation, all the entities are Actors that exist in our world. Mass entities do not exist in the world, they live in an alternate reality that is generally referred to in Mass as the Simulation. Data in the simulation is completely separate from the UObject world that Unreal levels are built around. To understand how we bridge the gap from the simulation, to the level, keep an eye out for part 2.
Tip Of The Day - UProperty Chains
- Category: Unrealengine
Today's tip relates to another common, but particularly insidious error in Unreal code that can end up costing huge amounts of frustration and lost time.
Consider the code below:
UCLASS()
class ATestActor : public AActor
{
UPROPERTY()
FMyStruct Data1;
FMyStruct Data2;
}
USTRUCT()
struct FMyStruct
{
UPROPERTY()
UMyObject* Object
}
You have a struct which contains a UObject pointer (yes, this should be TObjectPtr<> in UE5), and an actor that has two member instances of that struct, however, only one of them is marked UProperty.
Let's say you spawn in this actor in your game, and you create a UObject, and Data1::Object and Data2::Object to point to it. Then, you destroy the UObject. What happens?
Answer : The engine will zero out Data1::Object, but Data2::Object will remain as a dangling pointer.
Why? Because UProperties need to exist in an unbroken chain from the top-level object (usually the world). Because ATestActor::Data2 doesn't have a UPROPERTY() specifier, that chain is broken, and Data2::Object does not exist as a property in the engine, so it can't deal with nulling it when the object is destroyed.
This leads to all sorts of potential issues. The worst kind of non-obvious, intermittent, inconsistent, issues with object lifetimes, cooking, unexpected garbage collection, and potential memory stomps.
Best of all, you won't get warned that this is a problem! So make sure you keep an eye on your property chains, and if you have any weird intermittent property issues, double check them.
Tip Of The Day - GC Stalls & UObject Churn
- Category: Unrealengine
Common scenario: You're in a playtest session of your multiplayer Unreal title, it's all going well, but suddenly...."has the server died?", "I'm stuck!", "I'm lagging"....and then "Oh, no, it's fine again now".
You'll more than likely come across this scenario several times during the early phases of a project's development. Particularly working in AI, this is something I've had to fix more than a few times, as there's a common culprit that comes from Unreal's AI system.
If you're taking profiling snapshots of your server builds, it's easy enough to see the problem: You're churning UObjects, that is, something is creating new UObjects every frame, and every 30 seconds or so, the garbage collector is having to clear up the thousands of accumulated objects, which gives you this huge hitch on the server.
In order to find these issues, there's a useful option in Editor Preferences -> Show Frame Rate And Memory. This gives you a little extra information on the title bar of your editor window, showing framerate, memory usage, and critically, the number of UObjects.
Normally, when you're playing your game, you'll see this object count climbing rapidly as the game loads and everything initialises, and then fluctuate up and down, but settle in a steady range over time.
If you have a churn problem however, the number will be flying updwards like the altimeter in a crashing place, but the other way! It's very easy to spot so having this editor option turned on by default and catching it locally will help prevent this ever being a submitted issue. Obviously there are other tools such as Stat UObject to find more detail.
To then find out what the offending UObject is, you can simply stick a function breakpoint in UObject::UObject(). Once you hit it, continue a few times to make sure you haven't picked up something that is legitimately being constructed that frame.
Chances are, if you have AI in your game, the culprit will be UAITask_MoveTo. This is the object created when the behaviour tree system (with Gameplay Tasks enabled) requests a new move. Early on in the development of your title, it's likely the AI configuration is pretty rough, the navmesh isn't always up to date, the world isn't marked up correctly, and for whatever reason, you have an AI agent that has got into a position where it can't move, and every time it requests a new move, it fails. Every frame.
Ideally, your AI never gets properly stuck like this, but even so, it's a good idea to put the MoveTo task into a Selector node with a Delay after it, so if a move does fail, the agent waits a second, before trying again. No more new UObjects every single frame!
Obviously the issue won't always be AITasks. UI is a common culprit, as it's very easy to make interfaces that inadvertently create new UMG widget containers every frame. Fortunately though, whatever is causing UObject churn, it's dead easy to identify, track down, and fix!
A Fresh Coat Of Paint
- Category: Site
Well, it's not really, but this is as much as much time and effort as I can bring myself to spend wrestling with Joomla. I'm sure there are better CMS around these days but this has been going a long time and I don't fancy the thought of migrating it just now. Can't say I'm a huge fan of modern web design where everything is acres of dead space and all images with no text, trying to find a new theme that focuses on text articles is a mission these days.
In other news, I didn't manage to make it to DevCom sadly due to Covid, but the online coverage was very good and I got to watch more talks than I would have been able to attend in person, so not all bad. A real shame I wasn't able to catch up with friends in Düsseldorf, but there'll be a next time.
In other news, I have a few draft articles I've scribbled down over the years but never published, so I'm spending a little time editing them and may publish a few. May be useful to some people!
Fun fact : the most popular articles on my side are all about the boids implementation I wrote, it's a popular little programming exercise, I suspect it's a lot of students. I think my implementation is pretty optimal so all good!
My Unreal plugins have been updated to UE 5.0 a while ago, but having dabbled with 5.1 at work recently, I'm pretty sure I'm going to need to update them all in the next few weeks.
Page 1 of 22