- Details
- Category: Code
Just a quick update, I managed to track down the pathfinding issue mentioned in the previous article, it was down to a missing check for invalid navlinks during neighbour exploration. It wouldn't have been particularly noticeable as there are plenty of valid navlinks that would be preferred during exploration, but in the case where the path was completely blocked and the pathfinder was forced to keep exploring until it found this issue which only happened at higher (non-leaf) node layers and would be able to jump over obstacles.
The value of creating debug tools! It doesn't just look cool having a visual demonstration of masses of exploration in the nav volume, it turns up issues like this.
I've got a few little optimisations I want to make to the pathfinder still, then I'll be turning my attention to making a proper flying path follower that smooths and optimises the agent movement even more than the already optimised paths do. Ultimately this will lead to work on swarming and avoidance features which are leading towards a tech demo I've had in mind for some time.
Keep an eye on Github for updates :)
- Details
- Category: Code
So you're interested in using the power of a Large Language Model to help out while programming in Unreal, but due to the usual game industry NDAs and security concerns, you can't have your code getting shipped out to external services? Well in this article I'm going to run you through the process of setting up your development environment so you can run an LLM locally and integrate it into Jetbrains Rider to provide code generation, analysis, recommendations, and autocomplete.
I'm using Jetbrains Rider for this for a few reasons. One, Visual Studio 2022 has a well-integrated solution for Github CoPilot, but the third party extensions for local LLMs I looked at are suspiciously black-box and frankly I don't trust them. Visual Studio Code does have some good extensions available, but using VSCode with Unreal is....not great.
Jetbrains Rider has good extensions all round, and is generally a nice IDE for working in Unreal, and is free for non-commercial use now.
Prerequisites
Firstly, you will need to have Jetbrains Rider installed. Download it here.
Secondly, you will need to install Ollama. Download it here.
Follow the instructions to get Rider and Ollama installed. You should be able to open your Unreal project solution in Rider and build and run the game/editor. You should have the Ollama server running (you'll see a llama icon in your system tray).
To check your Ollama setup, open a command prompt, and type 'ollama'. You should see the usage information. Type 'ollama list', and see that you have no models installed yet.
Continue.dev Install
Now you can install Continue.dev, download the plugin for Jetbrains IDEs here.
Once installed, you should have the Continue icon in your sidebar. Click it to open, you should see the chat window. Now we need to setup our local AI models.

Local LLM Install
There are three different models you need to setup in Continue. One is the chat model, which is the one you will interact with like ChatGPT/CoPilot, by asking questions. Another is the autocomplete model, which will inspect the text before and after your cursor, and suggest text to autocomplete what you're typing. The last is an embedding provider, that in layman's terms, is a model that will parse the text of your code project and transform it in a way that lets the main chat model reason about your code.
Ollama will try to run your LLMs on your GPU, if there is sufficient VRAM, but will fall back to your system RAM. Models will run faster if they are on the GPU, as you'd expect. Which models you use is going to depend massively on you hardware resources. I am running an nVidia 4070 with 12GB of VRAM, which is quite restrictive. I can run a small autocomplete model (~3GB) and Unreal no problems, but a larger 7 billion parameter model is going to eat ~9GB of VRAM. As such, you will need to think about which models you want running. To inspect what models you have running, and where they are resident, you can run ollama ps on the command line.
C:\>ollama ps
NAME ID SIZE PROCESSOR UNTIL
qwen2.5-coder:1.5b 6d3abb8d2d53 3.3 GB 100% GPU 17 minutes from now
qwen2.5-coder:latest 2b0496514337 6.0 GB 100% GPU 13 minutes from now
As you can see here, I have a couple of models loaded, both on the GPU. This isn't leaving much VRAM for Unreal and Windows. If I'm not using the Chat functionality, and just want autocomplete, I might want to stop the larger model, which I can do like so :
C:\>ollama stop qwen2.5-coder:latest
C:\>ollama ps
NAME ID SIZE PROCESSOR UNTIL
qwen2.5-coder:1.5b 6d3abb8d2d53 3.3 GB 100% GPU 14 minutes from now
As you can see, the 6GB model has been unloaded.
Ideally, you have a beefy GPU like a 4090/5090, with enough VRAM to happily accommodate several models and your game engine. Now, let's get into setting up the models we want to integrate into Rider.
Embedding Model
We will use the Nomic model for embeddings (parsing our codebase). To install it, open a command prompt and run :
C:\>ollama pull nomic-embed-text
pulling manifest
pulling 970aa74c0a90... 100% ▕████████████████████████████████████████████████████████▏ 274 MB
pulling c71d239df917... 100% ▕████████████████████████████████████████████████████████▏ 11 KB
pulling ce4a164fc046... 100% ▕████████████████████████████████████████████████████████▏ 17 B
pulling 31df23ea7daa... 100% ▕████████████████████████████████████████████████████████▏ 420 B
verifying sha256 digest
writing manifest
success
All done. You can check which models you have installed with :
C:\>ollama list
NAME ID SIZE MODIFIED
nomic-embed-text:latest 0a109f422b47 274 MB About a minute ago
Chat Model
This is the model you will use to ask questions of. Ideally you want the biggest model your hardware can support, but your hardware also needs to run Rider and Unreal, so compromises will be required. For this example, I'm going to use the qwen2.5-coder model, with 7 billion parameters, and 4-bit quantization. This will consume 8.9GB of memory. Install the model like this :
C:\>ollama pull qwen2.5-coder
pulling manifest
pulling 60e05f210007... 100% ▕████████████████████████████████████████████████████████▏ 4.7 GB
pulling 66b9ea09bd5b... 100% ▕████████████████████████████████████████████████████████▏ 68 B
pulling e94a8ecb9327... 100% ▕████████████████████████████████████████████████████████▏ 1.6 KB
pulling 832dd9e00a68... 100% ▕████████████████████████████████████████████████████████▏ 11 KB
pulling d9bb33f27869... 100% ▕████████████████████████████████████████████████████████▏ 487 B
verifying sha256 digest
writing manifest
success
Autocomplete Model
Now for the autocomplete model, I'm currently using a smaller version of qwen2.5, the 1.5 billion parameter version. It's still providing good insights, but is more responsive, which is important for an autocomplete model. It uses 3.3GB of memory.
C:\>ollama pull qwen2.5-coder:1.5b
pulling manifest
pulling 29d8c98fa6b0... 100% ▕████████████████████████████████████████████████████████▏ 986 MB
pulling 66b9ea09bd5b... 100% ▕████████████████████████████████████████████████████████▏ 68 B
pulling e94a8ecb9327... 100% ▕████████████████████████████████████████████████████████▏ 1.6 KB
pulling 832dd9e00a68... 100% ▕████████████████████████████████████████████████████████▏ 11 KB
pulling 152cb442202b... 100% ▕████████████████████████████████████████████████████████▏ 487 B
verifying sha256 digest
writing manifest
success
Continue.dev Configuration
Back in Rider, open the configuration panel in Continue.

Then 'Open configuration file'. This will open continue's config.json in Rider.

Embedding Provider Config
Add the following section to the config file. This tells continue to use the nomic-embed-text model from the local ollama install for embeddings.
"embeddingsProvider": {
"maxBatchSize": 32,
"provider": "ollama",
"model": "nomic-embed-text"
}
Chat Model Config
Next, setup your chat models. You can have multiple models configured in here, which will be accessible from a dropdown in the chat window. Add an entry for our local qwen2.5coder model under the models section.
"models": [
{
"title": "qwen2.5-coder:latest",
"provider": "ollama",
"model": "qwen2.5-coder:latest",
"apiBase": "http://localhost:11434",
"apiKey": ""
}
]
Autocomplete Model Config
Now configure our local qwen2.5coder:1.5b model as the autocomplete model :
"tabAutocompleteModel": {
"title": "AutocompleteModel",
"provider": "ollama",
"model": "qwen2.5-coder:1.5b",
"apiBase": "http://localhost:11434",
"apiKey": ""
}
Test Codebase Indexing
Now, open the 'More (...)' panel in Continue. Click 'Click to re-index' to test the embedding setup. You should see it parse your codebase and say 'Indexing complete'

Chat Usage
Now, lets try out the chat model. Go back to the chat window in Continue. I have a simple prototype project loaded, which includes my 3D pathfinding plugin. So lets ask about adjusting the pathfinding heuristics, I ask 'Where can I adjust the pathfinding heuristics?', and then press Ctrl+Enter.
Ctrl+Enter is important here, as this includes the @codebase context. Essentially what this does is look at your question, and find the most appropriate files in your workspace, and includes them in the chat context. I've included the output below.
Note that I've expanded the Context section here, so you can see that it's included the relevant pathfinding related files from my plugin.
The LLM has correctly identified that function that implements the pathfinding heuristics, and even suggested a new heuristic types, as well as shown me how to implement it. Pretty neat!

Autocomplete Testing
Testing autocomplete is as simple as inserting your cursor anywhere in a code file. Here I'm in the GetLifetimeReplicatedProps function, and it correctly suggested a DOREPLIFETIME macro, which I can just accept by pressing Tab.

Summary
So, that's the basic setup to get local LLMs integrated into your Unreal workflow with Jetbrains Rider. Once you start using the tools, you can experiment with different models, different sizes, and start finding more ways to enhance your workflow.