The online racing simulator
Progress on Multithreading
(12 posts, closed, started )
Progress on Multithreading
This thread is for people who like technical insights into LFS development. It is probably too technical for most people, but I know that some do want to read this type of information.

As mentioned on the 20th Anniversary page I have been working on a multithreading system. As described on that page, we have been trying a 1000Hz physics update rate. It uses a bit more CPU than the old 100Hz physics rate (nothing like 10 times as much - but more information about that on another thread which I had to close).

The new graphics system also uses a lot more CPU power, primarily because of the shadow maps. People might think that graphics is all done by the GPU, but actually there is a lot of work for the CPU, sending instructions to the GPU (which objects to draw, using which textures and shaders, etc).

So with physics and graphics both consuming more CPU power, there is a great need for multithreading. LFS has always been a single threaded program, which is a good way to make a program when your CPU only has one core. But modern CPUs have more than one core, meaning they can run two or more threads simultaneously. Like two cooks building a complex cake, if they communicate a little, but not too much, they can get that cake made more quickly.

The physics and graphics have requirements that mean they need to run at a different rate. Graphics should run ideally at the refresh rate of your monitor, and physics should run at some high rate that eliminates stability issues. So for a typical monitor, graphics will run at 60Hz (it draws the world 60 times per second) and our experimental physics runs at 1000Hz (time steps of 1ms per physics update).

This suggests an approach where the graphics runs on one thread and the physics runs on another. This way, a CPU with at least two cores can get on with rendering graphical frames and processing physics updates at the same time. There are various advantages. For example, a few cars leaving the pits, but not seen, will not cause your graphical frame rate to drop below your monitor's refresh rate. There are other ways to do multithreading but I am aiming for this traditional approach.

That's great but it's actually a very complicated process to try to get the code separated onto two threads, and never looking at the same data at the same time. As a simple example, the graphics might be half way through drawing the cars, when the physics updates some of the cars, so they glitch forwards relative to other cars. Or worse, the graphics may be drawing an image from one car, the "view car" and that player goes to the pits, suddenly the "view car" is invalid data and the graphics code accesses the memory that is no longer allocated. Or maybe the user clicks a button and some data is added to the multiplayer packet system, at exactly the same moment while the other thread is writing to that system, causing data corruption. I can't describe all the examples but there are so many ways to go wrong unless great care is taken to avoid such issues.

The general principle for the coordination of graphics and physics threads, is that the 'game code' that does the physics, can freely continue most of the time. But when the graphics code wants to draw a frame, it requests a 'snapshot' of the current game state from the physics system. It may have to wait a small amount of time (something less than 1ms) for the physics to finish its current update, then the physics must stop for a tiny slice of time while the snapshot is obtained. That contains all the positions and rotations of the cars, wheels, suspension, physics objects, smoke, skids etc. Anything that can be changed by the game code. Let's call this a 'sync point'.

I made a few false starts on separating LFS as required. One of the earliest approaches was to split the entire program into a physics section and graphics section. But it got out of hand and I reverted around two days of coding. Then I tried something where only the game setup screen and in in-game code would be multithreaded. Running into too many problems, I reverted this attempt as well. There was some other attempt as well but it's a bit of a blur, I can't really remember. But I did learn things while making these false starts. In the end I have had success with an approach that only starts a physics thread when LFS enters the in-game mode. Even the game setup screen stays unchanged. This approach causes the minimum disruption, only changing the code that actually needs to be changed.

I started by preparing parts of the code for this separation. For example, the game timer used to be part of the main loop. I realised there were now different requirements for the graphics timer and the game timer, and that the game timer would need to be run on the game thread, so that code was duplicated and specialised into separate systems. Other code was moved around and restructured in preparation for the multithreading. The part of the code that actually processes multiplayer packets and does physics updates was moved into a separate file, along with associated variables. Various functions needed to be moved over into that separate game file. It's confusing work but the aim was to move all the "game and physics" code and variables into that new file, while leaving the graphics code in the original file.

With the separated code seeming to work as before, it was time to set an option to run that code on a thread. At first that thread was extremely protected from collisions with the graphics thread. It had to be protected so much that no graphics and physics would actually be running at the same time. However, game code could continue to run after the world was drawn, while post-processing was taking place and while LFS submitted the finished image to be applied to the screen. That was enough to cause some thread-related issues and cause me to develop some protection code, that can catch some bugs before they happen. For example, if the game code does some D3D work, for example deleting the mesh of a physics object that stops moving, or loading a car, while the graphics system is also asking D3D to do something, there is a problem and this will sometimes result in corruption. Not all the time, but randomly, when the timing happens to be wrong. So my protection code was designed to make sure the game code never does D3D work unless it has entered a "critical section" that allows it to do so (when the graphics code is in a moment when it is not drawing). Each time a critical section is used, there is a performance issue, because one thread has to wait for the other. So I devise ways to avoid their use when possible. For example when the game wants to delete a car or physics object, now it doesn't immediately delete it. Instead, it adds that object to a list of objects to be deleted. The graphics code will then delete that at the sync point that occurs every frame, just before getting the snapshot for the next frame.

With most issues sorted in that first, non-concurrent, version of actual multithreaded code, it was time to try to get the game code (multiplayer and physics) running simultaneously while the graphical images are drawn. It was very encouraging to see that this worked, to some extent, on the first attempt. But it displayed the expected thread-related issues. I remember the car flicking forward and backward a bit, and a strange effect in the mirror, where the displayed image in the mirror actually showed the back of the mirror, as the mirror view was somehow detached from the actual location of the car. As I say, entirely expected and so I set to work on the "snapshot" code.

The first stage of the snapshot code was to create a list of all the car objects visible at the sync point that occurs every graphical frame. And in all these car objects, store the position and rotation of the main car body. Then make all the graphical code refer only to that list of cars, and only use the "snapshot copy" of position and rotation, and never to the "physics version" of those values. This had the desired effect of stopping the cars jerking around and stabilising the mirrors. Then I saw that there was a tear in the wheel of the car I was viewing. This must be as some values of that wheel were changing, while it was being drawn. To me as a programmer, these bugs are actually quite exciting as they show the threads are really happening. Well I did know that something was going to go wrong with the wheels as they had not been included in the snapshot, and they have their own rotation, position, suspension compression, contact point, contact patch deflection, etc. In fact much more than the actual car body. I started to extract the necessary wheel variables into the snapshot, but reverted my code when that all got out of hand. The following morning I simply included a snapshot of the entire wheel structures, in the car's snapshot, and the wheel problem disappeared.

It seemed that the next step is to do the same for the physics objects, but then I started to encounter some rarer bugs that came up occasionally when I entered the garage screen. Most of the time things worked, but when a bug comes up it's best to sort it out immediately, as with these thread-related issues the bug is not guaranteed to come up every time. The trick is also to recognise a bug as an entire "category" of bugs, and go through the program removing all of those same types of bugs. Another one was when I tried to exit using the X button on the window, and the game froze. I thought it might be a deadlock but actually graphics and physics threads were still running. After some looking around I decided it was probably something to do with adding an "exit game" packet while other packets were being added, or removed, or something... I don't really know as it happened earlier and obviously wasn't something that would be caught in the debugger. So I started work on another protection system which will catch bugs of that type, when the graphics or UI code adds multiplayer packets, that now can only be done in a relevant critical section.

So that's where I am at the moment, the next thing is to take care of the "view car" which is set by the graphics code but can might be changed mid-draw with disastrous consequences. Presumably the solution is to store the view car pointer as part of the snapshot.

As for results, I have not done any performance testing yet. I'm more interested in improving the code structure and removing all the known ways things can go wrong, and also the unknown ways, each time they come up. I got some idea that good things are happening when my frame rate was very limited by being in the debug version (nowhere near monitor refresh rate) and I did a couple of /ai calls. Then overall CPU usage went up but the frame rate stayed constant, as the AI cars were not on screen. It was encouraging to see that the physics processing had increased without affecting the frame rate.

I'm interested in detailed performance monitoring. I'd like to do a timeline for the two threads with some accurate timings, so there are two lines across the screen, and you can see the length of time the graphics thread spent drawing, and for the physics thread, how much time it spent on all those small physics steps and audio updates (which are currently done on the physics thread). And the sync point. So we can start to see how things are really happening, how steady the physics steps are and so on.

That's all for now. I'll keep the thread closed. Partly because some users chose to use previous progress threads as an outlet for their personal issues. But more positively, the next progress post can be added directly after this one and the thread will be kept clean. It's a new approach that may have good results. I don't really need to discuss anything or receive ideas and input, because there is plenty to be getting on with.

Thanks for reading! Smile
I've been keeping a diary of progress that some might be interested in, just for a little insight into the way the work goes. As development has continued I have gradually learned more systematic ways to track down where issues might come up. With increased understanding, my direction changes from haphazard to more directed and systematic. The thing with multithreaded code is that things can't be left to chance, because the type of bugs that can come up from race conditions are are too mysterious.

The main gist of the last week of development was to continue with the snapshot, including the physics objects and the smoke and skids. As that is nearly complete I've been sorting out a snapshot for the various lists that can be displayed, like the race position list, connections list and results. Always making notes about the things I notice during my travels through the code, and thinking of strategies to sort out the next thing.

Yesterday I noticed a possibility of deadlock which cannot be allowed. No matter how rare it might be, if it's possible then it will happen at some point. So I've noted a couple of things to do a slightly different way and then I'll be looking at the input side of things.

Here's the diary of updates since the previous post:

18-20 Sept
ViewCar (pointer to currently viewed car) separated into two copies (Game and Snapshot)
Changed Game::ViewCar to Snapshot::ViewCar wherever it is used for graphical purposes
More functions moved from "World" to "Game" along with associated variables

21 Sept
Continued with CarSnapshot implementation, added smoothed camera position
Temporary diversion: improved accuracy of camera direction vector storage
More use of CarSnapshot and included player name for display purposes

22 Sept
Added Player UniqueID to CarSnapshot and made ray checking thread safe (click on car)
Improved efficiency of CarSnapshot (moved SSPos/SSBodyMat to main Car structure)
Implemented snapshot copy of the list of Solid objects (moving physics objects)
Fixed new bug that could cause an object to disappear for the first frame when moved

23 Sept
Implemented snapshot copy of Smoke/Dust puffs (necessary as they move)
Implemented critical section to avoid adding/removing skid nodes while drawing skids
Fixed free view follow glitches by storing elapsed frame turns as part of the snapshot
Made turbines, flags and trees use the new elapsed frame turns
Fixed crash bug related to map squares when exiting track editor
Added some player-related values to CarSnapshot to enable thread-safe text displays
Implemented snapshot copy of live race position list and race results list
Updated car arrows on small map display using snapshot race position list

24-25 Sept
Implemented snapshot of connection info for the list of names (take over, send setup)
Worked through several info display screens to check thread safety and use of snapshot
Noticed a possibility of deadlock by acquiring game code lock while d3d lock is held
- this is because there is also some game code that could try to acquire a d3d lock
- for example when a car is created or layout objects are added a d3d lock is required
- apparent solution: avoid possibility of a game lock by main thread during a d3d lock
This week I've been continuing to make the program thread safe and efficient, restructuring and reordering as needed. I have reduced the number of critical sections, noticed and eliminated a possibility of deadlock, implemented code to detect future issues.

It's interesting to me that multithreading is already in place and working but there is a continuing investigation into all the code that is not yet thread-safe. This week has been increasing my confidence in understanding the issues and the scope of those issues and how to find them and solve them.

I still have lists of info screens to check and known code that is not yet thread-safe. The type of thing that could go wrong is usually when things change, such as objects or cars appearing / disappearing / regenerating, or when the user moves from one screen to another. In the stable state it should be robust already. I think I may take a pause from thread safety today to try some performance analysis including a visual display to show what is happening on the two threads.

Here's the diary of updates since the previous post:

26 Sept
Working towards removing any game code locks from the main thread during a d3d lock
(note that the 'main' thread is the one that processes windows messages and graphics)
Moved sync point (game lock and snapshot) to just after the windows message loop
- this allows frame sync point and windows messages to share a single critical section
- there are good reasons to process keyboard and mouse input while game code is locked
Included a debug check to show red warning if game code is locked during a d3d lock

27 Sept
Removing the few instances where a game code lock could have caused a deadlock
Establishing new rules about how the program works, what can be called from where
As the sync point is united with message loop, kb and mouse input is now thread safe
InSim is another form of input so InSim processing is also moved to the sync point
Converted Escape Menu to be a mode of World2d rather than a separate subprocess
- looks the same but avoids old workarounds that were causing new problems

28 Sept
Additional debug check to prevent main thread code calling game functions at wrong time
Sorting out parts of code that were marked as needing attention regarding thread safety
Small diversion: added a pointer to Player object in AI drivers for better efficiency
- also added a similar pointer to Player in the Car object to avoid player searches
Included setup & fuel load in snapshot for thread safe live settings / pit instructions
Noticed a possible crash in car draw code, used the snapshot setup which is thread safe

29 Sept
Added a new debug mode to detect (crash) if setup or colour pointers are used in render
- testing showed issue with colour config pointer, generating meshes when joining race
- restructured so a local copy of the colour config is stored in the car (no pointer)
- noticed that this is more efficient and noted that car setups use a similar system
- for consistency, efficiency and thread safety decided to do the same with car setups
- this turned out quite a long operation but cleaned up garage and vehicle editor a bit
- yesterday's copy of setup stored in the snapshot is no longer needed so deleted that
Added copy of Steer variable to CarSnapshot (rotates steering wheel / animates driver)
OK, something graphical at last!

The attached image shows the first version of the thread performance analysis graph.

You can see 10 double lines, representing the two threads. These 10 lines really form one continuous timeline representing 1 second of time. Each horizontal line is 1/10 of a second. This is with vertical sync enabled (at 60Hz refresh rate).

The top line of each pair is the main (graphics) thread:

Grey - waiting to be able to do the sync (physics frame must finish)
Red - the sync point (windows messages and the snapshot)
Blue - the graphical render (drawing the world)

The bottom line of each pair is the game (physics) thread:

Green - each of the 1000 Hz physics updates (alternating shades of green)
Yellow - audio (car sound updates, approximately 100 times per second)

I think it's not bad for a first look. I can learn a lot from this and try to improve the timings. One thing would be to improve the timing and maybe frequency of the audio updates. At the moment they can sometimes cause the physics to bunch up and that can then delay the sync point, so the graphical render starts late. In an ideal world the green physics updates would be perfectly spaced, so that's something to have a look at. It's nice to be able to see how the physics and graphics run at the same time, instead of just having to imagine it.
Attached images
threadgraph1.png
More gradual progress to report. I don't think this is very interesting reading to most people but at least it shows the sort of thing I have to look into each day.

Attached an image of the updated timeline graph. It has some new colours. The reddish brown colour on the graphics thread is while it has called D3D 'Present' to apply the fully rendered image to the screen. It seems to take a long time to do that each frame because it is with Vertical Sync enabled, so it has to wait for the screen refresh (avoids tearing, and there is no point in having a higher frame rate than the monitor's refresh rate). But even during that time, the graphics card may still be rendering. The end of the blue part is only where LFS has finished sending draw instructions. The nice thing is that, with multithreading, that wait is no problem now, the physics updates keep going on as you can see.

Some gaps in the game/physics thread I believe are because my computer is so old it only has two cores (although it is a good, smooth running PC). I guess the operating system or another program wants to do something, so Windows gives a time slice to that other process instead of continuing with the LFS physics. I believe this might not happen much on a quad core.

Another thing is the audio code (yellow). Although better timed now, seems to take some time and bunch the physics updates. But this is the debug version of the code, it's a lot slower than the release build. So it might not be a problem at all.

Here's the diary of updates for any who are interested. Shrug

30 Sept
Graphical timeline showing what happens on the two threads over one second
- displays concurrency and shows where one thread can hold up the other
- reveals processing time differences between physics updates
- shows the bunching of physics updates by audio updates
- suggests better timing of audio updates

1 Oct
Timing investigation guided by the new graphical timeline
Improved accuracy of the system timer using undocumented ntdll functions
Improved the timeline a bit by adding new elements and adjusting exiting ones
Adjusted the timing of audio updates to avoid interference with the sync point

2 Oct
Adjusted game code to allow interruption of bunched physics updates for a sync point
Noticed that downloaded mods and skins are not reloaded in the best place (noted)
Made wait for sync point *always* interrupt overloaded physics (avoids visual hangs)
Noticed that game reset while physics objects moving calls D3D code 'illegally' (noted)

3 Oct
Investigating a known fault related to timer resetting when no guests are connected
Noticed some sync check values had not been updated for the 1000Hz physics update
Made some general improvements, simplified the timer reset code and fixed the bug
Moved reload of downloaded mods and cars with updated skins to the sync point
Fixed an already noted issue with auto restart on exiting the pits in hotlap mode

4 Oct
Spent some time looking at ways to delay the loading of textures when loading a car
- it would be better not do so any D3D work while loading a car but it's complicated
- some of the physics code (aero positions) depends on the 3D model being built
- possible solution is for the model to be built by game code but delay texture loading
Put that on the back burner for a while and looked at VR in the multithreading system
- found some issues including time appearing to slow down
- realised this is due to an oversight in the changes made on 2 August
- corrected that, continued VR testing, fixed a couple of bugs
Continued VR testing, looking at timing graph, experimenting

5 Oct
Another VR day, did research, testing, tried a different approach from yesterday
Ended up with good smooth results on tracks where my computer is fast enough
Did a substantial test on VR and FF wheel, driving at Bancroft and Fairfield
Force feedback can also be output at 1000 Hz with the physics updates
Seemed to feel higher resolution on the G27, I guess it could be good on DD wheel
Read through lists of things to do, would like to get a few crossed off tomorrow
Attached images
threadgraph2.png
Hi, here's some more progress to report. Again, I'll give a short summary and post the diary of updates.

Apart from continuing to make things thread-safe, I think the most interesting thing is a new system to make sure that a good part of the game code and data (the "Race" structure which holds Players and a lot of other data, for the game code) is not accessed by the graphics code, and also a smaller system to make sure the "Snapshot" (which is for rendering 3D and text) is not accessed by the game code. Considering that it would be quite impossible to find every line of code that is accessed in the wrong way, it's a great help to have these systems that can point out such issues whenever I come across them. It also helps me to devise more ways of restructuring the code to enforce thread safety.

Another interesting thing, at least to me Big grin is the live profiling graphs, which you can see in the attached images. LFS has always had one and it helps me to see which code uses a lot of CPU, (either because it's not well optimised or maybe there's a bug there). But it wasn't working recently due to the multithreading. I did have a CPU usage issue that forced me to sort out the multithreading version of the live profiler so I could track down the bug. Now I can select a profiler for the physics or graphics thread. The nice thing is these graphs are hierarchical, so I can click on the + buttons to see inside any section. I guess you'll know what I'm talking about if you have a look at the images. The first image shows the graphics thread CPU usage and the second image shows the physics thread CPU usage.

Here's the diary of updates:

6 Oct
Most of the day on home things (not LFS)
Afternoon / evening looked into the sound updates which can be improved
- in the most recent time graph you can see different length sound updates
- they are roughly in a 2:2:1 pattern, due to sounds being updated in 0.01s blocks
- it would be better to have variable length updates that can match frame lengths

7 Oct
Looked into sound improvements, changed the code to support variable length updates
- encountered various bugs, audio code is sometimes tricky to debug (due to timing)
- final bug was with the echo, realised there is a limit to the size of sound updates
- another obstacle I found is that the audio write point only moves in steps of 0.01s
- so it turns out I can't easily achieve the aim of variable length sound updates
- improvements have allowed me to split the sound updates over multiple physics frames
- another solution could be separate audio thread but it's too much work at the moment
Looked at FF rate - although 1000Hz is possible the options display still read 100Hz
Encountered (and fixed) thread-related crash when refreshing controllers in options
Fixed bug 'Flicker on car reset' which was thread related (wrongly reset draw values)

8 Oct
Morning with family
Looked into an issue where garage screen was initialised by game code when pitting
- that is against the rules of thread-safe behaviour in LFS code
- easily solved using a message from game code to main code to enter garage when safe
Found some bugs in Escape Menu - fixed them and improved the code for future proofing
Fixed thread-related and other minor issues with the Vehicle Editor
Moved position of "AutoJoin" which is for restarting autocross in multiplayer mode
Encountered a crash when second player joins race - related to drawing car alpha

9 Oct
Morning with family
Car alpha crash was due to removal of alpha LOD reset when car reset (bug fix 7 Oct)
- fixed by resetting alpha LOD to "do not draw" when car is created (not on reset)
Came up with another check to find code that is not thread-safe
- access "Race" structure (players, results, etc) through an access function
- this function can check that Race is only accessed from game code (not graphics)
- only trouble is the Race is accessed from around 900 lines of code... get started!
- stopped for the day after updating 400 of these lines to the new style access
- quick test already showed up two bugs for a quick fix and one more to note

10 Oct
Fixed yesterday evening's noted issue and carried on with converting Race references
Actually doing the changes makes me encounter various issues to fix along the way
It occurs to me that the Snapshot should also have the same type of protection as Race
- Race is for Game/Physics code and Snapshot (recently added system) is for graphics
Went ahead with restructuring Snapshot to an access protected version (fairly quick)
- Snapshot now contains a straight copy of the Race structure, excluding Players
- simplifies snapshot creation and makes it easier to convert code to use Snapshot
Stopped updating Race references for the day with only 164 remaining to deal with

11 Oct
Remaining references to Race have been updated to the new style after 2 hours
That is excluding some code related to skin resolution on the Misc Options screen
A race condition is revealed by the new check on switching from game setup to in-game
Also accesses to Race structure while drawing text in-game are revealed by the check
Fixed the text and entering game issues / made some changes to best lap announcement
MPR testing showed up bugs on restarting replay: crashes then invisible world issue
Clicking time bar at later point in replay is sometimes slow, sometimes fast
- learned how to reproduce the slow / fast versions of replay time clicking
- difficulty finding what extra processing takes place in the slow version
- built-in CPU profiler is currently unusable as it is not thread-safe
- looks like time to sort that out and that'll be another thing off the list
- profiler now selectable graphics / physics - used it to find the time clicking bug
Stop for the day, will fix "disappearing world on MPR restart" tomorrow

12 Oct
Fixed disappearing world on MPR restart - probably old bug unrelated to recent changes
- if layout objects were there, the occlusion octree was cleared when replay restarted
- needed to repopulate octree in that case - bug probably there since octree introduced
- for more info about the occlusion Octree see our September 2020 progress report
Back to fixing code that is not thread-safe (some identified by the new systems)
Garage was using (Player in) Race to decide which buttons to draw - now uses Snapshot
Spectate/Join wrongly used Snapshot copy of selected view (should use Game original)
Check for "Join reject" (first 12 seconds of race) needed Game and Snapshot versions
Something related to tyre warmers was called whenever a new Car was created in memory
Fixed that then spent a while cleaning up Car and Engine constructors for efficiency
Attached images
lfs_00001136.jpg
lfs_00001137.jpg
One week later... where did that time go! Looking

I've been doing a variety of updates including a bit of off topic into shaders that is finished for now. Some cleaning up and organising. Sometimes I add a line of code in a convenient place. Then something conceptually similar comes along and I add it at the same place. Then that happens again and again and it's taking up a lot of space in the original source file. So it becomes a new system in itself, or at least something that should be packaged in its own function, and might end up with its own source code file. This time that happened for the "sync point" which was starting to get messy right at the end of the main loop. Some other clutter from Main.cpp went along into SyncPoint.cpp too. The Sync Point is a very useful place for LFS to do things, because for that tiny slice of time, only the main thread is running and the game/physics thread is waiting, so it's a safe time for things that need safety. As a random example, suppose the user refreshed controllers. It is done at the sync point, so the controller can't be updated half way through a game update. Anyway that was really just a minor change. Randomly I noticed that I wanted to use SHIFT+J or SHIFT+S sometimes in Garage and Free View but those keys weren't available. So veered off topic a while to sort that out.

I realised there were more ways to check thread-safety and did something for the "Physics" system which manages the interactions between and updates of cars and other moving objects. I gave it a protected access function like I had done for Race and Snapshot before.

Eric noticed an issue with something involving two layers of alpha textures. I ended up solving a newly introduced editor crash and also went full off-topic for a while adding a new shader which is really for a new type of transparent window on buildings that will be useful in a few places. It can use a single pass instead of two passes which is always preferable. But it was a bit confusing at first so I reorganised the shader definitions, to the point where I got a new understanding of them and was able to come up a with a way to reduce the number of shaders. It is to help avoid a phenomenon called the "Shader Permutation Explosion" which is where the number of shaders in use can get out of hand. Well LFS hadn't got to the point of an "explosion" yet but it was bugging me that some shaders were extremely similar and I found a way to use a few options (in the form of shader constants) to produce the different effects instead of increasing the number of shaders. The attached images show the reduction in calls to "SetPixelShader" so that is a slight saving (although I couldn't actually measure a change in frame rate with such small numbers).

Another quick change solved an audio problem and got something off a list of things to do. You can see in the attached image, the audio updates (shown in yellow) occur at approximately 100Hz and are not affected by graphical frame rate. This is the maximum update rate I can get with the current system and results in the most responsive sound, even if the graphical frame rate drops a bit.

From tomorrow I plan to try and get a few more things off lists as I'd like to get to the end of the multithreading subject as soon as possible.

Here's the diary of updates.

13 Oct
Wanted to remove the last remaining old-style reference to Race, in Misc options
- needed another "must do x" in the sync point in the main loop, but getting messy
- added a new source code file for things to do at the sync point, cleaned up a bit
- removed Race from external visibility, now only accessible through protected function
Off topic update: added SHIFT+J and SHIFT+S keys to garage screen and free view mode
- this also involved quite a bit of cleaning up and improving some packet functions
- also added SHIFT+P to free view mode (SHIFT+P/J no check for unsaved layout changes)

14 Oct
Some quick fixes around cars in garage and other screens accessing Race when loading
Went through a list of source code files to check if they accessed anything wrongly
Decided to update the 'Physics' system to use a thread-safe access check like Race
- the Physics system is the code that manages all the moving objects (Cars and Solids)
- like Race it must not be accessed from the main / graphics code at the wrong time
- not a big task in this case but again identified issues and improvements were made

15 Oct
Most of morning with family
Off topic: researched an issue for Eric - modeller object appeared different in game
- after the research my version crashed when exiting from Track Editor using X button
- it turned out this was due to a new 'thread-safe' method of exiting track editor
- but the correct exit function was not called when using the window's X button
- implemented a way to provide a special exit function for any screen, used on TrackEd
- found & fixed a bug exiting from the modeller in TrackEd, related to Map Squares
- continued to track down non-thread-safe code switching into and out of TrackEd

16 Oct
Morning with family
Off topic: tried a new shader to allow something like 'overlay' but transparent
- in current version of LFS an 'overlay' shader is available for black (solid) windows
- went through code to allow overlay with ALPHA materials and added a new shader for it

17 Oct
Fixed a bug reported by Eric and checked another bug that had already been fixed
Off topic: cleaned up the shader defines that control the creation of various shaders
- many of the shaders are from one hlsl file with various sections enabled or disabled
- the organisation of these switches was confusing and could be improved / simplified
- this was done carefully with plenty of testing so as not to break any of the shaders
- it is now a lot easier to understand the similarities and differences between shaders

18 Oct
Bit of leaf clearing from gutters in the morning - house below 3 large beech trees...
Some more renaming and organising of the shader names and specifications in game code
Investigation into possibility of using shader options to reduce number of shaders
- some of the shaders are so similar with just a couple of different multipliers

19 Oct
Change to sound/audio code, sound updates are done at maximum rate of 100 per second
Previous changes resulted in 1 sound update per graphical frame which solved an issue
- the original problem was that sound updates could delay the sync point and graphics
- intermediate solution was to only do sound on the next update after the sync point
- but that was less immediate and caused poor sound quality if the frame rate dropped
- new solution: update whenever possible but delay 1ms if main thread waiting for sync
- you can see the audio updates (yellow) in the attached thread timing graph
- I produce these timing graphs in the debug version to exaggerate the CPU time
- another thing to notice is that every 10th physics update is longer than usual
- this is due to only checking for contact collisions every 0.01 second
- checking against the environment was the main CPU cost of the 1000Hz physics
- so contact points that are not currently touching wait 10 updates before next check
A new shader update combines 6 groups of 3 pixel shaders into 6 shaders total
- the idea is to reduce the number of extremely similar shaders in memory
- changing from one shader to another involves a tiny time penalty so good to reduce
- the new version uses shader constants to create the different effects instead
- these minor differences were about shine level and diffuse colour depending on alpha
- as you can see in the attached images there are now fewer calls to "SetPixelShader"
- it's not a big change but means the new shader has not resulted in even more shaders
Attached images
lfs_00001141.jpg
SetPixelShader_OLD.jpg
SetPixelShader_NEW.jpg
Three days later and I've reached the end of a shader update so thought I'd post another of these lists and a few images. Sometimes when updating graphics code I need to be careful not to break something, and that is surely the case when changing the way shaders work. So I produce screenshots from exact locations as I go, to do direct comparison by flicking between them. I realised some more of the shaders should be combined so now 24 shadowable, in game pixel shaders have been narrowed down to 9 with a few options. Actually there are two versions of each (with and without use of shadow maps) so it's really 48 down to 18.

On a day where I was feeling a bit interested in some other things I'd like to try, I felt uneasy with the number of things still on lists so made myself grind through a few fixes on paper and TODO comments in the code. I felt a bit uninspired but got more motivated as I began to cross things off lists. Thumbs up

During that time, another shader bugged me a bit, as there was a certain inconsistency in the shaders. Previously just one type of alpha shader had been a special case. The shiny windows shader (for e.g. the marshall boxes at Rockingham, and now several South City buildings) had previously been the only one that needed a different alpha blending state, where the source output from the pixel shader is not multiplied by the alpha value. Otherwise because such a transparent material (glass) would have a low alpha value that would remove the shine on the window, which is nearly all you can see. So in that shader the source reduction is done within the shader. But that one special case was no longer the only one, as the new shader I reported in the previous update also needed that change. But the remaining shaders didn't need to be how they were and could also move to the same method. Through a series of updates I've been able to make that the same in all the alpha shaders, so now there are no changes of blend state when drawing the alpha pass. Well I guess there are, if headlights are shining on things, but I'm talking about the majority of objects. So that cleans up the code a bit and removes hundreds of state changes.

Again, it's not a big change but I sometimes need to take a bit of time to clean up the code after some new things are added over the years, otherwise it becomes unmanageable. In this case there was a slight change to the output of other of the shaders, which was due to a change that I interpret as a 'bug' in the old system. But as it caused a slight change to the shine on transparent objects, I had to confirm that with Eric.

So I hope that's a little more insight into the way things go. It's not really related to multithreading but I don't see any reason to be on shaders next week.

The attached images show the reduction in calls to SetBlendState, through first an increase, then a bug, then the final result. They are described in the "22 Oct" paragraph.

Here's the log of updates:

20 Oct
Some other shaders with only tiny differences came to mind, decided to combine them
- now the shadowable, main world pixel shaders have been reduced from 24 to 9 total
- also there were was some more cleaning up and documentation needed in the shaders

21 Oct
Would like to try some ideas but must finish and fix a few things even if boring
Did some TODOs and fixed a bug in editor 'new car' related to storing setup in car
Spotted and fixed a bug in game setup screen that caused car to constantly reload
Fixed an already noted bug causing single player to superspeed a while after unpause
Updated exit from path editor that referenced Race - now use new style exit function
Fixed crash in game after exiting from animation editor if any AI drivers on track
Something was still bugging me about one of the shaders which seems inconsistent
- the variable shine / alpha shader effectively multiplies diffuse by alpha twice
- possible to correct this with a few changes, including a change to alpha blend state
- struggled a while with an 'impossible' bug that turned out to be a compiler error :-/
- blend state now consistent with other alpha shaders which used that for other reasons
- there is a slight change to appearance, did comparisons and asked Eric to check if OK

22 Oct
Now that alpha + vshine shader is consistent with others, improvements are possible
- there should be no need to change the alpha blending state during the alpha passes
- as seen in recent screenshots, counters can display frame state changes in real time
- D3D11 does render state changes in blocks instead of individual states as in D3D9
- state counters 'SetRenderState' and 'SetSamplerState' were left from D3D9 version
- removed them and added SetBlendState, SetRasterState, SetDepthState, SetSamplerState
- screenshots show original number of states then 1st update *increased* SetBlendState
- that is because one shader, used for trees, SrcBlend is still D3D11_BLEND_SRC_ALPHA
- 3rd image now all alpha shaders updated, so no calls to SetBlendState in alpha pass
- but there is a bug, see the distant trees, fog appears on their transparent parts
- that is due to the alpha blending change, now fixed in shader - 4th image all good
- clean up: some state requests were no longer needed, now constant during alpha pass
Attached images
SO_old_states.jpg
SO_intermediate.jpg
SO_new_states_bug.jpg
SO_new_states_OK.jpg
This week I've been removing the possible conflicts between the game code and graphical code threads. In the case of cars I have prevented the creation of anything graphical until the start of a graphical frame. Previously car creation involved D3D calls as the graphical model was created at that point. So the game code used a critical section to avoid clashing with D3D calls during the main render (or any other D3D activity). That meant the game thread would stall for that whole graphical frame, which is not what we want. So now the physics representation of a car is created immediately but the mesh is built only at the start of the next graphical frame (actually, only when that car will be seen in your view). Sounds simple but a lot of code had to be reorganised. The picture is further complicated when you consider game packets arriving with damage nodes (when a remote car hits something) and the model needs to be dented and updated. More detail and other reasons to rebuild meshes are mentioned in the log below.

I aim to remove all graphical critical sections in the game code so it never has to wait for graphics. Today I've been puzzling about layout objects, which may be added or removed in the game / multiplayer code. It's complicated as without protection they might be deleted during the render which could cause a crash. Possibilities such as delaying their adding or removal could cause an OOS (Out Of Sync) because the layout objects may define the timing system (by use of checkpoints or autocross start) and with unlucky timing that might cause a difference in results. So I'm thinking around the possible solutions.

Here's the update log for Sunday through to Friday:

23 Oct
Mainly day off, part of afternoon looking into things that might cost a bit in graphics
- slight change of code to allow profiler and render counters in the same dev version
- considering a future update to flags to update their motion with a compute shader

24 Oct
It's autumn half term holiday - morning started with family leaf clearing session!
On the multithreading list: cars being loaded wait for graphical render to end
- we don't want waits but needed as the new car's model may load textures (D3D call)
- obvious solution is to delay the creation of the graphical mesh until about to draw
- moved some code around to allow this and removed the wait (use of critical section)
- aero physics also depended on wing positions derived from the graphical model
- changed wing, undertray, aero centre positions to come from 'editor model' instead
- notes for next day - number plate creation when car is loaded also needs to be moved
- must investigate similar critical sections in other parts of multiplayer / game code

25 Oct
Had a look at these critical sections, wondering if all "EnterD3DCode" could be removed
- ideally game code would never never need to wait for graphics code to finish drawing
- although these waits are not common, their existence requires other critical sections
- for example main thread also needs critical sections to avoid the graphical conflicts
- if all CS on game thread could be removed then they will not be needed on main thread
- noticed a possible conflict if damage is being added while a mesh is being rebuilt
- started separating the physical (wheels) and graphical (mesh) parts of car damage
- added tiny critical sections to avoid using damage nodes while adding a new node
- this type of critical section doesn't cause threads to wait any noticeable time

26 Oct
A bit of a struggle updating the car mesh regeneration and damage system
- so many reasons for cars to be rebuilt or regenerated, it's quite confusing
- e.g. damage / skin download / delayed texture load / number plate change / editors
- also delayed texture load was an added complication meaning multiple types of reload
- decided to remove that entire system (that caused 'white cars' until texture reload)
- I have a better plan for delayed texture loads (reloading without mesh regeneration)
- approach was to simplify and unify all the regeneration methods until understandable
- finally mesh regeneration is never done by game code and only in the graphical code

27 Oct
Morning with family
Continuing to work through car initialisation, moved number plate init to graphics code
Noticed tyre vertex buffer init was also at car creation time - moved to graphics code
Helmet regeneration now follows similar coding pattern to car regenerate (but simpler)
Noticed that number plate info being set in car (by multiplayer packet) not thread safe
Found crash when regenerating car - snapshot copy of wheel refers to deleted model

28 Oct
Moved graphical elements of wheel (mesh pointers and matrices) to special structure
- avoids the crash noted yesterday evening as invalid pointers are not left around
- separate storage reduces the size of wheel to be copied at snapshot (more efficient)
Only one graphics critical section remains in game code - when adding layout objects
- the reason is because the graphical mesh for a layout object is built at that point
- investigation into removing that will start now as the aim is to avoid them totally
- there are still a few more noted points where graphical code is called by game code
Noticed and fixed shader error causing skids to be too shiny (related to vertex alpha)
Fixed another shader error for names above cars (related to recent alpha blend change)
Fixed a bug about position of a car exiting vehicle editor / now uses good height check
Today I feel like a milestone has been reached. Nod

Until now there was a system in place so that if game code (such as multiplayer packets) needed to access D3D code, it could wait for the graphical frame to complete, then do whatever it had to do. That is because two threads must not do D3D calls at the same time as it could (would) cause corruption. One example of this system, when a player left the pits, the car's model was built. This was dealt with as described in the previous report. The car can now leave the pits immediately and the model is created when the car will be drawn. The final example was the creation or deletion of layout objects, when an incoming multiplayer packet was received. This was sorted out yesterday, so today I was able to remove that system completely (it only a small system, a couple of variables, a heavily used critical section and a warning message). The new rule is that game code is no longer allowed to access D3D code at all and therefore will never wait for a frame render to finish.

This morning I solved a related issue (where game code did access D3D) and then, after clearing leaves from gutters, I added "events" to the thread timing graph to illustrate the split car creation. As you can see in the attached image, the sync point every frame is now marked with a red triangle on the main thread. For this thread graph (which was in debug mode so things take a lot of CPU time compared with a release build) I timed one second, and called /ai to add an AI car half way through the second. You can see an orange triangle below the game thread, where the car is created. The orange block represent the processing of game packets, in this case it takes a while as the car is loaded and initialised (ready for physics).

Then, as mentioned before, the creation of the car's mesh (from editor model to game-ready mesh) takes place during the next graphical frame. The start of that event is marked by a cyan triangle above the main thread. That graphical frame takes a long time - you can see some missing sync points. But the interesting thing is how the game and physics code continue, fairly unaffected after a little catching up. I notice some longer physics frames during that long graphics frame. I'm not sure what the reason for that is but I'm not concerned about it right now.

I have crossed off quite a few more items off my note papers, so I'll start a fresh sheet with some of the last notes from other sheets and try to get through them. It feels like I'm getting near the end of the multithreading conversion.

Here's the log from Saturday to Monday afternoon:

29 Oct
Puzzling about the adding and removing of layout objects (layout editor is multiplayer)
- complication due to the fact they are changed in game code while drawing takes place
- seems easy to deal with adding objects: delay adding to draw lists until next frame
- deleting objects is more complicated as an object could be deleted while being drawn
- object buttons may be visible in layout editor (prefer not to copy all at sync point)
- one solution presents itself: save packets and delay add/delete until the sync point
- this prevents any issues with objects being created or deleted during the draw
- unfortunately it means objects may be changed out of sync with other game packets
- layout objects may define checkpoints or the existence of an autocross start point
- although the delay is only one graphical frame an OOS could arise from this
- seems quite rare, OOS would only occur with unlucky timing but it's not ideal
- possible solution for the game sync issue could be special packet for checkpoints
- in this scenario checkpoints would be worked out on host and broadcast to guests

30 Oct
Morning started with family leaf clearing session...
Implemented layout packet storage using expanding array - processed at sync point
- so this means layout objects are never added or removed during graphical render
- also solves any thread-related issues in the layout editor like selection buttons
- apparently rare multiplayer OOS potential issues will be dealt with by other means
- critical section covering graphics is no longer used by game code so can be removed
- as mentioned before there are still some calls to graphics code by game to be fixed

31 Oct
Removed system that allowed game thread to wait (using critical section) to call D3D
- it is no longer needed - the new rule is that game thread must never use D3D code
Removed issue: restart race while physics objects were moving called graphics code
- issue was deletion of graphical mesh - solution: delay deletion until sync point
- delayed delete system was already in use for moved objects returning to their spot
Cleared leaves from gutters - rain is forecast and we don't like overflowing gutters
Added 'events' to the thread timing graph - used to show sync point and car creation
Attached images
lfs_00001144.png
Six more days have elapsed and there is a little more progress to report. It's interesting that most of the work was really nothing to do with multithreading. I had to remove a system that used to clear out disused skins from RAM, because the way it worked was not thread-safe. As mentioned before, we needed a new system that works with all car textures (not just skins). So I've implemented that as described below. You can see it doing its thing in the attached screenshot.

Another milestone is that I sent a version to Eric with a cautious recommendation that he could try using it for development. It has been stable apart from one thread-related crash - an FF device was being uninitialised at the same time as it was being used on another thread.

One of the next steps is a pit-out glitch reduction system that is a lot better than the old one. To reduce a glitch when a remote player leaves the pits in the current version of LFS, while you are driving, no textures are loaded from HDD (or SSD, obviously Big grin). The car model is built but may appear white where there should be textures, if those textures were not already loaded for some other car. It has often been reported as a "white cars" bug although it's really not a bug at all, and their cars are rebuilt with textures when you stop at some point.

In the new system (as reported on Sat 29 Oct) a remote player's car model is not built as soon as they leave the pits, but only when they are within your view. But currently all their textures will be loaded immediately as I had to remove the white cars pit-out glitch reduction system.

My new plan is to make it that a car may appear white for a short time, but one by one over the coming frames, the textures will be loaded. The entire model will not have to be rebuilt at any later point, only the actual textures will be loaded, without the car's model even noticing that. This will also cover a similar situation when a remote player joins and their skin is downloaded. That skin will be applied to their car without needing a full rebuild. I expect this to take a few days. The principle is that when a texture is requested, if it is not already in RAM, all information that is required to load it is stored in a "Texture" object that doesn't have its own D3D texture (instead pointing to a small plain system texture, probably grey). The real D3D texture will be loaded at some point (up to a few a few seconds later) and replaced in the Texture object.

Now that multithreading is basically done, I'll stop reporting daily progress and go back to a more normal style. I hope to report more regularly than in the past, as I know the reports are interesting for some people and the use of a closed thread has allowed me to make unofficial reports without them becoming a distraction.

I hope the detailed daily calendar has been of some interest, and some insight into the general complication of the work done by a game programmer. We don't just do some super-slow magic, conjuring up cars and tracks. Instead, it's mostly a lot of strange and abstract stuff that you might not think of immediately. It has been interesting for me, looking back and seeing all this stuff and helps me to understand why my time estimates are always so far off.

Anyway, here's the last of the daily detail calendars, for now: Thumbs up

31 Oct (evening)
FIX: Escape from MPR click to point in replay continued visibly speeding after Esc
FIX: Exposure was not reset after exiting from game which could cause overexposure

1 Nov
FIX: Lights were not switched on when starting a night replay after daylight replay
Removed function "check_for_skin_space" which was accessing Race at the wrong time
- this function ran through texture and car lists to remove older skins from memory
- it is not thread-safe as it was accessing Race (game code) during graphical render
- a better way to identify unused textures (not only skins) is now needed due to mods
Gathered remaining things to do onto new sheets of papers to make them more readable
FIX: Crash when loading very old cars, then a memory leak when loading very old cars
Sent update to Eric with suggestion that it may now be safe to use for development

2 Nov
As mentioned we need a system to remove textures and skins from cars no longer in use
Started coding a method to detect which already loaded textures can be deleted from RAM
First step: to know exactly which textures are used by cars (and helmets) but not world
- imperative not to delete any textures that are still in use or there will be a crash
- change all texture and material initialisation functions to specify usage (car/world)
- this also applies to "re-use" functions, in case a car texture is later used by world
Eric reported a crash bug with ALT+TAB from full screen and provided the crash address
- crash address was in LFS code, setting a Force (on a FFDevice that still existed)
- thread-related: controllers were being shut down while game code applied a force

3 Nov
Examined yesterday's code and brushed up a bit, ready to start with the second step
The algorithm must identify all car textures that are not in use and can be deleted
- avoids an excessive build-up of textures in RAM as cars come and go over time
- run through all textures marked CAR and IN USE - set to DISUSED (and add time stamp)
- new 3D object and car functions to run through all textures and mark then as IN USE
- call car functions on every car in the race, game setup screen, garage (and helmet)
- whatever remains as DISUSED is no longer used by an existing car so can be deleted
Now the focus is on when to run the described algorithm and which textures to delete
- we don't want to delete textures every time it's possible as they may be needed soon
- for example a player who goes to the pits is likely to exit again with the same car
- another example: whole game exits to game setup screen - no point removing textures
- the "last used" time stamp could help to delete textures not in use for a while
Wrote a system to remove materials and textures after some time
- if any car is added or removed in game, a short countdown is started
- after 3 seconds, disused materials and textures are identified (algorithm above)
- then 1 disused material or 1 disused texture is deleted every graphical frame
- but only materials and textures set to unused at least 5 minutes ago are deleted

4 Nov (morning)
Yesterday's algorithm does the job and unused textures always get removed eventually
- avoids build-up in RAM but will sometimes remove textures that will be needed again
- algorithm can be adjusted later but now it's best to work on delayed texture loading
- cleaned up some code and added debug check for texture delete while material exists
Attached images
lfs_00001145.png
This thread is closed

Progress on Multithreading
(12 posts, closed, started )
FGED GREDG RDFGDR GSFDG