Very interesting project that you're doing! I've read the whole thread and there has been a lot of intelligent discussions and progress (even though it has been running for quite some time).
It interests me because I obviously like LFS and my study is AI (not specifically game-related). Recently, before I ran into your post, I actually had the idea to try out how a Neural Network could drive the cars on a specific track. The input and output of the simulation would be similar to yours, although I would probably use exact track position initially, instead of reference points (the NN would sort of make the reference points by itself).
Saying this, I don't plan to start on that project now due to another project idea not related to racing, but I'll be sure to follow your project as you go forward.
As you're still somewhat undecided about controlling the AI, I'll shed some views on it (don't take me for an expert) as an addition to what a few other posters have already said.
As I understand you have been reading a bit about these. There are multiple different types, but I didn't look into most of the advanced ones (like Echo State networks, which can simulate temporal events).
A Multi-layered Perceptron network (MLP) is probably suitable for a racing simulater. As input you could have the track position, speed, gear, RPM and all similar things that you're already using. Then you send this through the black box of 1 or 2 hidden layers to the output, which would contain the actions to control the car.
The nice thing about such a network is that you don't need to worry about programming the actions of the driver, as the network figures this out by itself by comparing to an example situation (a computed racing line for example). In the end it would be human-like, including unpredictability and a chance of errors, which seems to fit the project.
There are some disadvantages though.
You don't have a lot of control about the network other than your training set and some parameters. Next to this you don't exactly know what is going on in the hidden layer (you could say the behaviour in there is "encoded"), so you have to rely on the network to fix its errors by telling it that it's doing something wrong.
Constructing the training set well could be tricky. A MLP would generally learn through backpropagation, which means at a certain point in time (one or more frames) you need to exactly specify the network's output (steering angle/throttle etc.) for the situation at that point in time. This can be tricky to specify, and you'll want many diverse desired output specifications so that it can learn the whole track and deal with as many situations as possible.
When you create too many output examples for the network to learn from, it will only use the ouput you have shown it to compute its actions (if there are enough neurons). Now, the network has no need to generalise situations, meaning that it does not need to find some action for a situation it hasn't seen before (you taught it appropriate outputs for each), which sort of defeats the purpose of a NN. This is called overfitting but it is not a huge problem, as you would likely train a whole track at a time, resulting in a different network for each track. So, a well-trained network would be more consistent and less prone to error on a given track. The main problem is generating/showing output examples for enough situations (and there are a lot of different input combinations on a single track!).
Lastly, people have tested NN's on simulated racing cars and while the results are impressive, they don't quite match up to the more standard pre-programmed behaviours for the AI drivers as far as I've seen. There are some cool youtube video's about this.
Evolutionary (Genetic) Algorithms
The great thing about these is that it is based on evolution, which means you let nature do all the work, while you sit back and watch.
GA's tend to be pretty CPU expensive as it is purely trial and error based. You randomly create some individuals and let them randomly mutate, mate and die for a very long time. It uses a fitness function, describing how well each individual is doing, which comes from the Darwinian idea of survival of the fittest. So, individuals with a higher fitness survive longer and produce more offspring, slowly making the whole population better over time.
I described it in an abstract way so I'll elaborate how I would see it working aimed at simulating race drivers.
In my view, it would best be combined with a multi-layered perceptron network and then leaving out the training of it. The training is replaced by evolution to let the network "evolve" into a better racing driver. The structure of the network (input, output and hidden layer) stay the same, and then you let trial and error find the optimum weight values for the neurons.
The drawback is that this would take a long time probably (a very rough estimate might be 40 cars running for 3000 iterations), although progress would be gradually visible along the way (some spikes up and down likely). This is an issue as I don't see a way to speed up the LFS driving tests if that is the platform you're going with. At each iteration you would check the fitness of each car and apply evolutionary actions such as mutating (randomly changing a network's values), mating (combining parts of the networks of 2 cars), and survival (keeping the population equally sized by deleting the cars with the lowest fitness; children resulting from mating will take up their place).
The tricky thing is always to compute the fitness of an individual. You would probably give a bonus for the distance driven and the the average velocity. Penalties would consist of minus points for going off-track/hitting obstacles/blowing up your tyres or engines.
When I said 40 cars, I meant that you have 40 networks running for each iteration and at the end you take the one with the highest fitness.
It might not result in an extremely fast driver, but it is certainly fun to just let them go and see what happens, how they improve.
I'm sorry for the long post and I fully understand if you don't want to employ machine learning (could certainly be a bigger challenge, with no guaranteed results), but I simply displayed them as an option to consider if anything. If time allowed, I would work on it myself. But I am certainly open for discussion about anything that I can help with if you continue this project.