Use machine learning where telemetry can prove it

Generated from content/lms/telemetry-systems-engineering/08-emerging-technologies-and-future-trends/02-ai-and-machine-learning.md; edit the source file, not this page.

Source path: content/lms/telemetry-systems-engineering/08-emerging-technologies-and-future-trends/02-ai-and-machine-learning.md

Course: Design and validate the telemetry system that feeds every decision

Module: Look ahead to the next generation of telemetry

Estimated duration: 55 minutes

Machine learning in telemetry is not a magic engineer in the laptop. Treat it as a pattern-finding tool that earns a place only when the data is measured correctly, stored consistently, and tied back to a decision you can verify on the car. That is the core skill for this lesson.

You already know the basic reason telemetry exists: the stopwatch measures the whole car, driver, tires, setup, and conditions as one result. Data logging splits that result into channels so you can ask why the car and driver performed that way at a particular place on the track. Machine learning does not change that purpose. It changes the scale. Instead of one engineer manually opening every trace, a model can help search a large history of logged laps for repeated patterns, unusual behavior, setup-response relationships, or driver-style differences. The reason this matters is simple: modern cars can carry many more sensors than a small team can fully inspect after every session. Tire-surface temperature, ride height, suspension movement, wheel speeds, brake pressure, accelerations, suspension loads, engine channels, and aerodynamic pressures can all become part of the record. The practical bottleneck becomes interpretation.

The first rule is to keep the model subordinate to the measurement. If a channel is not measured, measured too slowly, measured with the wrong range, or measured in conditions that do not match the question, machine learning cannot rescue it. A model cannot infer tire contact-patch load directly when there is no sensor for a rolling tire contact patch. It cannot make a windy aero straight-line test equivalent to a still-air test. It cannot turn scattered driver comments into controlled tire evidence if the control tire laps are inconsistent. It can only work with the available sensors, logging rate, resolution, circuit, lap context, weather, and test discipline. That limitation is not a footnote. It is the operating boundary of the whole skill.

So your job is not to ask whether AI can analyze telemetry. Your job is to decide which telemetry questions are learnable. A learnable question has four parts. First, the measured channels are close enough to the physical behavior you care about. Second, the dataset contains enough repeated examples under comparable conditions. Third, the outcome you want is labeled by something real, such as lap time, segment time, a known setup change, a control run, a component problem, or a driver comment that repeats. Fourth, the result can be checked against a later run, not merely admired on the screen.

That is why the strongest early uses of machine learning in club-racing and HPDE telemetry are not futuristic dashboards that invent setup from nowhere. They are triage systems. They help you choose where to look first. They can point to laps that do not resemble the rest of the dataset, highlight a channel relationship that changed after a setup adjustment, group drivers by how they use brake and throttle, or rank setup changes that have historically coincided with better segment performance. The model becomes useful when it reduces the amount of data the human must inspect without breaking the chain back to a measured cause.

Anomaly detection is the cleanest starting point. For this lesson, treat anomaly detection as disciplined comparison at scale. The model learns what normal looks like for a car, circuit section, session type, and sensor package, then flags traces that do not fit that normal pattern. The old manual version is familiar: you overlay laps, compare channels, and ask why one lap or one corner looks different. The machine-learning version repeats that work over more laps and more channels than you can review by hand.

The useful anomaly is not merely a strange number. It is a strange number in context. A high tire-surface temperature reading means little if the sensor is aimed differently, the lap is an out-lap, or the track temperature changed. A suspension movement pattern means little if the car crossed a different bump profile or the logger missed part of the event. A wheel-speed difference can mean tire behavior, driveline behavior, sensor noise, or cornering geometry depending on what the rest of the channels show. Your discipline is to make the model compare like with like: same circuit section, same run type, similar weather, known tire state, known setup, and enough related channels to avoid a single-sensor conclusion.

For reliability and predictive maintenance, stay even more modest. The bonded material supports the telemetry side of predictive maintenance: data can be used to examine causes of component failures, teams can add sensors to generate data on almost any component, and extended systems can log channels such as suspension loads, brake disc temperatures, tire pressures, propshaft torque, wheel speeds, and engine-specific data. It does not support a universal part-life formula or a specific bearing-failure neural-network signature. So do not teach yourself to say the model predicts a failure unless you can show the failure history and the channels that moved before it.

A better intermediate definition is this: predictive maintenance is trend-aware anomaly detection tied to a component decision. Instead of waiting for a part to fail, you watch whether a channel or channel relationship is drifting away from the car's known history. Brake pressure for the same deceleration demand, suspension load patterns over the same bump profile, wheel-speed behavior through the same corner, engine RPM and throttle relationships, or temperature behavior over similar run lengths can all become suspects. The model can help sort those suspects. The human still has to inspect the part, check the sensor, and decide whether the car continues, pits, or gets serviced.

Automated setup recommendation is more tempting and more dangerous. Setup is exactly where race teams want help: tire tests, suspension work, aero maps, simulation, and track tests all produce huge amounts of comparison data. But setup also exposes every weakness in the data. Lap time is not an absolute value when ambient conditions and driver variation are moving. Segment time helps, but only if the segment is defined consistently. A new tire, a ride-height change, a camber change, and driver adaptation can all appear together unless the test plan holds the other variables under control. A machine-learning model trained on messy setup history will find correlations. It will not know which correlations were caused by the setup and which were caused by the day.

The disciplined version is to use machine learning after you have done the old-school engineering correctly. Control runs come first. Known baselines come first. A test log comes first. If you are testing tires, the control tire checks whether the driver is repeatable enough for the test to mean anything. If you are testing aero on a straight, the same track section and the same markers keep the data comparable, and the gains or losses matter more than the absolute number. If you are using simulation, the logged results should feed validation of aero performance, bump profile, and tire model before you trust recommendations outside the range you physically tested. Only then should a model rank likely setup directions.

Think of automated setup recommendation as a shortlist generator, not a final authority. The model might suggest that a certain ride-height direction, tire-pressure range, or damper movement pattern has historically aligned with better segment performance. That is useful because it gives the engineer a better first test. It is not proof until the car repeats the gain with a controlled run. The proper output is not install this because the model says so. The proper output is test this next because the historical evidence says it is a promising direction, and here is the segment, condition range, and channel behavior that support the test.

Driver-style classification is the most driver-facing application. Data acquisition already supports analysis of driving style and performance variation. A model can group laps or drivers by how they use the car: early versus late throttle return, heavy versus progressive brake pressure, gear selection consistency, wheel-speed behavior, lateral and longitudinal acceleration shape, and segment performance. This is not meant to label a driver as good or bad. It is meant to make coaching and engineering more specific.

For an intermediate driver, the value is that style classification can separate repeatable habit from random lap noise. If your best laps all share a particular brake-release and throttle-return shape, that becomes a coaching clue. If your slower laps all show a delayed throttle pickup after a certain type of corner, that is a practice target. If a setup change only helps one driver style and hurts another, the team should know that before calling the setup universally better. The model helps describe the pattern; the instructor or engineer still translates the pattern into a change you can drive.

The required sub-skill is feature thinking. A feature is the measured behavior you give the model. You do not need to become a data scientist to think this way. You need to stop feeding a model vague session files and start deciding what the model is allowed to compare. For an anomaly model, useful features may be channel averages or peaks inside a fixed segment, the shape of brake pressure over a braking zone, the relationship between throttle and RPM on exit, or tire-surface temperature behavior over a run. For setup recommendation, useful features may include the setup change itself, the segment-time gain or loss, the tire state, weather context, and the related suspension or temperature channels. For driver-style classification, useful features may be where brake pressure begins, how quickly it releases, when throttle returns, what gear is used, and how the acceleration trace changes by track section.

The second sub-skill is segmentation. Full-lap averages are blunt instruments. The data acquisition literature emphasizes that the logger tells you how a car and driver are performing at a particular location on a racetrack. Machine learning should respect that. A car can gain time in one section and lose it in another. A tire can look better in high-load corners and worse in braking stability. A driver can improve exit throttle but still give away time on entry. A model trained only on full-lap outcomes may miss the useful reason. Split the lap into sections that match the question.

The third sub-skill is control discipline. Tire testing shows the pattern clearly. The stopwatch is still the primary measuring device, but changing conditions and driver variation prevent using lap time as an absolute value. Control tires are used as a benchmark, and repeated control runs validate whether the test is still meaningful. Machine learning needs the same discipline. It should see baseline examples. It should see repeated conditions. It should not be asked to compare a cool morning control run against a hot afternoon experimental run as if the setup were the only difference.

The fourth sub-skill is sensor humility. Extended channels are attractive, and modern systems can log a lot: suspension movement, brake line pressure, tire temperatures, gear position, wheel speeds, accelerations, tire pressures, ride height, suspension loads, brake disc temperatures, yaw speed, propshaft torque, aerodynamic pressures, and engine channels such as RPM, throttle position, lambda, and air box pressure. But more channels do not automatically mean more truth. Teams are warned to ask what exactly they want to measure. Specific needs require specific channels. Most teams start with basic channels and extend step by step as they gain experience. Machine learning follows the same rule. Add channels because they answer a decision, not because the connector exists.

The fifth sub-skill is validation. Validation means the model's suggestion survives contact with a later controlled check. If it flags a reliability anomaly, the part inspection or related channels should support the concern. If it recommends a setup direction, the next test should show the expected gain or loss in the same section of track. If it classifies a driver style, the instructor should be able to see the same pattern in the trace and in the car. If the model's result cannot be tested, it is not a telemetry decision yet. It is a prompt for more investigation.

Use this five-step workflow whenever you consider machine learning for a telemetry question.

Step one: write the decision in plain language. Decide whether you are trying to find a failing component, reduce data-review time, choose the next setup test, group driver habits, or improve a simulation model. If the decision is vague, the model will be vague.

Step two: define the measurable evidence. List the channels that should move if your hypothesis is true. If the channel is not logged, not reliable, or too indirect, either add the sensor in a future system design or reduce the claim. Do not ask a model to prove what the logger did not measure.

Step three: bind the data to context. Store enough information about circuit, segment, weather, tire state, setup, run type, and driver to make comparisons fair. The chunks make clear that logged data is lap-, circuit-, and weather-dependent. Treat those dependencies as required columns in the dataset, not as loose notes you hope to remember.

Step four: train or tune on history, but judge on later evidence. Historical data can accelerate interpretation, especially when you have a large dataset and vehicle-specific metrics. But the proof is whether the next controlled run, inspection, or driver coaching session confirms the pattern.

Step five: log the recommendation and outcome. A good log of simulation runs and results should be kept for later reference. The same applies to machine-learning suggestions. Record what the model flagged, what the human decided, what was changed on the car, and what happened next. Without that loop, you are building a clever search tool that never learns whether it helped.

Calibration cues matter because ML can look convincing even when it is wrong. You are improving when the model changes your review order in useful ways. It points you to the right corner, the right run, or the right channel earlier than manual browsing. You are improving when its setup suggestions become testable hypotheses with clear expected segment behavior. You are improving when the same driver-style groups keep appearing across sessions and match what the instructor sees. You are improving when it refuses to generalize across conditions that should not be mixed: different weather, different tires, different drivers, different track sections, or insufficient sensors.

You are not improving when the output becomes more ornate but less testable. A colorful dashboard that says the car wants a setup change without showing the relevant laps, channels, and control comparisons is weaker than a plain overlay with a defensible next test. A failure prediction without sensor validation is weaker than a manual inspection triggered by a simple trend. A driver label that cannot be tied to brake, throttle, gear, speed, or acceleration behavior is just decoration.

This lesson also has clear boundaries with the sibling lessons. Smart sensors and CAN-FD matter because machine learning gets better when the right signals exist at the right quality and data rate. The regulatory and rulebook lessons matter because some series restrict live telemetry, automated advisory systems, cost, sensors, or data use. This lesson stops before those constraints. Here you are learning how to decide whether machine learning has a valid telemetry job at all.

The best mental model is assistant, not authority. Machine learning can help you inspect more laps, more channels, and more combinations than a small team can review manually. It can accelerate interpretation of large logged datasets. It can make setup history searchable. It can turn repeated driver habits into coaching categories. But it still sits on top of the same foundation as any racecar data analysis: measure the right thing, control the comparison, understand the circuit section, validate against the stopwatch and the car, and keep a log of what happened.

Worked example: tire-test data becomes setup recommendation fuel

A tire test is a clean example because the old method already contains the safeguards a machine-learning workflow needs. The team does not simply bolt on a new tire and believe the lap time. The test uses control tires of known specification, repeated short runs, pressure and temperature checks, lap and segment times, driver comments, and occasional returns to the control tire to check whether the driver and conditions are still stable.

If you wanted to ML-enable that test, you would not start by asking the model which tire is best. You would first give it the disciplined comparison structure. Each run would carry tire identity, control or test status, run order, segment times, pressures, tire temperatures, relevant acceleration channels, and driver comment tags. The model's first useful job would be quality control: flag control-tire runs where the driver is no longer producing consistent lap times or comments. That protects the test from false learning.

The second job would be section-specific comparison. A prototype tire may help one part of the lap and hurt another. Full-lap time can hide that. Segment data lets the model search for where the tire gained or lost. The third job would be setup sensitivity. The Haney chunk makes the important point that tire testing teaches what type of setup the tire liked. The model can help organize that history across many runs, but only because the team recorded the setup and preserved the control comparisons.

The success criterion is not whether the model gives a confident answer. The success criterion is whether the next controlled run behaves as predicted. If the model says a tire plus setup direction tends to improve a specific segment, you test that direction against a known benchmark. If the gain repeats under controlled conditions, the model helped. If it does not, the model gave you a hypothesis that failed quickly, which is still better than chasing an unlogged hunch.

Worked example: aero and ride-height data as an ML boundary case

Aero testing shows both the appeal and the limit of machine learning. The corpus describes using beacons to split data into convenient track sections, averaging damper displacements between markers, incorporating pressure, temperature, and Pitot data, and using calculations to output downforce. It also warns that the extent of instrumentation depends on the precision and reliability you require, and that wind can spoil the value of the test.

A model could help here by comparing many straight-line test runs and identifying the combinations of ride height, pressure readings, damper displacement, and speed that align with downforce estimates. It could also flag runs that do not match the usual relationship, which may point to wind, sensor problems, inconsistent markers, or a setup state outside the usual map.

But this is exactly where you keep the claim narrow. The model is not discovering universal aerodynamics. It is organizing a specific car's measured behavior over a specific test procedure. If the data is dissected from the same section of track, the comparison becomes more credible. If the test mixes windy and calm runs, different markers, or different instrumentation quality, the model may learn the test noise. The right output is a confidence-ranked review list and a next controlled test, not an automatic aero map you trust without validation.

Worked example: driver-style classification without shaming the driver

Driver-style classification should feel like a coaching tool, not a ranking tool. The logger can record what the car and driver are doing, and the data can be used to examine driving style and performance variation. With channels such as brake pressure, throttle position, gear position, wheel speeds, accelerations, and segment timing, a model can group laps by repeatable behavior.

Imagine two drivers in the same car. One consistently carries brake pressure deeper into the corner but delays throttle return. The other releases the brake earlier, rotates less aggressively, and returns to throttle sooner. The model may group their laps into different styles. That is useful only if you connect it to the section of track and the result. Which style gained time in which segment? Which style worked on the current tire state? Which style created unstable tire temperatures or inconsistent exits?

For the driver, the coaching message should be concrete. You are not told that your style is wrong. You are told that your best laps share a measurable behavior and your slower laps lack it. You are told which corner family shows the pattern. You are given one change to test in the next session. That is the difference between useful classification and decorative labeling.

Common mistakes

The first mistake is feeding the model everything and asking for wisdom. More channels only help when they connect to a decision. A system can log many signals, but engineers are still told to ask what exactly they want to measure. Good looks like choosing channels because they support the failure, setup, or driver question in front of you.

The second mistake is treating lap time as a clean training label. Lap time is the final result, but it moves with ambient conditions, driver variation, tires, setup, traffic, and track state. Good looks like using segment times, control runs, and context fields so the model is not asked to explain a whole day with one number.

The third mistake is mixing unlike conditions. Logged data is dependent on lap, circuit, and weather. Tire tests use controls because conditions and driver variation matter. Good looks like filtering by comparable conditions or explicitly teaching the model which condition changed.

The fourth mistake is letting a model outrun the sensors. If there is no sensor for the physical quantity, the model may only be seeing a proxy. That can still be useful, but the conclusion must stay modest. Good looks like saying the model flagged a pattern in suspension movement or temperature, not that it directly measured an unmeasured contact-patch force.

The fifth mistake is accepting setup recommendations without a return test. Setup work needs controlled comparison. Good looks like using the model to choose the next test, then proving the gain or loss with a baseline, segment evidence, and a clear run log.

The sixth mistake is using driver-style classification as a verdict. Good looks like using classification to find repeatable habits, then translating one habit into one practice target. The model should help the driver improve, not create a label that replaces coaching.

Drill: build one ML-ready question from your next event

Do this over one event weekend or over the next three comparable sessions. The goal is not to train a production model. The goal is to learn the discipline that makes machine learning possible.

Before session one, choose one question only. Use one of these forms: find unusual reliability behavior, compare a setup change, or classify one driver habit. Write the decision you want to make after the event. For example, decide whether a brake-related channel is drifting, whether a tire-pressure direction helped a named segment, or whether your throttle-return timing differs between good and poor exits.

During each session, preserve context. Record setup, tire state, weather notes, run type, and any known traffic or abnormal lap. After each session, mark the laps you trust and the laps you do not. If you cannot explain why a lap belongs in the comparison, leave it out.

After the event, build a simple table. Each row is one lap or one segment. Include the outcome you care about, the relevant channels or summary values, and the context fields. Then do the manual version of the model's job: sort, group, and look for repeatable patterns. Which rows look normal? Which rows are outliers? Which setup state or driver behavior repeats with the better segment?

Success is not a trained neural network. Success is a dataset that another engineer could understand without you standing beside them. You should be able to point to the measured channels, explain the context filters, name the outcome, and state the next controlled test. If you can do that, you have created a question that machine learning may eventually help answer. If you cannot, the model would only make the confusion faster.

When this principle breaks down

This principle breaks down when the corpus of data is too thin, too mixed, or too far removed from the decision. A handful of laps from one changing day is usually not enough to support a broad model. A setup history without control runs is weak. A reliability model without known failures or inspections may only learn normal variation. A driver-style classifier without consistent segment boundaries may confuse track section effects with driver habits.

It also breaks down when the model's recommendation cannot be tested. If the car has no adjustment range left, the sensor package cannot see the claimed mechanism, or the next event will happen under completely different conditions, the output should be treated as a planning clue, not an instruction.

The recovery is to narrow the question. Instead of predicting component life, ask whether one channel relationship changed compared with the car's history. Instead of recommending a whole setup, ask which segment changed after one known adjustment. Instead of classifying an entire driver, ask whether throttle return or brake pressure shape differs between the driver's best and worst laps in one corner family. Smaller questions create cleaner evidence, and cleaner evidence is what makes machine learning useful.

Author Review

No quiz questions are attached to this lesson.

Sources

#	Document	Chunk	Pages	Score	Collection
1	Analysis Techniques for Racecar Data Acquisition	4b3855e4-e741-85ea-9df4-e328a90484b6	5	1	uio_books_raw_v1
2	The Racing and High-Performance Tire Paul Haney	11880aec-933e-aa8f-4b04-34e8fbf40f0e	168	1	uio_books_raw_v1
3	Analysis Techniques for Racecar Data Acquisition	f6dc9cae-392f-1151-15c6-df8acd9a8ec5	5	1	uio_books_raw_v1
4	Analysis Techniques for Racecar Data Acquisition	f725bafe-8b10-b36a-5b91-3395a519319d	16	1	uio_books_raw_v1
5	Analysis Techniques for Racecar Data Acquisition	1d32f116-9b81-52c6-919d-dba1c542c011	5	1	uio_books_raw_v1
6	Analysis Techniques for Racecar Data Acquisition	2c2b79d6-8481-a249-415e-c9cfb1be1d8c	19	1	uio_books_raw_v1
7	Competition Car Aerodynamics 3rd Edition McBeath Simon	60c37571-2161-a4a9-6363-698d635d7e59	355	1	uio_books_raw_v1
8	Analysis Techniques for Racecar Data Acquisition	d0db9128-dc9a-aec3-14a8-5f101654753f	3	1	uio_books_raw_v1

Worked example: tire-test data becomes setup recommendation fuel​

Worked example: aero and ride-height data as an ML boundary case​

Worked example: driver-style classification without shaming the driver​

Common mistakes​

Drill: build one ML-ready question from your next event​

When this principle breaks down​

Author Review​

Sources​