Posts about: "FADEC" [Posts: 194 Pages: 10]

PBL
2025-06-22T12:19:00
permalink
Post: 11908494
Interesting and informative post from JustusW on 2025-06-21@ 1704 on the ins and outs of various implementations of digital logic (SW, FPGAs, ASICs) and how it has changed and is changing.

I am using my usual approach to trying to figure out what happened in this accident. Which is to perform a possibility analysis: ideally, to consider all possible scenarios and prune out ones that do not fit with the facts as we know them. Might sound easy but it's not trivial, and there aren't that many people who become really good at it (and I am not even sure that my colleagues who are good at it think that I am......).

Severe reduction of thrust, simultaneously, just after unstick is one of the "facts as we know them". The control systerns for engines and fuel systems on the 787 are based on digital-logical electronics, including SW. Every digital-logical system may have bugs. In forty years of working with and around such systems I have never encountered one which didn't. Never. (Some eminent colleagues did try to do so with the "Tokeneer" project - and it took a year or two to find the bugs).

A bug in the digital-logical FADEC is a possibility. As far as I am concerned, it stays in the possibility analysis until it can be ruled out. Which it cannot be at this stage.

For this purpose, it does not matter what the logic is based on, or whether some SW-HW architectures can be less susceptible, for whatever reasons, than others.

4 users liked this post.

TURIN
2025-06-22T19:34:00
permalink
Post: 11908784
Originally Posted by PBL
Interesting and informative post from JustusW on 2025-06-21@ 1704 on the ins and outs of various implementations of digital logic (SW, FPGAs, ASICs) and how it has changed and is changing.

I am using my usual approach to trying to figure out what happened in this accident. Which is to perform a possibility analysis: ideally, to consider all possible scenarios and prune out ones that do not fit with the facts as we know them. Might sound easy but it's not trivial, and there aren't that many people who become really good at it (and I am not even sure that my colleagues who are good at it think that I am......).

Severe reduction of thrust, simultaneously, just after unstick is one of the "facts as we know them". The control systerns for engines and fuel systems on the 787 are based on digital-logical electronics, including SW. Every digital-logical system may have bugs. In forty years of working with and around such systems I have never encountered one which didn't. Never. (Some eminent colleagues did try to do so with the "Tokeneer" project - and it took a year or two to find the bugs).

A bug in the digital-logical FADEC is a possibility. As far as I am concerned, it stays in the possibility analysis until it can be ruled out. Which it cannot be at this stage.

For this purpose, it does not matter what the logic is based on, or whether some SW-HW architectures can be less susceptible, for whatever reasons, than others.
How can that bug affect two independently controlled and powered engines, at almost exactly the same time?

2 users liked this post.

TryingToLearn
2025-06-22T20:21:00
permalink
Post: 11908803
Originally Posted by TURIN
How can that bug affect two independently controlled and powered engines, at almost exactly the same time?
Short introduction on faults:

There are systematic and random faults.
Mechanics and SW only know systematic faults, only electronic HW does have random faults.
Why?
Mechanics are large enough to react in a reproduceable manner, SW too (on working HW).
Worn out part? -> systematic fault, you did not calculate wear correctly in your analysis, adding maintainance cycles
Broken nut due to too much torque? -> systematic fault, you did not analyse the severity and put QM measures in place within maintainance.
broken fan blade due to corrosion? -> systematic fault, incorrect testing and environment assumption, other fan blades may be corroded, too
SW crash every 10 years? -> systematic fault, you did not make it robust against some nasty timing circumstance or race condition.
So systematic faults do not mean they happen instantly, they may happen never or after dozens of years. Not on all parts but e.g. just if there is a problem in production like a cavity in aluminum casting.
But there are systematic ways to avoid them (FMEA, FTA...). This is why every accident is analyzed. If it was a systematic fault, other party could be affected.
This also counts for SW, there are techniques to avoid errors on all levels of development. But having the same SW with the same inputs and the exact same timing will give the same outcome (...on both engines).

Random faults are unique to electronic HW. The reason is just because of miniaturization. One cannot exclude migration and degradation on such small scales, there will be faulting transistors or other parts. If you scale them bigger, your processor runs on kHz instead of MHz, this is not an option. Those are random HW faults. There are calculation methods (FMEDA) to calculate the expected lifetime and error probability over the expected temperature profile. But if you give the faulting part the same input, it will show the fault again. But not any other part without the degraded transistor will give you the same error.
And there are soft faults (not SW faults, soft faults), those are not reproducible at all. Imagine some hard x-ray from the sky hitting a DRAM cell and changing it's state. You will never be able to reproduce this hard x-ray which made it's way thru the atmosphere without interaction hitting exactly the right \xb5m\xb3 to do the same bit-flip again. You can just have redundancy in information and processing (ECC, parity, lock-step etc.) to have enough probability to detect (and reset) or correct the fault.

Faults, no matter if systematic or random can be single, multiple, latent and dependent faults.
Single fault: a single error leads to a hazardous state
Multiple point fault: more than one (independent) error will lead to a hazardous state
Latent fault: a fault which remains undetected for some time (e.g. only detectable during startup / power-on) and not hazardous by itself. But together with a second fault it will lead to a hazardous event. But the probability of 2 errors within a power cycle is sufficiently low.
Dependent fault: A fault, by itself not hazardous which will influence the probability of a second fault with both in combination... exactly. (-> freedom from interference not given)

Having the same SW bug on both engines would fall within the dependent fault category. Redundancy only helps if freedom from interference is given.
So if there is a fault in SW and both FADEC instances get the same input and have the same timing, they will react the same way. If there is a hidden systematic fault, both will show it. This is why things are often programmed twice by independent teams or on different processors. They will not do the exact same mistake. Or there is a normal function and a supervisor like TCMA. One complex function but a tiny, easy to analyze supervisor just checking for the hazardous state.

All this can be analyzed and is really lots of work but necessary. So first there is an analysis how severe a fault would be on a high level, this defines the necessary level of analysis in detail.

Last edited by TryingToLearn; 22nd Jun 2025 at 20:35 .

13 users liked this post.

Semreh
2025-06-22T22:07:00
permalink
Post: 11908844
Originally Posted by JustusW
if ( happy == true ) {
print("I'm happy!"
}

Originally Posted by First_Principal
Just in case it wasn't obvious, JustusW managed to make an excellent point twice in their post about FPGA's, software and bugs.

The above code has a bug, which I assume was deliberate in order to see how many of us were really reading what they had to say, and to drive that home. Well done you, it is a simple illustration of just how easy it is for issues to slip through and not be noticed.

While this simple bug would have been picked up early on by a compiler there could well be other much more esoteric and subtle bugs that could take years in operation and require a very specific set of circumstances before they trip. I make no comment or assertion that's the case here for AI 171, but it is worth bearing in mind in any search for understanding and cause.

FP.
Mmm. Yes.

To be clear, this is 'happily' executed in perl , and doesn't give the results you might expect.

if ( happy == false ) {
print("I'm happy!")
}

A FADEC might faithfully implement the logic of the design, Someone might even have spent a lot of resources to formally prov e that the FADEC logic implements the specification. The trouble is, that does not tell you that the specification is correct .

Additionally, even if the specification is correct, it is possible to be implementing the code on flawed hardware.

It is not impossible that a common set of inputs* triggered an unusual FADEC response in separate FADECs: however, not impossible does not mean that it is likely .

*For example, the FADEC could be programmed to expect an input value to be a positive integer e.g. for altitude. If the device that sends the altitude value to the FADEC sends a negative value, one of several things might happen: (1) it could be rejected as being out of range. (2) It could be replaced with the lowest possible positive value: +0 (3) It could interpret the signed integer it has been given as an unsigned integer and treat it as an unrealistically large positive value.

I am not saying this is what has happened. Getting disagreement in the format of values sent between differing systems is not an unknown problem. Software problems as a result of poor, or differing interpretations of specifications have hit many high-profile projects: such as the Mars Climate Orbiter and the Ariane 5 - so I will not say that a system that includes FADECs cannot have problems: rather, that problems are unlikely .

3 users liked this post.

MaybeItIs
2025-06-22T23:35:00
permalink
Post: 11908907
Originally Posted by FullWings
That\x92s the nature of a common mode bug. If the software was vulnerable to Mars being in the house of Uranus, the scent of lilacs and the DOW being less than 42,000 then you\x92d expect the failure to occur everywhere when these conjoined. Same when an aeroplane\x92s systems and/or the environment present data that triggers an unplanned/unforeseen response in something like an EEC/FADEC. The experts still appear to think that this is unlikely but we have been presented with an unlikely occurrence...
I have to both agree and disagree with both this and the next post by TryingToLearn.

Yes, there may be (let's assume is) "identical" FADEC/TCMA hardware and firmware on both engines, but if the Left Engine is subject to Mars in the house of Uranus (wink wink), then the Right Engine cannot be, maybe it's Venus in the same House. This is simply because the Left engine TCMA 'contraption', I'm going to call it, is monitoring Left Engine Conditions (Shaft Speed, T/L setting / position data - Right or Wrong, and calculating and comparing accordingly against its internal map) while the opposite TCMA "device" is monitoring and calculating etc, Right Engine Conditions. There are some things in common, but (I say) it's virtually impossible for the Engine Conditions being individually monitored to be identical in both engines.

The Thrust Levers are electro-mechanical devices, almost certainly at this stage pushed by a somewhat squishy human hand, likely with a slight offset. What is the probability that those two levers are in identical positions, and even if they are, that the calibration (e.g. "zero points") of both levers are identical, and that the values they output (response slopes/curves) are exactly matching in every matching point in their individual travels? That's just one aspect, but consider the engines. They are different ages. Have different amounts of wear. They have separate fuel metering valves (or other names), separate HP Fuel pumps (and, I guess relief valves?), all also subject to wear), and each has a host of other, correspondingly paired, sensors, (maybe of different makes and certainly of different ages and different calibrations and response curves) from which each FADEC, supposedly independently of the TCMA, adjusts the fuel metering device settings and resulting engine power, and shaft RPMs follow in some other slightly non-matching way.

Sure, I would completely agree that the two engines and their calculated Throttle Lever positions to Shaft RPMs are always going to be similar during normal, matched operation, and they will very likely dance with each other, maybe one 'always' (75% of the time, say) leading during one dance (TO, say) with the other leading in dancing to a different tune (descent, say).

To me, the fact that this appears to have been an almost simultaneous dual engine failure, pretty much, for me, rules out a FADEC/TCMA firmware bug, especially as there don't seem to be any reports of even a single engine mid-air TCMA shutdown.

HOWEVER, and I want to stress this, that doesn't rule out the possibility that both TCMAs shutdown their respective engines simultaneously. Any lack of simultaneity observed would be due to those slight differences in other pieces of hardware, such as the time for one Shutoff valve to close versus the other.

As far as I know, there isn't enough information on what's actually inside those TCMA Black Boxes to say anything for sure, but here's a thought, which I think has been alluded to, or the question asked, here in one or other thread, earlier.

What does the TCMA firmware do when an engine is already running at a high power setting and TWO things occur in quick succession? I suspect this kind of event is a highly probable cause, but these two events have not occurred close enough together, or ever, before.

Imagine this: Plane taking off, Throttle Levers near Full Power, Engines performing correctly, also near Full Power, Rotation etc all normal, plane beginning to climb, positive rate achieved.

Pilot calls GEARUP. GearUp, activated.

The Gear Retract sequence begins. Due to some unforeseen or freshly occurring (maybe intermittent short or open circuit) linkage between the gear Up sequence and the WOW or Air/Ground System, the signal to both TCMAs suddenly switches to GROUND. All "good", so far, as the engine RPMs match the Throttle Lever settings and TCMA doesn't flinch. The plane could be in a Valid Takeoff sequence, so it had better not! But it does make a bit of sense. How is WOW / Air/Ground detected? Somewhere near the Landing Gear, I assume.

HOWEVER, now, a moment later, and perhaps due to a related system response, the Thrust Levers suddenly get pulled back to Idle, whether by man or Machine.

What would you expect the TCMA system to do? I would guess, fairly soon thereafter, two, independent, Fuel Cutoffs... Though I fully admit, I'm guessing based on a severe lack of knowledge of that Firmware.

Ok, no need for further explanation on that point, but I did refer to TCMA unflatteringly as a contraption, earlier. Last night (regrettably, before bed) I started looking at the TCMA Google Patent. Let's just say, so far, I'm aghast! My first impressions are bad ones. How did this patent even get approved? What I suspect here, now, is not a Firmware bug, but a serious Logic and Program Defect. But we'd have to see what's inside the firmware.

When I get more time, I'll dig deeper.

1 user liked this post.

Hot 'n' High
2025-06-28T21:21:00
permalink
Post: 11912668
Originally Posted by Machinbird
...... This might be a useful starting point for understanding what could have gone wrong.
Sadly Machinbird , I think this has been covered, especially re the engines, in some depth already. Several who post here have extensive knowledge in FADEC and associated protections. It's all a mystery tbh and I guess we just need to wait now until some concrete evidence is produced by the AAIB.

Not wishing to dampen enthusiasm to find out a cause but, 2 Threads in, no-one is much the wiser. Just my opinion but that seems to be where we are. What a dreadful accident - I can't even begin to imagine what the crew went through in those final moments.

8 users liked this post.

tdracer
2025-06-29T18:07:00
permalink
Post: 11913157
Originally Posted by Kraftstoffvondesibel
This has also been touched upon earlier in the thread, but it rather seems the cut-off switches are in the same LRU, in close proximity, using the same connector and goes through the same wiring harness. No one was able to say whether it works purely by digital signaling, and goes through any common software, or if it is duplicated by purely direct signaling. There might be numerous failure modes of the cut-off switch design, it is obviously very, very robust and overall sound, since dual failures here have never happened, but this is alredy an outlier event.
Again, disclaimer that my direct knowledge of the 787 specifics is limited, standard Boeing design practice is that all engine wiring is segregated between engines (and were practical, between FADEC channels).
The fuel switches are located adjacent to each other; however all the wiring would be separate.

7 users liked this post.

tdracer
2025-06-29T19:57:00
permalink
Post: 11913194
Originally Posted by Kraftstoffvondesibel
Separate would seem to be a relative term, ofcourse wires are separated in some way, but how separate? Do they share a quick connect? Are there 2 separate looms each side of the throttle installation, or are thety in some twisted bundle together. Someone on this thread claimed the fuel cut offs where inhibited if the throttles weren\x92t in idle. Is this true? If so, is this a software or mechanical system?
Can anything so closely placed together be considered separate when looking at an outlier event?

Everyone is looking for something that would shut off both engines at the exact same time. This installation could, it is the closest the 2 systems get in proximity, physically and electrically, at least and it seems we don\x92t know a whole lot about it.
Engine isolation means just that. No common wire bundles, no common connectors. You can move the fuel levers at any time - there is no lockout of any kind with respect to thrust lever position (imagine dropping something into the lever linkage that jams the thrust lever at max power - then being unable to shut that engine down?)
Obviously, since the thrust levers are placed next to each other - the separation that's available in the center console is limited, but as soon as the wiring exits that constrained area, the separation increases. Furthermore, the same engine-to-engine wiring separation also applies to channel A/B FADEC channels, as well as the fuel switch/fire handle wiring.
All these requirements are documented in the Boeing DR&O (Design Requirements and Objectives) - and there is an audit done late in the design process to insure compliance.
In short, you're barking up a tree stump - there is nothing there.

12 users liked this post.

The Brigadier
2025-06-30T08:28:00
permalink
Post: 11913431
We know that the right-hand GEnx-1B was removed for overhaul and re-installed in March 2025 so it was at \x93zero time\x94 and zero cycles, meaning a performance asymmetry that the FADEC would have to manage every time maximum thrust is selected. If the old engine was still on the pre-2021 EEC build while the fresh engine carried the post-Service Bulletin software/hardware, a dual \x93commanded rollback\x94 is plausible. A latent fault on one channel with the mid-life core can prompt the other engine to match thrust to maintain symmetry, leading to dual rollback.

Last edited by The Brigadier; 30th Jun 2025 at 11:43 .

3 users liked this post.

skwdenyer
2025-06-30T12:29:00
permalink
Post: 11913592
Originally Posted by The Brigadier
We know that the right-hand GEnx-1B was removed for overhaul and re-installed in March 2025 so it was at \x93zero time\x94 and zero cycles, meaning a performance asymmetry that the FADEC would have to manage every time maximum thrust is selected. If the old engine was still on the pre-2021 EEC build while the fresh engine carried the post-Service Bulletin software/hardware, a dual \x93commanded rollback\x94 is plausible. A latent fault on one channel with the mid-life core can prompt the other engine to match thrust to maintain symmetry, leading to dual rollback.
You think the Thrust Asymmetry Protection could kick in and leave the aircraft with little to no thrust?

1 user liked this post.

Lonewolf_50
2025-06-30T13:08:00
permalink
Post: 11913613
Originally Posted by The Brigadier
We know that the right-hand GEnx-1B was removed for overhaul and re-installed in March 2025 so it was at “zero time” and zero cycles, meaning a performance asymmetry that the FADEC would have to manage every time maximum thrust is selected. If the old engine was still on the pre-2021 EEC build while the fresh engine carried the post-Service Bulletin software/hardware, a dual “commanded rollback” is plausible.
A latent fault on one channel with the mid-life core can prompt the other engine to match thrust to maintain symmetry, leading to dual rollback.
Then why didn't that happen on the previous flight from Deli to Ahmedabad, or any of the previous flights since that engine install in March?
Originally Posted by silverelise
He also confirmed that all the data from the recorders has been downloaded and is being processed by the Indian AAIB, no boxes have been sent abroad.
The 30 day deadline for the preliminary report is July 12th.
Thanks for the update, and in particular that bolded bit.
Originally Posted by the linked article
Investigators still haven’t ruled out the possibility of sabotage being behind the Air India crash in Ahmedabad earlier this month that killed 274 people , according to India’s aviation minister. The Aircraft Accident Investigation Bureau (AAIB) has confirmed that the aircraft’s flight recorders – known as black boxes – will not be sent outside the country for assessment and will be analysed by the agency, said Murlidhar Mohol, the minister of state for civil aviation.l
island_airphoto
2025-06-30T13:24:00
permalink
Post: 11913622
Originally Posted by Lonewolf_50
Thank you for that answer, edge cases do abound in complex systems, but would not moving the throttles forward by hand (as the thrust was beginning to reduce {for that strange reason}) overcome that and restore thrust?
(As I don't fly the 787, I may be missing something basic on how the systems work).
I too don't fly that plane or any other FADEC plane for that matter, so I'll leave that to others. I keep thinking of the DA-42 that killed both engines due to an odd circumstance they didn't think would happen. It was a gross example of a corner case, but you can have other ones much harder to root out.

1 user liked this post.

The Brigadier
2025-06-30T13:59:00
permalink
Post: 11913645
Originally Posted by skwdenyer
You think the Thrust Asymmetry Protection could kick in and leave the aircraft with little to no thrust?
This could be several issues aligning to cause the loss of thrust. If the new engine was installed and the synchronisation step was omitted by maintenance staff and the engines had different CPU/software versions then there could be an emergent failure mode when maximum thrust is applied resulting in a FADEC rollback. Almost impossible to anticipate and create that in a test scenario. Don't overlook that the pilot moving thrust levers to override rollback will be ignored by the software, if the FADEC has flipped into protective mode.

That said, the continued absence of the FAA issuing an Emergency Airworthiness Directive for the Dreamliner suggests to me the fault was something like contaminated fuel which was specific to that flight.
fdr
2025-06-30T23:39:00
permalink
Post: 11913950
Originally Posted by The Brigadier
We know that the right-hand GEnx-1B was removed for overhaul and re-installed in March 2025 so it was at \x93zero time\x94 and zero cycles, meaning a performance asymmetry that the FADEC would have to manage every time maximum thrust is selected. If the old engine was still on the pre-2021 EEC build while the fresh engine carried the post-Service Bulletin software/hardware, a dual \x93commanded rollback\x94 is plausible. A latent fault on one channel with the mid-life core can prompt the other engine to match thrust to maintain symmetry, leading to dual rollback.
However, a roll back on its own to idle would not give the evidenced gear behaviour nor the RAT (I happen to concur that the RAT was deployed and probably automatically). Given the gear tilt, it is safe to assume no engine is at idle, the normal electrical systems are not functioning at all.

3 users liked this post.