Posts by user "Feathers McGraw" [Posts: 8 Total up-votes: 27 Pages: 1]

Feathers McGraw
2025-06-16T22:33:00
permalink
Post: 11903844
I'd like to mention something that, while unrelated, does shed a bit of light on how computerised systems can misinterpret input data and take the wrong action. I spent 40 odd years as an electronics engineer involving complex systems, it can be surprising just how careful one must be in systems that sample data and then process it for decision making in software.

On August 9th 2019, there was a significant grid failure in the UK leading to load shedding (removing supply to many consumers, including Newcastle Airport) which started when a series of several lightning strikes in Hertfordshire caused a trip out of generators at Little Barford combined-cycle gas turbine generation plant. This was followed by the shut down of the power concentrator and grid connector from the Hornsea1 off-shore wind farm, significant changes in the grid frequency away from the acceptable limits which is what triggered further load shedding.

The reason I mention it is that Hornsea1 going off line was due to the software that operated the concentrator/connector sensing large voltage transients due to the lightning strikes 120 miles away, however these transients were only of the order of 10us length spikes with nominal 20ms cycles at 50Hz on the grid. In old reliable grid equipment that had been in use for decades such spikes would have been ignored because the large rotating machine inertia would make them irrelevant. Other systems went into various states of shut down for protection reasons, some of the Siemens Class 700 trains had to be reset by the train crew, others required a Siemens engineer to drive to each train and reload its firmware. The train protection mode occurred because the grid frequency on the 25kV AC supply went below 49.8Hz, this was a programmed default and it turned out to have been a very conservative one, the trains could have continued to operate normally at even lower frequencies but someone decided to write the programs without actually testing what a sensible limit was. The whole very widespread problems this caused could have been avoided by not acting instantly on microsecond transients in a 50Hz system.

Is it possible that the monitoring systems on a Boeing 787 also sample the electrical system voltages and currents at a relatively high frequency, and thus in the event of a transient of some type perhaps over-react to this event by taking precipitate action instead of waiting a short time before re-sampling again. I have seen a suggestion that an alternative explanation for the "bang" heard by the survivor in seat 11A might have been the sound of a Bus Tie Contactor closing, which in itself suggests something quite important relating to the electrical system. The 787 is unusual in that it uses variable frequency AC generators whose outputs are rectified and then inverted to other AC voltages and also quite high DC voltages, some in the 250-300V region.

I hope that some hard information is going to come out from the investigators soon, but given that the flap mis-selection idea is effectively debunked and we know that the main undercarriage either started its retraction cycle with bogies tilting forwards or that falling hydraulic pressure caused the same thing to happen, then the only thing that fits the observed flight path is loss of thrust on both engines which could have either preceded or followed an electrical failure. We also know that the RAT deployed and in the relatively undamaged tail cone the APU inlet was open or opening indicating a likely auto-start of the APU due to the parameters to trigger that occurring.

I would like to know how many tests of the electrical/computer interactions in 787 development involved arcing/shorting voltage/current transient testing. Is this the sort of thing that is done at all? The EECs (FADECs) in the engines are self-powered via magnetos and self-controlling in many circumstances, but perhaps something caused them to think that the thrust levers had been retarded, and such a thing might have been down to the effect of electrical transients on the various signals received by the EECs.

I have read the original 65+ pages of the thread, but I have not seen any discussion of this particular idea. Maybe that is because the 787 is quite a significant departure from Boeing's previous design practices with totally different electrical systems, higher pressure hydraulics and no doubt other aspects as well.

What do you all think?

Subjects: APU  Electrical Failure  Generators/Alternators  Hydraulic Failure (All)  Hydraulic Pumps  Parameters  RAT (All)  RAT (Deployment)

15 users liked this post.

Feathers McGraw
2025-06-17T10:31:00
permalink
Post: 11904184
Presumably they must also supply that information to the FDR at least, but I don't know how.

Subjects: FDR

Feathers McGraw
2025-06-19T15:02:00
permalink
Post: 11906093
The more important recorder, from under the flight deck which probably has better cockpit voice data, is highly likely to be much more damaged.

This is in reply to EDLB

Subjects: None

Feathers McGraw
2025-06-20T12:03:00
permalink
Post: 11906898
VT-ANB flew DEL-AMD, approximately 1hr 15 minutes, before its fatal departure from AMD.

Does anyone know if fuel was added at AMD or was the total fuel required to LGW already aboard on departure from DEL?

Subjects: None

3 users liked this post.

Feathers McGraw
2025-06-20T13:01:00
permalink
Post: 11906962
Originally Posted by Roo
Unlikely to have tankered in fuel & even if they did they would still have had to uplift more. It would be normal to load the onward fuel in AMD having only flown in with what they needed for the short sector.. Would most likely have departed DEL with wing tank fuel only and an empty centre tank. I wont speculate here but I do wonder what works might have occurred in the 8 hour stop in DEL.
I don't disagree, but I wondered if there could be a plausible reason for filling up at DEL. I am sure the investigation will consider all fuel sources used by VT-ANB in the last day or perhaps more of its operation. Not that I am pointing the finger at fuel problems, I just don't know that's all.

Subjects: Centre Tank

Feathers McGraw
2025-06-21T13:50:00
permalink
Post: 11907772
Originally Posted by Crossky
Hello, this is my first post on pprune; as a 787 pilot I’m also puzzled by this accident. All seem to agree that for some reason there was a complete electrical failure and RAT deployment. With a complete electrical failure all six main fuel pumps fail. Each engine also has two mechanically driven fuel pumps. On takeoff, if there is fuel in the center tank, it will be used first, pumped by the two center tank pumps.
My airline’s manuals don’t go into much detail, but I read on another site that if both the center tank pumps fail, the engine driven pumps aren’t able to suction feed well enough from the center tanks to sustain engine operation. If there was fuel in the center tanks, a complete electrical failure would soon lead to center tank fuel pumps failure (all fuel pumps failure as stated previously) and fuel starvation of both engines. A rescue from this situation would be an immediate selection of both center tank fuel pumps OFF (not if my airline’s non-normal checklists) and waiting for successful suction feed from the L and R main tanks to occur, this would take a number of seconds.
Is this something that you train for in your airline? Am I correct that to do this requires making the needed switch selections on the overhead panel?

Further up the thread one of the posters mentions that it is very unlikely that any crew action (checklist, QRH) would have got anywhere near to changing a fuel pump switch position.

Subjects: Centre Tank  Electrical Failure  Fuel (All)  Fuel Cutoff  Fuel Pumps  RAT (All)  RAT (Deployment)

Feathers McGraw
2025-06-21T19:24:00
permalink
Post: 11907998
Originally Posted by JustusW
This has been setting a bit wrong with me for the entirety of the thread. When people say "software" they have an image in their head, and in this case the image is rather unhelpful if not outright wrong.
When we say "Software" we mean things like this:

This is the type of software we are usually subject to in our everyday lives basically everywhere. Your phone, your fridge, your oven, your water heater, your car, etc. pp. ad nauseam.

In case of the Safran FADEC 3 this is not actually what we're dealing with. It uses something called an FPGA (Field Programmable Gate Array) which is a very different beast to what we are used to dealing with.

In a normal computer we simply translate code like above into something less human readable but instead interpretable by a microchip or the like. This "machine code" is then run sequentially and does whatever it is meant to do (hopefully). If we were to expand our little happy check above with another condition then our computer would sequentially check these conditions. This opens up a lot of hoopla about timing, especially when interacting with the real world.

Let's say you want to track N1 and N2 in an engine. You'd have to make a measurement, that value would have to go through an AD converter, which writes it to some (likely volatile) storage, where it is then accessed by some sort of (C)PU, transferred to yet another piece of volatile memory and then used in the computation. This takes time because you can't do those things within the same clock cycle.

Unsurprisingly this is rather inconvenient when dealing with the real world and especially when dealing with volatile physical processes that need monitoring. Like a modern turbine engine.

Enter the FPGA.

While it is programmable what that actually means is that (at a very high level) you can build a thing called a truth table, that means a definitive mathematical mapping of input states to output states. Unlike our sequential CPU driven system an FPGA will be able to perform its entire logic every time it is asked to do so. We don't have to wait for our happy check to perform any other check.

This is very useful for our Turbine Engine, because now we can verify that N2 is smaller than X without delaying our check that the Throttle Control Lever setting is within acceptable distance to N1 and N2, while also creating the appropriate output for all the different bypasses and bleed valves in a modern engine, and so on. The Safran FADEC 3 does this "up to 70 times per second" as per the vendor.

In order to be manageable the actual FADEC consists of multiple different pieces, many of which are FPGAs. At least two are used for so called "input conditioning". This is where we deal with converting messy real world values from sensors subject to real physics into nice and clean numbers. Here our values from the Throttle Control Levers come in, the signal from our N1 and N2 sensors, WOW signal, and on, and on, and on.

This is then changed into logic level signals provided to a different set of FPGAs. Lacking an actual schematic I suspect that this is what sometimes is referred to as "channels". The channel could consist of both a signal conditioning stage and then a logic stage or (more likely) redundant signal conditioning stages feed separately into separate FPGA "channels" are evaluated and then the end result is likely put into yet another FPGA for control output generation.

Why is this important?

Because it is basically impossible for a "bug" to exist in this system. These systems are the epitome of a dumb computer. They do _precisely_ what they are meant to do. The TCMA is simply a set of conditions evaluated just like any other condition in those FPGAs. If there is an "error" then it is in the requirements that lead to the truth table used by these devices but never in the truth table itself. In fact these types of computation device are used precisely because of that very property.

So when people say "the FADEC software" (ie TCMA) has "failed" or "has a bug" what they're really saying is:
The conditions that were experienced in the real world by the system were incorrectly assessed in the system requirements and lead to an undesired output state when taking into account the real world result we would have preferred.
A bit of a mouth full, granted, but an important mouth full. This simply wouldn't be someone missing a semicolon somewhere in a thousand lines of code.

Regards,
Justus
I will just comment here that FPGAs contain many logic blocks which have around them routing channels which allow these logic blocks to be connected up in various ways, some of the logic is asynchronous, some of it is synchronous and thus uses clock signals to advance the state of the logic according to a combination on input signals, asynchronous logic outputs and signals fed back from other logic blocks.

All of this is created by writing software using something like VHDL (VHSIC Hardware Description Language where VHSIC stands for Very High Speed Integrated Circuit) with the output of the compiler for this software being an object file that contains the connection information for the on-FPGA routing. However, this process requires careful simulation, not just of the logical behaviour of the FPGA being designed but also of the ability of the programmable routing on the FPGA to get the required signals to where they are required with the necessary set-up delays so that the next clock edge does not allow a glitch where a signal changes state during the period when it must be static so that it propagates correctly through the logic blocks. With clock signals that form a clock tree across the device the skew on the clock signals must be carefully checked so that the lowest skew requirements are always met, there may be more margin for the less critical parts of the system.

Now, I am sure that the state of the art has moved on a lot since I was heavily involved in this sort of engineering but I can say that when my colleagues were prototyping ASICs (which are application specific devices that are not programmable like an FPGA) using FPGAs because it can be reprogrammed to cure bugs found in the code that created it I am absolutely sure that it is possible for a latent bug to exist in what has been built despite extensive testing being carried out. I well remember sitting down with an FPGA designer and a software engineer and showing them a situation where one of my radio devices was not being commanded to transmit, they refused to believe me until we captured their output signals in the failed state and I was thus able to prove to them that it was their logic and code that was failing and that my radio system was fully working except it couldn't possibly transmit when its enable line was inactive.

I would therefore never state that an FPGA-based FADEC is infallible. The devil will be in the detail of the functions contained within the devices and how they interact with each other and how the original logical functions are described and specified and then coded and checked for logical errors before getting on to the actual physical reality of the FPGA manufacturer's design and development system that bolts on to the back of the source code. If the FPGA devices are being pushed anywhere near the edge of their performance envelope, just like an aircraft things can go wrong. If a design begins with a device that is only just large enough in terms of how many logic blocks and routing channels are available for the logic required it's almost a certainty that development will take it to this area of its performance and a lot of tweaking may be needed which means that even more testing will be needed to be reasonable sure that it does what it is supposed to.

Subjects: Air Worthiness Directives  FADEC  TCMA (All)

9 users liked this post.

Feathers McGraw
2025-06-22T15:27:00
permalink
Post: 11908621
Originally Posted by za9ra22
While suspecting that mods may consider this subject outside the realm of this thread - and I think it was raised previously - I have to say that to my mind, there would be significant questions over the integrity and reliability of data collected via third-party commercial businesses or agencies which may or may not have vested interests, and the vulnerability to any transmitted data to unauthorised and unknown outside access.

Just that last consideration, meaning the need to introduce information security 'experts' into the analysis of data might create far more problems than it solves.
If the data is either encryted and/or signed with a suitably validated certificate, then any tampering becomes immediately obvious which itself makes the tampering pointless especially if it is sent via multiple routes.

Subjects: None