Announcement

Collapse
No announcement yet.

CSL '0401' Program Binary Disassembly Notes

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • karter16
    replied
    Originally posted by bmwfnatic View Post

    Don’t have to do this by hand, if you can get a compiler going for this arch you can just write it in C, compile as a single function without runtimes and copy the assembly from the compiled program.
    I'd still need to map the various memory locations for the compiler somehow though right? (e.g. stack location, locations of the global variables, etc., plus tell it what values were already in what registers, etc.). Also if the compiler doesn't have an understanding of the memory map and where the function is going to go in memory space it's not necessarily going to get things like near and far jumps right, etc. I'd have thought?

    Leave a comment:


  • bmwfnatic
    replied
    Originally posted by karter16 View Post
    Don't want to make it seem simpler than it is. That's still over 100 instructions to craft by hand (no assembler, no ability to use labels, etc.) and then debug by hand. It's a lot of work, but it does seem as though it should be achievable.
    Don’t have to do this by hand, if you can get a compiler going for this arch you can just write it in C, compile as a single function without runtimes and copy the assembly from the compiled program.

    Leave a comment:


  • heinzboehmer
    replied
    Originally posted by karter16 View Post
    You have more confidence in my assembly (actually it's not even going to be assembly really, more like poking machine code directly into the ROM) programming skills than I do myself haha. But yes, I was thinking about this further and I think that the safety concept implementation probably does mitigate a fair amount of the risk actually. I think provided I only change the Master section of the ROM then we can be pretty confident that the Slave will identify scenarios where the Master is non-performant and reset it and log a code. It hadn't occurred to me last night that all critical safety functions are maintained in parallel by both processors. I'll still do some more investigation into this but from a safety perspective even if there was some edge case bug in the modified code the Slave should maintain the function of the DME from a safety perspective.
    I would 100% put my DME at risk to try this sort of thing out. Thing has been through WAY worse. There's a couple jumper wires hanging out in there cause I've destroyed some pads with so much soldering/desoldering.

    But I totally get it, distributing this kind of stuff is scary. Almost as scary as the assembly itself, ha.


    Also, I think Bryson covered everything. I'm essentially logging the exact same things.

    Leave a comment:


  • karter16
    replied
    Okay cool - TKA it is.

    The more I look at this the simpler it seems to be getting. Slight change in approach to that proposed above.

    Turns out there actually IS some space to add a pointer in the array to another CAN function. It looks like it's full but actually repeats several times for ARBID 0x513 (which is the last one in the array) so I could slot an additional one in there. The advantage of that is that I could then use the existing CAN_Send_DME_ARBID() function to call the new one as well (less messing around making sure I've set the stack correctly before jumping to the new function, etc.).

    This then means that the actual coding boils down to the following:

    1: Create a copy of CAN_Send_DME_ARBIDs() in empty ROM space - ~ 30 instructions.
    2: Reference the new function in the 10ms task - 1 instruction.
    3: Create a new function that handles the new CAN message in empty ROM space - ~ 80 instructions (based on the function for ARBID 0x316).

    This would of course be for a simple version with fixed variables. Abstracting out the ability to configure what variables to push would be significantly more work.

    Don't want to make it seem simpler than it is. That's still over 100 instructions to craft by hand (no assembler, no ability to use labels, etc.) and then debug by hand. It's a lot of work, but it does seem as though it should be achievable.

    Leave a comment:


  • Bry5on
    replied
    Originally posted by karter16 View Post

    Awesome thanks!

    Intake Air Temp = TAN byte on Master @ 0x00FFEDD9
    Radiator Temp - do you want coolant temp (TMOT byte on Master @ 0x00FFEDDA) or radiator outlet temp (TKA byte on Master @ 0x00FFEDDE)?
    EGT = TABG word on slave but available on DPR @ 0x00FF8088
    Relative Opening = AQ_REL word on slave but available on DPR @ 0x00FF80DE or AQ_REL_ALPHA_N word on Master @ 0x00FE97A depending on what you want to use it for.
    Lambda Integrator 1 = LA_F_REGLER1 word on Slave but available on DPR @ 0x00FF80CA
    Lambda Integrator 2 = LA_F_REGLER2 word on Slave but available on DPR @ 0x00FF80CC

    All of these are available either from the Master CPU or are already available on the Dual-Ported RAM which is convenient.





    You have more confidence in my assembly programming skills than I do myself haha. But yes, I was thinking about this further and I think that the safety concept implementation probably does mitigate a fair amount of the risk actually. I think provided I only change the Master section of the ROM then we can be pretty confident that the Slave will identify scenarios where the Master is non-performant and reset it and log a code. It hadn't occurred to me last night that all critical safety functions are maintained in parallel by both processors. I'll still do some more investigation into this but from a safety perspective even if there was some edge case bug in the modified code the Slave should maintain the function of the DME from a safety perspective.
    I think TMOT is already available on CAN ID 0x329 but radiator outlet temp is not. So, radiator outlet temp

    Relative opening would be for part throttle tuning, or IDing where in the map you are for any given condition. Nice that they're all generally available!

    For completeness, here are some of the other things I log..

    CAN ID 0x316:
    - Engine Speed
    - Torque command​

    CAN ID 0x329:
    - Cruise control buttons
    - Coolant temp
    - Ambient pressure
    - TPS
    - Clutch/Neutral status

    CAN ID 0x545:
    - Cruise active status
    - Oil temp
    - Oil level

    CAN ID 0x153 (from DSC):
    - Speedo
    - Brake status (binary on/off, not pressure)
    - DSC status
    - DSC intervention status
    - DSC commanded throttle cuts

    CAN ID 0x615 (from cluster):
    - Ambient temp
    - Displayed Speed
    - Night lighting status (thanks Heinz)

    CAN ID 0x1F3 (from DSC):
    - Lateral acceleration
    - Longitudinal acceleration
    - Yaw rate

    CAN ID 0x1F5 (from steering angle sensor):
    - Steering angle
    Last edited by Bry5on; 06-09-2025, 10:09 PM.

    Leave a comment:


  • karter16
    replied
    Originally posted by Bry5on View Post

    The list:
    Intake Air Temp
    Radiator Temp
    EGT
    Relative Opening
    Lambda Integrator 1
    Lambda Integrator 2
    heinzboehmer anything I'm missing? Maybe Engine Load?

    I think everything else needed is available over CAN already
    Awesome thanks!

    Intake Air Temp = TAN byte on Master @ 0x00FFEDD9
    Radiator Temp - do you want coolant temp (TMOT byte on Master @ 0x00FFEDDA) or radiator outlet temp (TKA byte on Master @ 0x00FFEDDE)?
    EGT = TABG word on slave but available on DPR @ 0x00FF8088
    Relative Opening = AQ_REL word on slave but available on DPR @ 0x00FF80DE or AQ_REL_ALPHA_N word on Master @ 0x00FE97A depending on what you want to use it for.
    Lambda Integrator 1 = LA_F_REGLER1 word on Slave but available on DPR @ 0x00FF80CA
    Lambda Integrator 2 = LA_F_REGLER2 word on Slave but available on DPR @ 0x00FF80CC

    All of these are available either from the Master CPU or are already available on the Dual-Ported RAM which is convenient.

    What is less convenient is that that's 10 bytes, and ideally we keep it to 8 bytes to fit in a single CAN message.



    Originally posted by Bry5on View Post
    And I think I can speak for Heinz when I say both of us are game to put DMEs at risk here - not much we can really do to damage the engine without it really not working at all I think.​
    You have more confidence in my assembly (actually it's not even going to be assembly really, more like poking machine code directly into the ROM) programming skills than I do myself haha. But yes, I was thinking about this further and I think that the safety concept implementation probably does mitigate a fair amount of the risk actually. I think provided I only change the Master section of the ROM then we can be pretty confident that the Slave will identify scenarios where the Master is non-performant and reset it and log a code. It hadn't occurred to me last night that all critical safety functions are maintained in parallel by both processors. I'll still do some more investigation into this but from a safety perspective even if there was some edge case bug in the modified code the Slave should maintain the function of the DME from a safety perspective.
    Last edited by karter16; 06-09-2025, 09:46 PM.

    Leave a comment:


  • Bry5on
    replied
    Originally posted by karter16 View Post
    ​If you could dig out your ideal list of variables that would be super handy as a starting point. It will probably also give me the impetus to figure out what to do about values that originate from the slave CPU. I haven't looked but I'm guessing that there's probably room left in the DPR (given the carefree way BMW's engineers used it when adding on the 0401 specific functionality).
    The list:
    Intake Air Temp
    Radiator Temp
    EGT
    Relative Opening
    Lambda Integrator 1
    Lambda Integrator 2
    heinzboehmer anything I'm missing? Maybe Engine Load?

    I think everything else needed is available over CAN already

    And I think I can speak for Heinz when I say both of us are game to put DMEs at risk here - not much we can really do to damage the engine without it really not working at all I think.

    Leave a comment:


  • karter16
    replied
    Originally posted by Bry5on View Post
    Holy crap this is amazing! I think I had an ideal list of variables at some point back. This is all above my pay grade to easily jump in, how can I help?
    ​If you could dig out your ideal list of variables that would be super handy as a starting point. It will probably also give me the impetus to figure out what to do about values that originate from the slave CPU. I haven't looked but I'm guessing that there's probably room left in the DPR (given the carefree way BMW's engineers used it when adding on the 0401 specific functionality).


    Originally posted by heinzboehmer View Post
    Instead of copying the logic in CAN_Send_DME_ARBIDs(), why not do some things like this:

    Code:
    void CAN_Send_All_ARBIDs(void) {
    CAN_Send_DME_ARBIDs();
    if (enough_time && enough_memory && etc) {
    CAN_Send_Extra_ARBIDs();
    }
    }
    Then CAN_Send_All_ARBIDs() can replace CAN_Send_DME_ARBIDs() in the 10ms task and CAN_Send_Extra_ARBIDs() can handle anything new.

    This makes it so that all the original logic is run with very little change in timing. Also gives you a lot more flexibility cause you can tell CAN_Send_Extra_ARBIDs() to do whatever you want and it won't interfere with the stock routines. Lastly, if the checks are comprehensive, you won't bog down the CPU.

    I like the idea of building out a new data structure for the extra data. No sense in trying to squeeze it in with the rest when you can just have your new function deal with the new data location.
    Yeah that's a good suggestion. I'm not so sure that it will be easy/possible to check how much time/memory is left, given the way the OS manages and calls the tasks. I would need to dig in to this a bit more as part of this work. I know that the safety concept has safeguards built in to stop the DME from being overwhelmed by incoming CAN messages (given these are processed via interrupt) but I can't imagine anything similar exists for the outgoing messages given them are triggered from the timed task. Regardless I'll also look into this further.



    I'd be lying if I said I didn't have some concerns about the legal and moral liability of this. Making a modification of this scale to the program code without complete documentation and without a complete understanding of the entire operating system isn't without risk. Even if others completely absolved me of any responsibility in their eyes I'm not sure if I could stomach the risk of damaging someone else's car, or their life....

    I think next steps are as follows:

    1: Start getting together a list of values that are most in demand to be logged (this helps check whether there are any other gotchas by playing out real examples)
    2; Spend more time digging into OS safeguards around the timed tasks to understand worst-case scenarios
    3: Look into options around getting values from the slave CPU (probably via the DPR)

    Leave a comment:


  • heinzboehmer
    replied
    Instead of copying the logic in CAN_Send_DME_ARBIDs(), why not do some things like this:

    Code:
    void CAN_Send_All_ARBIDs(void) {
      CAN_Send_DME_ARBIDs();
      if (enough_time && enough_memory && etc) {
        CAN_Send_Extra_ARBIDs();
      }
    }
    Then CAN_Send_All_ARBIDs() can replace CAN_Send_DME_ARBIDs() in the 10ms task and CAN_Send_Extra_ARBIDs() can handle anything new.

    This makes it so that all the original logic is run with very little change in timing. Also gives you a lot more flexibility cause you can tell CAN_Send_Extra_ARBIDs() to do whatever you want and it won't interfere with the stock routines. Lastly, if the checks are comprehensive, you won't bog down the CPU.

    I like the idea of building out a new data structure for the extra data. No sense in trying to squeeze it in with the rest when you can just have your new function deal with the new data location.

    Leave a comment:


  • Bry5on
    replied
    Holy crap this is amazing! I think I had an ideal list of variables at some point back. This is all above my pay grade to easily jump in, how can I help?

    Leave a comment:


  • karter16
    replied
    Okay so with the latest prodding from Bry5on I've spent a bit more time looking at options around sending values over CAN. Here's my thoughts on how it could be achieved.

    The DME sends 3 different CAN messages to the bus every 10ms (4 if SMG). It does this by calling a function in the 10ms task on the Master CPU.

    Click image for larger version  Name:	Screenshot 2025-06-09 at 8.33.31 PM.png Views:	0 Size:	145.5 KB ID:	307762

    This function looks like this:

    Click image for larger version  Name:	Screenshot 2025-06-09 at 8.34.03 PM.png Views:	0 Size:	31.6 KB ID:	307763

    and the CAN_Send_DME_ARBID() function looks like this:

    Click image for larger version  Name:	Screenshot 2025-06-09 at 8.36.04 PM.png Views:	0 Size:	45.8 KB ID:	307764

    In turn the PTR points to an array of memory locations which in turn points to the specific functions for each CAN message:

    Click image for larger version  Name:	Screenshot 2025-06-09 at 8.36.46 PM.png Views:	0 Size:	52.6 KB ID:	307765

    Click image for larger version  Name:	Screenshot 2025-06-09 at 8.37.21 PM.png Views:	0 Size:	129.7 KB ID:	307766


    So where does this leave us?

    I think the best approach is to add in a new additional CAN message. This would give us 8 additional bytes that we could poke our values into. And I think that the best way to do that is to replace CAN_Send_DME_ARBIDs() with a new replacement function.

    1: It's an easy change in the 10ms task to call a function at a different memory location. No change in length of code, just the memory address to jump to.
    2: We can then build out a copy of CAN_Send_DME_ARBIDs() but add in the extra code to call a new function for our new CAN message.
    3: We don't need to try to inject a value into the existing array of CAN messages (which there isn't room to do).
    4: We can then build out a new function to send our new CAN message.

    By doing it this way we maximize our flexibility and minimize any changes to existing code and data. It also leaves open a tantalizing opportunity.

    By adding another layer of abstraction to our new function that constructs our CAN message I believe it would be possible to store the memory addresses of the variables we want to poke into the new CAN message in the partial (tune) binary. This additional layer of abstraction would mean that it would be possible for the end user to control what 8 bytes of data they wanted to make available in the CAN message by editing only the partial binary. There would be some limitations and considerations (e.g. the aforementioned issue of variables not available over the DPR, dealing with 8/16 bit values and how those are mapped (this could be dealt with with some additional config bytes), some variables are produced by functions that run less frequently than every 10ms, so there would be limited value in sending them over CAN so often, etc.)

    This assumes that the DME has enough headroom to send a 5th message in the 10ms task. I see nothing to suggest that the DME is being pushed that close to the limit already so I doubt that would be an issue in practice. If it is given a sufficiently low-priority ID it wouldn't matter too much I don't think.

    Leave a comment:


  • S54B32
    replied
    Originally posted by Bry5on View Post
    Any chance you're able to increase the frequency/speed that the telegrams reply? I'm assuming it's on an interval routine. This would make the tuning process a lot more useful and quick as well.

    The more useful thing is to broadcast more values over CAN
    karter16 should get in contact with sda2 (robin)! He did similar improvements and changes to the DS2 routines on ms43. And added also CAN ID's for tuning/logging

    i would love to see ignition retards cylinder specific from knockcontrol, on DS2.
    DS2 it self is so easy (on tester side), with a small display and a esp32/arduino or other mcu you can easily build a data acquisition display I would say even with limited knowledge about programming. Did that some years ago, but I was limited to available DS2 commands

    Leave a comment:


  • karter16
    replied
    Originally posted by Bry5on View Post
    Any chance you're able to increase the frequency/speed that the telegrams reply? I'm assuming it's on an interval routine. This would make the tuning process a lot more useful and quick as well.

    The more useful thing is to broadcast more values over CAN
    I haven't managed to unpick that far back yet. The DS2 handler seems to be called via interrupt, so working out what it is that's calling it is the next bit to try work out.

    Definitely broadcasting over CAN would be ideal - I have a rough idea how that would work, it will just be a lot of effort to pull off. When I get a chance I'll do a bit of a write up on how I see that possibly working. I think I adding an additional CAN ARBID isn't necessarily the hard bit. The challenge is the Master/Slave arrangement.

    With DS2 the master handles telegrams that have values that originate from Master and likewise the slave handles telegrams for values that originate from slave.

    With CAN the Master handles it all. Therefore broadcasting any values from the slave that aren't already exposed over the DPR (dual ported RAM) is harder.


    Sent from my iPhone using Tapatalk

    Leave a comment:


  • Bry5on
    replied
    Any chance you're able to increase the frequency/speed that the telegrams reply? I'm assuming it's on an interval routine. This would make the tuning process a lot more useful and quick as well.

    The more useful thing is to broadcast more values over CAN

    Leave a comment:


  • karter16
    replied
    So I don't really have any particular use case for it right now, but I'm pretty confident I've worked out enough about the DS2 telegrams and how they are constructed in the DME, and then interpreted in TestO to be able to replace an existing telegram with any set of variables I like from the DME. I'll try to prove this out in the next few weeks with a simple example (I'm thinking I could make AQ_REL_ALPHA_N available in TestO as a demonstration - although this wouldn't actually really be useful as a general solution for the VE tuning as it will be much easier to create a modified version of heinzboehmer's spreadsheet that applies the interpolation factor to AQ_REL).

    What are the limitations?
    - Replacing an existing value with another value of the same size (byte, word, etc.) is no problem as it's simply replacing one memory address with another in the code.
    - The DS2 telegrams tend to have room in them to insert additional values (array elements set to 0), but this is not as easy as a move instruction is 6 bytes vs a clr instruction at 2 bytes. It would mean greater modification to the code to jump out to another location would be required.
    - Doing any of this means flashing a custom program binary. It also means while the DME is running the custom binary other DS2 tools (like INPA) won't read the modified values correctly.

    Is there actually a use case for this?
    I don't really know to be honest. Maybe other people can think of some useful examples? I'm more envisaging this as potentially being useful for testing/debugging/learning to expose variables that otherwise aren't exposed outside of the DME.

    Leave a comment:

Working...
X