Yes, I can add a "just for this crash" debugging parameter.My concern is that I've seen this type of issue -- there's a callback, it's bad in some way, we have no idea which callback it is -- before. Each time, we have to add "just for this crash" debugging code and then get it into a dev release and then wait until we get enough adoption of the new dev release before we can even start fixing the real bug, increasing the chances that we ship something buggy to the end user. Perhaps because I'm on a stability gardening rotation, I see more of the "we don't know who to assign this to" problems than other people do. Do other stability gardener types see similar issues?--On Wednesday, May 22, 2024 at 1:52:02 PM UTC-7 Joe Mason wrote:BindStateBase is VERY performance (and size) sensitive, since so many callbacks are created and passed around. If there's really common debug info missing, it might be worth adding more, but it'll be a heavy lift.For this particular crash, since you know that the callback is passed to SetOnAborted, you could save some debug info (eg. a base::debug::StackTrace and/or base::debug:TaskTrace) in a member variable alongside the callback, instead of augmenting the callback. Then use base::Alias on those member variables in the stack frame that calls "on_aborted_.Run()".On Wed, May 22, 2024 at 4:30 PM Joe Mason <[email protected]> wrote:+scheduler-dev, where the experts on task tracing hang out.On Wed, May 22, 2024 at 4:21 PM Ian Barkley-Yeung <[email protected]> wrote:We're trying to research a bug (https://g-issues.chromium.org/issues/335902543) where a OnceCallback has a bad pointer and is crashing.The issue is that the call site doesn't tell us who owns the callback. It's one of those common places that many pieces of code hook into, so it could be any one of dozens of callbacks that has the problem.I've tried examining the stack variables using lldb, but unfortunately, I don't get any useful information about which function the callback will eventually resolve to. Minidumps don't save heap memory, so the contents of the BindState are not recorded.First question: Does anyone have any ideas on how to debug this crash further? Is there a way I'm not seeing to figure out which piece of code has the issue with the invalid pointer?Second question: Assuming no one has better ideas on debugging the problem.... I'm thinking of trying to improve the debuggability of these types of issues. (This isn't the first one I've seen.)In particular, I'd like to move the RETURN_ADDRESS() macro from location.cc into a header, and then have base::BindOnce and base::BindRepeating store the results of RETURN_ADDRESS() into a const void * in BindStateBase, call it BindStateBase::creation_address_. Then, in base::OnceCallback::Run and base::RepeatingCallback::Run, copy the creation_address_ into a local variable and use base::debug::Alias to ensure it's visible on the stack in a minidump. It's annoying but I can then manually symbolize the creation address to the creation function.Thoughts?(Side note: Why not a full Location object? Because the normal trick of having a defaulted parameter Location loc = FROM_HERE at the end of a function's parameter list doesn't work with base::BindOnce / base::BindRepeating since they use variadic templates and I don't think you can have a defaulted parameter in a variadic template function. And I'm not willing to do an LSC to change every single call to base::BindOnce/base::BindRepeating to have a FROM_HERE at the beginning. If someone with more template knowledge than me knows of a clever way to accomplish this without requiring every call site to change, please let me know!)--
--
Chromium Developers mailing list: [email protected]
View archives, change email options, or unsubscribe:
https://groups.google.com/a/chromium.org/group/chromium-dev
---
You received this message because you are subscribed to the Google Groups "Chromium-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/chromium-dev/16e70588-f302-4607-9c0c-9030e452df59n%40chromium.org.
You received this message because you are subscribed to the Google Groups "scheduler-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/scheduler-dev/2628b2c4-a4c3-433d-a8b5-ada611d25687n%40chromium.org.
> The call stack shows AnimateGroupedChildExpandedCollapse:$_0 on the stack, and those pass raw_ptr... so you'll get the dangling ptr checks.I'm not sure I believe it though. go/crash marks those frames as having been code-folded (ICF). So any callback with a similar signature (a raw pointer and a weak pointer) will go through that code address. Is it possible that's the source of the problem? Yes. Is it certain? No. And I've had other bugs from previous stability gardening rotations where I believed those code-folded signatures and assigned the crash to the wrong team -- it wasted a week.
> For example, if we somehow knew the concrete type of the object, would that help? Could we base::debug::Alias the remains of the object on the stack? I'm guessing even the vtable would be gone by the (currently irreversible) zapping though...
My understanding is that vtable pointers point to read-only data sections in the ELF? So it might be able to understand a vtable pointer.I don't think BindState is copyable, but yes, one could memcpy the whole thing into a local char array of the correct size in one of the Invoker functions. I think it would make more sense to only copy the BindState::functor_ which gives you address the callback wanted to call. But that also does impose some cost on each call to Once/RepeatingCallback::Run. (And it's even more difficult to interpret.) Still, it should also work.On Wednesday, May 22, 2024 at 2:39:51 PM UTC-7 Dave Tapuska wrote:Have you looked at:The call stack shows AnimateGroupedChildExpandedCollapse:$_0 on the stack, and those pass raw_ptr... so you'll get the dangling ptr checks.dave.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/scheduler-dev/CAF3XrKoPvOytC%3Dor4pDxTTNmBASNvVEBDACix-i2pAett8MW0A%40mail.gmail.com.
Right but I thought the signature would need to include an AshNotificationView (so I'd expect the closures in this method (and the other 5 in this class to match)).. Either way this code is funky..The "if (parent)" would never execute because it owns collapsed_summary_view_ and main_view_ which would fail the dangling raw_ptr checks, I think those should actually be removed as args on the callbacks and just access them directly when "if (parent)" is true.dave.
On Wed, May 22, 2024 at 6:15 PM Ian Barkley-Yeung <[email protected]> wrote:
> The call stack shows AnimateGroupedChildExpandedCollapse:$_0 on the stack, and those pass raw_ptr... so you'll get the dangling ptr checks.I'm not sure I believe it though. go/crash marks those frames as having been code-folded (ICF). So any callback with a similar signature (a raw pointer and a weak pointer) will go through that code address. Is it possible that's the source of the problem? Yes. Is it certain? No. And I've had other bugs from previous stability gardening rotations where I believed those code-folded signatures and assigned the crash to the wrong team -- it wasted a week.
> For example, if we somehow knew the concrete type of the object, would that help? Could we base::debug::Alias the remains of the object on the stack? I'm guessing even the vtable would be gone by the (currently irreversible) zapping though...
My understanding is that vtable pointers point to read-only data sections in the ELF? So it might be able to understand a vtable pointer.I don't think BindState is copyable, but yes, one could memcpy the whole thing into a local char array of the correct size in one of the Invoker functions. I think it would make more sense to only copy the BindState::functor_ which gives you address the callback wanted to call. But that also does impose some cost on each call to Once/RepeatingCallback::Run. (And it's even more difficult to interpret.) Still, it should also work.
On Wednesday, May 22, 2024 at 2:39:51 PM UTC-7 Dave Tapuska wrote:
Have you looked at:The call stack shows AnimateGroupedChildExpandedCollapse:$_0 on the stack, and those pass raw_ptr... so you'll get the dangling ptr checks.dave.On Wed, May 22, 2024 at 5:13 PM Daniel Cheng <[email protected]> wrote:
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/scheduler-dev/CAF3XrKoPvOytC%3Dor4pDxTTNmBASNvVEBDACix-i2pAett8MW0A%40mail.gmail.com.
--
--
Chromium Developers mailing list: [email protected]
View archives, change email options, or unsubscribe:
https://groups.google.com/a/chromium.org/group/chromium-dev
---
You received this message because you are subscribed to the Google Groups "Chromium-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/chromium-dev/CAHgVhZWf3Xp1QGRXp8yOEqh7oq9OEXJc64NJb7aQ6bv3Y54GsQ%40mail.gmail.com.