Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Spike] NullPointerException being thrown while trying to link an async token #2002

Closed
dhilpipre opened this issue Aug 8, 2024 · 12 comments
Closed
Assignees
Labels
8 Story Point Estimate jan-mar qtr Represents proposed work item for the Jan-Mar quarter

Comments

@dhilpipre
Copy link
Contributor

In certain cases where the link method is being called on the Token class, a NullPointerException is being thrown.

It appears that the problem arises when the code reaches the com.newrelic.agent.Transaction.linkTxOnThread(TokenImpl) method. In this code there is this call: oldTx.checkFinishTransaction(oldTxa, oldTxaId);

In checkFinishTransaction, it turns out that this line is called
transactionTime.markTxaFinishTime(txa.getRootTracer());

The transactionTime field is null and this results in an uncaught exception to the application.

At the least, there should be a null check here.

It would also be nice to find out why the transactionTime has not been initialized when it reaches this point.

I am not able to reproduce this error as it was pointed out by a customer using the Labs Ktor instrumentation. Note that Ktor and the underlying Kotlin Coroutines are Kotlin language based if that makes a difference.

Here is a stacktrace of the error
2024-08-08 09:41:59.153 [OperatorApi] [eventLoopGroupProxy-4-6 @call-handler#670] ERROR ktor.application - Unhandled exception caught for CoroutineName(call-handler)
kotlinx.coroutines.CoroutinesInternalError: Fatal exception in coroutines machinery for DispatchedContinuation[NettyDispatcher@55ab96eb, Continuation at io.sparetech.api.TypedApiKt$registerRoutes$2$1$1$1$1.invokeSuspend(TypedApi.kt:175)@468f50cd]. Please read KDoc to 'handleFatalException' method and report this incident to maintainers
at kotlinx.coroutines.DispatchedTask.handleFatalException$kotlinx_coroutines_core(DispatchedTask.kt:142)
at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:113)
at kotlinx.coroutines.EventLoop.processUnconfinedEvent(EventLoop.common.kt:65)
at kotlinx.coroutines.internal.DispatchedContinuation.resumeWith(DispatchedContinuation.kt:343)
at io.ktor.util.pipeline.SuspendFunctionGun.resumeRootWith(SuspendFunctionGun.kt:146)
at io.ktor.util.pipeline.SuspendFunctionGun.loop(SuspendFunctionGun.kt:120)
at io.ktor.util.pipeline.SuspendFunctionGun.access$loop(SuspendFunctionGun.kt:11)
at io.ktor.util.pipeline.SuspendFunctionGun$continuation$1.resumeWith(SuspendFunctionGun.kt:70)
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:46)
at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:104)
at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.ktor.server.netty.EventLoopGroupProxy$Companion.create$lambda$1$lambda$0(NettyApplicationEngine.kt:296)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.lang.NullPointerException: Cannot invoke "com.newrelic.agent.transaction.TransactionTimer.markTxaFinishTime(com.newrelic.agent.tracers.Tracer)" because "this.transactionTime" is null
at com.newrelic.agent.Transaction.checkFinishTransaction(Transaction.java:2451)
at com.newrelic.agent.Transaction.linkTxOnThread(Transaction.java:1497)
at com.newrelic.agent.TokenImpl.link(TokenImpl.java:91)
at com.newrelic.instrumentation.kotlin.coroutines_17.NRContinuationWrapper.resumeWith(NRContinuationWrapper.java:44)
at kotlinx.coroutines.UndispatchedCoroutine.afterResume(CoroutineContext.kt:266)
at kotlinx.coroutines.AbstractCoroutine.resumeWith(AbstractCoroutine.kt:99)
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:46)
at com.newrelic.instrumentation.kotlin.coroutines_17.NRContinuationWrapper.resumeWith(NRContinuationWrapper.java:47)
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:46)
at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:104)
... 17 common frames omitted
Caused by: java.lang.NullPointerException: Cannot invoke "com.newrelic.agent.transaction.TransactionTimer.markTxaFinishTime(com.newrelic.agent.tracers.Tracer)" because "this.transactionTime" is null

@dhilpipre dhilpipre added the bug Something isn't working as designed/intended label Aug 8, 2024
@workato-integration
Copy link

@jbedell-newrelic
Copy link
Contributor

@dhilpipre This is very curious. The transactionTime should never be null there. Is there any chance you can get us a repro app?

@dhilpipre
Copy link
Contributor Author

@jbedell-newrelic I do not and haven't been able to reproduce it myself. Let me reach out to the user to see if we can find some way to reproduce it.

@jbedell-newrelic jbedell-newrelic self-assigned this Aug 13, 2024
@dhilpipre
Copy link
Contributor Author

@jbedell-newrelic The customer did not have code that they were willing to share but they did provide some errors that were in the agent logs related to inconsistent tracer where the tracer finishing was not the same as the last tracer. That might have lead to the scenario causing this problem. I think it may come from the Kotlin Coroutine instrumentation where I am using a TracerFactory in order to track a certain type of method in Kotlin. I have also think I know how it might happen and have sent an update to the customer to test. If it gets rid of the problem then I could just close this issue.

@jbedell-newrelic
Copy link
Contributor

@dhilpipre Ah. Looks like this may be a duplicate of #1076 then?

@dhilpipre
Copy link
Contributor Author

@jbedell-newrelic
Yes, the two do seem to be very similar as there are a bunch of inconsistent tracer messages that show up prior to the NullPointerException

2024-07-02T11:48:08,622+0200 [482499 1] com.newrelic INFO: JRE vendor Amazon.com Inc. version 17.0.8.1
2024-07-02T11:48:08,622+0200 [482499 1] com.newrelic INFO: JVM vendor Amazon.com Inc. OpenJDK 64-Bit Server VM version 17.0.8.1+8-LTS
2024-07-02T11:48:08,622+0200 [482499 1] com.newrelic INFO: OS Linux version 6.6.32-1-MANJARO arch amd64
2024-07-02T11:48:08,622+0200 [482499 1] com.newrelic INFO: New Relic Agent v8.12.0 is initializing...

2024-07-02T11:50:01,146+0200 [482499 89] com.newrelic ERROR: Tracer Debug: Inconsistent state! tracer (actual tracer popped off stack) != lastTracer (pointer to top of stack) for com.newrelic.agent.TransactionActivity@7fffffff (com.newrelic.agent.tracers.OtherRootTracer@d75d02a != com.newrelic.agent.tracers.DefaultTracer@6888c6bb)
2024-07-02T11:50:01,154+0200 [482499 89] com.newrelic ERROR: Tracer Debug: An error occurred calling Transaction.tracerFinished() for class com.newrelic.instrumentation.kotlin.coroutines_17.NRContinuationWrapper : java.lang.NullPointerException: Cannot invoke "com.newrelic.agent.Transaction.activityFailedOrIgnored(com.newrelic.agent.TransactionActivity, int)" because "this.transaction" is null : this Tracer = com.newrelic.agent.tracers.OtherRootTracer@d75d02a
2024-07-02T11:51:01,293+0200 [482499 88] com.newrelic ERROR: Tracer Debug: Inconsistent state! tracer (actual tracer popped off stack) != lastTracer (pointer to top of stack) for com.newrelic.agent.TransactionActivity@7fffffff (com.newrelic.agent.tracers.OtherRootTracer@5481ebf0 != com.newrelic.agent.tracers.DefaultTracer@20fb9e6a)
2024-07-02T11:51:01,293+0200 [482499 60] com.newrelic ERROR: Tracer Debug: Inconsistent state! tracer (actual tracer popped off stack) != lastTracer (pointer to top of stack) for com.newrelic.agent.TransactionActivity@7fffffff (com.newrelic.agent.tracers.OtherRootTracer@3e6295a9 != com.newrelic.agent.tracers.DefaultTracer@c997983)
2024-07-02T11:51:01,294+0200 [482499 60] com.newrelic ERROR: Tracer Debug: An error occurred calling Transaction.tracerFinished() for class com.newrelic.instrumentation.kotlin.coroutines_17.NRContinuationWrapper : java.lang.NullPointerException: Cannot invoke "com.newrelic.agent.Transaction.activityFailedOrIgnored(com.newrelic.agent.TransactionActivity, int)" because "this.transaction" is null : this Tracer = com.newrelic.agent.tracers.OtherRootTracer@3e6295a9
2024-07-02T11:51:01,294+0200 [482499 88] com.newrelic ERROR: Tracer Debug: An error occurred calling Transaction.tracerFinished() for class kotlinx.coroutines.internal.LimitedDispatcher$Worker : java.lang.NullPointerException: Cannot invoke "com.newrelic.agent.Transaction.activityFailedOrIgnored(com.newrelic.agent.TransactionActivity, int)" because "this.transaction" is null : this Tracer = com.newrelic.agent.tracers.OtherRootTracer@5481ebf0
2024-07-02T11:51:01,294+0200 [482499 60] com.newrelic ERROR: Tracer Debug: Inconsistent state! tracer (actual tracer popped off stack) != lastTracer (pointer to top of stack) for com.newrelic.agent.TransactionActivity@7fffffff (com.newrelic.agent.tracers.OtherRootTracer@63214695 != com.newrelic.agent.tracers.DefaultTracer@394134b2)

@dhilpipre
Copy link
Contributor Author

One other thing that we have found in the logs are stackgtraces like this one where there is a TransactionActivity is gone error

2024-08-14T12:24:42,536+0200 [175109 147] com.newrelic.agent.instrumentation.ClassTransformerServiceImpl FINEST: createTracer(Continuation at io.ktor.client.engine.HttpClientEngine$DefaultImpls.executeWithinCallContext(HttpClientEngine.kt:100), 315, null)
java.lang.IllegalStateException: TransactionActivity is gone
at com.newrelic.agent.Transaction.getTransactionActivity(Transaction.java:924) ~[newrelic.jar:8.13.0]
at com.newrelic.agent.instrumentation.InstrumentationImpl.createTracer(InstrumentationImpl.java:120) [newrelic.jar:8.13.0]
at io.ktor.client.engine.HttpClientEngine$executeWithinCallContext$1.invokeSuspend(HttpClientEngine.kt) [ktor-client-core-jvm-2.3.11.jar:2.3.11]
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33) [kotlin-stdlib-2.0.10.jar:2.0.10-release-540]
at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:104) [kotlinx-coroutines-core-jvm-1.8.1.jar:?]
at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:584) [kotlinx-coroutines-core-jvm-1.8.1.jar:?]
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:811) [kotlinx-coroutines-core-jvm-1.8.1.jar:?]
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:715) [kotlinx-coroutines-core-jvm-1.8.1.jar:?]
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:702) [kotlinx-coroutines-core-jvm-1.8.1.jar:?]

@jbedell-newrelic jbedell-newrelic removed their assignment Aug 29, 2024
@deleonenriqueta deleonenriqueta self-assigned this Sep 3, 2024
@dhilpipre
Copy link
Contributor Author

One thing that I found in the log that the customer sent me was an UnsupportedOperationException. This happens when the agent is attempting to create a tracer

2024-08-29T15:43:57,594+0200 [68739 101] com.newrelic.agent.instrumentation.ClassTransformerServiceImpl FINEST: createTracer(Continuation at io.ktor.server.netty.cio.NettyHttpResponsePipeline$respondWithBigBody$2.invokeSuspend(NettyHttpResponsePipeline.kt:328), 246, null)
java.lang.UnsupportedOperationException: null
at java.util.AbstractList.add(AbstractList.java:153) ~[?:?]
at java.util.AbstractList.add(AbstractList.java:111) ~[?:?]
at com.newrelic.agent.TransactionActivity.addTracer(TransactionActivity.java:491) ~[newrelic.jar:8.14.0]
at com.newrelic.agent.TransactionActivity.addTracerToStack(TransactionActivity.java:344) ~[newrelic.jar:8.14.0]
at com.newrelic.agent.TransactionActivity.tracerStarted(TransactionActivity.java:312) ~[newrelic.jar:8.14.0]
at com.newrelic.agent.TransactionStateImpl.tracerStarted(TransactionStateImpl.java:107) ~[newrelic.jar:8.14.0]
at com.newrelic.agent.TransactionStateImpl.getTracer(TransactionStateImpl.java:39) ~[newrelic.jar:8.14.0]
at com.newrelic.agent.TransactionStateImpl.getTracer(TransactionStateImpl.java:55) ~[newrelic.jar:8.14.0]
at com.newrelic.agent.instrumentation.InstrumentationImpl.createTracer(InstrumentationImpl.java:126) ~[newrelic.jar:8.14.0]
at io.ktor.server.netty.cio.NettyHttpResponsePipeline$respondWithBigBody$2.invokeSuspend(NettyHttpResponsePipeline.kt) ~[ktor-server-netty-jvm-2.3.11.jar:2.3.11]
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33) ~[kotlin-stdlib-2.0.20.jar:2.0.20-release-360]
at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:104) ~[kotlinx-coroutines-core-jvm-1.8.1.jar:?]
at com.newrelic.instrumentation.kotlin.coroutines_17.NRRunnable.run(NRRunnable.java:49) ~[newrelic.jar:?]
at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173) ~[netty-common-4.1.112.Final.jar:4.1.112.Final]
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166) ~[netty-common-4.1.112.Final.jar:4.1.112.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:469) ~[netty-common-4.1.112.Final.jar:4.1.112.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569) ~[netty-transport-4.1.112.Final.jar:4.1.112.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:994) ~[netty-common-4.1.112.Final.jar:4.1.112.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[netty-common-4.1.112.Final.jar:4.1.112.Final]
at io.ktor.server.netty.EventLoopGroupProxy$Companion.create$lambda$1$lambda$0(NettyApplicationEngine.kt:296) ~[ktor-server-netty-jvm-2.3.11.jar:2.3.11]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.112.Final.jar:4.1.112.Final]
at java.lang.Thread.run(Thread.java:840) [?:?]

The only way that I can see that happening is that the TransactionActivity startAsyncTracerLimitCleanup() method, it is setting the tracers field to Collections.emptyList(). The list returned by emptyList would throw an UnsupportedOperationException.

Not sure what the flow is that causes this but it seems to be some weird combination of Kotlin Coroutines and Ktor because I have plenty of customers using the Coroutines instrumentation without this problem. BTW Ktor is built on top of Coroutines.

@kford-newrelic
Copy link
Contributor

Unable to reproduce. Will re-engage on this later in the quarter to try and reproduce. Kotlin/Ktor related.

@kford-newrelic kford-newrelic removed the bug Something isn't working as designed/intended label Oct 7, 2024
@kford-newrelic kford-newrelic moved this from In Sprint to In Quarter in Java Engineering Board Oct 7, 2024
@kanderson250 kanderson250 self-assigned this Nov 18, 2024
@deleonenriqueta deleonenriqueta removed their assignment Nov 18, 2024
@kanderson250
Copy link
Contributor

Thread with ASM https://gitlab.ow2.org/asm/asm/-/issues/318024

@kanderson250
Copy link
Contributor

@kanderson250 kanderson250 changed the title NullPointerException being thrown while trying to link an async token [Spike] NullPointerException being thrown while trying to link an async token Jan 13, 2025
@kanderson250
Copy link
Contributor

Closing as a Spike, to be completed with #2201

@github-project-automation github-project-automation bot moved this from In Sprint to Code Complete/Done in Java Engineering Board Jan 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
8 Story Point Estimate jan-mar qtr Represents proposed work item for the Jan-Mar quarter
Projects
Status: Code Complete/Done
Development

No branches or pull requests

6 participants