Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lighthouse under-subscribes attestation subnets for aggregation #6732

Open
sauliusgrigaitis opened this issue Dec 19, 2024 · 3 comments
Open

Comments

@sauliusgrigaitis
Copy link

Description

Lighthouse under-subscribes attestation subnets for aggregation. Initially, I compared Grandine's and Lighthouse's networking behavior and found that Lighthouse stops subscribing to attestation subnets for aggregation completely. This was fixed by #6682 . However, even after this fix Lighthouse subscribes to significantly fewer number of attestations subnets.

Version

The problem appears in 6.0.1 version.

Present Behaviour

6.0.1 version still subscribes to significantly fewer number of attestations subnets. A validator does not get directly penalized for it. However, this leads to a worse overall quality of aggregates in the network.

Expected Behaviour

It should subscribe to the correct list of attestation subnets.

Steps to resolve

The easiest way to debug and fix it would be to print the subscription events in both 5.3 and 6.0.1 Lighthouse BN nodes behind Vouch with a large number of validators (I used 30K). And compare the difference of the behavior.

The image shows 6.0.1 at the first half and reverted 5.3 in the second half:

Screenshot from 2024-12-19 11-06-36

@michaelsproul
Copy link
Member

cc @AgeManning

@AgeManning
Copy link
Member

found that Lighthouse stops subscribing to attestation subnets for aggregation completely

This never happened. Lighthouse has always subscribed to subnets for aggregation. There was a drop that we noticed. The fix you mentioned corrected the case that we were aggregators for the same subnet, multiple times in one epoch, where 6.0.0 would subscribe just once, rather than multiple times.

We have revised the code a few times. The newer version is substantially simpler. When we tested, we did some quick math with high validator counts and the theoretical number of aggregate subscriptions checked out. It was hard for us to tell then which was the correct logic, the 5.3 potentially over-subscribing or 6.0.1 potentially under-subscribing.

There might still be something here and it would be good to know which is correct, 5.3 or 6.0.1. I'll loop in @ackintosh and @jxs who might also be able to look into this.

@sauliusgrigaitis
Copy link
Author

This never happened.

Interesting, the reason why I requested Povi to check it was that I found there was no attestation traffic except for the 2 persistent subnets. Anyway, it's fixed now, so it doesn't matter.

We have revised the code a few times. The newer version is substantially simpler. When we tested, we did some quick math with high validator counts and the theoretical number of aggregate subscriptions checked out. It was hard for us to tell then which was the correct logic, the 5.3 potentially over-subscribing or 6.0.1 potentially under-subscribing.

A simpler code is not proof that it works correctly, this was already proven :) I checked Lighthouse 6.0.1 vs Prysm 5.2.0 vs Grandine 1.0.0.rc2 and Lighthouse underperforms significantly compared to the other clients at the Aggregate attestation strategy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants