Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix lrclib lyrics #5406

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open

Fix lrclib lyrics #5406

wants to merge 7 commits into from

Conversation

snejus
Copy link
Member

@snejus snejus commented Sep 4, 2024

Fixes #5102

LRCLib lyrics backend fixes

Bug Fixes

  • Fixed fetching lyrics from lrclib source. If lyrics for a specific album, artist, and title combination are not found, the plugin now searches for the artist and title and picks the most relevant result, scoring them by
    1. Duration similarity to the target item
    2. Availability of synced lyrics
  • Updated the default sources configuration to prioritize lrclib over other sources for faster and more reliable results.

Code Improvements

  • Added type annotations to fetch method in all backends.
  • Introduced LRCLyrics and LRCLibItem classes to encapsulate lyrics data and improve code structure.
  • Enhanced error handling and logging enchancements to the LRCLib backend. These will be added to the rest of the backends in a separate PR.

Tests

  • Added new tests to cover the updated functionality and error handling scenarios.

To Do

  • Documentation. (If you've added a new command-line flag, for example, find the appropriate page under docs/ to describe it.)
  • Changelog. (Add an entry to docs/changelog.rst to the bottom of one of the lists near the top of the document.)
  • Tests. (Very much encouraged but not strictly required.)

@snejus snejus self-assigned this Sep 4, 2024
@snejus snejus linked an issue Sep 4, 2024 that may be closed by this pull request
@snejus snejus requested a review from bal-e September 4, 2024 04:27
@snejus snejus force-pushed the fix-lrclib-lyrics branch 2 times, most recently from d4bed72 to 829192d Compare September 4, 2024 04:40
@snejus snejus requested a review from Serene-Arc September 4, 2024 04:47
@snejus snejus force-pushed the fix-lrclib-lyrics branch 5 times, most recently from cb8929f to c2807f0 Compare September 4, 2024 10:26
@snejus
Copy link
Member Author

snejus commented Sep 4, 2024

The build on win32 is failing to install reflink because it's only supported until Python 3.7.

I will address this in a separate PR and rebase this one accordingly once the fix is merged.

Note: this issue popped up now because I added a new requests-mock dependency which invalidated cached dependencies.

Copy link
Member

@bal-e bal-e left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a much better implementation now, well done. I especially like that you've managed to remove the test resources -- I wasn't expecting 2000 lines of lyrics in there.

beetsplug/lyrics.py Outdated Show resolved Hide resolved
beetsplug/lyrics.py Outdated Show resolved Hide resolved
snejus added a commit that referenced this pull request Sep 8, 2024
…5407)

See my comment under #5406 for context

> The build on win32 is failing to install reflink because it's [only
supported until Python
3.7](https://gitlab.com/rubdos/pyreflink/-/blob/master/setup.py?ref_type=heads).
>
> I will address this in a separate PR and rebase this one accordingly
once the fix is merged.
>
> Note: this issue popped up now because I added a new requests-mock
dependency which invalidated cached dependencies.
@snejus snejus requested a review from bal-e September 11, 2024 09:28
@snejus snejus force-pushed the fix-lrclib-lyrics branch 3 times, most recently from dc02c94 to 622ed3c Compare September 11, 2024 11:21
@snejus snejus requested a review from JOJ0 September 11, 2024 11:40
beetsplug/lyrics.py Outdated Show resolved Hide resolved
beetsplug/lyrics.py Outdated Show resolved Hide resolved
beetsplug/lyrics.py Outdated Show resolved Hide resolved
@tranxuanthang
Copy link

Hey, LRCLIB author here 👋. It would be great if you could make the /search API a fallback for the /get API (instead of replacing it entirely) when no result is found. The /get API is more performant on LRCLIB's side, whereas the /search API is much slower.

Also, the /get API has been updated recently, and the album_name parameter is no longer a hard requirement.

If you have any ideas for further improvements to LRCLIB's API, feel free to let me know!

@snejus
Copy link
Member Author

snejus commented Sep 26, 2024

Hey, LRCLIB author here 👋. It would be great if you could make the /search API a fallback for the /get API (instead of replacing it entirely) when no result is found. The /get API is more performant on LRCLIB's side, whereas the /search API is much slower.

Also, the /get API has been updated recently, and the album_name parameter is no longer a hard requirement.

If you have any ideas for further improvements to LRCLIB's API, feel free to let me know!

Hi @tranxuanthang, thanks for popping in! Absolutely, that's no problem at all. This should make most lyrics queries even speedier on our side.

Now that we're on this topic, I think it may be a good idea to also add caching: for example, if we're getting lyrics for two separate files

  1. Artist - Title (Some Remix)
  2. Artist - Title

Ideally we should only ask for Artist - Title lyrics once when Artist - Title (Some Remix) is not found.

@tranxuanthang thanks for a reliable and performant API!

@snejus
Copy link
Member Author

snejus commented Dec 8, 2024

The issue I'm thinking of is more for when there are multiple synced lyric providers. Imagine there's LCRLib and bilRCL, both can provide synced lyrics. As it is now, if I understand it correctly, if LRCLib is first in the sources list and only has plain lyrics for a song, the plugin will apply the plain lyrics, despite the fact that bilRCL might have synced lyrics.

I see! You're correct, the plugin now picks any kind of lyrics returned by the first source that returns something.

What I'm suggesting is to have the approach you have made here for LCRLib and expand it for all backends.

This makes sense - I am myself after synced lyrics if possible, so if there was another source that provides them, I would indeed try to engineer a way to retrieve them.

  • If you want synced lyrics, and the first source that supports synced lyrics has them, use them (in all cases - if they match the duration and other parameters).
  • If the first source only has plain lyrics, check the next source that supports lyrics.
  • If no synced source has synced lyrics, use the plain lyrics from the first source.
  • Sources are checked in order of preference as specified by the order of the sources config field.

This looks like a good approach. This of course depends on knowing in advance which sources provide synced lyrics, so that we can try them first.

  • If no synced source has synced lyrics, use the plain lyrics from the first source.

I'd say we may even want to have a separate source priority list for non-synced lyrics: for example, I found that I prefer lyrics from LRCLib only if they are synced. Otherwise, I'd rather get them from Genius where they've been reviewed by the community.

Alternatively, if users want synced lyrics, we firstly query backends that provide them (respecting their order in the sources configuration), and then query the rest if synced weren't found. This way, if I have the following configuration:

lyrics:
  sources: [genius, lrclib, tekstowo, bilrcl]
  synced: yes

The plugin tries LRCLib and bilRCL first and returns the first valid synced lyrics, if found. If not, it tries Genius, then (cached) plain lyrics from LRCLib, then Tekstowo, and finally bilRCL.

@edgars-supe
Copy link
Contributor

I think your suggestions make perfect sense! If you were to implement it like that, I wouldn't be mad at all. Though this probably warrants a bigger discussion with more maintainers, I would imagine. At any rate, I'll leave this idea with you, since you're working a lot on this plugin and you're way better suited for the task, if you choose to accept it, than I am.

@snejus snejus force-pushed the lyrics-refactor-tests branch 2 times, most recently from a4111b8 to 44b4b46 Compare January 12, 2025 20:36
@snejus snejus force-pushed the fix-lrclib-lyrics branch from b879e9c to 645c12e Compare January 12, 2025 22:39
@snejus snejus force-pushed the lyrics-refactor-tests branch from a4b45e9 to 2b80596 Compare January 12, 2025 23:09
@snejus snejus force-pushed the fix-lrclib-lyrics branch from 645c12e to 121ae5f Compare January 12, 2025 23:12
@snejus snejus force-pushed the lyrics-refactor-tests branch from 2b80596 to 02386cc Compare January 13, 2025 22:55
@snejus snejus force-pushed the fix-lrclib-lyrics branch from 121ae5f to 788a1d7 Compare January 13, 2025 22:55
@snejus snejus force-pushed the lyrics-refactor-tests branch from 02386cc to cc0ac8d Compare January 19, 2025 01:06
@snejus snejus force-pushed the fix-lrclib-lyrics branch 2 times, most recently from 016ddfa to 41b49cd Compare January 19, 2025 01:16
@snejus snejus force-pushed the lyrics-refactor-tests branch from cc0ac8d to 78fd959 Compare January 19, 2025 01:38
@snejus snejus force-pushed the fix-lrclib-lyrics branch from 41b49cd to 4b802bd Compare January 19, 2025 01:38
@snejus snejus force-pushed the lyrics-refactor-tests branch 3 times, most recently from 652494d to e5c006d Compare January 19, 2025 01:55
@snejus snejus force-pushed the fix-lrclib-lyrics branch from 4b802bd to 6684596 Compare January 19, 2025 01:55
snejus added a commit that referenced this pull request Jan 19, 2025
## Description

Fixes #2635
Fixes #5133

I realised that #5406 has gotten too big, thus I'm splitting it into
several smaller PRs.

This PR refactors lyrics plugin tests and fixes an empty metadata issue
in the lyrics logic.

#### CI
- Added `--extras=lyrics` to the Poetry install command to include the
lyrics plugin dependencies.
- In the main task which measures coverage, set `LYRICS_UPDATED`
environment variable based on changes detected in the lyrics files.

#### Test setup
- Introduced `ConfigMixin` to centralize configuration setup for tests,
reducing redundancy. This can be used by tests based on `pytest`.

#### Lyrics logic
- Trimmed whitespace from `item.title`, `item.artist`, and
`item.artist_sort` in `search_pairs` function.
- Added checks to avoid searching for lyrics if either the artist or
title is missing.
- Improved `_scrape_strip_cruft` function to remove Google Ads tags and
unnecessary HTML tags.

#### Lyrics tests overhaul
- Migrated lyrics tests to use `pytest` for better isolation and
configuration management.
- Deleted redundant lyrics text files and some unused utils.
- Marked tests that should only run when lyrics source code is updated
(`LYRICS_UPDATED` is set from the CI) using the `on_lyrics_update`
marker.

#### Documentation and Dependencies
- Added `requests-mock` version `1.12.1` to `pyproject.toml` and
`poetry.lock` for mocking HTTP requests in tests.
- Updated `setup.cfg` to include a new marker `on_lyrics_update`.
Base automatically changed from lyrics-refactor-tests to master January 19, 2025 02:00
@snejus
Copy link
Member Author

snejus commented Jan 19, 2025

@edgars-supe for now I will keep it as it is, but I noted this idea for the future when another source providing synchronised lyrics pops up!

Adjust the base URL to perform a '/search' instead of attempting to
'/get' specific lyrics where we're unlikely to find lyrics for the
specific combination of album, artist, track names and the duration (see
https://lrclib.net/docs).

Since we receive an array of matching lyrics candidates, rank them by
their duration similarity to the item's duration, and whether they
contain synced lyrics.
@snejus snejus force-pushed the fix-lrclib-lyrics branch from 6684596 to a9e069d Compare January 19, 2025 15:19
@snejus snejus force-pushed the fix-lrclib-lyrics branch from a9e069d to fcde4a6 Compare January 19, 2025 21:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

beets can't fetch lyrics from lrclib.net
4 participants