Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: search current working directory for config file #1464

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

IndexSeek
Copy link
Contributor

Resolves #1333

Adds the current working directory to the search path for the .pyiceberg.yaml file.

As it is now, the file is searched in the following order:

  1. the PYICEBERG_HOME environment variable
  2. ~/
  3. ./

I'm unsure if people would like to have 2 and 3 swapped. In either case, users can still override this with the environment variable.

Copy link
Contributor

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Added a few nit comments

pyiceberg/utils/config.py Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are a couple other places where pyiceberg.yaml is referenced in the docs
https://grep.app/search?q=pyiceberg.yaml&filter[repo][0]=apache/iceberg-python&filter[path][0]=mkdocs/docs/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice finds! That grep tool is pretty neat. I just made some corrections to these in 3ebbb8c.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried adding a test here, but I wonder if there are opportunities to clean it up.

Copy link
Contributor

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! I added some nit comments on testing

tests/utils/test_config.py Outdated Show resolved Hide resolved
tests/utils/test_config.py Show resolved Hide resolved
tests/utils/test_config.py Outdated Show resolved Hide resolved
@IndexSeek
Copy link
Contributor Author

There's another test that looks to do something similar to this newer test. Are both needed?

def test_from_configuration_files(tmp_path_factory: pytest.TempPathFactory) -> None:
config_path = str(tmp_path_factory.mktemp("config"))
with open(f"{config_path}/.pyiceberg.yaml", "w", encoding=UTF8) as file:
yaml_str = as_document({"catalog": {"production": {"uri": "https://service.io/api"}}}).as_yaml()
file.write(yaml_str)
os.environ["PYICEBERG_HOME"] = config_path
assert Config().get_catalog_config("production") == {"uri": "https://service.io/api"}

mkdocs/docs/cli.md Outdated Show resolved Hide resolved
Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left one comment, but apart from that, it looks good to me 👍

Co-authored-by: Fokko Driesprong <fokko@apache.org>
@IndexSeek
Copy link
Contributor Author

I've left one comment, but apart from that, it looks good to me 👍

Thank you! I applied that change! ✅

Copy link
Contributor

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I have a few nit comments about testings and adding a warning for storing in current dir

@@ -28,7 +28,7 @@ hide:

There are three ways to pass in configuration:

- Using the `~/.pyiceberg.yaml` configuration file
- Using the `.pyiceberg.yaml` configuration file stored in either the directory specified by the `PYICEBERG_HOME` environment variable, the home directory, or current working directory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: move the extra info about where the file is located down to L37.
also i think its valuable to include a warning about accidentally checking in secrets with git when using the current working directory

@@ -49,7 +49,7 @@ catalog:

and loaded in python by calling `load_catalog(name="hive")` and `load_catalog(name="rest")`.

This information must be placed inside a file called `.pyiceberg.yaml` located either in the `$HOME` or `%USERPROFILE%` directory (depending on whether the operating system is Unix-based or Windows-based, respectively) or in the `$PYICEBERG_HOME` directory (if the corresponding environment variable is set).
This information must be placed inside a file called `.pyiceberg.yaml` located either in the `$HOME` or `%USERPROFILE%` directory (depending on whether the operating system is Unix-based or Windows-based, respectively), in the current working directory, or in the `$PYICEBERG_HOME` directory (if the corresponding environment variable is set).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: include warning about accidentally checking in secrets with git when using the current working directory

"config_setup, expected_result",
[
# Validate lookup works with: config > home > cwd
(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: for test readability, use ["PYICEBERG_HOME", "HOME", and "CURRENT"]
and replace both with a list ["HOME", "CURRENT"]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like the parameterize test is testing

  1. PYICEBERG_HOME
  2. HOME
  3. CURRENT
  4. None
  5. "both" / ["HOME", "CURRENT"]

i'd add a test for all 3

@kevinjqliu kevinjqliu added this to the PyIceberg 0.9.0 release milestone Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

.pyiceberg.yaml config files should be loaded from current dir instead of home folder
3 participants