-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support config files #169
base: main
Are you sure you want to change the base?
Support config files #169
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution! I don't like multiple changes conflated in the same PR, so please submit all your unrelated changes as separate PRs as requested.
nbstripout/_utils.py
Outdated
|
||
# Traverse the file tree common to all files given as argument looking for | ||
# a configuration file | ||
config_path = os.path.commonpath([os.path.abspath(file) for file in args.files]) if args.files else os.getcwd() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use pathlib
wherever possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kynan Thanks for the quick review. I believe I addressed all your comments except this one. I'm not too familiar with the pathlib
module. Just did a quick search and couldn't find sth equivalent to os.path.commonpath()
. Could you be more specific here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, unfortunately there is no equivalent to commonpath
in pathlib
. I won't insist, so you can leave as-is :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a whole interesting thread on S/O on commonpath -
https://stackoverflow.com/a/43613742
Though not nessacarily helpful - they do end up still using commonpath, and just wrapping it with Path.
Please ensure your changes are compatible with Python 3.6 onwards. |
…n_file() in nbstripout/_utils.py
…rser.parse_args()
09716a9
to
e5cd304
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this approach is actually much better than what is implemented now. The approach is:
- Read config files and turn their defined keys and values into "pseudo" command line arguments and parse those with argparse
- Only then parse the actual command line arguments.
This way the order of precedence is correct and I think the implementation is actually much more elegant and readable!
nbstripout/_utils.py
Outdated
|
||
# Traverse the file tree common to all files given as argument looking for | ||
# a configuration file | ||
config_path = os.path.commonpath([os.path.abspath(file) for file in args.files]) if args.files else os.getcwd() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, unfortunately there is no equivalent to commonpath
in pathlib
. I won't insist, so you can leave as-is :)
if not tail: | ||
break | ||
|
||
if config: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hadn't picked up on this in my first round review: The way you implemented this, config file settings take precedence over command line arguments, however it should be the other way around.
I realise this will be a bit harder to implement. You could turn the Namespace
returned by argparse
into a dict and merge that into the dict you get from reading the config files, where the argparse
dict overwrites.
What makes this even more annoying is that you don't know if a value has been set by the caller or is simply the default :( I may need to look around for an idiomatic way to implement this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about the delay. Have to prioritize something else atm. Hope to come back to this at some point but if anyone sees fit, feel free to take over.
@janosh are you still planning to pick this back up at some point? |
Not all tests working.
@arobrien I think the usual approach is just to make a new PR to upstream, i.e. to https://github.com/kynan/nbstripout, not my fork. If you continued from my branch, the earlier commits will still be associated with me. I'm not worried about credit though. More important to get the feature. 😄 Either way, happy to merge your PR to my fork. |
support config files contributions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this is definitely going in the right direction :) Just a few small bits to fix.
@@ -351,7 +351,7 @@ def status(git_config, install_location=INSTALL_LOCATION_LOCAL, verbose=False): | |||
return 1 | |||
|
|||
|
|||
def main(): | |||
def setup_commandline() -> Namespace: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is no longer returning a Namespace
but a parser, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably be called get_parser
or sth like that.
import sys | ||
from collections import defaultdict | ||
from functools import partial | ||
from typing import Any, Dict, Optional | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please keep imports sorted
for a in parser._actions: | ||
if a.dest in dict_config and isinstance(a, (_StoreTrueAction, _StoreFalseAction)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a big fan of reaching into the internals of ArgumentParser
but also don't have a great idea what to do instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I thought that this would still be more robust than keeping an explicit list of Boolean arguments, although that wouldn't be impossible. I looked at a range of options, including shifting to an alternative arguments library, like Click, but that all got too complicated.
I think that this is a reasonable compromise if we keep on top of adding new Python versions to the test suite whenever they become available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, agree
for a in parser._actions: | ||
if a.dest in raw_config and isinstance(a, (_StoreTrueAction, _StoreFalseAction)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto
# when it finds the file, or a .git directory, or a .hg directory, or the root of the file system, whichever | ||
# comes first. | ||
# if no files are given, start from cwd | ||
config_path = os.path.commonpath([os.path.abspath(file) for file in args.files]) if args.files else os.path.abspath(os.getcwd()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line gets a bit long with a ternary
if name not in valid_args: | ||
raise ValueError(f'{name} in the config file is not a valid option') | ||
|
||
# separate into default-overrides and special treatment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a brief comment why extra_keys
needs special treatment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% sure that it does. I left this with the same functionality that @janosh did.
This special treatment lets you specify a base set of extra keys in the config files, and then add to that in an ad-hoc manner, but does not let you remove extra keys....
Alternative treatment would be that whenever you specify extra keys you would have to list the complete set.
The more I think about it, the more I think that perhaps special treatment is NOT justified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think that's a fair point and treating it specially may lead to surprising results.
@janosh can you explain your rationale? Do you agree that special treatment of this flag may be surprising?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't remember for sure at this point, but I think it might be left over from the original autoflake8 config file PR. I agree minimal user surprise is usually the way to go.
Co-authored-by: Florian Rathgeber <[email protected]>
Co-authored-by: Florian Rathgeber <[email protected]>
* Add command line argument keep-id, which maintiains randomly generated cell ids. Otherwise cell ids are assigned incrementally (after the removal of cells), which should keep them consistent across runs in version control * Modify test_cell and test_exception in test_keep_output_tags.py to use the new strip_output signature * Fix failed test_end_to_end_nbstripout with test_max_size by passing --keep-id for keeping the existing ids * Add tests for notebooks with and without the --keep-id flag. A new extension expected_id was added for expected output with ordered ids * Modify the readme to include the --include-id flag * Add keyword arguments for None inputs in test_keep_output_tags.py * Rename expected output files to make desired sequential ids more explicit Co-authored-by: Florian Rathgeber <[email protected]>
* Clarify that extra keys may _only_ specify notebook and cell metadata to be stripped. * Update metadata stripped by default. Addresses kynan#187
@@ -120,7 +120,7 @@ | |||
import sys | |||
import warnings | |||
|
|||
from nbstripout._utils import strip_output, strip_zeppelin_output | |||
from nbstripout._utils import strip_output, strip_zeppelin_output, merge_configuration_file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from nbstripout._utils import strip_output, strip_zeppelin_output, merge_configuration_file | |
from nbstripout._utils import strip_output, strip_zeppelin_output, merge_configuration_files |
|
||
def main(): | ||
parser = setup_commandline() | ||
args = merge_configuration_file(parser) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
args = merge_configuration_file(parser) | |
args = merge_configuration_files(parser) |
@janosh Are you still planning to address the outstanding review comments? :) |
Sorry, don't have time. Feel free to close. |
Thanks for the update! I'll leave this pending for now. Maybe I'll find time to pick this up myself over the holidays. |
Will this feature be available in pyproject.toml? In my use case, I need to filter notebooks that have a specific cell tag while leaving all others unfiltered. Will this feature support this use case? |
I think this PR was supposed to add pyproject.toml support for configuring
I don't think your feature request is related to this PR. Please file a new issue. @l-johnston would you be interested in working on this PR? |
Closes #166.
The code in this PR was adapted from PyCQA/autoflake#79. Big thank you and credit to @akeeman.