Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix inline completion bug, add test cases #1002

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open

Conversation

hiCXK
Copy link

@hiCXK hiCXK commented Sep 17, 2024

Fixes #1001

@krassowski please check if the edits are correct

@dlqqq dlqqq added the bug Something isn't working label Sep 23, 2024
@dlqqq
Copy link
Member

dlqqq commented Sep 25, 2024

Thank you for your contribution! It's been a week since we've last heard from you, so I'm closing this PR due to inactivity. Please feel free to re-open this PR after addressing the review feedback.

@dlqqq dlqqq closed this Sep 25, 2024
@krassowski
Copy link
Member

Well, to be fair 3 days to address a review is not that much time if people are contributing in free time. CC @ellisonbg

@hiCXK
Copy link
Author

hiCXK commented Sep 25, 2024

Working on the issue... will open this again as soon as I resolve it 😃

@ellisonbg
Copy link
Contributor

Yeah, I agree with @krassowski. Jupyter doesn't have any formal standards on this, so each repo and subproject tends to handle it differently. At the same time, I think waiting at least a month or more is pretty normal across Jupyter. At times, some repos have even given people much longer (>1 year), but that quickly makes it difficult to navigate the PR queue. I want to be clear here - I am not making any decisions here, just reflecting how Jupyter has handled this in the past.

@ellisonbg
Copy link
Contributor

Thanks @hiCXK for your contributions! We do want to go out of our way to welcome and encourage contributors. I will let others reopen the PR, but we look forward to seeing this PR reviewed and merged :-)

@krassowski
Copy link
Member

krassowski commented Sep 25, 2024

Sorry for the direct ping, I just see that many PR reviews include words like "after discussing internally with the team" so when I saw a pattern of closing PRs based on one week threshold I just wanted to raise this with wider team in case if this was also discussed internally :) To be clear, I am not complaining about having internal reviews of features in addition to public reviews, I think this is awesome that you can do this!

I think that it could make sense for this repo to decide on short PR iteration time, but it would be great to have:

  • some consensus building in a public issue among contributors (like show of hands between 1/2/3 weeks of inactivity)
  • it documented in CONTRIBUTING.md or a section of the documentation

:)

@ellisonbg
Copy link
Contributor

ellisonbg commented Sep 25, 2024

Thanks @krassowski for bringing this up, this is really important. Obviously, we (at AWS) do discuss things internally, but you are right that we shouldn't ever make decisions in that way or even appear to do so. All information and discussions that are used to make decisions about any Jupyter subproject or repo should be public and on GitHub.

In terms of decision making, we are bound to follow the Jupyter Decision Making Guide. However, for smaller repos (at least smaller than Lab, Notebook, Hub, etc.) like Jupyter AI it would be helpful to add information to the CONTRIBUTING.md with a summary of how that is meant to play out in this particular project, to help people understand the rough process and timelines for review, merging, etc.

I know @dlqqq is working hard to keep up with the fast pace of PRs in this repo and want to acknowledge his amazing work to keep the the PR queue moving, this is an underappreciated part of open source work. I can't believe that this repo has already seen almost 500 PRs merged already – or a relatively small effort that is fantastic.

@hiCXK
Copy link
Author

hiCXK commented Sep 25, 2024

@dlqqq please re-open the PR. I have corrected the test cases which should now resolve the CI issue.

@dlqqq dlqqq reopened this Sep 26, 2024
@dlqqq
Copy link
Member

dlqqq commented Sep 26, 2024

@krassowski Yes, I agree. 3 days is definitely too short, and this was a mistake on my end. I was looking at when the PR was opened, not when your review was given. Thank you for calling out my error and the need for community involvement in establishing contributing standards. 👍

(cc @hiCXK) Can open source contributors re-open PRs on their own, or do they require permission from a maintainer to do so? When I closed this PR, I had thought that this would be a harmless way of reducing the number of open PRs, which helps me identify the ones that require my review more quickly. I certainly didn't mean to "shut the door" on contributors who didn't respond within a certain time. Hope this clarifies my intention here. 🤗

@dlqqq
Copy link
Member

dlqqq commented Sep 26, 2024

Sorry for the direct ping, I just see that many PR reviews include words like "after discussing internally with the team" so when I saw a pattern of closing PRs based on one week threshold I just wanted to raise this with wider team in case if this was also discussed internally :) To be clear, I am not complaining about having internal reviews of features in addition to public reviews, I think this is awesome that you can do this!

@krassowski Thank you for addressing the "elephant in the room" here directly. I think it's really wonderful that we respect & trust each other enough to be able to discuss these hard questions directly & publicly. I fully acknowledge that we can do more to be transparent & earn trust with the open source community.

Given the project's substantial growth recently, there have been situations where I/we decided outcomes internally instead of seeking consensus in the interest of time, hoping that our conclusion would be generally accepted. However, this doesn't always work (as proven here 😁), and as @ellisonbg called out, this just isn't the right way to run an open source project. I will meet with Brian and apply his feedback on how to make Jupyter AI more open and transparent in its decision-making. This may take some time given that I'm balancing code review, code contribution, and project governance all at once, so please feel free to open an issue or ping me if you feel that progress isn't being made quickly enough.

Again, hope this clarifies things, and thank you for your feedback!

Copy link
Member

@dlqqq dlqqq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@krassowski (regarding this PR) Can you review this since you're more familiar with the inline completion logic? You should have permissions to approve & merge as a member of the jupyterlab/ org, so feel free to do so if you're able.

@hiCXK
Copy link
Author

hiCXK commented Sep 26, 2024

@krassowski Yes, I agree. 3 days is definitely too short, and this was a mistake on my end. I was looking at when the PR was opened, not when your review was given. Thank you for calling out my error and the need for community involvement in establishing contributing standards. 👍

(cc @hiCXK) Can open source contributors re-open PRs on their own, or do they require permission from a maintainer to do so? When I closed this PR, I had thought that this would be a harmless way of reducing the number of open PRs, which helps me identify the ones that require my review more quickly. I certainly didn't mean to "shut the door" on contributors who didn't respond within a certain time. Hope this clarifies my intention here. 🤗

I think the 'reopen' button wasn't visible to me; it might be only visible to members. As for closing this PR, I understood your intentions 🤗 I should have responded while working on it. Thank you.

Comment on lines 107 to 115
("```python\n\n\n# load json file\nimport\n```", "# load json file\nimport"),
("```python\n # load json file\nimport\n```", "# load json file\nimport"),
("```\n# load json file\nimport\n```", "# load json file\nimport"),
(" ```\n# load json file\nimport\n```", "# load json file\nimport"),
("``` \n# load json file\nimport\n```", "# load json file\nimport"),
(
"\n```python\n# load json file\nimport\n```",
"# load json file\nimport",
),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These test cases are not representative of the example that I included in #1001. The point was that user types in instruction like:

# load json file

and the model should follow by responding with something like:

\nimport json
\nwith open('path-to-file', 'r') as f:
\n    content = json.load(f)

What want to test is that the initial line has the \n preserved if it is included in the expected_suggestion. I do not think that these test cases test this, instead they repeat the previous tests just with different text content.

@hiCXK does it make sense to you?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems I'm having trouble understanding the usage of response and expected_suggestion.
response is the generated text from the mock LLM and expected_suggestion is the required correct text.
So, can this be a valid test case?
("```\nimport json\nwith open('path-to-file','r') as f:```", "\nimport json\nwith open('path-to-file','r') as f:")
This way we can check if the initial line's \n character is preserved.
Please suggest a test case if my approach is incorrect and correct my approach 🙏

Copy link
Member

@krassowski krassowski Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, something along these lines. Sometimes it will include the comment prefix sometimes it won't do two cases are needed here

Copy link
Author

@hiCXK hiCXK Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so i am adding these test cases:

("```\n# load json file\nimport json\nwith open('path-to-file','r') as f:```", "\n# load json file\nimport json\nwith open('path-to-file','r') as f:"), 
("```\nimport json\nwith open('path-to-file','r') as f:```", "\nimport json\nwith open('path-to-file','r') as f:")

are these correct? comment prefix is included. We can also shorten them up because we only have to check the preservation of first \n character

  ("```\n# load json file\n```", "\n# load json file"),
  ("```\ndef function():\n    pass\n```", "\ndef function():\n    pass"),
 

Also can you elaborate more on inclusion of comment prefix if my solution is incorrect. I have also gone through the community forum and documentation to find more about this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposed shortened tests are fine here. Ideally we would have a separate test which checks that:

response = "```\n# load json file\nimport json\nwith open('path-to-file','r') as f:```"

generated for prefix:

# load json file

gives

result = "\nimport json\nwith open('path-to-file','r') as f:" 

To check the code in line 46.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although the test cases seem correct this time and the function post_process_suggestion is correct, the tests cases are failing unit tests again. There are AssertionError for each test case i added. \n characters are still being stripped off despite changing the function. I must be missing something.

image

Please suggest the way to resolve them.

Copy link
Member

@krassowski krassowski Oct 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to write a new tests function rather than add cases to this one. This is because this test function defines prefix = "" but here we are concerned with a situation where user gets suggestion after some prefix (like a comment with an instruction). For "" prefix "any-text".startswith(prefix)" will always be true.

Copy link
Author

@hiCXK hiCXK Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@krassowski Please have a look at this function which utilizes parameterized test:

@pytest.mark.parametrize(
    "prefix,response,expected_result",
    [
        ("", "```\n# load json file\n```", "\n# load json file"),
        ("", "```\ndef function():\n    pass\n```", "\ndef function():\n    pass"),
        ("# load json file\n", "```\n# load json file\nimport json\nwith open('path-to-file','r') as f:```", "\nimport json\nwith open('path-to-file','r') as f:"),
    ]
)
async def test_newline_preservation_and_prefix_handling(prefix, response, expected_result):
    inline_handler = MockCompletionHandler(
        lm_provider=MockProvider,
        lm_provider_params={
            "model_id": "model",
            "responses": [response],
        },
    )
    dummy_request = InlineCompletionRequest(
        number=1, prefix=prefix, suffix="", mime="", stream=False,  
    )

    await inline_handler.handle_request(dummy_request)    
    assert len(inline_handler.messages) == 1
    suggestions = inline_handler.messages[0].list.items
    assert len(suggestions) == 1

    result = suggestions[0].insertText
    assert result == expected_result 
    
    if prefix:
        assert not result.startswith(prefix)  # result should not include the prefix

When I added this function to test_handlers.py, it still produces assertion error. How are test_handlers.py and completion_utils.py connected ( how is post_process_suggestion function accessed)? do we need to change some other file like default.py or where InlineCompletionRequest class is defined.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Inline completion strips out new line when it should be kept
4 participants