Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RSDK-9566: Monitor CPU/Memory usage of viam-server and modular processes with FTDC. #4642

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

dgottlieb
Copy link
Member

@dgottlieb dgottlieb commented Dec 18, 2024

In this patch, we just output raw "counters" for things like cpu/system time. I'm going to make a subsequent change to the ftdc parsing to treat those special to generate the actual cpu usage graphs.

@viambot viambot added the safe to test This pull request is marked safe to test from a trusted zone label Dec 18, 2024
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Dec 18, 2024
@dgottlieb dgottlieb requested a review from cheukt December 18, 2024 20:06
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Dec 18, 2024
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Dec 18, 2024
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Dec 18, 2024
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Dec 18, 2024
Copy link
Member

@cheukt cheukt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally looks good, q around behavior on other systems

ftdc/sys/sys.go Outdated
)

// On linux, getting the page size is a system call. Cache the page size for the entirety of the
// progarm lifetime. As opposed to calling it each time we wish to compute the resident memory a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// progarm lifetime. As opposed to calling it each time we wish to compute the resident memory a
// program lifetime. As opposed to calling it each time we wish to compute the resident memory a

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// Turn on process cpu/memory diagnostics for the module process. If there's an error, we should
// continue normally, just without FTDC. The control flow for that is a bit messy, so we use a
// lambda to leverage early-returns.
func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: could just factor out into a function then?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}

// Stats returns Stats.
func (sys *UsageStatser) Stats() any {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does Stats() return on a mac or possibly a windows machine?

Copy link
Member Author

@dgottlieb dgottlieb Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect the constructor to fail on both -- therefore these objects would never get registered with FTDC. And FTDC would never call Stats. But if the constructor did succeed, I'd expect the Stat call on line 80 to return an error and we'd just get 0 values out.

@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Dec 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
safe to test This pull request is marked safe to test from a trusted zone
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants