-
Notifications
You must be signed in to change notification settings - Fork 369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add dynamic parameter extraction #3143
base: main
Are you sure you want to change the base?
Add dynamic parameter extraction #3143
Conversation
9a9d21f
to
ce33ec7
Compare
✅ Deploy Preview for tetragon ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
ce33ec7
to
5ebb3e0
Compare
e3117d7
to
ba048b3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it, left some comments
thanks for splitting the code in multiple logical changes, it's rare ;-)
we were originally thinking of using C expression parsing (there's some go lib for that) but I think that can be done later if it's ever needed, the basic parsing you did should be fine
please add tests, would be great to have some framework where we could easily add various 'ExtractParam' expressions for testing
@@ -150,6 +150,19 @@ struct selector_arg_filters { | |||
__u32 argoff[5]; | |||
} __attribute__((packed)); | |||
|
|||
struct config_btf_arg_child { | |||
__u8 offset; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
u8 is not enough, is it? you can have bigger offset than 256
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I was doing the POC for this I could use u32 and the verifier does not complain, but when i put this in tetragon it does, so I changed it to u8. Though I've seen this function args_off
that format the int to 14 bits. So maybe the verifier let do math pointer until 1600 ?
I will make more tests with u16 to push the verifier to the limits to try.
i int, | ||
) (*btf.Type, error) { | ||
switch t := currentType.(type) { | ||
case *btf.Struct: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we add here btf.Union as well
pkg/btf/btf.go
Outdated
if childOff > 255 { | ||
return nil, fmt.Errorf("Unable to reach type %v at offset %d", currentType.TypeName(), childOff) | ||
} | ||
(*btfArgChilds)[i].Offset = uint8(childOff) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
offset should be uint32 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Verifier does not allow more than u8. But I should test if it is possible. This is why I put an exception if offset > 255.
bpf/process/generic_calls.h
Outdated
if (index <= 4) { | ||
btf_config = (&config->btf_arg_child0)[index]; | ||
struct extract_arg_data extract_data = { | ||
.config = config, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can drop extract_arg_data and just pass btf_config and arg directly to extract_arg_depth ?
I don't see struct extract_arg_data being used elsewhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
loop(MAX_BTF_ARG_CHILD_DEPTH, extract_arg_depth, &extract_data, 0);
loop takes only 4 params. I do not see how we could do it ?
Another solution could be nested callback, but not sure it is better
Example :
loop(MAX_BTF_ARG_CHILD_DEPTH, extract_arg_depth(btf_config), (void **)&a, 0);
with extract_arg_depth
that take bpf_config
and return a function compatible with loop
. But this look ugly as well
@@ -89,6 +101,25 @@ generic_process_event(void *ctx, struct bpf_map_def *heap_map, | |||
a = (&e->a0)[index]; | |||
total = e->common.size; | |||
|
|||
if (index <= 4) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all the new extra processing should be optional, now we do that always.. maybe extra bool in event_config?
#else | ||
loop(MAX_BTF_ARG_CHILD_DEPTH, extract_arg_depth, &extract_data, 0); | ||
#endif /* __V61_BPF_PROG */ | ||
probe_read((void *)&a, sizeof(char *), (char *)a); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the purpose of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is needed because pointer are dereferenced on the next iteration when called, not when we do math ptr.
To give more context, let's assume we have linux_binprm and want to follow this linux_binprm.file.my_var
with my_var
as string and file
as struct file *
. The is_pointer
check in the loop is not done at file
iteration but on the next with my_var
. In order to get this pointer delay back, we make sure my_var
is safe ptr.
In the tetragon logs it look like this
With this final probe_read
{
"process_lsm": {
"args": [
{
"string_arg": "ls"
}
],
},
}
Without it
{
"process_lsm": {
"args": [
{
"string_arg": "xB�O�=�O�ls"
}
],
},
}
argType = gt.GenericTypeFromBTF(*lastChild) | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should put this in function and use it from all the generic probes config code, now we just duplicate the code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes ! I was thinking about this because it is very ugly this way. But after that, I though maybe there are things that can be mutualized between generickprobe.go
and genericlsm.go
. So I put the ugly code like this and later we do refacto.
But if you mention this, I'll start just for this haha
This commit introduce the `struct config_btf_arg_depth`. It append `btf_argN` to `struct event_config` as an array of children to dig in any structure. For more details, any `btf_argN` can have a list of childrens as follow. For example, file->f_path is 152 bytes, so the array will look like [{ offset: 152, is_pointer: 0 }, { offset: 0, is_pointer: 0 } ...]. The max value `MAX_BTF_ARG_DEPTH` as been set arbitrary as the verifier need a fixed size. Signed-off-by: Tristan d'Audibert <[email protected]>
The aim of this commit is to enable the loop on the current hook arguments and move the buffer at config->btf_argN[i]->offset defined in userspace. The new offset is then written in the original variable to keep the same behaviour. The result of this is the rewrite of the current argument buffer with the required data. Signed-off-by: Tristan d'Audibert <[email protected]>
This commit make a closer relationship between the defined `GenericStringToType` arguments and the existing BTF types. For instance: `struct file` with btf is called `file` and also exists in `GenericStringToType` with the same alias. But `struct qstr d_name->name` has a returned type `unsigned char` wich does not exist yet in `GenericStringToType`. The same thing happened with `linux_binprm->filename` has the type is `char` Signed-off-by: Tristan d'Audibert <[email protected]>
The function recursively search in the structure for childrens matching `pathToFound`. The variable `i` is used here as the depth in the `pathToFound` array. For instance, if the search is in the linux_binprm structure and the path is as follow `file.f_path.dentry.d_name.name`, the following actions will be done. - Look for the variable name `file` inside `linux_binprm`. - If the matches, it stores the offset from linux_binprm where the `file` variable can be found. - Then it take the btf type `file` and search for parameter named `f_path`. Signed-off-by: Tristan d'Audibert <[email protected]>
This commit add 2 parameters to give the ability to the user to search for a specific variable following a "path" as follow: ```yml ... args: - index: 0 type: "linux_binprm" extractParam: "file.f_path.dentry.d_name.name" overwriteType: "string" ... ``` The above config can be used to extract a specific parameter from the structure at index 0. Signed-off-by: Tristan d'Audibert <[email protected]>
The OverwriteType parameter should be deleted if `argSelectorType` can use the `EventConfig` structure to search for the correct Type. Signed-off-by: Tristan d'Audibert <[email protected]>
Search if every types defined by the user with ExtractParam exists as BTF types and store their offsets in ConfigBtfArg. This function do a basic split on `ExtractParam` to obtain the "path" to the required data. Then, the array is gave to `btf.FindNextBTFType` to find the offsets of each elements until we reach the required data. The output is stored in EventConfig to keep the normal behaviour. For example, if the arg 0 is `struct linux_binprm` and ExtractParam is set to `file.f_path.dentry.d_name.name`, the output will give an array of all the offsets from there parents as such [{ offset: 96, is_pointer: 0 }, { offset: 152, is_pointer: 1 }, ...] From input = "linux_binprm -> file.f_path.dentry.d_name.name" Signed-off-by: Tristan d'Audibert <[email protected]>
This commit update `addLsm` function to use the parameters `ExtractParam` and `OverwriteType` in order to look for the child members in BTF structure. Signed-off-by: Tristan d'Audibert <[email protected]>
…probes Uprobe offsets can't be found in BTF file. Thus, if the user defines these specific parameters, they are ignored and a warning is displayed Signed-off-by: Tristan d'Audibert <[email protected]>
Add very similar code as in `genericlsm.go` file to handle those params. In a future commit, a refactoring should be acheived to avoid code redundant code with `genericlsm.go` Signed-off-by: Tristan d'Audibert <[email protected]>
ba048b3
to
1fb7dbc
Compare
The discussion for this PR can be found here #3142
This is currently a draft.
I wanted to start the discussion before submitting the final code, as I think, it is a big enough PR.
Take this PR in the today state as a proof of concept for dynamic parameter extraction. I will continue to work on this PR until the below checks are done.
At the current state, the PR is able to
Description
This PR introduce the dynamic parameter extraction
Comments
OverwriteType
parameter. But since the functionargSelectorType
does not receiveEventConfig
, it is not possible to search for the type using directly BTF types. So I suggest doing another PR before this one is merged to do so if possible. Then I'll remove the parameter. It is also not possible to add an if condition foruprobes
in this function. So if the user defines the parameter, it will overwrite the type at this moment.u8
to store the offset, as the verifier does not allow me to useu16
. At this moment, I don't know if it is possible to found offset > 255 in BTF structures. For my uses cases, usingu8
was enough.Test the PR
You can use the following config
If you want to test it with more arguments, you can use
bprm_creds_from_file
hook. It hasstruct linux_binprm
andstruct file
which are supported.