Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Display Refactor #27

Open
toolCHAINZ opened this issue Nov 30, 2024 · 0 comments
Open

Display Refactor #27

toolCHAINZ opened this issue Nov 30, 2024 · 0 comments
Labels
enhancement New feature or request jingle_sleigh Involves the sleigh FFI

Comments

@toolCHAINZ
Copy link
Owner

The Current State

In my first cut at the various structures for representing p-code in Rust, I had the goal of avoiding allocation if at all possible and avoiding "context" if at all possible.

A VarNode was represented as follows:

pub struct VarNode {
    /// The index at which the relevant space can be found in a [`SpaceManager`]
    pub space_index: usize,
    /// The offset into the given space
    pub offset: u64,
    /// The size in bytes of the given [`VarNode`]
    pub size: usize,
}

Advantages

  • VarNodes are cheap to create, move, and clone.
  • VarNodes can be trivially used with serde.
  • VarNodes can be trivially constructed for test cases.
  • VarNodes are conceptually simple: no type arguments or lifetimes.

Disadvantages

  • This is further from sleigh's representation of VarNodes: in sleigh, a VarNode has a raw pointer handle to the Space it is in. Presumably this decision was made as the sleigh developers found it necessary (or at least helpful) to have this connection in place.
  • It is easy to manually create a VarNode which is meaningless with respect to a given sleigh context by putting in an invalid space id. The sleigh context should provide a way to produce varnodes that guarantees they are "meaningful".
  • It is impossible to impl Display for a VarNode without adding in a .display(ctx) method producing a VarNodeDisplay structure, which contains the data of the VarNode as well as relevant data from the sleigh context (the space name). This same process must be done for VarNode, IndirectVarNode, GeneralizedVarNode, and ResolvedVarNode, and anything that uses any of the above, including PcodeOperations leading to a large amount of cruft.

Planned Refactor

Instead define a VarNode like this:

pub struct VarNode {
    /// The index at which the relevant space can be found in a [`SpaceManager`]
    pub space: SharedSpaceInfo,
    /// The offset into the given space
    pub offset: u64,
    /// The size in bytes of the given [`VarNode`]
    pub size: usize,
}

Where SharedSpaceInfo is:

pub struct SharedSpaceInfo(Rc<SpaceInfo>);

Advantages

  • Does not add generic arguments or lifetimes to VarNodes. Once you have one, it is still very easy to work with.
  • Using an Rc allows for the VarNode to have access to more information about its space without imposing a huge memory or performance penalty: there are only a few spaces in an architecture, and they will be allocated once; runtime penalty of increments, decrements, and uses of an Rc shouldn't be noticable.
  • Allows for trivially implementing formatters (e.g. Display & LowerHex) directly on all varieties of VarNode as well as PcodeOperation and Instruction.
  • Frees developers from having to cart a reference to the sleigh context around anywhere they might want to display a VarNode.

Disadvantages

  • It is no longer possible to just slap a space index into a VarNode to create it; this can be a pain for testing.
  • It is now harder to implement Serialize or Deserialize on VarNode and friends. While serde has an rc flag which would enable this, it would lead to massively inefficient representations of seralized and deserialized data as serde does not deduplicate Rcs. Going forward, would likely need to produce SerializedVarNode (etc) structures to support this. Similar cruft to display, but now we're exporting it to serde type code instead of display code. That said, this would only need to be done for VarNode and IndirectVarNode as GeneralizedVarNode is not stored in any structures and ResolvedVarNode is likely not seralizable anyway due to the z3 context, so it likely wouldn't be too bad.
  • Probably a tiny tiny performance hit, but not enough to be worried about.
  • Makes VarNode structures unsuitable for a nostd environment, but the sleigh engine is likely already unsuitable, and I don't foresee or plan to support nostd for this crate.
@toolCHAINZ toolCHAINZ added enhancement New feature or request jingle_sleigh Involves the sleigh FFI labels Nov 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request jingle_sleigh Involves the sleigh FFI
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

1 participant