| Safe Haskell | None |
|---|---|
| Language | Haskell2010 |
Tokstyle.Analysis.PointsTo
Description
This module implements an inter-procedural, context-sensitive, summary-based points-to analysis for C code.
The analysis determines the set of abstract memory locations that each pointer variable could point to at any given program point. This is a foundational analysis required for many subsequent static analyses, such as taint tracking.
The core algorithm works as follows:
1. **AST Traversal**: It first traverses the Abstract Syntax Tree (AST) of the
entire program to find all function definitions and declarations.
2. **Worklist Algorithm**: It uses a worklist algorithm to iteratively analyze
functions until a fixed point is reached. The worklist contains pairs of
(FunctionName, Context), ensuring that functions are re-analyzed if their
calling context changes.
3. **Intra-procedural Analysis**: For each function, it performs a standard
forward dataflow analysis over its Control Flow Graph (CFG). The state
(or "facts") at each point is a PointsToMap, which maps abstract
locations to the set of locations they point to.
4. **Function Summaries**: After analyzing a function, it generates a
PointsToSummary. This summary captures the function's effect on pointers,
including what it returns and what side effects it has on pointers passed
as arguments. This is the key to scalability, as it avoids re-analyzing
a function's body at every call site.
5. **Context Sensitivity**: The analysis is context-sensitive, meaning it can
distinguish between different call sites of the same function. The Context
is a list of node IDs (hashes) representing the call stack. This allows for
more precise summaries and handling of recursion.
6. **Fixed-Point Iteration**: The worklist algorithm continues until the
summaries for all functions stabilize (i.e., no longer change). When a
function's summary changes, all of its callers are added back to the
worklist to be re-analyzed with the updated information. This process
guarantees that the analysis correctly handles complex interactions,
including mutual recursion.
Synopsis
- type PointsToMap = Map AbstractLocation (Set AbstractLocation)
- data PointsToContext l = PointsToContext {
- ptcCallGraph :: CallGraph
- ptcSummaries :: Map FunctionName PointsToSummary
- ptcFuncDefs :: FunctionDefs
- ptcFuncDecls :: FunctionDefs
- ptcStructDefs :: Map Text (Node (Lexeme l))
- ptcVarTypes :: Map Text (Node (Lexeme l))
- ptcCurrentContext :: Context
- ptcDynamicCallGraph :: Map (FunctionName, Context) (Set (FunctionName, Context))
- ptcAnalyzedCfgs :: Map (FunctionName, Context) (CFG Text PointsToState)
- ptcLocalVars :: Set Text
- ptcFileMacros :: MacroDefinitionMap
- data PointsToState = PointsToState {}
- type PointsToSummary = Map Context PointsToSummaryData
- data PointsToSummaryData = PointsToSummaryData {}
- type MacroDefinitionMap = Map FunctionName (Node (Lexeme Text))
- buildPointsToContext :: [(FilePath, [Node (Lexeme Text)])] -> CallGraph -> Map FunctionName PointsToSummary -> PointsToContext Text
- analyzeFunctionWithSummaries :: PointsToContext Text -> FunctionName -> PointsToMap
- toAbstractLocation :: HasCallStack => Node (Lexeme Text) -> AbstractLocation
- analyzeStatementForPointers :: MacroDefinitionMap -> PointsToContext Text -> Text -> PointsToMap -> Node (Lexeme Text) -> (PointsToMap, Set (FunctionName, Context))
- fixpointSummaries :: PointsToContext Text -> Worklist (FunctionName, Context) -> PointsToContext Text
- evalPointsToSet :: MacroDefinitionMap -> PointsToContext Text -> Set Text -> FunctionName -> PointsToMap -> Node (Lexeme Text) -> (Set AbstractLocation, Set (FunctionName, Context))
- transferPointsToState :: PointsToContext Text -> FunctionName -> PointsToState -> Node (Lexeme Text) -> (PointsToState, Set (FunctionName, Context))
Documentation
type PointsToMap = Map AbstractLocation (Set AbstractLocation) Source #
The PointsToMap is the data flow fact. It maps a pointer's abstract location to the set of abstract locations it can point to.
data PointsToContext l Source #
The context for the inter-procedural analysis. This data structure holds all the global information needed while analyzing the program.
Constructors
| PointsToContext | |
Fields
| |
Instances
| DataFlow PointsToContext Text PointsToState Source # | The PointsTo analysis is an instance of the generic DataFlow framework. |
Defined in Tokstyle.Analysis.PointsTo Methods emptyFacts :: PointsToContext Text -> PointsToState Source # transfer :: PointsToContext Text -> Text -> PointsToState -> Node (Lexeme Text) -> (PointsToState, Set (Text, [Int])) Source # join :: PointsToContext Text -> PointsToState -> PointsToState -> PointsToState Source # | |
data PointsToState Source #
The state for the points-to analysis, including the points-to map and the set of currently defined macros.
Constructors
| PointsToState | |
Fields | |
Instances
| Eq PointsToState Source # | |
Defined in Tokstyle.Analysis.PointsTo Methods (==) :: PointsToState -> PointsToState -> Bool # (/=) :: PointsToState -> PointsToState -> Bool # | |
| Show PointsToState Source # | |
Defined in Tokstyle.Analysis.PointsTo Methods showsPrec :: Int -> PointsToState -> ShowS # show :: PointsToState -> String # showList :: [PointsToState] -> ShowS # | |
| DataFlow PointsToContext Text PointsToState Source # | The PointsTo analysis is an instance of the generic DataFlow framework. |
Defined in Tokstyle.Analysis.PointsTo Methods emptyFacts :: PointsToContext Text -> PointsToState Source # transfer :: PointsToContext Text -> Text -> PointsToState -> Node (Lexeme Text) -> (PointsToState, Set (Text, [Int])) Source # join :: PointsToContext Text -> PointsToState -> PointsToState -> PointsToState Source # | |
type PointsToSummary = Map Context PointsToSummaryData Source #
The full, context-sensitive points-to summary for a function.
data PointsToSummaryData Source #
The summary for a function's points-to analysis in a specific context.
Constructors
| PointsToSummaryData | |
Instances
| Eq PointsToSummaryData Source # | |
Defined in Tokstyle.Analysis.Types Methods (==) :: PointsToSummaryData -> PointsToSummaryData -> Bool # (/=) :: PointsToSummaryData -> PointsToSummaryData -> Bool # | |
| Ord PointsToSummaryData Source # | |
Defined in Tokstyle.Analysis.Types Methods compare :: PointsToSummaryData -> PointsToSummaryData -> Ordering # (<) :: PointsToSummaryData -> PointsToSummaryData -> Bool # (<=) :: PointsToSummaryData -> PointsToSummaryData -> Bool # (>) :: PointsToSummaryData -> PointsToSummaryData -> Bool # (>=) :: PointsToSummaryData -> PointsToSummaryData -> Bool # max :: PointsToSummaryData -> PointsToSummaryData -> PointsToSummaryData # min :: PointsToSummaryData -> PointsToSummaryData -> PointsToSummaryData # | |
| Show PointsToSummaryData Source # | |
Defined in Tokstyle.Analysis.Types Methods showsPrec :: Int -> PointsToSummaryData -> ShowS # show :: PointsToSummaryData -> String # showList :: [PointsToSummaryData] -> ShowS # | |
type MacroDefinitionMap = Map FunctionName (Node (Lexeme Text)) Source #
A map from a macro's name to its AST definition node.
buildPointsToContext :: [(FilePath, [Node (Lexeme Text)])] -> CallGraph -> Map FunctionName PointsToSummary -> PointsToContext Text Source #
The main entry point for the inter-procedural points-to analysis. It sets up the initial context and kicks off the fixed-point iteration.
analyzeFunctionWithSummaries :: PointsToContext Text -> FunctionName -> PointsToMap Source #
Analyze a single function using the pre-computed summaries to get the final
points-to map at its exit points. This is useful for debugging or for
getting the final state of a specific function like main.
toAbstractLocation :: HasCallStack => Node (Lexeme Text) -> AbstractLocation Source #
Helper to convert an LHS expression AST node to an AbstractLocation.
analyzeStatementForPointers :: MacroDefinitionMap -> PointsToContext Text -> Text -> PointsToMap -> Node (Lexeme Text) -> (PointsToMap, Set (FunctionName, Context)) Source #
The transfer function for a single statement. It takes the current points-to state and a statement, and returns the new state and a set of new functions that need to be added to the worklist (due to being called).
fixpointSummaries :: PointsToContext Text -> Worklist (FunctionName, Context) -> PointsToContext Text Source #
The fixed-point iteration loop for computing function summaries. This is the heart of the inter-procedural analysis.
evalPointsToSet :: MacroDefinitionMap -> PointsToContext Text -> Set Text -> FunctionName -> PointsToMap -> Node (Lexeme Text) -> (Set AbstractLocation, Set (FunctionName, Context)) Source #
Evaluates an expression to determine the set of abstract locations it points to. This is a key part of the transfer function, used to resolve the RHS of assignments and the targets of function calls.
transferPointsToState :: PointsToContext Text -> FunctionName -> PointsToState -> Node (Lexeme Text) -> (PointsToState, Set (FunctionName, Context)) Source #
The new top-level transfer function. It inspects the current statement and either updates the macro map or delegates to the existing points-to logic.