Skip to content

Knowledge elicitation through "Context"

Robert Osazuwa Ness edited this page Sep 1, 2022 · 1 revision

Knowledge elicitation through "Context"

Problem to be solved

A key differentiator for the PyWhy library is the departure from "data-first causal discovery," where users provide data as the primary input to a discovery algorithm. Causal inference theory requires the modeler to specify a formal representation of their domain assumptions, causal inference reasons over those assumptions. "Data-first causal discovery" discourages novice users from thinking about and specifying their assumptions, and rather encourages to see a discovery algorithm as a "philosopher's stone" that magically converts data to causal relationships. With this mindset, users tend to surrender the task of providing domain-specific assumptions that enable identifiability to the algorithm.

The "data-first causal discovery" problem contrasts with DoWhy and packages focused on causal effect inference and other queries downstream of having a graph. These libraries require users to specifying domain assumptions up front in the form of a DAG or related abstraction before the data is added, and then address the identifiability of the target query given those assumptions and data. Causal discovery should follow this same pattern.

Proposed solution

We propose a Context data structure that stores assumptions, domain knowledge, include/exclude lists, priors, and other contexts that can constrain causal discovery. Such assumptions, knowledge, and constraints include but are limited to the following:

  • Inference Constraints - include but are not limited to assumptions that provide identifiability of a graph up to some equivalence class.
    • lists of edges that should be included or excluded in a causal graph
    • invariants describing the nature of features in the dataset (e.g. feature X is always an effect)
  • Feature Relationships
    • Mechanistic/directional/functional assumptions between variables in the model
  • Bayesian Priors on Graph Structure
  • Constraints on or biases towards latent structure

Context is passed in along side data.

context = Context(...)
model = learn_graph(context, data, discovery_algo)

Base Case: User specifies target nodes

Causal discovery algorithms tend to define the scope of discovery by the nodes that are in the data. Using the data to define the scope of the graph encourages the anti-pattern of "data-first causal discovery."

So as a base case, we ask the user to specify a basic set of nodes they wish to target with causal discovery. These are the nodes about which the user has causal questions; the task of the causal graph is to learn a graph that can answer those questions. Specifically, the causal discovery algorithm attempts to learn the union of the (a) a causally sufficient set of nodes for the input set and (b) the Markov blankets of the input sets.

context = Context(
    base_nodes=("X", "Y", "Z")
)
model = learn_graph(context, data, discovery_algo)

We force the user to specify all the nodes in the data explicitly.

context = Context(
    base_nodes=data.columns
)

Similarly, the user could specify base nodes by their role in a down stream causal inference query, such as casual effect inference. Causal discovery would focus on learning the causal structure of a causally sufficient set of nodes as well as adding nodes that could aid in causal effect inference, such instruments, mediators, and effect modifiers.

context = Context(
    treatments={'A', 'B'},
    outcomes={'C'}
)

Examples of assumptions, constraints, and prior knowledge

Include and exclude lists

The simplest example of usage is specifying knowledge as include and exclude lists. Include and exclude lists are highly effective ways to elicit expert knowledge and reduce the size of the search space.

context = Context(
    nodes=...
    forced_edges=include_df,
    excluded_edges=exclude_df
)

More examples will be added as we add new discovery algorithms.

Builder Pattern Prototype

We are prototyping the use of a the builder pattern to create context objects.

To illustrate, a context with include and exclude lists and a set of base nodes would look as follows:

context = make_context()
  .nodes(...)
  .forced_edges(include_df)
  .prevented_edges(exclude_df)
  .build()
context = make_context()
  .treatments(...) 
  .outcomes(...)
  .forced_edges(include_df)
  .prevented_edges(exclude_df)
  .build()