// CONDOR SNOWFLAKE NATIVE APP DOCS

Snowflake Native App quickstart and operator guide

Condor is a Snowflake Native App for customer identity resolution inside Snowflake. Use these docs to create a workspace, launch modeling, run graph refresh, and inspect run status from SQL.

Minimal quickstart

Use this flow when the app is installed, privileges are approved, the source reference is bound, and runtime resources can be created or already exist. Install the app from Snowflake Marketplace before running these procedures.

CALL APP.apply_app_runtime_settings();

CALL APP.create_workspace('customer_workspace');

CALL APP.run_modeling('customer_workspace');

CALL APP.get_run_status('customer_workspace');

CALL APP.run_graph_full_refresh('customer_workspace');

CALL APP.get_run_status('customer_workspace');

Output tables

View

entity_membership

Primary record-to-entity membership output. It shows which source records belong to each resolved identity cluster.

Use this view to

  • map input records to resolved entities
  • count records per resolved entity
  • join source records back to graph assignments
  • inspect which records were grouped together
  • use stable IDs downstream when stable ID generation is enabled
Column Type Description
cluster_id VARCHAR The current resolved identity cluster identifier assigned by the graph run. Records with the same cluster_id are currently in the same resolved entity cluster.
stable_id VARCHAR The durable stable identifier assigned to the cluster when stable IDs are enabled and available. This can be NULL if stable ID assignment is disabled or no stable ID was assigned.
namespace VARCHAR The source namespace for the input record. Namespaces distinguish source systems, feeds, or logical record groups.
record_id VARCHAR The record identifier from the source namespace. Together, namespace and record_id identify the source record.
updated_run_id VARCHAR The graph run that last published this membership row.

Example query

SELECT
  stable_id,
  cluster_id,
  namespace,
  record_id
FROM entity_membership
ORDER BY stable_id, cluster_id, namespace, record_id;

View

entity_matches

Published edges between canonical records inside resolved clusters. It explains which record-level relationships contributed to the graph output.

Use this view to

  • inspect why records are connected
  • distinguish deterministic edges from probabilistic edges
  • review match scores for probabilistic links
  • build graph visualizations or QA reports
  • debug unexpected entity membership
Column Type Description
edge_id VARCHAR Stable identifier for the published edge. It is derived from the cluster, endpoints, and edge type.
cluster_id VARCHAR The resolved identity cluster that contains both edge endpoints.
stable_id VARCHAR The stable identifier for the cluster when stable IDs are enabled and available.
left_namespace VARCHAR Source namespace for the left edge endpoint.
left_canonical_id VARCHAR Canonical record identifier for the left edge endpoint.
right_namespace VARCHAR Source namespace for the right edge endpoint.
right_canonical_id VARCHAR Canonical record identifier for the right edge endpoint.
edge_type VARCHAR The type of published edge. Current values are deterministic and probabilistic.
score DOUBLE Edge score. Deterministic edges publish 1.0; probabilistic edges publish the model probability that passed the publish threshold.
updated_run_id VARCHAR The graph run that last published this edge row.

Review probabilistic edges

SELECT
  stable_id,
  cluster_id,
  score,
  left_namespace,
  left_canonical_id,
  right_namespace,
  right_canonical_id
FROM entity_matches
WHERE edge_type = 'probabilistic'
ORDER BY score DESC;

View

entity_golden_records

One published golden record per stable ID when golden record output is enabled.

Use this view to

  • consume entity-level customer records
  • join stable IDs to selected profile attributes
  • power downstream analytics with one row per resolved identity
  • inspect the selected values for configured golden record fields
Column Type Description
stable_id VARCHAR The stable identifier for the resolved identity. This is the primary key for the golden record output.
<configured_golden_record_field> VARCHAR One column is created for each field in golden_record_include_fields. Field columns are configurable by workspace graph settings.
updated_run_id VARCHAR The graph run that last published this golden record row.

Example query for default settings

SELECT
  stable_id,
  email,
  updated_run_id
FROM entity_golden_records
ORDER BY stable_id;

View

entity_golden_record_lineage

Explains where each golden record field value came from.

Use this view to

  • audit golden record value selection
  • trace a golden record field back to a source record
  • understand which bundle supplied a field
  • debug why a specific source value was chosen
  • support explainability for golden records
Column Type Description
stable_id VARCHAR The stable identifier for the resolved identity.
field_name VARCHAR The golden record field described by this lineage row.
bundle_name VARCHAR The golden record bundle that selected this field. For fields not explicitly grouped into a bundle, the system uses resolved singleton bundles.
source_namespace VARCHAR Namespace of the source record that supplied the field value.
source_record_id VARCHAR Record ID of the source record that supplied the field value.
updated_run_id VARCHAR The graph run that last published this lineage row.

Trace field sources

SELECT
  stable_id,
  field_name,
  bundle_name,
  source_namespace,
  source_record_id
FROM entity_golden_record_lineage
ORDER BY stable_id, field_name;

Recommended output queries

Count records per entity

SELECT
  stable_id,
  cluster_id,
  COUNT(*) AS record_count
FROM entity_membership
GROUP BY stable_id, cluster_id
ORDER BY record_count DESC;

Find entities with records from multiple namespaces

SELECT
  stable_id,
  cluster_id,
  COUNT(DISTINCT namespace) AS namespace_count,
  COUNT(*) AS record_count
FROM entity_membership
GROUP BY stable_id, cluster_id
HAVING COUNT(DISTINCT namespace) > 1
ORDER BY namespace_count DESC, record_count DESC;

Inspect match evidence for one stable ID

SELECT
  edge_type,
  score,
  left_namespace,
  left_canonical_id,
  right_namespace,
  right_canonical_id
FROM entity_matches
WHERE stable_id = '<stable_id>'
ORDER BY edge_type, score DESC;

Trace golden record sources for one stable ID

SELECT
  field_name,
  bundle_name,
  source_namespace,
  source_record_id
FROM entity_golden_record_lineage
WHERE stable_id = '<stable_id>'
ORDER BY field_name;

Important notes

  • stable_id is usually the best downstream join key when stable IDs are enabled.
  • cluster_id identifies the current graph cluster.
  • updated_run_id is useful for freshness checks, audits, and troubleshooting. It is not a business identifier.
  • entity_matches uses canonical record IDs. Use entity_membership when mapping raw source records to resolved entities.

Procedure details

Saving settings changes future effective workspace settings. It does not launch a run. Modeling and graph launch procedures submit jobs; use run status to inspect progress and completion.

APP.apply_app_runtime_settings()

Purpose: Apply app-global runtime resources and runtime metadata for the installed app.

When to use it: Run after app installation and privilege approval, before launching modeling or graph jobs.

Result: Expected status: applied.

CALL APP.apply_app_runtime_settings();

APP.create_workspace(workspace_name)

Purpose: Create or update a named workspace for identity resolution runs.

When to use it: Run before saving workspace settings or launching workspace jobs.

Result: Expected status: ready when the shared source reference is bound, or pending_reference when the workspace is created before the source reference is ready.

CALL APP.create_workspace('customer_workspace');

APP.get_workspace_modeling_settings(workspace_name)

Purpose: Inspect saved and effective modeling settings for a workspace.

When to use it: Run before modeling to confirm the next-run effective settings.

Result: Returns workspace name, effective modeling fields, saved settings, and effective settings.

CALL APP.get_workspace_modeling_settings('customer_workspace');

APP.save_workspace_modeling_settings(workspace_name, settings)

Purpose: Save workspace-level modeling setting overrides.

When to use it: Use before APP.run_modeling when packaged modeling defaults are not enough.

Result: Expected status: saved.

CALL APP.save_workspace_modeling_settings(
  'customer_workspace',
  PARSE_JSON('{
    "model_quality": "low",
    "model_passes": "low",
    "model_instructions": "Prefer canonical identifiers",
    "ignore_input_columns": [
      "customer_id_ground_truth",
      "record_time",
      "condor_record_id"
    ]
  }')
);

APP.get_workspace_graph_settings(workspace_name)

Purpose: Inspect saved and effective graph settings for a workspace.

When to use it: Run before graph launch to confirm stable ID, golden record, namespace, and bundle settings.

Result: Returns workspace name, saved settings, and effective graph settings.

CALL APP.get_workspace_graph_settings('customer_workspace');

APP.save_workspace_graph_settings(workspace_name, settings)

Purpose: Save workspace-level graph setting overrides.

When to use it: Use before graph launch when default graph settings are not enough.

Result: Expected status: saved.

CALL APP.save_workspace_graph_settings(
  'customer_workspace',
  PARSE_JSON('{
    "stable_id_enabled": true,
    "stable_id_mint_unclaimed_ids": true,
    "golden_record_include_fields": [
      "email",
      "phone_number",
      "loyalty_id",
      "first_name",
      "last_name"
    ],
    "golden_record_record_time_strategy": "latest",
    "golden_record_prioritize_by_namespace": true,
    "golden_record_namespace_priority": [
      "loyalty",
      "email",
      "pos_orders",
      "ecomm_orders"
    ],
    "golden_record_record_time_null_policy": "NULLS_LOW",
    "golden_record_bundles": {}
  }')
);

APP.run_modeling(workspace_name)

Purpose: Submit a modeling job for a workspace.

When to use it: Run after runtime resources are applied, source reference is bound, workspace exists, and modeling settings are reviewed or saved.

Result: Expected status: submitted.

CALL APP.run_modeling('customer_workspace');

APP.run_graph(workspace_name)

Purpose: Submit an incremental graph job for a workspace.

When to use it: Use for graph runs after modeling has completed and runtime resources are ready.

Result: Expected status: submitted.

CALL APP.run_graph('customer_workspace');

APP.run_graph_full_refresh(workspace_name)

Purpose: Submit a full-refresh graph job for a workspace.

When to use it: Use for initial setup, demos, or reset/rebuild workflows after modeling has completed and graph settings are reviewed or saved.

Result: Expected status: submitted.

CALL APP.run_graph_full_refresh('customer_workspace');

APP.get_run_status(workspace_name)

Purpose: Inspect run status for workspace jobs.

When to use it: Run after modeling or graph launch to inspect job state, service name, logs locator, and completion.

Result: Returns run status rows for the workspace.

CALL APP.get_run_status('customer_workspace');

Operator sequence

  1. Approve required app privileges.
  2. Apply runtime settings.
  3. Bind the source reference.
  4. Create a workspace.
  5. Review or save modeling settings.
  6. Launch modeling and inspect run status.
  7. Review or save graph settings.
  8. Launch graph refresh and inspect run status.

Important boundaries

  • The app is designed to run inside the customer's Snowflake environment.
  • Modeling runs from Snowflake SQL.
  • The source binding model uses a single shared app-level source-table reference.
  • Full refresh is available as a reset or rebuild path.

FAQ

Can I run it with default settings?

Yes. If no workspace settings are saved, packaged modeling and graph defaults are used.

How do I know a run completed?

Use APP.get_run_status(workspace_name) after modeling and graph refresh jobs.

What is a golden record?

A published customer record assembled from configured fields, bundles, namespace priority, and record-time behavior.

Is full refresh the normal ongoing runtime path?

No. Full refresh is available as a reset or rebuild path.