Intercom API
The dt-intercom-api pipeline extracts data from the Intercom API and uploads it as JSON files to GCS. It is scheduled via Airflow.
Direction of flow
This page covers data flowing from Intercom into GCS. For the reverse direction — pushing user attributes and events to Intercom — see the Intercom data product.
Streams
| Stream | Endpoint | Method | Incremental | Notes |
|---|---|---|---|---|
admins |
/admins |
GET | No | Full load each run |
teams |
/teams |
GET | No | Full load each run |
tags |
/tags |
GET | No | Full load each run |
contacts |
/contacts/search |
POST | Yes | Cursor-based pagination |
conversations |
/conversations/search |
POST | Yes | Also fetches conversation_parts |
calls |
/calls |
GET | Yes | Page-based pagination, also fetches call_transcriptions |
articles |
/articles |
GET | Yes | Page-based pagination |
collections |
/help_center/collections |
GET | No | Full snapshot each run — endpoint is not sorted by updated_at, so incremental sync is unsafe |
GCS Output
Files are stored with date partitioning:
State files for incremental streams:
Call Transcriptions
Transcripts are fetched via GET /calls/{id}/transcript for calls with a non-null transcription_url. Each record contains:
call_id— the call IDcall_updated_at— the call'supdated_attimestamptranscript— array of utterances (start_time,end_time,speaker,content)
Running --stream calls fetches both calls and their transcriptions in a single pagination pass, uploading to separate GCS paths (calls/ and call_transcriptions/).
Articles and Collections
articles is an incremental stream over GET /articles — same page-based, DESC-updated_at shape as calls, with overlap_pages=3 (vs calls' default 40) since the article catalogue is small.
collections uses a stateless full-snapshot helper (_fetch_all_paginated_list_endpoint) because GET /help_center/collections is not sorted by updated_at — the maximum updated_at can appear on any page, which breaks the standard skip-ahead logic. Each run fetches all collections, dedupes by id, and writes a single GCS file. There is no state file for collections.