Intercom API
The dt-intercom-api pipeline extracts data from the Intercom API and uploads it as JSON files to GCS. It is scheduled via Airflow.
Streams
| Stream | Endpoint | Method | Incremental | Notes |
|---|---|---|---|---|
admins |
/admins |
GET | No | Full load each run |
teams |
/teams |
GET | No | Full load each run |
tags |
/tags |
GET | No | Full load each run |
contacts |
/contacts/search |
POST | Yes | Cursor-based pagination |
conversations |
/conversations/search |
POST | Yes | Also fetches conversation_parts |
calls |
/calls |
GET | Yes | Page-based pagination, also fetches call_transcriptions |
GCS Output
Files are stored with date partitioning:
State files for incremental streams:
Call Transcriptions
Transcripts are fetched via GET /calls/{id}/transcript for calls with a non-null transcription_url. Each record contains:
call_id— the call IDcall_updated_at— the call'supdated_attimestamptranscript— array of utterances (start_time,end_time,speaker,content)
Running --stream calls fetches both calls and their transcriptions in a single pagination pass, uploading to separate GCS paths (calls/ and call_transcriptions/).