adlibrary.com Logoadlibrary.com
Share
Guides & Tutorials,  Platforms & Tools

How to Pull Meta Ads Data Into BigQuery (2026 Guide)

Step-by-step guide to exporting Meta Ads data into BigQuery using Google's Data Transfer Service, the Marketing API, and ETL connectors. Updated for 2026.

Instagram ad campaign setup: three placements each with distinct creative layout

TL;DR: To pull Meta Ads data into BigQuery, you have three practical paths: Google's BigQuery Data Transfer Service (easiest, daily refresh), a direct Meta Marketing API pipeline (most flexible, requires code), or a third-party ETL connector like Fivetran or Stitch (best for teams needing hourly data without writing code). This guide covers all three with concrete steps, plus how to load competitor ad intelligence from the AdLibrary API into the same warehouse.

Meta Ads Manager is not a data warehouse. Its built-in reporting is session-scoped, its export is a CSV, and its attribution window settings shift what numbers you see depending on when you look. If you are running paid media at scale, those limits compound fast: you cannot join Meta spend data with CRM revenue or other ad platforms without first getting the data somewhere neutral.

Google's data warehouse is the neutral place for most teams — cheap for storage, fast for SQL analytical queries, and natively connected to every BI layer that matters (Looker Studio, Looker, Tableau, Hex, Metabase). Getting your Meta ads data into it is a one-time setup problem that pays off every week for as long as you run campaigns. The difficult attribution tracking problems that frustrate every performance team become tractable once the data lives in a warehouse you control.

Why Ads Manager Alone Does Not Scale

Three specific problems push teams toward a data warehouse:

Cross-channel attribution. Meta claims the conversion. Google claims the conversion. Your Meta dashboard says €4 CPA, your Google dashboard says €6 CPA. Neither is the truth — the truth is only visible when both datasets sit in a neutral warehouse where you control the attribution logic. See attribution tracking for how teams handle this in practice. The same problem applies to TikTok, YouTube, and every other paid channel you run in parallel.

Historical fidelity. Ads Manager shows data under the attribution window currently selected. Switch from 7-day click to 1-day click and your Q1 numbers change retroactively — not because campaign performance changed, but because the measurement frame changed. A properly built data warehouse table preserves reported values at time of ingestion. The numbers you used to make decisions stay preserved alongside the decisions you made, independent of any future setting changes.

Cohort and LTV analysis. Which Q1 campaigns generated customers who spent >€200 over 90 days? Ads Manager has no concept of customer lifetime value. That query needs your ad spend joined to your order data in the same system. This is the most common reason teams at ad performance tracking maturity invest in a pipeline — not faster reporting, but richer joins that the platform itself cannot produce.

The Three Methods: Decision Table

Pick the right method before touching a keyboard.

MethodSetup timeData freshnessTechnical requirementsBest for
BigQuery Data Transfer Service30 minDaily (24-48h lag)Google Cloud account onlyNon-technical teams, daily reporting
Meta Marketing API (direct)1-3 daysNear-real-time (15 min lag)Python/Node, GCS/BQ write accessTeams with engineering capacity
Third-party ETL (Fivetran, Stitch)2-4 hoursHourly (paid tier)No code, just authenticationNon-technical teams needing hourly data

For most marketing teams without an in-house data engineer, the BigQuery Data Transfer Service is the right starting point. It covers the core ad performance fields and costs nearly nothing to run. For creative-level metadata, placement breakdowns, or real-time data, the Marketing API direct path is worth the setup investment. For hourly refresh without code, Fivetran or Stitch are the practical choices.

Method 1: BigQuery Data Transfer Service

Google's BigQuery Data Transfer Service has a native Facebook Ads connector — the fastest path to getting Meta spend data into a warehouse.

Prerequisites: A Google Cloud project with billing enabled, BigQuery and BigQuery Data Transfer APIs enabled, a Meta Business account with advertiser access to the accounts you want to pull.

Setup steps:

  1. In BigQuery console, click "Data transfers" → "+ Create Transfer".
  2. Select "Facebook Ads" as the data source.
  3. Name the transfer (e.g., meta-ads-daily) and set schedule. Daily at 6-8am UTC works for most teams — Meta data is available approximately 12-24 hours after day end.
  4. Select your destination BigQuery project and dataset. The connector auto-creates four tables: ads, ad_sets, campaigns, and ad_performance (the main fact table with impressions, clicks, spend, and conversions by date).
  5. Click "Grant Access" and complete the OAuth flow with a Meta account that has advertiser access to the target ad accounts.
  6. Save and click "Schedule Backfill" for up to 90 days of historical data.

After the first run, you can query Facebook ads reporting data directly:

sql
SELECT date, campaign_name,
  SUM(spend) AS total_spend,
  SAFE_DIVIDE(SUM(spend), SUM(clicks)) AS cpc
FROM `your-project.your-dataset.ad_performance`
WHERE date >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)
GROUP BY 1, 2
ORDER BY total_spend DESC

What the Transfer Service does not cover: Creative-level metadata (ad copy text, image hash, video duration), placement breakdowns, audience segment data, and custom conversion event breakdowns. If you need those fields, use Method 2.

Method 2: Meta Marketing API Direct Pipeline

The Meta Marketing API gives you maximum control over fields and schedule. The core pipeline is 100-200 lines of Python.

Step 1: Create a Meta Developer app. Go to developers.facebook.com → create a Business-type app → add the Marketing API product. Request ads_read (required for performance data) and ads_management (required for creative metadata). For production, create a system user in Meta Business Manager (Settings > System Users) — system user tokens do not expire when a person's password changes.

Step 2: Query the Insights API.

python
from facebook_business.adobjects.adaccount import AdAccount
from facebook_business.api import FacebookAdsApi

FacebookAdsApi.init(access_token=ACCESS_TOKEN)
account = AdAccount('act_' + AD_ACCOUNT_ID)
insights = account.get_insights(
    params={
        'level': 'ad',
        'fields': ['ad_id','ad_name','campaign_name',
                   'impressions','clicks','spend','cpm','cpc','ctr',
                   'actions','date_start','date_stop'],
        'time_increment': 1,
        'date_preset': 'last_30d',
    }
)

The API paginates results via a cursor in the paging field. For large accounts with many ads (>500 rows per query), the Insights API returns results asynchronously — you get a job ID, poll until status is Job Completed, then fetch the report.

Meta's rate limit for Insights API calls is 200/hour per ad account (Business Use Case tier). For a daily pipeline on a mid-size account, this is not a constraint. The IAB programmatic framework is useful context for teams bridging this pipeline toward programmatic buying workflows.

Step 3: Write to BigQuery. Write API results to Google Cloud Storage as newline-delimited JSON, then load with a BigQuery load job:

python
from google.cloud import bigquery
client = bigquery.Client()
job_config = bigquery.LoadJobConfig(
    source_format=bigquery.SourceFormat.NEWLINE_DELIMITED_JSON,
    write_disposition="WRITE_TRUNCATE",  # idempotent per partition
    schema=[...]
)
load_job = client.load_table_from_json(rows, table_id, job_config=job_config)
load_job.result()

Use WRITE_TRUNCATE scoped to the date partition being loaded — this makes re-runs idempotent and prevents duplicate rows.

Step 4: Schedule. Google Cloud Scheduler → Cloud Function or Cloud Run container. For multi-source orchestration, Apache Airflow via Cloud Composer is standard. Pipeline runtime for a mid-size account is 3-8 minutes.

Method 3: Third-Party ETL Connectors

For teams that need more than daily refresh without writing pipeline code:

Fivetran — market-leading managed connector. Handles OAuth token refresh, API versioning, incremental syncs, and schema migrations automatically. Pricing based on monthly active rows. Supports hourly sync on paid plans.

Stitch — lower-cost alternative with similar coverage, less sophisticated schema management for custom conversion events.

Supermetrics — primarily a reporting layer (Looker Studio, Sheets) that also supports BigQuery as a destination. If your team already uses Supermetrics for Facebook ads dashboard work, this is a low-friction path to warehouse storage.

Before committing to any connector, verify: (a) does it expose placement and demographic breakdowns, or only summary metrics? (b) how does it handle custom conversion events? (c) what happens when Meta deprecates an API version — does it update automatically or break?

For Meta ads integrations that must survive Meta's deprecation cycles, managed connectors and the Transfer Service are more resilient than self-built pipelines.

Schema Decisions That Matter

The schema choices you make on day one affect how useful the data is for years. Four principles:

Preserve raw action data. Meta's actions field is a JSON array: [{"action_type":"purchase","value":"14"}]. Flattening to columns works until you add a new custom conversion API event and need to alter the schema. Store as REPEATED RECORD — you can always unnest at query time.

Partition by date. Make date_start the partition column. BigQuery charges for data scanned; a partitioned table makes 30-day queries cost ~30x less.

Preserve the attribution window. If you query with 7-day click today and 1-day click next month, the same date range returns different numbers. Add an attribution_window column to every row. This is the single change that most improves historical ad data analysis reliability.

Separate creative metadata from performance data. Creative fields change when ads are edited; performance data accumulates. Store them in separate tables with a timestamp column on the creatives table, joined via ad_id. This directly affects the quality of any downstream Facebook ads analytics work.

Adding Competitor Intelligence to the Same Warehouse

Your own performance data shows how your campaigns are doing. Competitor ad data shows whether the market has shifted — whether a competitor is scaling a format you have not tried, or whether CPMs are rising because the competitive landscape tightened.

Meta's free Ad Library is adequate for manual one-platform browsing. The moment you want structured competitor creative data queryable in BigQuery across Meta, TikTok, Google, YouTube, and LinkedIn in the same SQL, you need something else.

AdLibrary's Business plan API (€329/mo) returns competitor ad data in warehouse-friendly JSON: advertiser name, platform, creative format, first/last seen dates, engagement signals, and creative metadata. Load it with the same pipeline pattern as your own Meta data:

sql
SELECT
  own.campaign_name, own.spend, own.cpa,
  comp.advertiser_name AS top_competitor
FROM `your-project.meta_ads.performance` own
JOIN `your-project.adlibrary.competitor_snapshot` comp
  ON own.date = comp.snapshot_date AND own.category = comp.category
WHERE own.date >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)

Your spend efficiency vs. competitor activity in the same SQL query — that is what turns a reporting pipeline into a strategy tool. For the API integration walkthrough, see AdLibrary API documentation and the API access feature page.

Validating the Pipeline Before Going Live

Four checks before calling the pipeline production-ready:

Row count reconciliation. Compare total spend and impressions in your BigQuery table against Ads Manager for the most recent completed day. Differences under 2% are normal. Over 5% means a field mapping error or a missed ad account.

Action event completeness. Pick three ads with known purchase conversions in Ads Manager. Verify your BigQuery table shows the same count. Mismatches usually mean the actions array is being parsed incorrectly.

Token expiry alert. User access tokens expire. System user tokens can be revoked. Build an alert that fires on authentication failure — a 3-day data gap is much harder to recover from than a same-day alert. This applies equally to secure Facebook ads API connections.

Rate limit headroom. Log the x-business-use-case-usage header from every response. If you are above 70% of your hourly allowance, increase page size to reduce the number of paginated calls, or request Advanced Access in your Meta app settings.

What to Build Once the Data Is In

A Meta Ads → BigQuery pipeline is infrastructure, not a destination. The value is in what runs on top. Teams that get the most from this setup typically start with one of these four uses and expand from there.

Cross-channel ROAS reporting. Join Meta spend with Google Ads (also available via Transfer Service), TikTok (via API), and Shopify orders. One dashboard showing total spend, total revenue, and blended ROAS across all channels — impossible to produce inside any single platform's interface. Use the ROAS calculator to set the target threshold before you build the dashboard so you have a clear pass/fail line for campaign evaluation.

Creative performance at scale. Query the top 50 ads by 7-day CTR over 90 days. What formats appear most? What copy patterns repeat in high performers? This analysis runs in seconds in BigQuery. It is not practical at all in Ads Manager beyond a handful of ads. The historical ad data analysis this enables is the most immediate payoff for creative teams building on top of the pipeline.

Attribution modeling. With ad spend in BigQuery alongside customer order data, you can build cohort analyses and custom attribution window models that account for the gaps created by App Tracking Transparency. The media mix modeling methodology requires exactly this kind of multi-touch data in a neutral warehouse — neither Meta's nor Google's dashboard can serve as the neutral party in their own attribution calculations.

Budget pacing alerts. A Cloud Function running on a 15-minute schedule queries BigQuery, compares actual daily spend against pacing targets by campaign, and sends a Slack or email alert when a campaign is pacing more than 20% under or over target. This "warehouse as ops brain" model is where teams at ad data for AI agents scale operate — automated signal detection rather than manual dashboard review. The ad budget planner is useful for setting the pacing targets that feed these alerts.

Frequently Asked Questions

Does Google's BigQuery Data Transfer Service support Meta Ads natively?

Yes. Google's BigQuery Data Transfer Service has a Facebook Ads connector that pulls campaign, ad set, and ad performance data on a daily schedule. It handles authentication, schema creation, and backfill automatically. Available in most Google Cloud regions; requires BigQuery and Data Transfer APIs enabled.

What Meta Ads data fields are available in BigQuery?

The Transfer Service connector imports account, campaign, ad set, and ad-level tables with core performance fields: impressions, clicks, spend, reach, frequency, CPM, CPC, CTR, conversions by action type, and date. Via Marketing API direct, you get all of those plus creative metadata, placement and demographic breakdowns, and custom conversion event data.

How fresh is the data pulled from Meta Ads into BigQuery?

Transfer Service runs daily — data is typically 24-48 hours behind. The Marketing API supports near-real-time queries (15-minute lag at the ad level), but API pipelines require building and scheduling the ingestion yourself. Third-party ETL tools like Fivetran offer hourly sync on paid tiers.

What permissions does the Meta Marketing API require for BigQuery export?

Your Meta app needs ads_read for performance data. For creative metadata, add ads_management. All permissions require Business verification. System user tokens (non-expiring) are the recommended approach for server-side pipelines — they do not break when a team member's password changes.

Can I pull competitor Meta Ads data into BigQuery?

Meta's Marketing API only exposes accounts you own or manage — competitor performance data is not accessible through it. For structured competitor creative intelligence across Meta, TikTok, Google, and more, the AdLibrary API on the Business plan (€329/mo) returns warehouse-ready JSON loadable via the same pipeline patterns described in this guide.

AdLibrary image

Pipeline Troubleshooting and Server-Side Attribution

Even a well-built pipeline hits errors. The four that come up most often, and the fix for each:

OAuthException: (#200) The user hasn't authorized the application — The access token is missing a required permission. Re-authenticate with the correct scope. For system user tokens, verify the user has been assigned ads_read in Business Manager settings (Settings > System Users > Assign Assets).

Transfer Service FAILED with no error detail — Navigate to the transfer run details and check the transfer_logs table BigQuery auto-creates in your destination dataset. Common causes: ad account ID mismatch, revoked OAuth token, or a deactivated ad account. Fix the root cause and re-run the backfill for affected dates — do not let data gaps accumulate silently.

BigQuery load job JSON parsing error — Usually means the actions array contained a new action type your schema does not accommodate. The permanent fix is using REPEATED RECORD or JSON type for actions rather than flattening it at load time. See conversion tracking patterns for schema examples that handle this gracefully.

Duplicate rows accumulating — Your pipeline is appending without deduplication. Use WRITE_TRUNCATE scoped to the date partition being loaded. Check for existing duplicates before deploying the fix:

sql
SELECT ad_id, date_start, COUNT(*) as row_count
FROM `your-project.meta_ads.performance`
GROUP BY 1, 2
HAVING row_count > 1

For rate limit errors: Meta's rate limiting documentation covers the BUC tier — 200 calls/hour per ad account is sufficient for most daily pipelines. If you are consistently above 70% utilization, increase page size (fewer paginated calls) or request Advanced Access in your Meta app settings.

The attribution gap. A BigQuery pipeline pulls reported data — subject to Meta's attribution logic, which shifted after iOS 14 introduced App Tracking Transparency. For accounts with significant iOS browser traffic, reported conversions may undercount by 20-40%. Modeled conversions fill some of that gap statistically, but not completely.

The complementary fix is server-side tracking via the Conversions API. CAPI sends conversion events from your server directly to Meta, bypassing browser limitations. Events sent via CAPI appear in Ads Manager and flow into your BigQuery pipeline — more complete conversion data in the same warehouse. With CAPI running alongside the pipeline, you have three sources in one place: server-measured events, Meta-attributed events, and your own order data. Cross-referencing them shows exactly where modeled conversions are filling measurement gaps and where they are not.

For CAPI implementation, see pixel and CAPI integration automation. Use the ROAS calculator to model the impact of better attribution coverage on reported campaign returns — the delta between pre- and post-CAPI reported ROAS is often 15-30% on iOS-heavy accounts.

Putting It All Together

Getting Meta Ads data into BigQuery is a solved problem. The question is which solution fits your situation and team.

For daily reporting without engineering work: BigQuery Data Transfer Service is a 30-minute setup covering 80% of use cases. Start there. Running costs are near-zero for most account sizes.

For creative-level metadata, placement breakdowns, or real-time data: Build a direct Marketing API pipeline. The core code is 100-200 lines of Python. Ongoing maintenance is low with system user tokens and API version monitoring via the deprecation schedule Meta publishes quarterly.

For hourly freshness without code: Fivetran or Stitch are the practical choices. The connector cost justifies itself quickly for teams spending more than €15,000/month on Meta — time saved on pipeline maintenance exceeds tool cost within two to three months. Use the Ad Spend Estimator to model whether your current spend volume supports the managed connector pricing tier.

The downstream investment that compounds most is adding competitive context to the same warehouse. AdLibrary's Business plan gives you API access to structured competitor ad data across Meta, TikTok, Google, and more — warehouse-ready JSON that loads into BigQuery with the same pipeline patterns you already run for your own data. Meta's free Ad Library is adequate for manual one-platform browsing. The moment you want multi-platform competitor data queryable alongside your own spend in the same SQL, you need something else. That is not a replacement for Meta's API — it is what you add when Meta's API stops being enough.

For further reading: meta ads integrations that matter, Facebook ads data analysis challenges and fixes, and the AdLibrary API documentation. For the programmatic competitive monitoring use case that feeds intelligence into this pipeline, see automate competitor ad monitoring and ad data for AI agents.