Skip to content

Chapter 5: Your Customer Data

AI Flow

The Problem

You want to target "high-value customers who purchased recently." But where does that data come from? What fields can you actually use? Before you can build segments, you need to understand how customer data is organized in Treasure Data.

The Key Idea

Core concept

Your customer data lives in a "parent segment"—a unified view that combines customer records with their attributes and behaviors.

Think of it as the foundation. Every audience segment you build is a filtered view of this foundation.

The Three Layers

A parent segment has three components:

Master Table

The master table is your source of truth for customer identity. Each row is one customer. It contains:

  • A unique identifier (customer ID, email, etc.)
  • Core customer fields (name, email, signup date)

Example fields from a master table:

  • customer_id
  • email
  • first_name
  • signup_date
  • country

Attributes

Attributes are customer properties pulled from other tables and attached to each customer. They answer "what do we know about this customer?"

Example attributes:

  • lifetime_value — Total spend across all orders
  • loyalty_tier — Gold, Silver, Bronze
  • preferred_category — Most purchased product category
  • last_purchase_date — When they last bought something

Attributes are calculated or aggregated from your data warehouse and joined to each customer.

Behaviors

Behaviors are customer actions—events with timestamps. They answer "what has this customer done?"

Example behaviors:

  • purchase — Customer made a purchase
  • page_view — Customer viewed a page
  • email_open — Customer opened an email
  • cart_abandon — Customer left items in cart

Behaviors let you target based on recency and frequency: "purchased in the last 30 days" or "opened more than 3 emails this month."

Mental Model: Customer Profiles

Imagine a customer profile card:

┌─────────────────────────────────────┐
│ Sarah Johnson                       │
│ sarah@example.com                   │
│ Customer since: Jan 2022            │
├─────────────────────────────────────┤
│ Attributes:                         │
│   Lifetime Value: $2,450            │
│   Loyalty Tier: Gold                │
│   Preferred Category: Electronics   │
├─────────────────────────────────────┤
│ Recent Behaviors:                   │
│   Dec 15: Purchased ($150)          │
│   Dec 10: Viewed product page       │
│   Dec 1: Opened promotional email   │
└─────────────────────────────────────┘

The parent segment creates this unified view for every customer. When you build a segment, you're filtering these profiles: "show me all customers where Loyalty Tier is Gold AND they purchased in the last 30 days."

Seeing Your Data

Ask AI to show you what's available:

> "What attributes are available in my parent segment?"

AI will run the appropriate tdx command and return a list:

yaml
attributes:
  - name: lifetime_value
    type: number
  - name: loyalty_tier
    type: string
  - name: last_purchase_date
    type: timestamp
  - name: email_opt_in
    type: boolean

For behaviors:

> "What behaviors can I target?"
yaml
behaviors:
  - name: purchase
    fields: [amount, product_id, category]
  - name: page_view
    fields: [url, duration]
  - name: email_open
    fields: [campaign_id]

What This Means for Segments

When you ask AI to build a segment, you're using these building blocks:

Targeting NeedData TypeExample
"High spenders"Attributelifetime_value > 1000
"Gold members"Attributeloyalty_tier = 'Gold'
"Recent buyers"Behaviorpurchase in last 30 days
"Email engaged"Behavioremail_open count > 3 this month

AI translates your intent into the correct attribute or behavior references.

The Parent Segment YAML

Here's what a parent segment configuration looks like:

yaml
name: All Customers
master_table:
  database: marketing
  table: customers
  key_column: customer_id

attributes:
  - name: lifetime_value
    database: analytics
    table: customer_metrics
    key_column: customer_id
    value_column: ltv

behaviors:
  - name: purchase
    database: events
    table: purchases
    key_column: customer_id
    timestamp_column: purchased_at

You won't write this from scratch—it's typically set up once by your data team. But understanding it helps you know what's possible.

Pitfalls

"I don't see the field I need."

The field might not be in the parent segment yet. Ask your data team to add it as an attribute or behavior, or ask AI: "How would I add lifetime_value as an attribute?"

"Attribute vs. behavior—which do I use?"

  • Attributes are current states: "is a Gold member," "has lifetime value of $1000"
  • Behaviors are historical actions: "made a purchase," "opened an email"

If you're targeting based on something the customer did, it's a behavior. If you're targeting based on something the customer is, it's an attribute.

"The data seems out of date."

Parent segments refresh on a schedule (usually daily). Ask AI: "When was the parent segment last updated?" Real-time data requires different approaches covered in advanced chapters.

What You've Learned

  • Customer data is organized in parent segments
  • Master tables hold core customer identity
  • Attributes are customer properties (what they are)
  • Behaviors are customer actions (what they did)
  • AI can show you available fields to target

Next Step

You understand the data. Chapter 6 shows you how to explore it—asking AI to investigate your tables and answer questions about what you can target.


You know the structure. Next, you'll explore what's inside.