포스트

[MCP&A2A] 15. 비용 추적 및 관리

[MCP&A2A] 15. 비용 추적 및 관리

개요

AI 시스템의 운영 비용은 예측하기 어렵고 빠르게 증가할 수 있습니다. 이 장에서는 LLM API 호출, 임베딩 생성, 토큰 사용량을 정밀하게 추적하고 제어하는 방법을 다룹니다.

왜 비용 추적이 중요한가?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
일반적인 AI 프로젝트 비용 증가 패턴:

월 1일: $100
월 7일: $500  (5배 증가!)
월 14일: $2,000 (20배 증가!!)
월 21일: $5,000 (50배 증가!!!)
월 30일: $12,000 (120배 증가!!!!)

원인:
- 예상치 못한 트래픽 급증
- 중복 API 호출
- 비효율적인 프롬프트
- 캐싱 미적용
- 무한 루프

1. 비용 구조 분석

1.1 AI 시스템 비용 구성

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
총 비용 = LLM 비용 + 임베딩 비용 + 인프라 비용

LLM 비용 (50-70%)
├─ Input Tokens × $0.01/1K
├─ Output Tokens × $0.03/1K
└─ 모델별 차등 (GPT-4 > GPT-3.5)

임베딩 비용 (20-30%)
├─ text-embedding-ada-002: $0.0001/1K tokens
├─ nomic-embed-text: 무료 (self-hosted)
└─ 벡터 DB 저장 비용

인프라 비용 (10-20%)
├─ PostgreSQL: $50-200/월
├─ Redis: $20-100/월
├─ Kubernetes: $100-500/월
└─ 네트워크: $10-50/월

1.2 비용 최적화 우선순위

순위항목잠재 절감난이도
1LLM 캐싱60-80%낮음
2프롬프트 최적화30-50%중간
3임베딩 캐싱80-90%낮음
4모델 선택40-60%낮음
5배치 처리20-30%중간
6토큰 제한10-20%낮음

2. 데이터베이스 스키마

2.1 토큰 사용량 테이블

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
-- 토큰 사용량 추적
CREATE TABLE token_usage (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    
    -- 테넌트 정보
    tenant_id VARCHAR(100) NOT NULL,
    user_id VARCHAR(100),
    
    -- 서비스 정보
    service_type VARCHAR(50) NOT NULL, -- 'llm', 'embedding', 'rag'
    model_name VARCHAR(100) NOT NULL,   -- 'gpt-4', 'gpt-3.5-turbo', 'ada-002'
    
    -- 토큰 사용량
    input_tokens INTEGER DEFAULT 0,
    output_tokens INTEGER DEFAULT 0,
    total_tokens INTEGER NOT NULL,
    
    -- 비용 정보
    cost_per_1k_input DECIMAL(10, 6),
    cost_per_1k_output DECIMAL(10, 6),
    total_cost DECIMAL(10, 4) NOT NULL,
    
    -- 요청 정보
    request_id VARCHAR(100),
    endpoint VARCHAR(255),
    
    -- 메타데이터
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    metadata JSONB
);

-- 인덱스
CREATE INDEX idx_token_usage_tenant_created 
ON token_usage(tenant_id, created_at DESC);

CREATE INDEX idx_token_usage_service 
ON token_usage(service_type, created_at DESC);

CREATE INDEX idx_token_usage_user 
ON token_usage(user_id, created_at DESC);

-- 파티셔닝 (대용량 데이터용)
-- CREATE TABLE token_usage_2024_01 PARTITION OF token_usage
--     FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');

2.2 비용 예산 테이블

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
-- 테넌트별 비용 예산
CREATE TABLE cost_budgets (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    tenant_id VARCHAR(100) NOT NULL UNIQUE,
    
    -- 예산 설정
    daily_budget DECIMAL(10, 2),
    monthly_budget DECIMAL(10, 2),
    
    -- 현재 사용량
    daily_spent DECIMAL(10, 2) DEFAULT 0,
    monthly_spent DECIMAL(10, 2) DEFAULT 0,
    
    -- 알림 임계값 (%)
    warning_threshold INTEGER DEFAULT 80,  -- 80% 도달 시 경고
    critical_threshold INTEGER DEFAULT 95, -- 95% 도달 시 차단
    
    -- 제한 설정
    is_enabled BOOLEAN DEFAULT true,
    max_requests_per_minute INTEGER DEFAULT 100,
    
    -- 타임스탬프
    daily_reset_at TIMESTAMP,
    monthly_reset_at TIMESTAMP,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- 인덱스
CREATE INDEX idx_cost_budgets_tenant ON cost_budgets(tenant_id);

2.3 비용 알림 로그

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
-- 비용 알림 이력
CREATE TABLE cost_alerts (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    tenant_id VARCHAR(100) NOT NULL,
    
    -- 알림 정보
    alert_type VARCHAR(50) NOT NULL, -- 'warning', 'critical', 'exceeded'
    threshold_type VARCHAR(50) NOT NULL, -- 'daily', 'monthly'
    
    -- 비용 정보
    current_spent DECIMAL(10, 2),
    budget_limit DECIMAL(10, 2),
    usage_percentage DECIMAL(5, 2),
    
    -- 상태
    is_notified BOOLEAN DEFAULT false,
    notified_at TIMESTAMP,
    
    -- 메시지
    message TEXT,
    
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- 인덱스
CREATE INDEX idx_cost_alerts_tenant_created 
ON cost_alerts(tenant_id, created_at DESC);

3. 비용 추적 구현

3.1 토큰 카운터

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
// internal/billing/token_counter.go
package billing

import (
    "context"
    "time"
    
    "github.com/google/uuid"
    "github.com/jackc/pgx/v5/pgxpool"
)

// TokenUsage : 토큰 사용량 기록
type TokenUsage struct {
    ID              string    `json:"id"`
    TenantID        string    `json:"tenant_id"`
    UserID          string    `json:"user_id"`
    ServiceType     string    `json:"service_type"`
    ModelName       string    `json:"model_name"`
    InputTokens     int       `json:"input_tokens"`
    OutputTokens    int       `json:"output_tokens"`
    TotalTokens     int       `json:"total_tokens"`
    CostPer1KInput  float64   `json:"cost_per_1k_input"`
    CostPer1KOutput float64   `json:"cost_per_1k_output"`
    TotalCost       float64   `json:"total_cost"`
    RequestID       string    `json:"request_id"`
    Endpoint        string    `json:"endpoint"`
    CreatedAt       time.Time `json:"created_at"`
}

// TokenCounter : 토큰 사용량 추적기
type TokenCounter struct {
    db *pgxpool.Pool
}

func NewTokenCounter(db *pgxpool.Pool) *TokenCounter {
    return &TokenCounter{db: db}
}

// RecordUsage : 토큰 사용량 기록
func (tc *TokenCounter) RecordUsage(ctx context.Context, usage *TokenUsage) error {
    // 비용 계산
    usage.TotalCost = tc.calculateCost(usage)
    
    query := `
        INSERT INTO token_usage (
            id, tenant_id, user_id, service_type, model_name,
            input_tokens, output_tokens, total_tokens,
            cost_per_1k_input, cost_per_1k_output, total_cost,
            request_id, endpoint, created_at
        ) VALUES (
            $1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14
        )
    `
    
    _, err := tc.db.Exec(ctx, query,
        uuid.New().String(),
        usage.TenantID,
        usage.UserID,
        usage.ServiceType,
        usage.ModelName,
        usage.InputTokens,
        usage.OutputTokens,
        usage.TotalTokens,
        usage.CostPer1KInput,
        usage.CostPer1KOutput,
        usage.TotalCost,
        usage.RequestID,
        usage.Endpoint,
        time.Now(),
    )
    
    return err
}

// calculateCost : 비용 계산
func (tc *TokenCounter) calculateCost(usage *TokenUsage) float64 {
    inputCost := float64(usage.InputTokens) / 1000.0 * usage.CostPer1KInput
    outputCost := float64(usage.OutputTokens) / 1000.0 * usage.CostPer1KOutput
    return inputCost + outputCost
}

// GetDailyUsage : 일일 사용량 조회
func (tc *TokenCounter) GetDailyUsage(ctx context.Context, tenantID string) (float64, error) {
    query := `
        SELECT COALESCE(SUM(total_cost), 0)
        FROM token_usage
        WHERE tenant_id = $1
          AND created_at >= CURRENT_DATE
    `
    
    var totalCost float64
    err := tc.db.QueryRow(ctx, query, tenantID).Scan(&totalCost)
    return totalCost, err
}

// GetMonthlyUsage : 월간 사용량 조회
func (tc *TokenCounter) GetMonthlyUsage(ctx context.Context, tenantID string) (float64, error) {
    query := `
        SELECT COALESCE(SUM(total_cost), 0)
        FROM token_usage
        WHERE tenant_id = $1
          AND created_at >= date_trunc('month', CURRENT_DATE)
    `
    
    var totalCost float64
    err := tc.db.QueryRow(ctx, query, tenantID).Scan(&totalCost)
    return totalCost, err
}

// GetUsageByService : 서비스별 사용량 조회
func (tc *TokenCounter) GetUsageByService(ctx context.Context, tenantID string, startDate, endDate time.Time) (map[string]float64, error) {
    query := `
        SELECT service_type, SUM(total_cost) as cost
        FROM token_usage
        WHERE tenant_id = $1
          AND created_at BETWEEN $2 AND $3
        GROUP BY service_type
    `
    
    rows, err := tc.db.Query(ctx, query, tenantID, startDate, endDate)
    if err != nil {
        return nil, err
    }
    defer rows.Close()
    
    result := make(map[string]float64)
    for rows.Next() {
        var serviceType string
        var cost float64
        if err := rows.Scan(&serviceType, &cost); err != nil {
            return nil, err
        }
        result[serviceType] = cost
    }
    
    return result, nil
}

3.2 모델별 가격표

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
// internal/billing/pricing.go
package billing

// ModelPricing : 모델별 가격 정보
type ModelPricing struct {
    InputCostPer1K  float64
    OutputCostPer1K float64
}

// PricingTable : 가격표
var PricingTable = map[string]ModelPricing{
    // OpenAI GPT-4
    "gpt-4": {
        InputCostPer1K:  0.03,
        OutputCostPer1K: 0.06,
    },
    "gpt-4-turbo": {
        InputCostPer1K:  0.01,
        OutputCostPer1K: 0.03,
    },
    
    // OpenAI GPT-3.5
    "gpt-3.5-turbo": {
        InputCostPer1K:  0.0005,
        OutputCostPer1K: 0.0015,
    },
    
    // Anthropic Claude
    "claude-3-opus": {
        InputCostPer1K:  0.015,
        OutputCostPer1K: 0.075,
    },
    "claude-3-sonnet": {
        InputCostPer1K:  0.003,
        OutputCostPer1K: 0.015,
    },
    "claude-3-haiku": {
        InputCostPer1K:  0.00025,
        OutputCostPer1K: 0.00125,
    },
    
    // Embeddings
    "text-embedding-ada-002": {
        InputCostPer1K:  0.0001,
        OutputCostPer1K: 0.0,
    },
    "text-embedding-3-small": {
        InputCostPer1K:  0.00002,
        OutputCostPer1K: 0.0,
    },
    "text-embedding-3-large": {
        InputCostPer1K:  0.00013,
        OutputCostPer1K: 0.0,
    },
}

// GetPricing : 모델 가격 조회
func GetPricing(modelName string) (ModelPricing, bool) {
    pricing, exists := PricingTable[modelName]
    return pricing, exists
}

// EstimateCost : 비용 예측
func EstimateCost(modelName string, inputTokens, outputTokens int) float64 {
    pricing, exists := GetPricing(modelName)
    if !exists {
        return 0.0
    }
    
    inputCost := float64(inputTokens) / 1000.0 * pricing.InputCostPer1K
    outputCost := float64(outputTokens) / 1000.0 * pricing.OutputCostPer1K
    
    return inputCost + outputCost
}

4. 예산 관리

4.1 예산 매니저

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
// internal/billing/budget_manager.go
package billing

import (
    "context"
    "fmt"
    "time"
    
    "github.com/jackc/pgx/v5/pgxpool"
)

// CostBudget : 비용 예산
type CostBudget struct {
    ID                   string    `json:"id"`
    TenantID             string    `json:"tenant_id"`
    DailyBudget          float64   `json:"daily_budget"`
    MonthlyBudget        float64   `json:"monthly_budget"`
    DailySpent           float64   `json:"daily_spent"`
    MonthlySpent         float64   `json:"monthly_spent"`
    WarningThreshold     int       `json:"warning_threshold"`
    CriticalThreshold    int       `json:"critical_threshold"`
    IsEnabled            bool      `json:"is_enabled"`
    MaxRequestsPerMinute int       `json:"max_requests_per_minute"`
    DailyResetAt         time.Time `json:"daily_reset_at"`
    MonthlyResetAt       time.Time `json:"monthly_reset_at"`
}

// BudgetManager : 예산 관리자
type BudgetManager struct {
    db           *pgxpool.Pool
    tokenCounter *TokenCounter
}

func NewBudgetManager(db *pgxpool.Pool, tc *TokenCounter) *BudgetManager {
    return &BudgetManager{
        db:           db,
        tokenCounter: tc,
    }
}

// CheckBudget : 예산 확인
func (bm *BudgetManager) CheckBudget(ctx context.Context, tenantID string, estimatedCost float64) error {
    // 예산 조회
    budget, err := bm.GetBudget(ctx, tenantID)
    if err != nil {
        return err
    }
    
    if !budget.IsEnabled {
        return nil
    }
    
    // 일일 예산 체크
    dailyTotal := budget.DailySpent + estimatedCost
    if budget.DailyBudget > 0 && dailyTotal > budget.DailyBudget {
        return fmt.Errorf("daily budget exceeded: $%.2f / $%.2f", 
            dailyTotal, budget.DailyBudget)
    }
    
    // 월간 예산 체크
    monthlyTotal := budget.MonthlySpent + estimatedCost
    if budget.MonthlyBudget > 0 && monthlyTotal > budget.MonthlyBudget {
        return fmt.Errorf("monthly budget exceeded: $%.2f / $%.2f", 
            monthlyTotal, budget.MonthlyBudget)
    }
    
    // 경고 임계값 체크
    if budget.DailyBudget > 0 {
        dailyPercentage := (dailyTotal / budget.DailyBudget) * 100
        if dailyPercentage >= float64(budget.WarningThreshold) {
            bm.createAlert(ctx, tenantID, "warning", "daily", dailyTotal, budget.DailyBudget)
        }
    }
    
    return nil
}

// UpdateSpent : 사용량 업데이트
func (bm *BudgetManager) UpdateSpent(ctx context.Context, tenantID string, cost float64) error {
    query := `
        UPDATE cost_budgets
        SET daily_spent = daily_spent + $1,
            monthly_spent = monthly_spent + $1,
            updated_at = CURRENT_TIMESTAMP
        WHERE tenant_id = $2
    `
    
    _, err := bm.db.Exec(ctx, query, cost, tenantID)
    return err
}

// ResetDailyBudget : 일일 예산 리셋
func (bm *BudgetManager) ResetDailyBudget(ctx context.Context) error {
    query := `
        UPDATE cost_budgets
        SET daily_spent = 0,
            daily_reset_at = CURRENT_TIMESTAMP
        WHERE daily_reset_at < CURRENT_DATE
    `
    
    _, err := bm.db.Exec(ctx, query)
    return err
}

// ResetMonthlyBudget : 월간 예산 리셋
func (bm *BudgetManager) ResetMonthlyBudget(ctx context.Context) error {
    query := `
        UPDATE cost_budgets
        SET monthly_spent = 0,
            monthly_reset_at = CURRENT_TIMESTAMP
        WHERE monthly_reset_at < date_trunc('month', CURRENT_DATE)
    `
    
    _, err := bm.db.Exec(ctx, query)
    return err
}

// GetBudget : 예산 조회
func (bm *BudgetManager) GetBudget(ctx context.Context, tenantID string) (*CostBudget, error) {
    query := `
        SELECT id, tenant_id, daily_budget, monthly_budget,
               daily_spent, monthly_spent, warning_threshold,
               critical_threshold, is_enabled, max_requests_per_minute
        FROM cost_budgets
        WHERE tenant_id = $1
    `
    
    budget := &CostBudget{}
    err := bm.db.QueryRow(ctx, query, tenantID).Scan(
        &budget.ID,
        &budget.TenantID,
        &budget.DailyBudget,
        &budget.MonthlyBudget,
        &budget.DailySpent,
        &budget.MonthlySpent,
        &budget.WarningThreshold,
        &budget.CriticalThreshold,
        &budget.IsEnabled,
        &budget.MaxRequestsPerMinute,
    )
    
    return budget, err
}

// createAlert : 알림 생성
func (bm *BudgetManager) createAlert(ctx context.Context, tenantID, alertType, thresholdType string, currentSpent, budgetLimit float64) error {
    usagePercentage := (currentSpent / budgetLimit) * 100
    
    query := `
        INSERT INTO cost_alerts (
            tenant_id, alert_type, threshold_type,
            current_spent, budget_limit, usage_percentage,
            message
        ) VALUES ($1, $2, $3, $4, $5, $6, $7)
    `
    
    message := fmt.Sprintf(
        "Budget %s alert: $%.2f / $%.2f (%.1f%%)",
        alertType, currentSpent, budgetLimit, usagePercentage,
    )
    
    _, err := bm.db.Exec(ctx, query,
        tenantID, alertType, thresholdType,
        currentSpent, budgetLimit, usagePercentage,
        message,
    )
    
    return err
}

5. 미들웨어 통합

5.1 비용 추적 미들웨어

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
// internal/middleware/billing_middleware.go
package middleware

import (
    "context"
    "net/http"
    
    "practice-go-lang/internal/billing"
)

// BillingMiddleware : 비용 추적 미들웨어
func BillingMiddleware(
    tokenCounter *billing.TokenCounter,
    budgetManager *billing.BudgetManager,
) func(http.Handler) http.Handler {
    return func(next http.Handler) http.Handler {
        return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
            ctx := r.Context()
            
            // 테넌트 ID 추출
            tenantID := getTenantID(ctx)
            if tenantID == "" {
                next.ServeHTTP(w, r)
                return
            }
            
            // 예산 체크
            estimatedCost := 0.01 // 최소 예상 비용
            if err := budgetManager.CheckBudget(ctx, tenantID, estimatedCost); err != nil {
                http.Error(w, "Budget limit exceeded", http.StatusPaymentRequired)
                return
            }
            
            // ResponseWriter 래핑 (토큰 수 캡처)
            bw := &billingWriter{
                ResponseWriter: w,
                tenantID:       tenantID,
                tokenCounter:   tokenCounter,
                budgetManager:  budgetManager,
            }
            
            next.ServeHTTP(bw, r)
        })
    }
}

// billingWriter : 응답 래퍼
type billingWriter struct {
    http.ResponseWriter
    tenantID      string
    tokenCounter  *billing.TokenCounter
    budgetManager *billing.BudgetManager
}

// Write : 응답 기록
func (bw *billingWriter) Write(b []byte) (int, error) {
    // 실제 응답에서 토큰 수 추출 (구현 필요)
    // 여기서는 간단히 예시만
    
    return bw.ResponseWriter.Write(b)
}

func getTenantID(ctx context.Context) string {
    tenantID, _ := ctx.Value("tenant_id").(string)
    return tenantID
}

6. 비용 리포트

6.1 일일 리포트

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
// internal/billing/reporter.go
package billing

import (
    "context"
    "time"
)

// DailyReport : 일일 비용 리포트
type DailyReport struct {
    TenantID      string             `json:"tenant_id"`
    Date          string             `json:"date"`
    TotalCost     float64            `json:"total_cost"`
    TotalTokens   int                `json:"total_tokens"`
    RequestCount  int                `json:"request_count"`
    ByService     map[string]float64 `json:"by_service"`
    ByModel       map[string]float64 `json:"by_model"`
    TopUsers      []UserCost         `json:"top_users"`
}

// UserCost : 사용자별 비용
type UserCost struct {
    UserID string  `json:"user_id"`
    Cost   float64 `json:"cost"`
}

// Reporter : 비용 리포터
type Reporter struct {
    tokenCounter *TokenCounter
}

func NewReporter(tc *TokenCounter) *Reporter {
    return &Reporter{tokenCounter: tc}
}

// GenerateDailyReport : 일일 리포트 생성
func (r *Reporter) GenerateDailyReport(ctx context.Context, tenantID string, date time.Time) (*DailyReport, error) {
    query := `
        SELECT 
            COALESCE(SUM(total_cost), 0) as total_cost,
            COALESCE(SUM(total_tokens), 0) as total_tokens,
            COUNT(*) as request_count
        FROM token_usage
        WHERE tenant_id = $1
          AND DATE(created_at) = $2
    `
    
    report := &DailyReport{
        TenantID: tenantID,
        Date:     date.Format("2006-01-02"),
    }
    
    err := r.tokenCounter.db.QueryRow(ctx, query, tenantID, date).Scan(
        &report.TotalCost,
        &report.TotalTokens,
        &report.RequestCount,
    )
    if err != nil {
        return nil, err
    }
    
    // 서비스별 비용
    report.ByService, _ = r.getByService(ctx, tenantID, date)
    
    // 모델별 비용
    report.ByModel, _ = r.getByModel(ctx, tenantID, date)
    
    // Top 사용자
    report.TopUsers, _ = r.getTopUsers(ctx, tenantID, date, 10)
    
    return report, nil
}

// getByService : 서비스별 비용
func (r *Reporter) getByService(ctx context.Context, tenantID string, date time.Time) (map[string]float64, error) {
    query := `
        SELECT service_type, SUM(total_cost)
        FROM token_usage
        WHERE tenant_id = $1 AND DATE(created_at) = $2
        GROUP BY service_type
    `
    
    rows, err := r.tokenCounter.db.Query(ctx, query, tenantID, date)
    if err != nil {
        return nil, err
    }
    defer rows.Close()
    
    result := make(map[string]float64)
    for rows.Next() {
        var serviceType string
        var cost float64
        rows.Scan(&serviceType, &cost)
        result[serviceType] = cost
    }
    
    return result, nil
}

// getByModel : 모델별 비용
func (r *Reporter) getByModel(ctx context.Context, tenantID string, date time.Time) (map[string]float64, error) {
    query := `
        SELECT model_name, SUM(total_cost)
        FROM token_usage
        WHERE tenant_id = $1 AND DATE(created_at) = $2
        GROUP BY model_name
    `
    
    rows, err := r.tokenCounter.db.Query(ctx, query, tenantID, date)
    if err != nil {
        return nil, err
    }
    defer rows.Close()
    
    result := make(map[string]float64)
    for rows.Next() {
        var modelName string
        var cost float64
        rows.Scan(&modelName, &cost)
        result[modelName] = cost
    }
    
    return result, nil
}

// getTopUsers : Top 사용자
func (r *Reporter) getTopUsers(ctx context.Context, tenantID string, date time.Time, limit int) ([]UserCost, error) {
    query := `
        SELECT user_id, SUM(total_cost) as cost
        FROM token_usage
        WHERE tenant_id = $1 AND DATE(created_at) = $2
        GROUP BY user_id
        ORDER BY cost DESC
        LIMIT $3
    `
    
    rows, err := r.tokenCounter.db.Query(ctx, query, tenantID, date, limit)
    if err != nil {
        return nil, err
    }
    defer rows.Close()
    
    var users []UserCost
    for rows.Next() {
        var uc UserCost
        rows.Scan(&uc.UserID, &uc.Cost)
        users = append(users, uc)
    }
    
    return users, nil
}

7. API 엔드포인트

7.1 비용 조회 API

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
// internal/handlers/billing_handler.go
package handlers

import (
    "encoding/json"
    "net/http"
    "time"
    
    "practice-go-lang/internal/billing"
)

// BillingHandler : 비용 조회 핸들러
type BillingHandler struct {
    tokenCounter  *billing.TokenCounter
    budgetManager *billing.BudgetManager
    reporter      *billing.Reporter
}

func NewBillingHandler(tc *billing.TokenCounter, bm *billing.BudgetManager, r *billing.Reporter) *BillingHandler {
    return &BillingHandler{
        tokenCounter:  tc,
        budgetManager: bm,
        reporter:      r,
    }
}

// GetDailyUsage : GET /api/billing/daily
func (h *BillingHandler) GetDailyUsage(w http.ResponseWriter, r *http.Request) {
    ctx := r.Context()
    tenantID := getTenantID(ctx)
    
    cost, err := h.tokenCounter.GetDailyUsage(ctx, tenantID)
    if err != nil {
        http.Error(w, err.Error(), http.StatusInternalServerError)
        return
    }
    
    w.Header().Set("Content-Type", "application/json")
    json.NewEncoder(w).Encode(map[string]interface{}{
        "tenant_id":  tenantID,
        "date":       time.Now().Format("2006-01-02"),
        "total_cost": cost,
    })
}

// GetMonthlyUsage : GET /api/billing/monthly
func (h *BillingHandler) GetMonthlyUsage(w http.ResponseWriter, r *http.Request) {
    ctx := r.Context()
    tenantID := getTenantID(ctx)
    
    cost, err := h.tokenCounter.GetMonthlyUsage(ctx, tenantID)
    if err != nil {
        http.Error(w, err.Error(), http.StatusInternalServerError)
        return
    }
    
    w.Header().Set("Content-Type", "application/json")
    json.NewEncoder(w).Encode(map[string]interface{}{
        "tenant_id":  tenantID,
        "month":      time.Now().Format("2006-01"),
        "total_cost": cost,
    })
}

// GetDailyReport : GET /api/billing/report/daily
func (h *BillingHandler) GetDailyReport(w http.ResponseWriter, r *http.Request) {
    ctx := r.Context()
    tenantID := getTenantID(ctx)
    
    // 날짜 파라미터 (기본: 오늘)
    dateStr := r.URL.Query().Get("date")
    var date time.Time
    if dateStr == "" {
        date = time.Now()
    } else {
        var err error
        date, err = time.Parse("2006-01-02", dateStr)
        if err != nil {
            http.Error(w, "Invalid date format", http.StatusBadRequest)
            return
        }
    }
    
    report, err := h.reporter.GenerateDailyReport(ctx, tenantID, date)
    if err != nil {
        http.Error(w, err.Error(), http.StatusInternalServerError)
        return
    }
    
    w.Header().Set("Content-Type", "application/json")
    json.NewEncoder(w).Encode(report)
}

// GetBudget : GET /api/billing/budget
func (h *BillingHandler) GetBudget(w http.ResponseWriter, r *http.Request) {
    ctx := r.Context()
    tenantID := getTenantID(ctx)
    
    budget, err := h.budgetManager.GetBudget(ctx, tenantID)
    if err != nil {
        http.Error(w, err.Error(), http.StatusInternalServerError)
        return
    }
    
    w.Header().Set("Content-Type", "application/json")
    json.NewEncoder(w).Encode(budget)
}

8. 비용 최적화 전략

8.1 캐싱 전략

1
2
3
4
5
6
7
// 임베딩 캐싱으로 80% 절감
Before: $1000/ (100 호출)
After:  $200/ (캐시 히트율 80%)

// Redis 캐시 키 전략
key := fmt.Sprintf("emb:%s", md5(text))
ttl := 30 * 24 * time.Hour // 30일

8.2 프롬프트 최적화

1
2
3
4
5
6
7
8
9
10
Bad Prompt (500 tokens):
"I need you to analyze this document thoroughly and provide 
a comprehensive summary that covers all the main points, 
including background information, key findings, methodology, 
results, and conclusions. Please make sure to be very detailed..."

Good Prompt (100 tokens):
"Summarize: background, findings, methodology, results, conclusions"

절감: 80% (500 → 100 tokens)

8.3 모델 선택

1
2
3
4
5
6
7
8
9
10
작업별 최적 모델:

간단한 분류: GPT-3.5 Turbo
복잡한 추론: GPT-4 Turbo
대량 처리: Claude Haiku
요약: Claude Sonnet

예시:
GPT-4: $0.03/1K input → GPT-3.5: $0.0005/1K
절감: 98%!

핵심 요약

비용 추적 체크리스트

데이터베이스

  • token_usage 테이블
  • cost_budgets 테이블
  • cost_alerts 테이블

핵심 컴포넌트

  • TokenCounter (사용량 기록)
  • BudgetManager (예산 관리)
  • Reporter (리포트 생성)

최적화 전략

  • 임베딩 캐싱 (80% 절감)
  • 프롬프트 최적화 (50% 절감)
  • 모델 선택 (60% 절감)

API 엔드포인트

  • GET /api/billing/daily
  • GET /api/billing/monthly
  • GET /api/billing/report/daily
  • GET /api/billing/budget

비용 절감 효과

최적화BeforeAfter절감률
임베딩 캐싱$1000$20080%
프롬프트 단축$500$25050%
모델 변경$1000$40060%
배치 처리$300$21030%

작성일: 2024-12-13

이 기사는 저작권자의 CC BY 4.0 라이센스를 따릅니다.