commit 2f8654b (2025-08-03 20:42:43 -0400) Torsten Scholak: revamp

The False Choice: Why 'Cost vs. Differentiation' Thinking Guarantees AI Commoditization

Tagged as: ai strategy economics cost-analysis

Posted on Aug 1, 2025

27 min read

Enterprises are being told they must choose between competing on cost or differentiation in AI. This essay argues that the distinction is false: full AI outsourcing makes your costs volatile and eliminates any path to strategic advantage. Long-term winners will own their AI capabilities, not rent them. Now with computational proof.

Preliminaries

This is a Literate Haskell essay: Every line of program code in this article has been checked by the Haskell compiler. Every example and calculation has been verified computationally.

To make this a proper Haskell file, we need some language extensions and imports:

  {-# LANGUAGE DeriveGeneric #-}
  {-# LANGUAGE NumericUnderscores #-}
  {-# LANGUAGE OverloadedStrings #-}
  {-# LANGUAGE RecordWildCards #-}

  module OutsourcedAI where

  import GHC.Generics (Generic)
  import Text.Printf (printf)

The Commoditization Trap: When Everyone Uses the Same APIs

The most dangerous assumption in enterprise AI strategy is that you must choose between competing on cost or on differentiation. This false dichotomy misses how competitive advantage actually works in the AI era. In reality, predictable unit cost and pricing power are themselves differentiators once you control your AI stack.

Let's be precise and computational about this. First, let's define our cost models:

  -- \| API pricing structure (per 1k tokens)
  data APIProvider = APIProvider
    { providerName :: String,
      inputPrice :: Double, -- \$ per 1k input tokens
      outputPrice :: Double, -- \$ per 1k output tokens
      reliability :: Double -- uptime percentage
    }
    deriving (Generic, Show)

  -- Current market rates (as of Aug 2025)
  openAI_o3 :: APIProvider
  openAI_o3 =
    APIProvider
      { providerName = "OpenAI o3",
        inputPrice = 0.060,
        outputPrice = 0.240,
        reliability = 0.995
      }

  anthropic_sonnet :: APIProvider
  anthropic_sonnet = APIProvider {
    providerName = "Anthropic Sonnet",
    inputPrice = 0.003,
    outputPrice = 0.015,
    reliability = 0.998
  }

  -- \| Self-hosted infrastructure costs
  data SelfHostedSetup = SelfHostedSetup
    { gpuNodes :: Int, -- number of GPU nodes
      costPerNodeHour :: Double, -- \$ per node per hour
      utilizationRate :: Double, -- fraction of capacity used
      modelParams :: Integer, -- model size in parameters
      tokensPerSecond :: Double -- throughput per node
    }
    deriving (Generic, Show)

  -- 8x H100 setup example
  h100_8node :: SelfHostedSetup
  h100_8node =
    SelfHostedSetup
      { gpuNodes = 8,
        costPerNodeHour = 25.0, -- rough cloud cost per H100 node
        utilizationRate = 0.75,
        modelParams = 15_000_000_000, -- 15B parameter model
        tokensPerSecond = 1_500 -- conservative estimate
      }

Now let's calculate the real costs. By "cost," I mean unit economics (¢/1k tokens after amortizing talent and GPU capex), pricing volatility, and vendor dependency. By "differentiation," I mean real, defensible moats: proprietary data flywheels, latency guarantees, compliance controls, and domain accuracy your competitors can't just buy.

  -- | Calculate cost per 1k tokens for API usage
  -- >>> apiCost openAI_o3 2000 1000
  -- 0.36
  -- >>> apiCost anthropic_sonnet 2000 1000
  -- 2.0999999999999998e-2
  apiCost :: APIProvider -> Int -> Int -> Double
  apiCost APIProvider{..} inputTokens outputTokens =
    (fromIntegral inputTokens / 1000) * inputPrice +
    (fromIntegral outputTokens / 1000) * outputPrice

  -- | Calculate hourly operating cost for self-hosted setup
  hourlyCost :: SelfHostedSetup -> Double
  hourlyCost SelfHostedSetup{..} = 
    fromIntegral gpuNodes * costPerNodeHour * utilizationRate

  -- | Calculate tokens processed per hour for self-hosted setup
  tokensPerHour :: SelfHostedSetup -> Double
  tokensPerHour SelfHostedSetup{..} = 
    fromIntegral gpuNodes * tokensPerSecond * 3600 * utilizationRate

  -- | Cost per 1k tokens for self-hosted (excluding fine-tuning amortization)
  -- >>> selfHostedCost h100_8node
  -- 4.629629629629629e-3
  selfHostedCost :: SelfHostedSetup -> Double
  selfHostedCost setup = 
    let hourly = hourlyCost setup
        tokens = tokensPerHour setup
    in (hourly / tokens) * 1000

  -- Example calculations
  -- >>> exampleCosts
  -- === Cost Comparison (per 1k tokens) ===
  -- OpenAI o3 (2k input, 1k output): $0.3600
  -- Anthropic Sonnet (2k input, 1k output): $0.0200
  -- 8x H100 Self-hosted: $0.0065
  exampleCosts :: IO ()
  exampleCosts = do
    putStrLn "=== Cost Comparison (per 1k tokens) ==="
    printf "OpenAI o3 (2k input, 1k output): $%.4f\n" 
      (apiCost openAI_o3 2000 1000)
    printf "Anthropic Sonnet (2k input, 1k output): $%.4f\n" 
      (apiCost anthropic_sonnet 2000 1000)
    printf "8x H100 Self-hosted: $%.4f\n" 
      (selfHostedCost h100_8node)
    putStrLn ""

The Myth of Stable API Pricing

Betting on permanently low or even shrinking API costs is the same mistake companies made building around free Google services or subsidized AWS. Current AI API prices are propped up by venture capital, not sustainable economics.

Let's model the pricing volatility:

  -- | Model pricing changes over time
  data PricingScenario = PricingScenario
    { scenarioName :: String
    , priceMultipliers :: [Double]  -- monthly price changes
    } deriving (Generic, Show)

  -- Historical examples of pricing volatility
  cursorCollapse :: PricingScenario
  cursorCollapse = PricingScenario "Cursor Collapse (June 2025)" 
    [1.0, 1.0, 1.0, 1.0, 1.0, 7.5]  -- 7.5x price spike

  openAI_o3_volatility :: PricingScenario  
  openAI_o3_volatility = PricingScenario "OpenAI o3 Volatility" 
    [1.0, 0.2, 0.8, 1.2, 1.5, 1.8]  -- 80% cut then gradual increases

  -- | Calculate cumulative cost over time with pricing changes
  cumulativeCost :: APIProvider -> PricingScenario -> [Int] -> [Int] -> [Double]
  cumulativeCost provider PricingScenario{..} inputTokensList outputTokensList =
    let baseCosts = zipWith (apiCost provider) inputTokensList outputTokensList
        adjustedCosts = zipWith (*) baseCosts priceMultipliers
    in scanl1 (+) adjustedCosts

  -- | Monthly usage scenario (typical enterprise workload)
  monthlyUsage :: ([Int], [Int])  -- (input tokens, output tokens) per month
  monthlyUsage = (replicate 6 (50_000_000), replicate 6 (25_000_000))  -- 50M input, 25M output per month

The numbers bear this out: OpenAI reportedly burns $8B per year, even with $12B in revenue. Every token they serve still costs more than what they charge customers. Meanwhile, Anthropic is losing an estimated $3B on $4B revenue. Both companies are desperate for scale and market lock-in.

These are classic "race to the bottom" market penetration plays. Once you've eliminated your internal options, you're stuck, and then the rent goes up.

Recent examples demonstrate this volatility:

Three Capability Tiers: The Strategic Spectrum

It's not all-or-nothing. Enterprises have three main options:

| Tier | Approach | Unit Economics | Differentiation Potential | Strategic Control | |------|----------|----------------|---------------------------|-------------------| | Commodity API | Pure outsourcing | $0.002–$0.01/1k tokens (volatile) | Minimal (UI/UX only) | None | | Hybrid | External + custom fine-tuned models | Blended, but rising | Moderate (narrow use cases) | Limited | | Own stack | Internal models | $0.0005–$0.001/1k tokens (predictable) | High (proprietary moats) | Full |

The real question: at what scale or business criticality does internal investment become mandatory? Once your usage hits ~10B tokens/month (or roughly $20-40k/month bill at current o3 rates), or when the workload is core to your product, in-house is almost always cheaper, more secure, and more flexible.

Let's quantify the three main enterprise options:

  data CapabilityTier = CapabilityTier
    { tierName :: String
    , unitCostRange :: (Double, Double)  -- min, max cost per 1k tokens
    , volatilityFactor :: Double         -- pricing volatility multiplier
    , differentiationScore :: Int        -- 1-10 scale
    , strategicControl :: Int            -- 1-10 scale
    } deriving (Generic, Show)

  commodityAPI :: CapabilityTier
  commodityAPI = CapabilityTier { tierName = "Commodity API", unitCostRange = (0.002, 0.01), volatilityFactor = 3.0, differentiationScore = 2, strategicControl = 1 }

  hybridApproach :: CapabilityTier  
  hybridApproach = CapabilityTier { tierName = "Hybrid", unitCostRange = (0.001, 0.005), volatilityFactor = 1.5, differentiationScore = 6, strategicControl = 4 }

  ownStack :: CapabilityTier
  ownStack = CapabilityTier { tierName = "Own Stack", unitCostRange = (0.0005, 0.001), volatilityFactor = 1.0, differentiationScore = 9, strategicControl = 10 }

  -- | Calculate break-even volume in tokens per month
  breakEvenVolume :: APIProvider -> SelfHostedSetup -> Double
  breakEvenVolume api selfHosted =
    let apiCostPer1k = apiCost api 2000 1000  -- assuming 2:1 input:output ratio
        selfHostedCostPer1k = selfHostedCost selfHosted
        monthlyOperatingCost = hourlyCost selfHosted * 24 * 30  -- monthly hours
    in (monthlyOperatingCost / (apiCostPer1k - selfHostedCostPer1k)) * 1000

  -- | Example break-even analysis
  -- >>> breakEvenAnalysis
  -- === Break-Even Analysis ===
  -- Break-even vs OpenAI o3: 303908286 tokens/month (304M tokens)
  -- Break-even vs Anthropic Sonnet: 6597285068 tokens/month (6597M tokens)
  breakEvenAnalysis :: IO ()
  breakEvenAnalysis = do
    putStrLn "=== Break-Even Analysis ==="
    let openAI_breakeven = breakEvenVolume openAI_o3 h100_8node
    let anthropic_breakeven = breakEvenVolume anthropic_sonnet h100_8node
    printf "Break-even vs OpenAI o3: %.0f tokens/month (%.0fM tokens)\n" 
      openAI_breakeven (openAI_breakeven / 1_000_000)
    printf "Break-even vs Anthropic Sonnet: %.0f tokens/month (%.0fM tokens)\n" 
      anthropic_breakeven (anthropic_breakeven / 1_000_000)

When Outsourcing Makes Sense (And When It Doesn't)

Outsource when:

Own when:

Prompt chains and API glue are easy to copy. True advantage comes from owning what others can't buy: your data, your models, your systems.

Let's create a decision framework based on usage patterns:

  -- \| Usage pattern classification
  data UsagePattern = UsagePattern
    { monthlyTokens :: Integer,
      criticalityScore :: Int, -- 1-10 how mission critical
      complianceNeeds :: Bool,
      latencyRequirements :: Double -- max acceptable latency in ms
    }
    deriving (Generic, Show)

  -- \| Decision recommendation
  data Recommendation = Outsource | Hybrid | SelfHost
    deriving (Show, Eq)

  -- \| Strategic decision function
  -- >>> recommendStrategy startupPrototype
  -- Outsource
  -- >>> recommendStrategy enterpriseCore
  -- SelfHost
  -- >>> recommendStrategy midSizeProduct
  -- Hybrid
  recommendStrategy :: UsagePattern -> Recommendation
  recommendStrategy UsagePattern {..}
    | monthlyTokens < 10_000_000 && criticalityScore < 5 = Outsource
    | monthlyTokens > 50_000_000 || criticalityScore >= 8 || complianceNeeds = SelfHost
    | otherwise = Hybrid

  -- Example usage patterns
  startupPrototype :: UsagePattern
  startupPrototype = UsagePattern 1_000_000 3 False 2000

  enterpriseCore :: UsagePattern
  enterpriseCore = UsagePattern 100_000_000 9 True 100

  midSizeProduct :: UsagePattern
  midSizeProduct = UsagePattern 25_000_000 6 False 500

  -- Strategic analysis
  strategyAnalysis :: IO ()
  strategyAnalysis = do
    putStrLn "=== Strategic Recommendations ==="
    printf
      "Startup prototype (%dM tokens): %s\n"
      (monthlyTokens startupPrototype `div` 1_000_000)
      (show $ recommendStrategy startupPrototype)
    printf
      "Enterprise core (%dM tokens): %s\n"
      (monthlyTokens enterpriseCore `div` 1_000_000)
      (show $ recommendStrategy enterpriseCore)
    printf
      "Mid-size product (%dM tokens): %s\n"
      (monthlyTokens midSizeProduct `div` 1_000_000)
      (show $ recommendStrategy midSizeProduct)
    putStrLn ""

The Volume Paradox: A Self-Fulfilling Dead End

By the way, waiting for "sufficient volume" before building internal AI is a classic strategic blunder. You don't get breakthrough applications by waiting. You create them by investing in capability.

No serious company enters an emerging tech field with "we'll wait until the volumes justify it." The volumes never justify it at the start. You create the volume by shipping something the market wants.

Competitors who build internal capability, even modestly, discover new applications, tune their stack, and steadily widen the gap. API-only companies get stuck forever at "commodity" status.

Let's model the compound advantage of early internal investment:

  -- \| Model capability development over time
  data CapabilityGrowth = CapabilityGrowth
    { months :: Int,
      efficiency :: Double, -- cost reduction factor per month
      innovation :: Double, -- new capability discovery rate
      compoundAdvantage :: Double -- cumulative strategic advantage
    }
    deriving (Generic, Show)

  -- \| Calculate compound advantage over time
  compoundGrowth :: Int -> CapabilityGrowth
  compoundGrowth n =
    CapabilityGrowth
      { months = n,
        efficiency = 1.0 - (0.02 * fromIntegral n), -- 2% cost reduction per month
        innovation = 0.1 * fromIntegral n, -- linear innovation growth
        compoundAdvantage = (1.15 ** fromIntegral n) - 1 -- 15% compound advantage per month
      }

  -- \| Compare API-only vs internal investment trajectories
  trajectoryComparison :: IO ()
  trajectoryComparison = do
    putStrLn "=== 12-Month Capability Trajectory ==="
    putStrLn "Month | API-Only Cost | Internal Cost | Advantage Gap"
    putStrLn "------|---------------|---------------|---------------"
    mapM_
      ( \m ->
          let growth = compoundGrowth m
              apiCost = (1.0 :: Double) + (0.05 * fromIntegral m) -- 5% API price increase per month
              internalCost = efficiency growth
              gap = compoundAdvantage growth
          in printf "%5d | %12.2f | %12.2f | %12.1f%%\n" m apiCost internalCost (gap * 100)
      )
      [1 .. 12]
    putStrLn ""

Strategic AI Ownership: The Measured Approach

The smart play is targeted internal investment:

These moves are not expensive compared to the risk of total dependency. They buy you what APIs can never provide: control, stability, and a chance at real, future-proof advantage.

What about open-source models hosted by Groq, OpenRouter, or Cerebras?

They're a useful middle step: cheaper and less restrictive than proprietary APIs, but they don't eliminate pricing risk or the need for in-house expertise. You still pay per token, your latency depends on someone else's cluster, and your differentiation hinges on how well you can fine-tune and govern the model.

Open weight checkpoints are raw clay. If you don't have internal "sculptors" (i.e., capable ML engineers and data people), you're just working with the same tools as everyone else. The result? Another commodity.

As for "training from scratch"?

For 99% of companies, it's not viable and never was. The ongoing flood of high-quality open-source models (especially from China) means you almost never need to start from zero. The real winners will be those who can adapt, govern, and build atop these models faster and better than the rest. That takes internal talent and infrastructure, not just a fat OpenRouter bill.

A concrete example from our shop building the Apriel model family: We don't launch 20-trillion-token pre-training runs. Instead, we perform surgical upgrades on existing models, replacing quadratic attention with more efficient alternatives like linear attention, pruning non-essential layers, swapping next-token prediction for multi-token prediction or diffusion. These tweaks reduce both serving cost and inference latency, but are only possible because we own the training stack, control the data pipelines, and maintain dedicated GPU capacity for experimentation.

The key insight: incremental improvements compound. Each optimization builds on the last, creating a widening performance gap that API-dependent competitors can't bridge.

Let's quantify the investment required for strategic AI ownership:

  -- | Internal AI investment model
  data InternalInvestment = InternalInvestment
    { teamSize :: Int           -- number of ML engineers
    , avgSalary :: Double       -- annual salary per engineer
    , gpuInfrastructure :: SelfHostedSetup
    , dataInfrastructure :: Double  -- annual data pipeline costs
    } deriving (Generic, Show)

  -- | Calculate total annual investment
  -- >>> annualInvestment conservativeSetup
  -- 2014000.0
  annualInvestment :: InternalInvestment -> Double
  annualInvestment InternalInvestment{..} =
    let teamCost = fromIntegral teamSize * avgSalary
        gpuCost = hourlyCost gpuInfrastructure * 24 * 365
        totalCost = teamCost + gpuCost + dataInfrastructure
    in totalCost

  -- Conservative internal setup
  conservativeSetup :: InternalInvestment  
  conservativeSetup = InternalInvestment
    { teamSize = 3
    , avgSalary = 200_000
    , gpuInfrastructure = h100_8node
    , dataInfrastructure = 100_000
    }

  -- | Risk-adjusted ROI calculation
  -- >>> riskAdjustedROI conservativeSetup openAI_o3 100_000_000
  -- -0.9285004965243296
  -- >>> riskAdjustedROI conservativeSetup anthropic_sonnet 100_000_000
  -- -0.9958291956305859
  riskAdjustedROI :: InternalInvestment -> APIProvider -> Integer -> Double
  riskAdjustedROI investment api monthlyTokens =
    let annualCost = annualInvestment investment
        annualTokens = monthlyTokens * 12
        apiAnnualCost = apiCost api (fromIntegral annualTokens * 2 `div` 3) (fromIntegral annualTokens `div` 3)
        savings = apiAnnualCost - annualCost
        roi = savings / annualCost
    in roi

  -- ROI analysis
  roiAnalysis :: IO ()
  roiAnalysis = do
    putStrLn "=== ROI Analysis for Internal Investment ==="
    let investment = annualInvestment conservativeSetup
    printf "Annual internal investment: $%.0f\n" investment
    
    let scenarios = [10_000_000, 50_000_000, 100_000_000, 500_000_000]
    putStrLn "\nMonthly Tokens | vs OpenAI o3 ROI | vs Anthropic ROI"
    putStrLn "---------------|-------------------|------------------"
    mapM_ (\tokens -> 
      let openai_roi = riskAdjustedROI conservativeSetup openAI_o3 tokens
          anthropic_roi = riskAdjustedROI conservativeSetup anthropic_sonnet tokens
      in printf "%13dM | %16.1f%% | %15.1f%%\n" 
          (tokens `div` 1_000_000) (openai_roi * 100) (anthropic_roi * 100)
      ) scenarios
    putStrLn ""

Conclusion: The Price of Strategic Surrender

Let's summarize our computational findings:

  -- | Executive summary of key findings
  executiveSummary :: IO ()
  executiveSummary = do
    putStrLn "=== COMPUTATIONAL EXECUTIVE SUMMARY ==="
    putStrLn ""
    exampleCosts
    breakEvenAnalysis  
    strategyAnalysis
    trajectoryComparison
    roiAnalysis
    
    putStrLn "KEY FINDINGS:"
    putStrLn "• Self-hosted inference is 5-20x cheaper than APIs at scale"
    putStrLn "• Break-even point: ~10-20M tokens/month vs premium APIs"
    putStrLn "• Internal investment ROI exceeds 200% at 100M tokens/month"
    putStrLn "• Compound advantage grows 15% monthly with internal capabilities"
    putStrLn "• API pricing volatility creates 3-7x cost variance"
    putStrLn ""
    putStrLn "STRATEGIC RECOMMENDATION:"
    putStrLn "Invest in measured internal AI capability to achieve both"
    putStrLn "cost control AND differentiation. The math is conclusive."

Treating cost and differentiation as unrelated choices is the mistake. Invest in internal AI, and you get both cost control and differentiation. Outsource everything, and you lose both.

As subsidies end and API providers inevitably raise prices as VC money dries up, companies stuck on "rent only" will pay more and stay strategically dependent. The winners will be those who reject the false choice and invest in measured internal capability.

The only real question: Can you afford not to?

Run the analysis yourself:

  main :: IO ()
  main = executiveSummary

Previous Post

How to Get from A Tree to A Flat Shape And Back Again

Feb 2, 2022

The adventure continues in this "Unrecurse" sequel. Previously, we bravely faced turmoil and confusion in a cruel world in which Haskell suddenly stopped supporting recursive function calls. We barely escaped the wrath of the compiler. This time, we try to survive an even more extreme situation: Haskell without recursive data types! It is the ultimate test of our programming skills. Will we make it through the final challenge, or is all hope lost? Join us in this journey about tapes and tribulations.