commit 2f8654b (2025-08-03 20:42:43 -0400) Torsten Scholak: revamp
Tagged as: ai strategy economics cost-analysis
Posted on Aug 1, 2025
27 min read
Enterprises are being told they must choose between competing on cost or differentiation in AI. This essay argues that the distinction is false: full AI outsourcing makes your costs volatile and eliminates any path to strategic advantage. Long-term winners will own their AI capabilities, not rent them. Now with computational proof.
This is a Literate Haskell essay: Every line of program code in this article has been checked by the Haskell compiler. Every example and calculation has been verified computationally.
To make this a proper Haskell file, we need some language extensions and imports:
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE NumericUnderscores #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE RecordWildCards #-}
module OutsourcedAI where
import GHC.Generics (Generic)
import Text.Printf (printf)
The most dangerous assumption in enterprise AI strategy is that you must choose between competing on cost or on differentiation. This false dichotomy misses how competitive advantage actually works in the AI era. In reality, predictable unit cost and pricing power are themselves differentiators once you control your AI stack.
Let's be precise and computational about this. First, let's define our cost models:
-- \| API pricing structure (per 1k tokens)
data APIProvider = APIProvider
providerName :: String,
{ inputPrice :: Double, -- \$ per 1k input tokens
outputPrice :: Double, -- \$ per 1k output tokens
reliability :: Double -- uptime percentage
}deriving (Generic, Show)
-- Current market rates (as of Aug 2025)
openAI_o3 :: APIProvider
=
openAI_o3 APIProvider
= "OpenAI o3",
{ providerName = 0.060,
inputPrice = 0.240,
outputPrice = 0.995
reliability
}
anthropic_sonnet :: APIProvider
= APIProvider {
anthropic_sonnet = "Anthropic Sonnet",
providerName = 0.003,
inputPrice = 0.015,
outputPrice = 0.998
reliability
}
-- \| Self-hosted infrastructure costs
data SelfHostedSetup = SelfHostedSetup
gpuNodes :: Int, -- number of GPU nodes
{ costPerNodeHour :: Double, -- \$ per node per hour
utilizationRate :: Double, -- fraction of capacity used
modelParams :: Integer, -- model size in parameters
tokensPerSecond :: Double -- throughput per node
}deriving (Generic, Show)
-- 8x H100 setup example
h100_8node :: SelfHostedSetup
=
h100_8node SelfHostedSetup
= 8,
{ gpuNodes = 25.0, -- rough cloud cost per H100 node
costPerNodeHour = 0.75,
utilizationRate = 15_000_000_000, -- 15B parameter model
modelParams = 1_500 -- conservative estimate
tokensPerSecond }
Now let's calculate the real costs. By "cost," I mean unit economics (¢/1k tokens after amortizing talent and GPU capex), pricing volatility, and vendor dependency. By "differentiation," I mean real, defensible moats: proprietary data flywheels, latency guarantees, compliance controls, and domain accuracy your competitors can't just buy.
-- | Calculate cost per 1k tokens for API usage
-- >>> apiCost openAI_o3 2000 1000
-- 0.36
-- >>> apiCost anthropic_sonnet 2000 1000
-- 2.0999999999999998e-2
apiCost :: APIProvider -> Int -> Int -> Double
APIProvider{..} inputTokens outputTokens =
apiCost fromIntegral inputTokens / 1000) * inputPrice +
(fromIntegral outputTokens / 1000) * outputPrice
(
-- | Calculate hourly operating cost for self-hosted setup
hourlyCost :: SelfHostedSetup -> Double
SelfHostedSetup{..} =
hourlyCost fromIntegral gpuNodes * costPerNodeHour * utilizationRate
-- | Calculate tokens processed per hour for self-hosted setup
tokensPerHour :: SelfHostedSetup -> Double
SelfHostedSetup{..} =
tokensPerHour fromIntegral gpuNodes * tokensPerSecond * 3600 * utilizationRate
-- | Cost per 1k tokens for self-hosted (excluding fine-tuning amortization)
-- >>> selfHostedCost h100_8node
-- 4.629629629629629e-3
selfHostedCost :: SelfHostedSetup -> Double
=
selfHostedCost setup let hourly = hourlyCost setup
= tokensPerHour setup
tokens in (hourly / tokens) * 1000
-- Example calculations
-- >>> exampleCosts
-- === Cost Comparison (per 1k tokens) ===
-- OpenAI o3 (2k input, 1k output): $0.3600
-- Anthropic Sonnet (2k input, 1k output): $0.0200
-- 8x H100 Self-hosted: $0.0065
exampleCosts :: IO ()
= do
exampleCosts putStrLn "=== Cost Comparison (per 1k tokens) ==="
"OpenAI o3 (2k input, 1k output): $%.4f\n"
printf 2000 1000)
(apiCost openAI_o3 "Anthropic Sonnet (2k input, 1k output): $%.4f\n"
printf 2000 1000)
(apiCost anthropic_sonnet "8x H100 Self-hosted: $%.4f\n"
printf
(selfHostedCost h100_8node)putStrLn ""
Betting on permanently low or even shrinking API costs is the same mistake companies made building around free Google services or subsidized AWS. Current AI API prices are propped up by venture capital, not sustainable economics.
Let's model the pricing volatility:
-- | Model pricing changes over time
data PricingScenario = PricingScenario
scenarioName :: String
{ priceMultipliers :: [Double] -- monthly price changes
,deriving (Generic, Show)
}
-- Historical examples of pricing volatility
cursorCollapse :: PricingScenario
= PricingScenario "Cursor Collapse (June 2025)"
cursorCollapse 1.0, 1.0, 1.0, 1.0, 1.0, 7.5] -- 7.5x price spike
[
openAI_o3_volatility :: PricingScenario
= PricingScenario "OpenAI o3 Volatility"
openAI_o3_volatility 1.0, 0.2, 0.8, 1.2, 1.5, 1.8] -- 80% cut then gradual increases
[
-- | Calculate cumulative cost over time with pricing changes
cumulativeCost :: APIProvider -> PricingScenario -> [Int] -> [Int] -> [Double]
PricingScenario{..} inputTokensList outputTokensList =
cumulativeCost provider let baseCosts = zipWith (apiCost provider) inputTokensList outputTokensList
= zipWith (*) baseCosts priceMultipliers
adjustedCosts in scanl1 (+) adjustedCosts
-- | Monthly usage scenario (typical enterprise workload)
monthlyUsage :: ([Int], [Int]) -- (input tokens, output tokens) per month
= (replicate 6 (50_000_000), replicate 6 (25_000_000)) -- 50M input, 25M output per month monthlyUsage
The numbers bear this out: OpenAI reportedly burns $8B per year, even with $12B in revenue. Every token they serve still costs more than what they charge customers. Meanwhile, Anthropic is losing an estimated $3B on $4B revenue. Both companies are desperate for scale and market lock-in.
These are classic "race to the bottom" market penetration plays. Once you've eliminated your internal options, you're stuck, and then the rent goes up.
Recent examples demonstrate this volatility:
It's not all-or-nothing. Enterprises have three main options:
| Tier | Approach | Unit Economics | Differentiation Potential | Strategic Control | |------|----------|----------------|---------------------------|-------------------| | Commodity API | Pure outsourcing | $0.002–$0.01/1k tokens (volatile) | Minimal (UI/UX only) | None | | Hybrid | External + custom fine-tuned models | Blended, but rising | Moderate (narrow use cases) | Limited | | Own stack | Internal models | $0.0005–$0.001/1k tokens (predictable) | High (proprietary moats) | Full |
The real question: at what scale or business criticality does internal investment become mandatory? Once your usage hits ~10B tokens/month (or roughly $20-40k/month bill at current o3 rates), or when the workload is core to your product, in-house is almost always cheaper, more secure, and more flexible.
Let's quantify the three main enterprise options:
data CapabilityTier = CapabilityTier
tierName :: String
{ unitCostRange :: (Double, Double) -- min, max cost per 1k tokens
, volatilityFactor :: Double -- pricing volatility multiplier
, differentiationScore :: Int -- 1-10 scale
, strategicControl :: Int -- 1-10 scale
,deriving (Generic, Show)
}
commodityAPI :: CapabilityTier
= CapabilityTier { tierName = "Commodity API", unitCostRange = (0.002, 0.01), volatilityFactor = 3.0, differentiationScore = 2, strategicControl = 1 }
commodityAPI
hybridApproach :: CapabilityTier
= CapabilityTier { tierName = "Hybrid", unitCostRange = (0.001, 0.005), volatilityFactor = 1.5, differentiationScore = 6, strategicControl = 4 }
hybridApproach
ownStack :: CapabilityTier
= CapabilityTier { tierName = "Own Stack", unitCostRange = (0.0005, 0.001), volatilityFactor = 1.0, differentiationScore = 9, strategicControl = 10 }
ownStack
-- | Calculate break-even volume in tokens per month
breakEvenVolume :: APIProvider -> SelfHostedSetup -> Double
=
breakEvenVolume api selfHosted let apiCostPer1k = apiCost api 2000 1000 -- assuming 2:1 input:output ratio
= selfHostedCost selfHosted
selfHostedCostPer1k = hourlyCost selfHosted * 24 * 30 -- monthly hours
monthlyOperatingCost in (monthlyOperatingCost / (apiCostPer1k - selfHostedCostPer1k)) * 1000
-- | Example break-even analysis
-- >>> breakEvenAnalysis
-- === Break-Even Analysis ===
-- Break-even vs OpenAI o3: 303908286 tokens/month (304M tokens)
-- Break-even vs Anthropic Sonnet: 6597285068 tokens/month (6597M tokens)
breakEvenAnalysis :: IO ()
= do
breakEvenAnalysis putStrLn "=== Break-Even Analysis ==="
let openAI_breakeven = breakEvenVolume openAI_o3 h100_8node
let anthropic_breakeven = breakEvenVolume anthropic_sonnet h100_8node
"Break-even vs OpenAI o3: %.0f tokens/month (%.0fM tokens)\n"
printf / 1_000_000)
openAI_breakeven (openAI_breakeven "Break-even vs Anthropic Sonnet: %.0f tokens/month (%.0fM tokens)\n"
printf / 1_000_000) anthropic_breakeven (anthropic_breakeven
Outsource when:
Own when:
Prompt chains and API glue are easy to copy. True advantage comes from owning what others can't buy: your data, your models, your systems.
Let's create a decision framework based on usage patterns:
-- \| Usage pattern classification
data UsagePattern = UsagePattern
monthlyTokens :: Integer,
{ criticalityScore :: Int, -- 1-10 how mission critical
complianceNeeds :: Bool,
latencyRequirements :: Double -- max acceptable latency in ms
}deriving (Generic, Show)
-- \| Decision recommendation
data Recommendation = Outsource | Hybrid | SelfHost
deriving (Show, Eq)
-- \| Strategic decision function
-- >>> recommendStrategy startupPrototype
-- Outsource
-- >>> recommendStrategy enterpriseCore
-- SelfHost
-- >>> recommendStrategy midSizeProduct
-- Hybrid
recommendStrategy :: UsagePattern -> Recommendation
UsagePattern {..}
recommendStrategy | monthlyTokens < 10_000_000 && criticalityScore < 5 = Outsource
| monthlyTokens > 50_000_000 || criticalityScore >= 8 || complianceNeeds = SelfHost
| otherwise = Hybrid
-- Example usage patterns
startupPrototype :: UsagePattern
= UsagePattern 1_000_000 3 False 2000
startupPrototype
enterpriseCore :: UsagePattern
= UsagePattern 100_000_000 9 True 100
enterpriseCore
midSizeProduct :: UsagePattern
= UsagePattern 25_000_000 6 False 500
midSizeProduct
-- Strategic analysis
strategyAnalysis :: IO ()
= do
strategyAnalysis putStrLn "=== Strategic Recommendations ==="
printf"Startup prototype (%dM tokens): %s\n"
`div` 1_000_000)
(monthlyTokens startupPrototype show $ recommendStrategy startupPrototype)
(
printf"Enterprise core (%dM tokens): %s\n"
`div` 1_000_000)
(monthlyTokens enterpriseCore show $ recommendStrategy enterpriseCore)
(
printf"Mid-size product (%dM tokens): %s\n"
`div` 1_000_000)
(monthlyTokens midSizeProduct show $ recommendStrategy midSizeProduct)
(putStrLn ""
By the way, waiting for "sufficient volume" before building internal AI is a classic strategic blunder. You don't get breakthrough applications by waiting. You create them by investing in capability.
No serious company enters an emerging tech field with "we'll wait until the volumes justify it." The volumes never justify it at the start. You create the volume by shipping something the market wants.
Competitors who build internal capability, even modestly, discover new applications, tune their stack, and steadily widen the gap. API-only companies get stuck forever at "commodity" status.
Let's model the compound advantage of early internal investment:
-- \| Model capability development over time
data CapabilityGrowth = CapabilityGrowth
months :: Int,
{ efficiency :: Double, -- cost reduction factor per month
innovation :: Double, -- new capability discovery rate
compoundAdvantage :: Double -- cumulative strategic advantage
}deriving (Generic, Show)
-- \| Calculate compound advantage over time
compoundGrowth :: Int -> CapabilityGrowth
=
compoundGrowth n CapabilityGrowth
= n,
{ months = 1.0 - (0.02 * fromIntegral n), -- 2% cost reduction per month
efficiency = 0.1 * fromIntegral n, -- linear innovation growth
innovation = (1.15 ** fromIntegral n) - 1 -- 15% compound advantage per month
compoundAdvantage
}
-- \| Compare API-only vs internal investment trajectories
trajectoryComparison :: IO ()
= do
trajectoryComparison putStrLn "=== 12-Month Capability Trajectory ==="
putStrLn "Month | API-Only Cost | Internal Cost | Advantage Gap"
putStrLn "------|---------------|---------------|---------------"
mapM_
->
( \m let growth = compoundGrowth m
= (1.0 :: Double) + (0.05 * fromIntegral m) -- 5% API price increase per month
apiCost = efficiency growth
internalCost = compoundAdvantage growth
gap in printf "%5d | %12.2f | %12.2f | %12.1f%%\n" m apiCost internalCost (gap * 100)
)1 .. 12]
[putStrLn ""
The smart play is targeted internal investment:
These moves are not expensive compared to the risk of total dependency. They buy you what APIs can never provide: control, stability, and a chance at real, future-proof advantage.
What about open-source models hosted by Groq, OpenRouter, or Cerebras?
They're a useful middle step: cheaper and less restrictive than proprietary APIs, but they don't eliminate pricing risk or the need for in-house expertise. You still pay per token, your latency depends on someone else's cluster, and your differentiation hinges on how well you can fine-tune and govern the model.
Open weight checkpoints are raw clay. If you don't have internal "sculptors" (i.e., capable ML engineers and data people), you're just working with the same tools as everyone else. The result? Another commodity.
As for "training from scratch"?
For 99% of companies, it's not viable and never was. The ongoing flood of high-quality open-source models (especially from China) means you almost never need to start from zero. The real winners will be those who can adapt, govern, and build atop these models faster and better than the rest. That takes internal talent and infrastructure, not just a fat OpenRouter bill.
A concrete example from our shop building the Apriel model family: We don't launch 20-trillion-token pre-training runs. Instead, we perform surgical upgrades on existing models, replacing quadratic attention with more efficient alternatives like linear attention, pruning non-essential layers, swapping next-token prediction for multi-token prediction or diffusion. These tweaks reduce both serving cost and inference latency, but are only possible because we own the training stack, control the data pipelines, and maintain dedicated GPU capacity for experimentation.
The key insight: incremental improvements compound. Each optimization builds on the last, creating a widening performance gap that API-dependent competitors can't bridge.
Let's quantify the investment required for strategic AI ownership:
-- | Internal AI investment model
data InternalInvestment = InternalInvestment
teamSize :: Int -- number of ML engineers
{ avgSalary :: Double -- annual salary per engineer
, gpuInfrastructure :: SelfHostedSetup
, dataInfrastructure :: Double -- annual data pipeline costs
,deriving (Generic, Show)
}
-- | Calculate total annual investment
-- >>> annualInvestment conservativeSetup
-- 2014000.0
annualInvestment :: InternalInvestment -> Double
InternalInvestment{..} =
annualInvestment let teamCost = fromIntegral teamSize * avgSalary
= hourlyCost gpuInfrastructure * 24 * 365
gpuCost = teamCost + gpuCost + dataInfrastructure
totalCost in totalCost
-- Conservative internal setup
conservativeSetup :: InternalInvestment
= InternalInvestment
conservativeSetup = 3
{ teamSize = 200_000
, avgSalary = h100_8node
, gpuInfrastructure = 100_000
, dataInfrastructure
}
-- | Risk-adjusted ROI calculation
-- >>> riskAdjustedROI conservativeSetup openAI_o3 100_000_000
-- -0.9285004965243296
-- >>> riskAdjustedROI conservativeSetup anthropic_sonnet 100_000_000
-- -0.9958291956305859
riskAdjustedROI :: InternalInvestment -> APIProvider -> Integer -> Double
=
riskAdjustedROI investment api monthlyTokens let annualCost = annualInvestment investment
= monthlyTokens * 12
annualTokens = apiCost api (fromIntegral annualTokens * 2 `div` 3) (fromIntegral annualTokens `div` 3)
apiAnnualCost = apiAnnualCost - annualCost
savings = savings / annualCost
roi in roi
-- ROI analysis
roiAnalysis :: IO ()
= do
roiAnalysis putStrLn "=== ROI Analysis for Internal Investment ==="
let investment = annualInvestment conservativeSetup
"Annual internal investment: $%.0f\n" investment
printf
let scenarios = [10_000_000, 50_000_000, 100_000_000, 500_000_000]
putStrLn "\nMonthly Tokens | vs OpenAI o3 ROI | vs Anthropic ROI"
putStrLn "---------------|-------------------|------------------"
mapM_ (\tokens ->
let openai_roi = riskAdjustedROI conservativeSetup openAI_o3 tokens
= riskAdjustedROI conservativeSetup anthropic_sonnet tokens
anthropic_roi in printf "%13dM | %16.1f%% | %15.1f%%\n"
`div` 1_000_000) (openai_roi * 100) (anthropic_roi * 100)
(tokens
) scenariosputStrLn ""
Let's summarize our computational findings:
-- | Executive summary of key findings
executiveSummary :: IO ()
= do
executiveSummary putStrLn "=== COMPUTATIONAL EXECUTIVE SUMMARY ==="
putStrLn ""
exampleCosts
breakEvenAnalysis
strategyAnalysis
trajectoryComparison
roiAnalysis
putStrLn "KEY FINDINGS:"
putStrLn "• Self-hosted inference is 5-20x cheaper than APIs at scale"
putStrLn "• Break-even point: ~10-20M tokens/month vs premium APIs"
putStrLn "• Internal investment ROI exceeds 200% at 100M tokens/month"
putStrLn "• Compound advantage grows 15% monthly with internal capabilities"
putStrLn "• API pricing volatility creates 3-7x cost variance"
putStrLn ""
putStrLn "STRATEGIC RECOMMENDATION:"
putStrLn "Invest in measured internal AI capability to achieve both"
putStrLn "cost control AND differentiation. The math is conclusive."
Treating cost and differentiation as unrelated choices is the mistake. Invest in internal AI, and you get both cost control and differentiation. Outsource everything, and you lose both.
As subsidies end and API providers inevitably raise prices as VC money dries up, companies stuck on "rent only" will pay more and stay strategically dependent. The winners will be those who reject the false choice and invest in measured internal capability.
The only real question: Can you afford not to?
Run the analysis yourself:
main :: IO ()
= executiveSummary main