{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Veacon Portfolio Benchmark v1\n",
    "\n",
    "Reference: <https://veacon.io/use-cases/portfolio-benchmark>\n",
    "\n",
    "Compares your REIT / fund portfolio against Veacon's RTMS-grade market median per cohort, computes per-asset premium and a portfolio-weighted summary, and exports an Excel-ready report.\n",
    "\n",
    "**Quick start:** set `VEACON_API_KEY`, edit the `PORTFOLIO_CSV` cell with your assets, run all.\n",
    "\n",
    "**Tier:** Pro tier (60 req/min) handles ~20 assets per run. Team tier (200 req/min) handles 100+."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import io\n",
    "import requests\n",
    "import pandas as pd\n",
    "\n",
    "API_KEY = os.environ.get('VEACON_API_KEY')\n",
    "if not API_KEY:\n",
    "    raise SystemExit('Set VEACON_API_KEY environment variable first.')\n",
    "\n",
    "BASE = 'https://veacon.io/api/v1/real-estate/pulse'\n",
    "PERIOD = 'last_3m'  # 'last_3m' / 'last_6m' / 'last_12m' / '2026-Q1' / '2026-03'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. Portfolio definition\n",
    "Replace this with your own list. Required columns: `asset_name, sigungu_code, property_type, transaction_type, internal_valuation_krw, area_m2`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "PORTFOLIO_CSV = '''\\\n",
    "asset_name,sigungu_code,property_type,transaction_type,internal_valuation_krw,area_m2\n",
    "강남파이낸스 (역삼),11680,office,sale,4500000000,250\n",
    "영등포 IFC,11560,office,sale,4200000000,280\n",
    "명동 retail flagship,11140,retail,sale,8500000000,180\n",
    "강서 마곡 R&D,11500,office,sale,2200000000,200\n",
    "충북 청주 사옥,43111,office,sale,900000000,170\n",
    "'''\n",
    "\n",
    "portfolio = pd.read_csv(io.StringIO(PORTFOLIO_CSV))\n",
    "portfolio['sigungu_code'] = portfolio['sigungu_code'].astype(str)\n",
    "print(f'Portfolio loaded: {len(portfolio)} assets')\n",
    "portfolio"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Fetch market data per asset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def fetch_market(row):\n",
    "    params = {\n",
    "        'sigungu_code':     row['sigungu_code'],\n",
    "        'property_type':    row['property_type'],\n",
    "        'transaction_type': row['transaction_type'],\n",
    "        'period':           PERIOD,\n",
    "        'geo_precision':    'sigungu',\n",
    "    }\n",
    "    r = requests.get(BASE, headers={'X-API-Key': API_KEY}, params=params, timeout=15)\n",
    "    if r.status_code == 404:\n",
    "        return pd.Series({\n",
    "            'sample_count': 0, 'median_price': None,\n",
    "            'p25_price': None, 'p75_price': None,\n",
    "            'confidence': None, 'source_mix': None,\n",
    "        })\n",
    "    r.raise_for_status()\n",
    "    body = r.json()['data'][0]\n",
    "    return pd.Series({\n",
    "        'sample_count':  body['sample_count'],\n",
    "        'median_price':  body['median_price'],\n",
    "        'p25_price':     body['p25_price'],\n",
    "        'p75_price':     body['p75_price'],\n",
    "        'confidence':    body['confidence'],\n",
    "        'source_mix':    body['source_mix'],\n",
    "    })\n",
    "\n",
    "market = portfolio.apply(fetch_market, axis=1)\n",
    "df = pd.concat([portfolio, market], axis=1)\n",
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Premium calculation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df['premium_pct'] = (\n",
    "    (df['internal_valuation_krw'] - df['median_price']) / df['median_price']\n",
    ")\n",
    "\n",
    "valid = df[df['median_price'].notna()]\n",
    "total_area = valid['area_m2'].sum()\n",
    "valid_pf_premium = (valid['premium_pct'] * valid['area_m2'] / total_area).sum() if total_area > 0 else float('nan')\n",
    "\n",
    "print(f'Portfolio weighted premium (excluding NaN): {valid_pf_premium:+.1%}')\n",
    "print(f'  high-confidence assets:   {(df[\"confidence\"]==\"high\").sum()}')\n",
    "print(f'  medium-confidence assets: {(df[\"confidence\"]==\"medium\").sum()}')\n",
    "print(f'  low-confidence assets:    {(df[\"confidence\"]==\"low\").sum()}')\n",
    "print(f'  no-data (404) assets:     {df[\"median_price\"].isna().sum()}')\n",
    "df[['asset_name', 'internal_valuation_krw', 'median_price', 'premium_pct', 'sample_count', 'confidence']]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. Visualization"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "\n",
    "color_map = {'high': '#10b981', 'medium': '#f59e0b', 'low': '#ef4444', None: '#6b7280'}\n",
    "colors = df['confidence'].map(color_map).fillna('#6b7280')\n",
    "\n",
    "fig, ax = plt.subplots(figsize=(10, 5))\n",
    "ax.bar(df['asset_name'], df['premium_pct'].fillna(0), color=colors)\n",
    "ax.axhline(0, color='black', linewidth=0.5)\n",
    "ax.set_title(f'Portfolio premium vs Veacon market median (weighted: {valid_pf_premium:+.1%})')\n",
    "ax.set_ylabel('premium (decimal, e.g. 0.10 = +10%)')\n",
    "ax.tick_params(axis='x', rotation=30)\n",
    "plt.tight_layout()\n",
    "plt.savefig('portfolio-benchmark-chart.png', dpi=150)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5. Excel export"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "OUT_PATH = 'portfolio-benchmark-2026Q1.xlsx'\n",
    "\n",
    "with pd.ExcelWriter(OUT_PATH, engine='xlsxwriter') as w:\n",
    "    df.to_excel(w, sheet_name='Detail', index=False)\n",
    "    summary = pd.DataFrame([{\n",
    "        'as_of':                pd.Timestamp.utcnow().date(),\n",
    "        'portfolio_size':       len(df),\n",
    "        'weighted_premium_pct': valid_pf_premium,\n",
    "        'high_conf_count':      int((df['confidence'] == 'high').sum()),\n",
    "        'medium_conf_count':    int((df['confidence'] == 'medium').sum()),\n",
    "        'low_conf_count':       int((df['confidence'] == 'low').sum()),\n",
    "        'no_data_count':        int(df['median_price'].isna().sum()),\n",
    "        'data_sources_note':    'Veacon API + RTMS — see https://veacon.io/data-trust',\n",
    "    }])\n",
    "    summary.to_excel(w, sheet_name='Summary', index=False)\n",
    "\n",
    "print(f'Wrote {OUT_PATH}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "**Footnote for IC / quarterly report:**\n",
    "\n",
    "> Market median is from Veacon API `/api/v1/real-estate/pulse`. RTMS-reported transactions only — Industry research suggests 40-60% of actual Korean CRE sale transactions are publicly reported. Per-asset confidence reflects sample size: high (≥10) / medium (≥5) / low (≥2). Audit details: <https://veacon.io/data-trust>."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "name": "python",
   "version": "3.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}