Anbesa Bank S.C.

Modern Data Lakehouse Architecture

Unifying data to drive financial inclusion, real-time fraud detection, and NBE regulatory compliance.

1. Executive Summary

As Anbesa Bank S.C. continues to expand its digital footprint across Ethiopia, traditional data silos between our Core Banking System and digital channels create reporting latency.

The Data Lakehouse paradigm unifies these by adding a transactional metadata layer. This architecture enables Anbesa Bank to handle streaming fraud detection for mobile transfers, automate National Bank of Ethiopia (NBE) reporting, and provide a 360-view of customers from Addis to regional branches.

2. High-Level Architecture

graph TD subgraph Sources ["1. Data Sources"] direction TB Core[("Core Banking
(Sopra Banking Platform)")]:::source Cards[("Card Switch
(EthSwitch)")]:::source Ext["External Data
(NBE / ECX)"]:::source Unstruct["Unstructured
(Docs / Audio Logs)"]:::source Mob["Anbesa Mobile /
Internet Banking"]:::source end subgraph Ingestion ["2. Ingestion Layer"] Stream["Streaming Bus
(Kafka/Event Hubs)"]:::process Batch["Batch ETL
(ADF/Glue/Airflow)"]:::process end subgraph Lakehouse ["3. The Data Lakehouse (Storage + Metadata)"] direction TB Gov[("Unity/Purview Catalog
(Governance and Lineage)")]:::governance subgraph Layers ["Medallion Architecture"] direction TB Bronze[("🥉 Bronze Layer
(Raw/Landing)
Native Format")]:::bronze Silver[("🥈 Silver Layer
(Clean/Conformed)
Delta/Iceberg")]:::silver Gold[("🥇 Gold Layer
(Aggregated/Business)
Star Schema")]:::gold end end subgraph Compute ["4. Processing Engine"] Spark["Spark / Databricks
(Data Eng and ML)"]:::compute SQL["Serverless SQL
(BI Warehousing)"]:::compute Vector["Vector Database
(GenAI / RAG)"]:::ai end subgraph Consume ["5. Consumption Layer"] BI["BI and Reporting
(PowerBI/Tableau)"]:::consume ML["AI/ML Models
(Credit Scoring)"]:::consume Chat["GenAI Chatbot
(Knowledge Base)"]:::consume Reg["NBE Regulatory
Reports"]:::consume end %% Connections Core --> Batch Core -->|CDC Logs| Stream Cards --> Stream Mob --> Stream Ext --> Batch Unstruct --> Batch Stream --> Bronze Batch --> Bronze Bronze -->|Cleaning| Silver Silver -->|Aggregations| Gold Silver -->|Embeddings| Vector Gov -.- Bronze Gov -.- Silver Gov -.- Gold %% Compute Connections Silver --> Spark Spark --> ML Gold --> SQL SQL --> BI SQL --> Reg Vector --> Chat %% Styling Classes classDef source fill:#dbeafe,stroke:#3b82f6,stroke-width:2px,color:#1e3a8a classDef process fill:#f3f4f6,stroke:#9ca3af,stroke-width:2px,stroke-dasharray: 5 5 classDef bronze fill:#78350f,stroke:#451a03,color:#fff,stroke-width:0px classDef silver fill:#64748b,stroke:#475569,color:#fff,stroke-width:0px classDef gold fill:#d97706,stroke:#b45309,color:#fff,stroke-width:0px classDef governance fill:#fae8ff,stroke:#d946ef,stroke-width:2px,stroke-dasharray: 5 5 classDef compute fill:#f5d0fe,stroke:#c026d3,stroke-width:2px,color:#4a044e classDef ai fill:#c4b5fd,stroke:#7c3aed,stroke-width:2px,color:#4c1d95 classDef consume fill:#dcfce7,stroke:#22c55e,stroke-width:2px

Figure 1: End-to-end data flow tailored for Ethiopian Banking Context.

3. The Medallion Architecture Layers

Data is refined through three distinct stages to ensure quality for Anbesa Bank's operations.

🥉

Bronze Layer

Raw Ingestion

Landing zone for raw data. No schema enforcement to ensure fast ingestion.

  • Raw logs from Anbesa Mobile App
  • Unstructured: Call center audio, PDF Directives.
  • Retention: Indefinite (Audit trail) for compliance.
  • Handles Schema Drift (when source systems change fields).
  • Append-Only storage to preserve full history.
🥈

Silver Layer

Curated & Enriched

Filtered, cleaned, and augmented data. The "Enterprise View".

  • Linking EthSwitch transactions to accounts
  • Deduplication & Null handling
  • PII Masking: Tokenizing Fayda ID / TINs
  • Schema Enforcement: Rejects bad data that fails quality checks.
  • Standardization: Unifying date formats (GC/EC) and currencies.
  • Handling SCD Type 2 (tracking customer address changes over time).
🥇

Gold Layer

Business Aggregates

Consumption-ready data tailored for specific business units.

  • NBE Reports: Daily liquidity & FX position
  • Branch Analytics: Performance per region
  • Credit Scoring: Digital lending models
  • Read-Optimized: Z-Ordering and Partitioning for fast BI queries.
  • Customer 360: Unified single view of a customer across all products.

4. Key Banking Capabilities

A. Regulatory Compliance (NBE)

The architecture prioritizes NBE Directives. By maintaining strict Time Travel capabilities, Anbesa Bank can reproduce the exact state of the ledger for any past date requested by auditors or regulators, ensuring transparency in FX and Reserve reporting.

B. IFB & Conventional Banking

The Lakehouse supports multi-tenancy to segregate and manage data for both Conventional and Interest-Free Banking (IFB) windows, ensuring that profit-sharing calculations for IFB customers are accurate and compliant with Sharia principles.

C. Security & Governance

  • Fine-Grained Access Control (FGAC): Ensuring only authorized personnel can view sensitive customer data (TIN, Fayda ID).
  • Data Lineage: Tracing the origin of every report submitted to the NBE, from the Core Banking System transaction to the final PDF report.

5. Technology Stack Recommendations

Component Technology Options
Storage On-Prem Object Storage (MinIO / Dell ECS) or Hybrid Cloud (Azure/AWS)
Table Format Delta Lake, Apache Iceberg
Processing Apache Spark, Databricks, Starburst (Trino)
GenAI / Vector Weaviate, Pinecone, Azure AI Search (for RAG)
Orchestration Apache Airflow, Azure Data Factory

6. Use Case: Real-Time Mobile Fraud Detection

1

Ingest

Mobile transfer request initiated via Anbesa Mobile App hits Kafka.

2

Process

Spark Streaming compares amount vs. customer's daily limit & location.

3

Check

Job queries Silver Layer for historical patterns (e.g., unexpected transfer to new beneficiary).

4

Action

If risk score > 80, block transaction and alert customer via SMS.

7. Generative AI Readiness

RAG (Retrieval-Augmented Generation)

The architecture now supports a Vector Database in the processing layer. This allows Anbesa Bank to ingest unstructured data (PDFs of NBE Directives, Loan Contracts) and "chat" with them.

Example Use Case:

"A Branch Manager asks the internal AI chatbot: 'What is the latest NBE directive regarding forex retention for exporters?' The system searches the vector DB, retrieves the exact PDF clause, and summarizes it in Amharic or English."