Database Maintenance Workflows

This document provides a high-level overview of the different workflows for maintaining and updating the Nepal Entity Service database. Understanding these workflows helps you choose the right approach for your use case.

Overview
- Core Principles
- User Roles
Workflow Types
- 1. Migration Workflow (Database Updates)
- 2. API Consumption Workflow (Read-Only)
Choosing the Right Workflow
- Use Migration Workflow When
- Use API Consumption When
Common Scenarios
- Scenario 1: Import Election Results
- Scenario 2: Update Party Leadership
- Scenario 3: Build Transparency Platform
- Scenario 4: Fix Data Quality Issues
- Scenario 5: Research Political Networks
Workflow Decision Tree
Additional Resources
Support

Overview

The Nepal Entity Service supports two primary workflows: one for updating the database and one for consuming data. All database updates go through the migration workflow to ensure transparency, reproducibility, and community participation.

Core Principles

All workflows share these principles:

Versioning: Every change creates a new version with full audit trail
Author Attribution: All changes are attributed to an author
Data Integrity: Business rules and validation ensure consistency
Git-Based: Changes are tracked through Git for transparency and rollback capability

User Roles

Different workflows are designed for different roles:

API Consumers: Read-only access to entity data via public API
Contributors: Anyone proposing data updates via migrations
Maintainers: Reviewers who approve and execute migrations

Workflow Types

1. Migration Workflow (Database Updates)

Who: Anyone proposing data updates - contributors, maintainers, researchers

When: All database updates including routine maintenance, bulk imports, data quality improvements, and structural changes

How: Create migration folders with Python scripts, submit via GitHub pull requests

Key Features:
- Community contributions welcome
- Code review before execution
- Versioned migration folders
- Deterministic execution (idempotent)
- Complete audit trail through Git

Process Flow:

graph LR subgraph Contributor["
👤 CONTRIBUTOR

"] A0[Fork
Repo] --> A[Create
Migration] A --> B[Test
Locally] B --> C{Data
OK?} C -->|No| B C -->|Yes| D[Submit
PR] end subgraph CI["🤖 GitHub Actions - CI"] E[Run Tests] --> F[Build Migration] end subgraph Review["
👥 MAINTAINER

"] G[Review
PR] --> H{Approved?} H -->|No| I[Request
Changes] H -->|Yes| J[Merge to
main] end subgraph Deploy["🚀 GitHub Actions - Deploy"] K[Auto-Build
Triggered] --> L[Run
Migration] L --> M[Auto-Commit
& Push] end D --> E F --> G I -.-> B J --> K M --> O[(Database
Updated)] style Contributor fill:#e3f2fd,stroke:#1976d2,stroke-width:4px style Review fill:#fff8e1,stroke:#f57c00,stroke-width:4px style CI fill:#fafafa,stroke:#757575,stroke-width:2px style Deploy fill:#fafafa,stroke:#757575,stroke-width:2px style A0 fill:#e1f5ff,stroke:#0288d1,stroke-width:2px style A fill:#e1f5ff,stroke:#0288d1,stroke-width:2px style B fill:#e1f5ff,stroke:#0288d1,stroke-width:2px style C fill:#e1f5ff,stroke:#0288d1,stroke-width:2px style D fill:#e1f5ff,stroke:#0288d1,stroke-width:2px style E fill:#fff3e0,stroke:#f57c00,stroke-width:1px style F fill:#fff3e0,stroke:#f57c00,stroke-width:1px style G fill:#fff4e1,stroke:#ffa726,stroke-width:2px style H fill:#fff4e1,stroke:#ffa726,stroke-width:2px style I fill:#fff4e1,stroke:#ffa726,stroke-width:2px style J fill:#fff4e1,stroke:#ffa726,stroke-width:2px style K fill:#e8f5e9,stroke:#66bb6a,stroke-width:1px style L fill:#e8f5e9,stroke:#66bb6a,stroke-width:1px style M fill:#e8f5e9,stroke:#66bb6a,stroke-width:1px style O fill:#c8e6c9,stroke:#43a047,stroke-width:3px

Detailed Steps:

Contributor: Fork Repository
- Fork the Service/API repository on GitHub
- Clone your fork locally
Contributor: Create Migration Locally
- Create migration folder: migrations/NNN-descriptive-name/
- Add migrate.py script and README.md
- Include data files (CSV, JSON, etc.) if needed
Contributor: Test Migration
- Run: nes migration run NNN-name
- Verify entities/relationships are created correctly
- Check data quality and completeness
Contributor: Submit Pull Request
- Push migration code to your fork
- Create PR with description and data sources
- PR contains migration code only (no database changes)
GitHub Actions: Automated Testing
- Run tests on migration code
- Build and validate migration structure
- Post results as PR comment
Maintainer: Review and Approve
- Review code quality and data sources
- Verify migration logic
- Approve and merge PR to main
GitHub Actions: Auto-Build and Deploy
- Triggered automatically on merge to main
- Runs pending migrations
- Creates entities/relationships in database
- Auto-commits and pushes changes to Database Repository

Database Updated
- Changes persisted in nes-db/ repository
- Git commit serves as migration tracking
- Full audit trail maintained

**Example**:
```python
# migrations/005-add-ministers/migrate.py
async def migrate(context):
    data = context.read_csv("ministers.csv")

    for row in data:
        await context.publication.create_entity(
            entity_data={...},
            author_id="author:migration:005-add-ministers",
            change_description="Import minister"
        )

Why Migrations for Everything?:
- Transparency: All changes are reviewed and documented
- Reproducibility: Database state can be recreated by replaying migrations
- Community Participation: Anyone can contribute data updates
- Audit Trail: Complete history of who changed what and why
- Quality Control: Code review ensures data quality

Documentation:
- Migration Contributor Guide
- Migration Maintainer Guide
- Migration Architecture

2. API Consumption Workflow (Read-Only)

Who: Application developers, researchers, data consumers

When: Building applications, analyzing data, displaying entity information

How: Use the public read-only REST API

Key Features:
- Read-only access
- No authentication required
- RESTful endpoints
- JSON responses
- Search and filtering capabilities

Process:

1. Application makes HTTP request
   GET /api/entities?type=person&query=ram

2. API returns entity data
   {
     "entities": [...],
     "total": 42,
     "page": 1
   }

3. Application uses data
   - Display in UI
   - Analyze relationships
   - Build visualizations

Example:

# Search for entities
curl https://nes.newnepal.org/api/entities?query=nepal&type=organization

# Get specific entity
curl https://nes.newnepal.org/api/entities/entity:person/ram-chandra-poudel

# Get version history
curl https://nes.newnepal.org/api/entities/entity:person/ram-chandra-poudel/versions

Documentation: API Guide

Choosing the Right Workflow

Use Migration Workflow When:

✓ You need to update, create, or delete entities
✓ You need to modify relationships
✓ You're importing data from any source
✓ You're fixing data quality issues
✓ You're making structural changes
✓ You're performing bulk operations

Example Scenarios:
- Importing election results from official sources
- Updating politician positions after government formation
- Fixing data quality issues or typos
- Adding missing entities or relationships
- Batch updating entity attributes
- Contributing curated datasets

Note: All database updates, regardless of who makes them (maintainers or contributors), go through the migration workflow for transparency and reproducibility.

Use API Consumption When:

✓ Building applications that display entity data
✓ Analyzing entity relationships
✓ Creating visualizations
✓ Researching political networks
✓ Need read-only access
✓ Don't need to modify data

Example Scenarios:
- Building civic technology application
- Creating data journalism project
- Academic research on political networks
- Transparency platform displaying public officials

Common Scenarios

Scenario 1: Import Election Results

Workflow: Migration

# migrations/010-election-results-2024/migrate.py
async def migrate(context):
    results = context.read_csv("results.csv")

    for row in results:
        await context.publication.create_entity(...)

    context.log(f"Imported {len(results)} results")

Process:
1. Create migration folder with CSV data
2. Submit PR to repository
3. Maintainer reviews and executes
4. Database updated with full audit trail

Scenario 2: Update Party Leadership

Workflow: Migration

# migrations/011-update-party-leadership/migrate.py
async def migrate(context):
    updates = context.read_json("leadership_updates.json")

    for update in updates:
        entity = await context.db.get_entity(update["party_id"])
        entity.attributes.update(update["new_leadership"])

        await context.publication.update_entity(
            entity=entity,
            author_id="author:migration:011-update-party-leadership",
            change_description="Update party leadership"
        )

Scenario 3: Build Transparency Platform

Best Workflow: API Consumption

Implementation:

// Frontend application
async function searchPoliticians(query) {
    const response = await fetch(
        `https://nes.newnepal.org/api/entities?type=person&sub_type=politician&query=${query}`
    );
    const data = await response.json();
    return data.entities;
}

async function getPoliticianDetails(id) {
    const response = await fetch(
        `https://nes.newnepal.org/api/entities/${id}`
    );
    return await response.json();
}

async function getRelationships(id) {
    const response = await fetch(
        `https://nes.newnepal.org/api/relationships?source_entity_id=${id}`
    );
    return await response.json();
}

Scenario 4: Fix Data Quality Issues

Workflow: Migration

# migrations/012-fix-name-spellings/migrate.py
async def migrate(context):
    corrections = context.read_json("name_corrections.json")

    for correction in corrections:
        entity = await context.db.get_entity(correction["entity_id"])

        # Apply correction
        for name in entity.names:
            if name.kind == correction["name_kind"]:
                if correction.get("en"):
                    name.en.full = correction["en"]
                if correction.get("ne"):
                    name.ne.full = correction["ne"]

        await context.publication.update_entity(
            entity=entity,
            author_id="author:migration:012-fix-name-spellings",
            change_description=f"Fix name spelling: {correction['description']}"
        )

Scenario 5: Research Political Networks

Best Workflow: API Consumption

Analysis Script:

import requests
import networkx as nx

# Fetch politicians
response = requests.get("https://nes.newnepal.org/api/entities?type=person&sub_type=politician&limit=1000")
politicians = response.json()["entities"]

# Build network graph
G = nx.Graph()

for politician in politicians:
    # Get relationships
    rel_response = requests.get(f"https://nes.newnepal.org/api/relationships?source_entity_id={politician['id']}")
    relationships = rel_response.json()["relationships"]

    for rel in relationships:
        if rel["relationship_type"] == "MEMBER_OF":
            G.add_edge(politician["id"], rel["target_entity_id"])

# Analyze network
print(f"Network density: {nx.density(G)}")
print(f"Average clustering: {nx.average_clustering(G)}")

Workflow Decision Tree

Do you need to modify entity data?
│
├─ No → Use API Consumption Workflow
│       └─ Build applications, analyze data, research
│
└─ Yes → Use Migration Workflow
         └─ Create migration, submit PR, get review, execute

Additional Resources

For Database Updates

Migration Contributor Guide - Creating and submitting migrations
Migration Maintainer Guide - Reviewing and executing migrations
Migration Architecture - System design and technical details
Database Setup - Git submodule configuration

For API Consumers

API Guide - Using the public API
Data Models - Understanding entity schemas
Examples - Common usage patterns
Usage Examples - Code examples and notebooks

Support

For questions about workflows:

Review the relevant guide for your workflow
Check examples in the examples/ directory
Explore notebooks in the notebooks/ directory
Open an issue for clarification or bugs

Last Updated: 2024
Version: 2.0

🇳🇵 Nepal Entity Service

Database Maintenance Workflows

Table of Contents

Overview

Core Principles

User Roles

Workflow Types

1. Migration Workflow (Database Updates)

2. API Consumption Workflow (Read-Only)

Choosing the Right Workflow

Use Migration Workflow When:

Use API Consumption When:

Common Scenarios

Scenario 1: Import Election Results

Scenario 2: Update Party Leadership

Scenario 3: Build Transparency Platform

Scenario 4: Fix Data Quality Issues

Scenario 5: Research Political Networks

Workflow Decision Tree

Additional Resources

For Database Updates

For API Consumers

Support