๐Ÿ‡ณ๐Ÿ‡ต Nepal Entity Service

Open Source, open data, and open API for Nepali public entities

Contributor Guide

This guide is for developers who want to contribute to the Nepal Entity Service project, run their own instance, or maintain data locally.

Prerequisites

Getting Started

1. Clone the Repository

# Clone with submodules (includes the database)
git clone --recurse-submodules https://github.com/NewNepal-org/NepalEntityService.git
cd NepalEntityService

# If you already cloned without submodules:
git submodule update --init --recursive

2. Install Dependencies

# Install with Poetry
poetry install

# Install with API extras (for running the server)
poetry install --extras api

# Install with all extras (API + scraping)
poetry install --extras all

3. Set Up the Database

The database is managed as a Git submodule. See the Database Setup guide for detailed configuration.

# Verify the database submodule
ls -la nes-db/v2/

# Set the database path (optional, defaults to ./nes-db/v2)
export NES_DB_URL=file:///absolute/path/to/nes-db/v2

Project Structure

NepalEntityService/
โ”œโ”€โ”€ nes/                          # Main package
โ”‚   โ”œโ”€โ”€ models/                   # Data models (Entity, Relationship, etc.)
โ”‚   โ”œโ”€โ”€ database/                 # Database layer (FileDatabase)
โ”‚   โ”œโ”€โ”€ services/                 # Business logic (Publication, Search)
โ”‚   โ”œโ”€โ”€ api/                      # FastAPI application
โ”‚   โ”œโ”€โ”€ cli/                      # Command-line interface
โ”‚   โ”œโ”€โ”€ scraping/                 # Web scraping and data collection
โ”‚   โ””โ”€โ”€ config.py                 # Configuration management
โ”œโ”€โ”€ nes-db/                       # Database submodule (Git repository)
โ”‚   โ””โ”€โ”€ v2/                       # Version 2 database files
โ”œโ”€โ”€ migrations/                   # Data migration scripts
โ”œโ”€โ”€ docs/                         # Documentation
โ”œโ”€โ”€ tests/                        # Test suite
โ”œโ”€โ”€ examples/                     # Example scripts
โ”œโ”€โ”€ notebooks/                    # Jupyter notebooks
โ””โ”€โ”€ pyproject.toml               # Poetry configuration

CLI Commands

The nes CLI provides commands for managing the service:

Server Commands

# Start production server
poetry run nes server start

# Start development server (with auto-reload)
poetry run nes server dev

# Alternative: Use the nes-dev shortcut for development
poetry run nes-dev

# Specify custom host/port
poetry run nes server start --host 0.0.0.0 --port 8080

Migration Commands

# Run a migration
poetry run nes migration run <migration-name>

# List available migrations
poetry run nes migration list

# List pending migrations
poetry run nes migration list --pending

For detailed migration information, see:
- Migration Contributor Guide - Creating migrations
- Migration Maintainer Guide - Reviewing and executing migrations
- Migration Architecture - System design

Development Workflow

Running the API Server

# Development mode with auto-reload
poetry run nes server dev

# Alternative: Use the nes-dev shortcut
poetry run nes-dev

# The API will be available at:
# - Documentation: http://localhost:8195/
# - API endpoints: http://localhost:8195/api/
# - OpenAPI docs: http://localhost:8195/docs

The nes-dev command is a convenient shortcut that starts the development server with live reload enabled. Any changes to the code will automatically restart the server, making it ideal for active development.

Running Tests

# Run all tests
poetry run pytest

# Run with coverage
poetry run pytest --cov=nes

# Run specific test file
poetry run pytest tests/services/test_publication_service.py

Code Quality

# Format code with Black
poetry run black nes/ tests/

# Sort imports with isort
poetry run isort nes/ tests/

# Lint with flake8
poetry run flake8 nes/ tests/

Data Maintenance

Using the Publication Service

The Publication Service provides a Pythonic interface for maintaining data:

from pathlib import Path
from nes.database.file_database import FileDatabase
from nes.services.publication import PublicationService

# Initialize services
db = FileDatabase(base_path="nes-db/v2")
pub_service = PublicationService(database=db)

# Create an entity
entity = await pub_service.create_entity(
    entity_data={
        "slug": "politician-name",
        "type": "person",
        "names": [{
            "kind": "PRIMARY",
            "en": {"full": "Full Name"},
            "ne": {"full": "เคชเฅ‚เคฐเคพ เคจเคพเคฎ"}
        }]
    },
    author_id="author:human:your-name",
    change_description="Added new politician"
)

# Update an entity
entity.attributes["position"] = "Minister"
updated = await pub_service.update_entity(
    entity=entity,
    author_id="author:human:your-name",
    change_description="Updated position"
)

For detailed data maintenance, see the Data Maintainer Guide.

Scraping Service

The scraping service automates data collection using GenAI/LLM:

Prerequisites

# Install with scraping extras
poetry install --extras scraping

# Set up Google Cloud credentials (for Vertex AI)
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/credentials.json

Using the Scraper

from nes.services.scraping import WikipediaScraper, DataNormalizer

# Scrape Wikipedia
scraper = WikipediaScraper()
raw_data = await scraper.scrape_politician("Ram Chandra Poudel")

# Normalize to NES format
normalizer = DataNormalizer()
entity_data = await normalizer.normalize(raw_data)

Creating Migrations

Migrations are the recommended way to contribute data changes:

1. Create Migration Structure

mkdir -p migrations/001-add-new-politicians
cd migrations/001-add-new-politicians

2. Create Required Files

001-add-new-politicians/
โ”œโ”€โ”€ migrate.py          # Migration script
โ”œโ”€โ”€ README.md           # Documentation
โ””โ”€โ”€ data/              # Optional: supporting data files

3. Implement Migration

# migrate.py
from nes.services.publication import PublicationService

async def migrate(pub_service: PublicationService, dry_run: bool = False):
    """Add new politicians to the database."""

    # Create entities
    entity = await pub_service.create_entity(
        entity_data={...},
        author_id="author:human:contributor-name",
        change_description="Added via migration 001"
    )

    if dry_run:
        print(f"Would create entity: {entity.id}")
    else:
        print(f"Created entity: {entity.id}")

4. Submit Pull Request

See the Migration Contributor Guide for the complete workflow.

Contributing Guidelines

Code Style

Commit Messages

feat: Add new entity type for government bodies
fix: Correct relationship validation logic
docs: Update API guide with new endpoints
test: Add tests for search service

Pull Request Process

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Make your changes
  4. Run tests and linting
  5. Commit your changes
  6. Push to your fork
  7. Open a Pull Request

Package Extras

The project uses Poetry extras for optional dependencies:

# API only (FastAPI, Uvicorn)
poetry install --extras api

# Scraping only (Wikipedia, Google AI)
poetry install --extras scraping

# Everything
poetry install --extras all

Environment Variables

# Database path (file:// protocol required)
export NES_DB_URL=file:///absolute/path/to/nes-db/v2

# API server configuration
export NES_HOST=0.0.0.0
export NES_PORT=8195

# Google Cloud (for scraping)
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/credentials.json

Docker Deployment

# Build image
docker build -t nepal-entity-service .

# Run container
docker run -p 8195:8195 nepal-entity-service

# Run with custom database
docker run -p 8195:8195 \
  -v /path/to/database:/app/nes-db/v2 \
  -e NES_DB_URL=file:///app/nes-db/v2 \
  nepal-entity-service

Resources

Documentation

Migration Guides

Data Maintenance

Getting Help

License

This project is open source. See the LICENSE file for details.