Contributor Guide
This guide is for developers who want to contribute to the Nepal Entity Service project, run their own instance, or maintain data locally.
Prerequisites
- Python 3.12+
- Poetry (Python package manager)
- Git
- Basic understanding of Python and REST APIs
Getting Started
1. Clone the Repository
# Clone with submodules (includes the database)
git clone --recurse-submodules https://github.com/NewNepal-org/NepalEntityService.git
cd NepalEntityService
# If you already cloned without submodules:
git submodule update --init --recursive
2. Install Dependencies
# Install with Poetry
poetry install
# Install with API extras (for running the server)
poetry install --extras api
# Install with all extras (API + scraping)
poetry install --extras all
3. Set Up the Database
The database is managed as a Git submodule. See the Database Setup guide for detailed configuration.
# Verify the database submodule
ls -la nes-db/v2/
# Set the database path (optional, defaults to ./nes-db/v2)
export NES_DB_URL=file:///absolute/path/to/nes-db/v2
Project Structure
NepalEntityService/
โโโ nes/ # Main package
โ โโโ models/ # Data models (Entity, Relationship, etc.)
โ โโโ database/ # Database layer (FileDatabase)
โ โโโ services/ # Business logic (Publication, Search)
โ โโโ api/ # FastAPI application
โ โโโ cli/ # Command-line interface
โ โโโ scraping/ # Web scraping and data collection
โ โโโ config.py # Configuration management
โโโ nes-db/ # Database submodule (Git repository)
โ โโโ v2/ # Version 2 database files
โโโ migrations/ # Data migration scripts
โโโ docs/ # Documentation
โโโ tests/ # Test suite
โโโ examples/ # Example scripts
โโโ notebooks/ # Jupyter notebooks
โโโ pyproject.toml # Poetry configuration
CLI Commands
The nes CLI provides commands for managing the service:
Server Commands
# Start production server
poetry run nes server start
# Start development server (with auto-reload)
poetry run nes server dev
# Alternative: Use the nes-dev shortcut for development
poetry run nes-dev
# Specify custom host/port
poetry run nes server start --host 0.0.0.0 --port 8080
Migration Commands
# Run a migration
poetry run nes migration run <migration-name>
# List available migrations
poetry run nes migration list
# List pending migrations
poetry run nes migration list --pending
For detailed migration information, see:
- Migration Contributor Guide - Creating migrations
- Migration Maintainer Guide - Reviewing and executing migrations
- Migration Architecture - System design
Development Workflow
Running the API Server
# Development mode with auto-reload
poetry run nes server dev
# Alternative: Use the nes-dev shortcut
poetry run nes-dev
# The API will be available at:
# - Documentation: http://localhost:8195/
# - API endpoints: http://localhost:8195/api/
# - OpenAPI docs: http://localhost:8195/docs
The nes-dev command is a convenient shortcut that starts the development server with live reload enabled. Any changes to the code will automatically restart the server, making it ideal for active development.
Running Tests
# Run all tests
poetry run pytest
# Run with coverage
poetry run pytest --cov=nes
# Run specific test file
poetry run pytest tests/services/test_publication_service.py
Code Quality
# Format code with Black
poetry run black nes/ tests/
# Sort imports with isort
poetry run isort nes/ tests/
# Lint with flake8
poetry run flake8 nes/ tests/
Data Maintenance
Using the Publication Service
The Publication Service provides a Pythonic interface for maintaining data:
from pathlib import Path
from nes.database.file_database import FileDatabase
from nes.services.publication import PublicationService
# Initialize services
db = FileDatabase(base_path="nes-db/v2")
pub_service = PublicationService(database=db)
# Create an entity
entity = await pub_service.create_entity(
entity_data={
"slug": "politician-name",
"type": "person",
"names": [{
"kind": "PRIMARY",
"en": {"full": "Full Name"},
"ne": {"full": "เคชเฅเคฐเคพ เคจเคพเคฎ"}
}]
},
author_id="author:human:your-name",
change_description="Added new politician"
)
# Update an entity
entity.attributes["position"] = "Minister"
updated = await pub_service.update_entity(
entity=entity,
author_id="author:human:your-name",
change_description="Updated position"
)
For detailed data maintenance, see the Data Maintainer Guide.
Scraping Service
The scraping service automates data collection using GenAI/LLM:
Prerequisites
# Install with scraping extras
poetry install --extras scraping
# Set up Google Cloud credentials (for Vertex AI)
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/credentials.json
Using the Scraper
from nes.services.scraping import WikipediaScraper, DataNormalizer
# Scrape Wikipedia
scraper = WikipediaScraper()
raw_data = await scraper.scrape_politician("Ram Chandra Poudel")
# Normalize to NES format
normalizer = DataNormalizer()
entity_data = await normalizer.normalize(raw_data)
Creating Migrations
Migrations are the recommended way to contribute data changes:
1. Create Migration Structure
mkdir -p migrations/001-add-new-politicians
cd migrations/001-add-new-politicians
2. Create Required Files
001-add-new-politicians/
โโโ migrate.py # Migration script
โโโ README.md # Documentation
โโโ data/ # Optional: supporting data files
3. Implement Migration
# migrate.py
from nes.services.publication import PublicationService
async def migrate(pub_service: PublicationService, dry_run: bool = False):
"""Add new politicians to the database."""
# Create entities
entity = await pub_service.create_entity(
entity_data={...},
author_id="author:human:contributor-name",
change_description="Added via migration 001"
)
if dry_run:
print(f"Would create entity: {entity.id}")
else:
print(f"Created entity: {entity.id}")
4. Submit Pull Request
See the Migration Contributor Guide for the complete workflow.
Contributing Guidelines
Code Style
- Follow PEP 8 style guidelines
- Use type hints for function signatures
- Write docstrings for public functions and classes
- Keep functions focused and small
Commit Messages
feat: Add new entity type for government bodies
fix: Correct relationship validation logic
docs: Update API guide with new endpoints
test: Add tests for search service
Pull Request Process
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes
- Run tests and linting
- Commit your changes
- Push to your fork
- Open a Pull Request
Package Extras
The project uses Poetry extras for optional dependencies:
# API only (FastAPI, Uvicorn)
poetry install --extras api
# Scraping only (Wikipedia, Google AI)
poetry install --extras scraping
# Everything
poetry install --extras all
Environment Variables
# Database path (file:// protocol required)
export NES_DB_URL=file:///absolute/path/to/nes-db/v2
# API server configuration
export NES_HOST=0.0.0.0
export NES_PORT=8195
# Google Cloud (for scraping)
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/credentials.json
Docker Deployment
# Build image
docker build -t nepal-entity-service .
# Run container
docker run -p 8195:8195 nepal-entity-service
# Run with custom database
docker run -p 8195:8195 \
-v /path/to/database:/app/nes-db/v2 \
-e NES_DB_URL=file:///app/nes-db/v2 \
nepal-entity-service
Resources
Documentation
- API Consumer Guide - Using the public API
- Data Models - Entity and relationship schemas
- Service Design - Architecture details
- Database Setup - Git submodule configuration
- Usage Examples - Code examples and notebooks
Migration Guides
- Migration Contributor Guide - Creating migrations
- Migration Maintainer Guide - Reviewing migrations
- Migration Architecture - System design
Data Maintenance
- Data Maintainer Guide - Local data maintenance
- Examples - Common usage patterns
Getting Help
- GitHub Issues: Report bugs or request features
- GitHub Discussions: Ask questions and share ideas
- Pull Requests: Contribute code or documentation
License
This project is open source. See the LICENSE file for details.