ChainOS Node Maintenance

Maintenance Overview

Regular maintenance is essential for keeping your ChainOS node running smoothly and securely. This guide covers best practices for updates, backups, and troubleshooting common issues.

Regular Maintenance Tasks

To ensure optimal node performance, follow these regular maintenance tasks:

Daily Tasks

  • Check node status and synchronization
  • Monitor system resources (CPU, RAM, disk, network)
  • Review logs for errors or warnings
  • Verify validator signing performance (for validators)

Weekly Tasks

  • Update operating system security patches
  • Check disk space usage and clean up if necessary
  • Review network peers and adjust if needed
  • Verify backup procedures are working

Monthly Tasks

  • Perform full system updates
  • Test disaster recovery procedures
  • Review and optimize node configuration
  • Check for ChainOS software updates

Software Updates

Keeping your ChainOS node software up to date is critical for security and functionality:

Types of Updates

Update Process

Follow these steps to safely update your ChainOS node:

# 1. Backup your node data
cp -r ~/.chainosd/data ~/.chainosd/data_backup_$(date +%Y%m%d)
cp -r ~/.chainosd/config ~/.chainosd/config_backup_$(date +%Y%m%d)

# 2. Stop the node service
sudo systemctl stop chainosd

# 3. Update the software
cd ~/ChainOS--Mainnet
git fetch --all
git checkout v1.5.05  # Replace with the target version
make install

# 4. Verify the installation
chainosd version

# 5. Start the node service
sudo systemctl start chainosd

# 6. Monitor the logs
sudo journalctl -u chainosd -f --output cat

Important Update Notes

For major updates and hard forks:

  • Always read the release notes carefully before updating
  • Check for specific upgrade instructions that may differ from the standard process
  • Be aware of the scheduled upgrade height or time
  • For validators, coordinate with the community to ensure a smooth transition

Backups and Data Management

Regular backups are essential for node recovery in case of failures:

Critical Files to Back Up

Backup Strategies

Implement these backup strategies for comprehensive protection:

Automated Scheduled Backups

Set up a cron job to perform regular backups:

# Add to crontab (crontab -e)
# Daily backup of critical files at 2:00 AM
0 2 * * * tar -czf /backup/chainosd_config_$(date +\%Y\%m\%d).tar.gz ~/.chainosd/config

Cold Storage for Private Keys

For validator nodes, store a backup of your private keys in a secure offline location:

# Export keys to an encrypted file
tar -czf validator_keys.tar.gz ~/.chainosd/config/priv_validator_key.json
gpg -c validator_keys.tar.gz

# Store the encrypted file on multiple secure media (USB drives, etc.)
# Store the encryption password separately

Remote Backups

Store backups in a remote location for disaster recovery:

# Using rsync to a remote server
rsync -avz --delete ~/.chainosd/config/ user@backup-server:/backup/chainosd/config/

# Or using a cloud storage service
rclone copy ~/.chainosd/config remote:chainosd-backups/config

Data Pruning

To manage disk space, consider implementing data pruning strategies:

# Clean up old log files (older than 7 days)
find ~/.chainosd/logs -name "*.log" -type f -mtime +7 -delete

# Clean up old backups (older than 30 days)
find /backup -name "chainosd_*.tar.gz" -type f -mtime +30 -delete

Node Recovery

In case of node failure, follow these recovery procedures:

Simple Recovery

For minor issues where the blockchain data is intact:

# Stop the node
sudo systemctl stop chainosd

# Reset the node's memory state
chainosd unsafe-reset-all --home=$HOME/.chainosd --keep-addr-book

# Start the node
sudo systemctl start chainosd

Full Recovery from Backup

For complete node recovery:

# Stop the node
sudo systemctl stop chainosd

# Restore configuration files
tar -xzf /backup/chainosd_config_20250515.tar.gz -C $HOME

# Reset the data directory
chainosd unsafe-reset-all --home=$HOME/.chainosd --keep-addr-book

# Start the node and let it sync
sudo systemctl start chainosd

Fast Recovery with State Sync

For faster recovery using state sync:

# Stop the node
sudo systemctl stop chainosd

# Reset the data directory
chainosd unsafe-reset-all --home=$HOME/.chainosd

# Configure state sync in config.toml
# Get a recent trusted block height and hash
LATEST_HEIGHT=$(curl -s https://chainos.network:26657/block | jq -r .result.block.header.height)
TRUST_HEIGHT=$((LATEST_HEIGHT - 2000))
TRUST_HASH=$(curl -s "https://chainos.network:26657/block?height=$TRUST_HEIGHT" | jq -r .result.block_id.hash)

# Update config.toml with state sync settings
sed -i.bak -E "s|^(enable[[:space:]]+=[[:space:]]+).*$|\1true| ; \
s|^(rpc_servers[[:space:]]+=[[:space:]]+).*$|\1\"chainos.network:26657,chainos-backup.network:26657\"| ; \
s|^(trust_height[[:space:]]+=[[:space:]]+).*$|\1$TRUST_HEIGHT| ; \
s|^(trust_hash[[:space:]]+=[[:space:]]+).*$|\1\"$TRUST_HASH\"| ; \
s|^(trust_period[[:space:]]+=[[:space:]]+).*$|\1\"168h\"| ; \
" $HOME/.chainosd/config/config.toml

# Start the node
sudo systemctl start chainosd

Troubleshooting Common Issues

Here are solutions for common node issues:

Node Not Syncing

Symptoms: Node starts but doesn't sync with the network.

Solutions:

  1. Check network connectivity:
    curl -s localhost:26657/net_info | jq '.result.n_peers'
  2. Add more persistent peers in config.toml:
    persistent_peers = "2b89c755963a03a2a2c846d5efb97c06e6d2cdfe@chainos.network:26656,3b89c755963a03a2a2c846d5efb97c06e6d2cdfe@chainos-backup.network:26656"
  3. Check for firewall issues:
    sudo ufw status
    sudo ufw allow 26656/tcp
  4. Consider using state sync for faster synchronization

Out of Disk Space

Symptoms: Node crashes with disk space errors.

Solutions:

  1. Check disk usage:
    df -h
    du -h --max-depth=1 ~/.chainosd
  2. Enable pruning in app.toml:
    pruning = "custom"
    pruning-keep-recent = "100"
    pruning-keep-every = "0"
    pruning-interval = "10"
  3. Clean up old log files:
    find ~/.chainosd/logs -name "*.log" -type f -mtime +7 -delete
  4. Add more storage or migrate to a larger disk

High CPU/Memory Usage

Symptoms: Node consumes excessive system resources.

Solutions:

  1. Check resource usage:
    top
    htop
  2. Adjust database settings in config.toml:
    db_backend = "goleveldb"  # Try different backends
    indexer = "null"  # Disable indexing if not needed
  3. Limit connections in config.toml:
    max_num_inbound_peers = 30
    max_num_outbound_peers = 10
  4. Upgrade hardware resources if necessary

Validator Missing Blocks

Symptoms: Validator node misses blocks and risks being jailed.

Solutions:

  1. Check validator status:
    chainosd query staking validator $(chainosd tendermint show-address)
  2. Check if the node is synced:
    chainosd status | jq '.SyncInfo'
  3. Ensure the validator key is correct:
    ls -la ~/.chainosd/config/priv_validator_key.json
    ls -la ~/.chainosd/data/priv_validator_state.json
  4. Check for network connectivity issues
  5. If jailed, unjail the validator:
    chainosd tx slashing unjail --from=validator-wallet

Performance Optimization

Optimize your node's performance with these tips:

System-Level Optimization

# Optimize system for network performance
# Add to /etc/sysctl.conf
net.core.somaxconn=1024
net.core.netdev_max_backlog=5000
net.core.rmem_max=16777216
net.core.wmem_max=16777216
net.ipv4.tcp_wmem=4096 12582912 16777216
net.ipv4.tcp_rmem=4096 12582912 16777216
net.ipv4.tcp_max_syn_backlog=8096
net.ipv4.tcp_slow_start_after_idle=0
net.ipv4.tcp_tw_reuse=1

# Apply changes
sudo sysctl -p

Application-Level Optimization

# Optimized config.toml settings for validators
[p2p]
max_num_inbound_peers = 60
max_num_outbound_peers = 20
send_rate = 5120000
recv_rate = 5120000

[mempool]
size = 10000
cache_size = 20000

[tx_index]
indexer = "null"  # Disable indexing for validators

# Optimized app.toml settings
[pruning]
pruning = "custom"
pruning-keep-recent = "100"
pruning-keep-every = "0"
pruning-interval = "10"

Maintenance Schedule Example

Here's a sample maintenance schedule for a ChainOS node:

Daily (Automated)

  • Check node status and synchronization:
    chainosd status | jq '.SyncInfo'
  • Monitor system resources:
    df -h
    free -m
    top -b -n 1
  • Check logs for errors:
    grep -i error /var/log/syslog
    journalctl -u chainosd -n 100 | grep -i error
  • Backup configuration files:
    tar -czf /backup/chainosd_config_$(date +%Y%m%d).tar.gz ~/.chainosd/config

Weekly (Manual)

  • Update operating system:
    sudo apt update
    sudo apt upgrade -y
  • Check for ChainOS updates:
    cd ~/ChainOS--Mainnet
    git fetch --all
    git tag -l | sort -V | tail -n 5
  • Clean up old log files:
    find ~/.chainosd/logs -name "*.log" -type f -mtime +7 -delete
  • Test backup restoration:
    mkdir -p ~/test-restore
    tar -xzf /backup/chainosd_config_$(date +%Y%m%d).tar.gz -C ~/test-restore
    diff -r ~/test-restore/.chainosd/config ~/.chainosd/config

Monthly (Manual)

  • Full system maintenance:
    sudo apt update
    sudo apt full-upgrade -y
    sudo apt autoremove -y
    sudo apt autoclean
  • Review and optimize configuration:
    nano ~/.chainosd/config/config.toml
    nano ~/.chainosd/config/app.toml
  • Check disk health:
    sudo smartctl -a /dev/sda
    sudo badblocks -sv /dev/sda
  • Perform full backup:
    tar -czf /backup/chainosd_full_$(date +%Y%m%d).tar.gz ~/.chainosd

Need Help?

If you need assistance with node maintenance, join our Discord community where our team and other node operators can help.