Update Advanced Configuration
parent
4b09b31bf4
commit
fd0fccb3ec
1 changed files with 51 additions and 449 deletions
|
|
@ -1,473 +1,75 @@
|
|||
# Advanced Configuration
|
||||
# 🔧 Advanced Configuration
|
||||
|
||||
Comprehensive guide to fine-tuning **GTS-HolMirDas** for production environments and specific use cases.
|
||||
Comprehensive guide to advanced GTS-HolMirDas configuration options and production deployment strategies.
|
||||
|
||||
## Environment Variables Reference
|
||||
## 🏗️ Production Deployment
|
||||
|
||||
### Core Configuration
|
||||
|
||||
```bash
|
||||
# GoToSocial Connection (Required)
|
||||
GTS_SERVER_URL=https://your-gts-instance.tld
|
||||
GTS_ACCESS_TOKEN=your_access_token_here
|
||||
|
||||
# Processing Control
|
||||
MAX_POSTS_PER_RUN=25 # Posts per feed per run (1-100)
|
||||
DELAY_BETWEEN_REQUESTS=1 # Seconds between API calls (0.5-5)
|
||||
LOG_LEVEL=INFO # DEBUG, INFO, WARNING, ERROR
|
||||
|
||||
# File Configuration
|
||||
RSS_URLS_FILE=/app/rss_feeds.txt # Path to RSS feeds file
|
||||
|
||||
# Optional Features
|
||||
HEALTHCHECK_URL= # Healthchecks.io ping URL
|
||||
USER_AGENT=GTS-HolMirDas/1.1.0 # Custom User-Agent string
|
||||
```
|
||||
|
||||
### Advanced Processing Options
|
||||
|
||||
```bash
|
||||
# Memory Management
|
||||
DUPLICATE_CACHE_SIZE=10000 # Max URLs to cache (affects memory)
|
||||
BATCH_SIZE=50 # Posts processed per batch
|
||||
|
||||
# Network Configuration
|
||||
REQUEST_TIMEOUT=30 # Seconds to wait for RSS/API responses
|
||||
MAX_RETRIES=3 # Retry attempts for failed requests
|
||||
BACKOFF_FACTOR=2 # Exponential backoff multiplier
|
||||
|
||||
# Federation Control
|
||||
INSTANCE_DISCOVERY=true # Enable automatic instance discovery
|
||||
MIN_INSTANCE_POSTS=5 # Minimum posts before counting instance
|
||||
```
|
||||
|
||||
### Production Hardening
|
||||
|
||||
```bash
|
||||
# Security
|
||||
VALIDATE_SSL=true # Enforce SSL certificate validation
|
||||
ALLOWED_DOMAINS= # Comma-separated list of allowed RSS domains
|
||||
|
||||
# Resource Limits
|
||||
MAX_FEED_SIZE=10MB # Maximum RSS feed size to process
|
||||
MAX_PROCESSING_TIME=1800 # Kill run after 30 minutes
|
||||
MEMORY_LIMIT=512MB # Container memory limit (Docker)
|
||||
|
||||
# Logging
|
||||
LOG_FORMAT=json # json, text
|
||||
LOG_TO_FILE=true # Enable file logging
|
||||
LOG_RETENTION_DAYS=30 # Days to keep log files
|
||||
```
|
||||
|
||||
## RSS Feed Strategies
|
||||
|
||||
### Feed Selection Methodology
|
||||
|
||||
#### High-Quality Instance Selection
|
||||
|
||||
**Tech-Focused Instances (Recommended):**
|
||||
```bash
|
||||
# Excellent signal-to-noise ratio
|
||||
https://fosstodon.org/tags/homelab.rss?limit=100
|
||||
https://infosec.exchange/tags/security.rss?limit=100
|
||||
https://social.tchncs.de/tags/linux.rss?limit=75
|
||||
|
||||
# Specialized communities
|
||||
https://chaos.social/tags/ccc.rss?limit=50
|
||||
https://mas.to/tags/privacy.rss?limit=50
|
||||
```
|
||||
|
||||
**Balanced General Instances:**
|
||||
```bash
|
||||
# Large instances with moderate limits
|
||||
https://mastodon.social/tags/technology.rss?limit=50
|
||||
https://mstdn.social/tags/programming.rss?limit=40
|
||||
https://hachyderm.io/tags/devops.rss?limit=60
|
||||
```
|
||||
|
||||
#### Hashtag Strategy
|
||||
|
||||
**Tier 1: Core Topics (Use high limits)**
|
||||
```bash
|
||||
# Your primary interests - use limit=75-100
|
||||
https://fosstodon.org/tags/homelab.rss?limit=100
|
||||
https://fosstodon.org/tags/selfhosting.rss?limit=100
|
||||
https://fosstodon.org/tags/docker.rss?limit=100
|
||||
```
|
||||
|
||||
**Tier 2: Secondary Topics (Moderate limits)**
|
||||
```bash
|
||||
# Related interests - use limit=50-75
|
||||
https://mastodon.social/tags/linux.rss?limit=50
|
||||
https://social.tchncs.de/tags/privacy.rss?limit=50
|
||||
https://infosec.exchange/tags/cybersecurity.rss?limit=60
|
||||
```
|
||||
|
||||
**Tier 3: Discovery Topics (Conservative limits)**
|
||||
```bash
|
||||
# Exploration areas - use limit=25-40
|
||||
https://mastodon.social/tags/photography.rss?limit=30
|
||||
https://pixelfed.social/tags/art.rss?limit=25
|
||||
```
|
||||
|
||||
### Feed Quality Assessment
|
||||
|
||||
#### Monitoring Feed Performance
|
||||
|
||||
```bash
|
||||
# Check feed response times
|
||||
curl -w "@curl-format.txt" -s -o /dev/null https://fosstodon.org/tags/homelab.rss
|
||||
|
||||
# curl-format.txt content:
|
||||
# time_namelookup: %{time_namelookup}\n
|
||||
# time_connect: %{time_connect}\n
|
||||
# time_appconnect: %{time_appconnect}\n
|
||||
# time_pretransfer: %{time_pretransfer}\n
|
||||
# time_redirect: %{time_redirect}\n
|
||||
# time_starttransfer: %{time_starttransfer}\n
|
||||
# ----------\n
|
||||
# time_total: %{time_total}\n
|
||||
```
|
||||
|
||||
#### Feed Quality Metrics
|
||||
|
||||
**High-Quality Indicators:**
|
||||
- Response time < 2 seconds
|
||||
- Consistent content updates
|
||||
- Low duplicate rate with other feeds
|
||||
- Active community engagement
|
||||
|
||||
**Red Flags:**
|
||||
- Frequent timeouts or errors
|
||||
- Very high duplicate rate
|
||||
- Spam or low-quality content
|
||||
- Instance frequently down
|
||||
|
||||
### Geographic and Language Considerations
|
||||
|
||||
```bash
|
||||
# English-language instances
|
||||
https://fosstodon.org/tags/homelab.rss?limit=75
|
||||
https://hachyderm.io/tags/devops.rss?limit=50
|
||||
|
||||
# German-language instances
|
||||
https://social.tchncs.de/tags/homelab.rss?limit=50
|
||||
https://chaos.social/tags/34c3.rss?limit=25
|
||||
|
||||
# Multi-language instances
|
||||
https://mastodon.social/tags/technology.rss?limit=40
|
||||
```
|
||||
|
||||
## Production Deployment
|
||||
|
||||
### Docker Compose Production Configuration
|
||||
### Docker Resource Management
|
||||
|
||||
```yaml
|
||||
# compose.yml- Resource limits
|
||||
services:
|
||||
gts-holmirdas:
|
||||
image: gts-holmirdas:latest
|
||||
container_name: gts-holmirdas-prod
|
||||
restart: unless-stopped
|
||||
|
||||
# Resource limits
|
||||
deploy:
|
||||
resources:
|
||||
limits:
|
||||
memory: 512M
|
||||
cpus: '0.5'
|
||||
reservations:
|
||||
memory: 256M
|
||||
cpus: '0.25'
|
||||
|
||||
# Security
|
||||
user: "1000:1000"
|
||||
read_only: true
|
||||
security_opt:
|
||||
- no-new-privileges:true
|
||||
|
||||
# Networking
|
||||
networks:
|
||||
- gts-network
|
||||
|
||||
# Environment
|
||||
env_file:
|
||||
- .env.production
|
||||
|
||||
# Volumes
|
||||
volumes:
|
||||
- ./data:/app/data:rw
|
||||
- ./rss_feeds.txt:/app/rss_feeds.txt:ro
|
||||
- ./logs:/app/logs:rw
|
||||
- /tmp:/tmp:rw # Required for read_only mode
|
||||
|
||||
# Health check
|
||||
healthcheck:
|
||||
test: ["CMD", "python3", "-c", "import requests; requests.get('http://localhost:8080/health')"]
|
||||
interval: 30m
|
||||
timeout: 10s
|
||||
retries: 3
|
||||
start_period: 5m
|
||||
|
||||
# Logging
|
||||
logging:
|
||||
driver: "json-file"
|
||||
options:
|
||||
max-size: "10m"
|
||||
max-file: "5"
|
||||
compress: "true"
|
||||
|
||||
networks:
|
||||
gts-network:
|
||||
external: true
|
||||
restart: unless-stopped
|
||||
```
|
||||
|
||||
### Production Environment Variables
|
||||
### Data Persistence Strategy
|
||||
|
||||
**Important data locations:**
|
||||
- `data/processed_urls.json` - Processing history (prevents duplicates)
|
||||
- `rss_feeds.txt` - RSS feed configuration
|
||||
- `.env` - Environment configuration
|
||||
|
||||
```bash
|
||||
# .env.production
|
||||
GTS_SERVER_URL=https://social.yourdomain.com
|
||||
GTS_ACCESS_TOKEN=prod_token_here
|
||||
|
||||
# Production tuning
|
||||
MAX_POSTS_PER_RUN=50
|
||||
DELAY_BETWEEN_REQUESTS=1
|
||||
LOG_LEVEL=INFO
|
||||
LOG_FORMAT=json
|
||||
LOG_TO_FILE=true
|
||||
|
||||
# Security hardening
|
||||
VALIDATE_SSL=true
|
||||
REQUEST_TIMEOUT=30
|
||||
MAX_RETRIES=2
|
||||
MEMORY_LIMIT=512MB
|
||||
|
||||
# Monitoring
|
||||
HEALTHCHECK_URL=https://hc-ping.com/your-production-uuid
|
||||
```
|
||||
|
||||
### Monitoring and Alerting
|
||||
|
||||
#### Log Analysis Setup
|
||||
|
||||
```bash
|
||||
# Structured logging for analysis
|
||||
LOG_FORMAT=json
|
||||
|
||||
# Example log analysis with jq
|
||||
docker logs gts-holmirdas 2>&1 | jq -r 'select(.level=="ERROR") | .message'
|
||||
|
||||
# Performance monitoring
|
||||
docker logs gts-holmirdas 2>&1 | jq -r 'select(.posts_processed) | "\(.timestamp): \(.posts_processed) posts in \(.runtime)"'
|
||||
```
|
||||
|
||||
#### Metrics Collection
|
||||
|
||||
```bash
|
||||
# Custom metrics script
|
||||
#!/bin/bash
|
||||
# metrics-collector.sh
|
||||
|
||||
STATS=$(docker logs gts-holmirdas --tail=1 2>&1 | grep "Run Statistics" -A 6)
|
||||
RUNTIME=$(echo "$STATS" | grep "Runtime" | cut -d':' -f2- | tr -d ' ')
|
||||
POSTS=$(echo "$STATS" | grep "Total posts" | cut -d':' -f2 | tr -d ' ')
|
||||
INSTANCES=$(echo "$STATS" | grep "Current known" | cut -d':' -f2 | tr -d ' ')
|
||||
|
||||
# Send to monitoring system
|
||||
curl -X POST "https://monitoring.yourdomain.com/metrics" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d "{\"runtime\":\"$RUNTIME\",\"posts\":$POSTS,\"instances\":$INSTANCES}"
|
||||
```
|
||||
|
||||
### Backup and Recovery
|
||||
|
||||
#### Automated Backup Script
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# backup-gts-holmirdas.sh
|
||||
|
||||
BACKUP_DIR="/backups/gts-holmirdas"
|
||||
DATE=$(date +%Y%m%d_%H%M%S)
|
||||
CONTAINER="gts-holmirdas"
|
||||
|
||||
# Create backup directory
|
||||
mkdir -p "$BACKUP_DIR"
|
||||
|
||||
# Stop container gracefully
|
||||
docker compose stop gts-holmirdas
|
||||
|
||||
# Backup data directory
|
||||
tar -czf "$BACKUP_DIR/data_$DATE.tar.gz" ./data
|
||||
|
||||
# Backup configuration
|
||||
cp .env "$BACKUP_DIR/env_$DATE"
|
||||
cp rss_feeds.txt "$BACKUP_DIR/rss_feeds_$DATE.txt"
|
||||
cp compose.yml"$BACKUP_DIR/compose_$DATE.yml"
|
||||
|
||||
# Restart container
|
||||
docker compose start gts-holmirdas
|
||||
|
||||
# Cleanup old backups (keep 30 days)
|
||||
find "$BACKUP_DIR" -name "*.tar.gz" -mtime +30 -delete
|
||||
find "$BACKUP_DIR" -name "env_*" -mtime +30 -delete
|
||||
find "$BACKUP_DIR" -name "rss_feeds_*" -mtime +30 -delete
|
||||
|
||||
echo "Backup completed: $DATE"
|
||||
```
|
||||
|
||||
#### Recovery Procedure
|
||||
|
||||
```bash
|
||||
# Full recovery from backup
|
||||
#!/bin/bash
|
||||
# restore-gts-holmirdas.sh
|
||||
|
||||
BACKUP_DATE=$1 # e.g., 20240115_143022
|
||||
|
||||
if [ -z "$BACKUP_DATE" ]; then
|
||||
echo "Usage: $0 <backup_date>"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Stop current container
|
||||
docker compose down
|
||||
|
||||
# Restore data
|
||||
tar -xzf "/backups/gts-holmirdas/data_$BACKUP_DATE.tar.gz"
|
||||
|
||||
# Restore configuration
|
||||
cp "/backups/gts-holmirdas/env_$BACKUP_DATE" .env
|
||||
cp "/backups/gts-holmirdas/rss_feeds_$BACKUP_DATE.txt" rss_feeds.txt
|
||||
cp "/backups/gts-holmirdas/compose_$BACKUP_DATE.yml" docker-compose.yml
|
||||
|
||||
# Restart with restored configuration
|
||||
docker compose up -d
|
||||
|
||||
echo "Recovery completed from backup: $BACKUP_DATE"
|
||||
```
|
||||
|
||||
## Multi-Instance Deployment
|
||||
|
||||
### Load Balancing Multiple Instances
|
||||
|
||||
For very large deployments, you can run multiple GTS-HolMirDas instances:
|
||||
|
||||
```yaml
|
||||
# docker-compose-multi.yml
|
||||
version: '3.8'
|
||||
|
||||
services:
|
||||
gts-holmirdas-1:
|
||||
image: gts-holmirdas:latest
|
||||
env_file: .env.1
|
||||
volumes:
|
||||
- ./data1:/app/data
|
||||
- ./feeds1.txt:/app/rss_feeds.txt:ro
|
||||
# Backup critical data
|
||||
tar -czf gts-holmirdas-backup-$(date +%Y%m%d).tar.gz \
|
||||
data/ rss_feeds.txt .env
|
||||
|
||||
gts-holmirdas-2:
|
||||
image: gts-holmirdas:latest
|
||||
env_file: .env.2
|
||||
volumes:
|
||||
- ./data2:/app/data
|
||||
- ./feeds2.txt:/app/rss_feeds.txt:ro
|
||||
# For persistent storage
|
||||
mkdir -p ./data
|
||||
chown 1000:1000 ./data
|
||||
```
|
||||
|
||||
### Feed Distribution Strategy
|
||||
## 📡 RSS Feed Management
|
||||
|
||||
### Feed Categories & Organization
|
||||
|
||||
Organize your RSS feeds by content type:
|
||||
|
||||
```txt
|
||||
# rss_feeds.txt - Organized by category
|
||||
|
||||
# Homelab & Self-hosting
|
||||
https://mastodon.social/tags/homelab.rss
|
||||
https://fosstodon.org/tags/selfhosting.rss
|
||||
|
||||
# Docker & Container Technology
|
||||
https://social.tchncs.de/tags/docker.rss
|
||||
https://mastodon.social/tags/kubernetes.rss
|
||||
|
||||
# Open Source & Development
|
||||
https://fosstodon.org/tags/opensource.rss
|
||||
https://hachyderm.io/tags/programming.rss
|
||||
```
|
||||
|
||||
### Feed Quality Assessment
|
||||
|
||||
Monitor which feeds provide the best instance discovery:
|
||||
|
||||
```bash
|
||||
# feeds1.txt - Tech focus
|
||||
https://fosstodon.org/tags/homelab.rss?limit=100
|
||||
https://fosstodon.org/tags/docker.rss?limit=100
|
||||
https://infosec.exchange/tags/security.rss?limit=100
|
||||
|
||||
# feeds2.txt - General topics
|
||||
https://mastodon.social/tags/technology.rss?limit=50
|
||||
https://hachyderm.io/tags/programming.rss?limit=50
|
||||
https://social.tchncs.de/tags/linux.rss?limit=50
|
||||
# Check feed performance
|
||||
grep "Successfully looked up" logs | \
|
||||
cut -d'/' -f3 | sort | uniq -c | sort -nr
|
||||
```
|
||||
|
||||
## Integration with External Systems
|
||||
|
||||
### Webhook Integration
|
||||
|
||||
```bash
|
||||
# Add to .env
|
||||
WEBHOOK_URL=https://your-system.com/webhook/gts-holmirdas
|
||||
WEBHOOK_SECRET=your_webhook_secret
|
||||
|
||||
# Webhook payload example:
|
||||
{
|
||||
"timestamp": "2024-01-15T14:30:22Z",
|
||||
"runtime": "0:04:23",
|
||||
"posts_processed": 87,
|
||||
"instances_discovered": 12,
|
||||
"total_instances": 2847,
|
||||
"feeds_processed": 45,
|
||||
"success": true
|
||||
}
|
||||
```
|
||||
|
||||
### Prometheus Metrics
|
||||
|
||||
```bash
|
||||
# Custom metrics exporter
|
||||
#!/bin/bash
|
||||
# prometheus-metrics.sh
|
||||
|
||||
# Parse latest statistics
|
||||
STATS=$(docker logs gts-holmirdas --tail=1 2>&1 | grep "Run Statistics" -A 6)
|
||||
|
||||
# Extract metrics
|
||||
POSTS=$(echo "$STATS" | grep "Total posts" | grep -o '[0-9]\+')
|
||||
INSTANCES=$(echo "$STATS" | grep "Current known" | grep -o '[0-9]\+')
|
||||
RUNTIME_MIN=$(echo "$STATS" | grep "Runtime" | grep -o '[0-9]\+:[0-9]\+' | cut -d':' -f2)
|
||||
|
||||
# Export to Prometheus format
|
||||
cat > /tmp/gts-holmirdas-metrics.prom << EOF
|
||||
# HELP gts_holmirdas_posts_processed_total Total posts processed in last run
|
||||
# TYPE gts_holmirdas_posts_processed_total counter
|
||||
gts_holmirdas_posts_processed_total $POSTS
|
||||
|
||||
# HELP gts_holmirdas_instances_known Total known fediverse instances
|
||||
# TYPE gts_holmirdas_instances_known gauge
|
||||
gts_holmirdas_instances_known $INSTANCES
|
||||
|
||||
# HELP gts_holmirdas_runtime_minutes Runtime of last processing run in minutes
|
||||
# TYPE gts_holmirdas_runtime_minutes gauge
|
||||
gts_holmirdas_runtime_minutes $RUNTIME_MIN
|
||||
EOF
|
||||
```
|
||||
|
||||
## Advanced Troubleshooting
|
||||
|
||||
### Performance Profiling
|
||||
|
||||
```bash
|
||||
# Enable detailed profiling
|
||||
LOG_LEVEL=DEBUG
|
||||
PROFILE_ENABLED=true
|
||||
PROFILE_OUTPUT_DIR=/app/profiles
|
||||
|
||||
# Analyze performance bottlenecks
|
||||
docker compose exec gts-holmirdas python3 -m cProfile -o /app/profiles/profile.out /app/gts_holmirdas.py
|
||||
```
|
||||
|
||||
### Custom User Agent Configuration
|
||||
|
||||
```bash
|
||||
# Avoid rate limiting by customizing User-Agent
|
||||
USER_AGENT="GTS-HolMirDas/1.1.0 (+https://git.klein.ruhr/matthias/gts-holmirdas)"
|
||||
```
|
||||
|
||||
### Network Optimization
|
||||
|
||||
```bash
|
||||
# DNS caching for better performance
|
||||
DNS_CACHE_TTL=300
|
||||
|
||||
# Connection pooling
|
||||
CONNECTION_POOL_SIZE=10
|
||||
CONNECTION_POOL_MAXSIZE=20
|
||||
```
|
||||
|
||||
This advanced configuration guide should help optimize GTS-HolMirDas for any production environment or specific use case!
|
||||
**High-value feed characteristics:**
|
||||
- Active communities (20+ posts/day)
|
||||
- Diverse user base (multiple instances)
|
||||
- Technical content (better federation)
|
||||
- Regular posting schedule
|
||||
Loading…
Add table
Add a link
Reference in a new issue