Compute Instance Issues

Instance Fails to Start

High Impact

Your instance remains in a "Starting" or "Error" state and fails to become available after several minutes.

Possible Causes

Resource availability constraints in the selected region
Issues with the base image or snapshot
Insufficient quota for the instance type
Infrastructure maintenance or outage

Resolution Steps

Check the instance status details
Navigate to the instance details page and check the status message for specific error information.
Verify your account quota
Go to Account → Quotas to ensure you haven't exceeded your instance limit for the region or instance type.
Try a different region
If resources are constrained in your selected region, try deploying the instance in a different region.
Check the system status page
Visit status.slyd.cloud to see if there are any ongoing infrastructure issues.
Restart the instance creation process
Delete the failed instance and create a new one, potentially with a different image or size.

Diagnostic Information

Check the Events tab in your instance details for specific error codes. Common error codes include:

INSUFFICIENT_CAPACITY: The region doesn't have enough resources available
QUOTA_EXCEEDED: You've reached your account limit for this resource type
IMAGE_CORRUPT: The selected image has issues and can't be deployed
HARDWARE_FAILURE: Physical hardware issues in the data center

When to Contact Support

If you've tried all the steps above and your instance still fails to start, contact support with the following information:

Instance ID
Region
Instance type
Time of failure
Error codes from the Events tab

Instance Suddenly Stopped or Restarted

High Impact

A running instance unexpectedly stopped, restarted, or is experiencing unexpected reboots.

Possible Causes

Resource exhaustion (CPU, memory, disk)
Host system maintenance or issues
Operating system crashes or kernel panics
Security systems terminating problematic processes
Payment or billing issues leading to service suspension

Resolution Steps

Check instance metrics
Review resource usage metrics to identify potential resource exhaustion issues.

Review system logs

Connect to the instance and check system logs for error messages, OOM (Out of Memory) events, or crash reports.

# Check system logs
sudo journalctl -xb -p err

# Check for out of memory events
sudo grep -i "out of memory" /var/log/syslog

# Check kernel logs
sudo dmesg | grep -i error

Verify billing status
Check your account billing page to ensure no payment issues or account suspensions.

Check resource usage and limits

Ensure your application isn't exceeding the instance's resource limits.

# Check current resource usage
htop

# Check disk space
df -h

# Check for processes consuming excessive resources
ps aux --sort=-%mem | head -10

Enable crash reporting
Configure the operating system to preserve crash dumps for analysis.

Diagnostic Information

The instance Events tab will show restart events and any automated actions taken by the platform. Key files to examine include:

/var/log/syslog: General system logs
/var/log/kern.log: Kernel-related messages
/var/crash/: System crash dumps (if enabled)
~/.pm2/logs/: PM2 process manager logs (if using Node.js)
/var/log/nginx/: Nginx web server logs

Preventative Measures

Implement proper resource monitoring with alerts for high usage
Configure auto-scaling for applications with variable loads
Use load testing to identify resource bottlenecks before production
Implement proper error handling in applications
Keep operating systems and applications updated

Instance Running Slowly

Medium Impact

Your instance is operational but experiencing poor performance, high latency, or slow response times.

Possible Causes

Resource contention (CPU, memory, disk I/O)
Network congestion or limitations
Inefficient application code or configuration
Database performance issues
Background processes consuming resources
Instance size inadequate for workload

Resolution Steps

Identify resource bottlenecks

Use monitoring tools to identify which resources are constrained.

# Check CPU, memory, and process information
htop

# Check disk I/O performance
iostat -x 1

# Check network performance
iftop

Optimize application configuration
Adjust application settings to better utilize available resources.
Check for resource-intensive processes
Identify and optimize or terminate unnecessary processes.
Optimize database operations
Review and optimize database queries, indexes, and connection pooling.
Resize the instance
If the workload consistently exceeds current resources, upgrade to a larger instance size.

Performance Analysis Tools

SLYD provides several built-in tools to help diagnose performance issues:

Resource Monitoring Dashboard: Real-time metrics for CPU, memory, disk, and network
Performance Insights: Historical performance data and trend analysis
Benson Performance Advisor: AI-powered recommendations for performance optimization

Additionally, you can install these common performance analysis tools:

# Install common performance tools
sudo apt update
sudo apt install htop iotop iftop sysstat netdata

Benson Recommends

"Use the Performance Score feature in your instance dashboard to get an overall health rating and specific recommendations. Scores below 70 indicate potential issues that should be addressed."

"Consider enabling auto-scaling if your workload has variable demand patterns. This allows your resources to adjust automatically based on actual usage."

Networking Issues

Cannot Connect to Instance

High Impact

Unable to connect to an instance via SSH, web interface, or application endpoints.

Possible Causes

Instance firewall rules blocking connections
Cloudflare tunnel configuration issues
SSH key or authentication problems
Instance network interface issues
Application or service not running on expected port
DNS propagation delays

Resolution Steps

Verify instance status
Ensure the instance is in "Running" state and has passed all health checks.
Check tunnel status
Navigate to Networking → Tunnels to verify the tunnel is active and properly configured.
Verify firewall rules
Check that your instance's firewall allows traffic on the required ports.
Test connection using SLYD Console
Use the web-based terminal in the SLYD Console to access your instance directly, bypassing external networking.

Check service status

If you can access the instance but not a specific service, verify the service is running:

# Check listening ports
sudo netstat -tulpn | grep LISTEN

# Check SSH service
sudo systemctl status sshd

# Check web server
sudo systemctl status nginx  # or apache2, etc.

Diagnostic Information

If you cannot connect via SSH, try these diagnostic steps:

Enable verbose SSH logging

# Add -v (up to -vvv for maximum verbosity)
ssh -v [email protected]

Check DNS resolution

# Verify DNS resolution
nslookup instance-xyz.slyd.dev

# Check connection with specific timeout
nc -zv instance-xyz.slyd.dev 22 -w 5

Connection Troubleshooting Flow

Slow Network Performance

Medium Impact

Instance is experiencing high latency, slow data transfer rates, or intermittent connectivity.

Possible Causes

Geographic distance between client and instance region
Network congestion in data center or Cloudflare network
Bandwidth throttling or limitations
Large data transfers saturating available bandwidth
DNS or CDN configuration issues
DDoS protection temporarily limiting connections

Resolution Steps

Run network diagnostics

Test network speed and latency from your instance.

# Install network tools
sudo apt install speedtest-cli mtr

# Run speed test
speedtest-cli

# Run traceroute with timing
mtr -rw google.com

Check for network-intensive processes

Identify processes that might be consuming excessive bandwidth.

# Monitor network usage by process
sudo apt install nethogs
sudo nethogs

Optimize content delivery
For web applications, enable compression and caching.
Consider a different region
If latency is consistently high, consider deploying to a region closer to your users.

Network Performance Benchmarking

Compare your network performance against SLYD benchmarks:

Metric	Expected Range	Poor Performance
Download Speed	500-2000 Mbps	< 200 Mbps
Upload Speed	400-1500 Mbps	< 150 Mbps
Latency (Same Region)	1-5 ms	> 10 ms
Latency (Cross-Region)	20-100 ms	> 150 ms

Network Optimization Tips

Enable HTTP/2 for web applications to improve connection efficiency
Use a content delivery network (CDN) for static assets
Implement proper caching headers for web content
Consider enabling Cloudflare's Argo Smart Routing for improved performance
Schedule large data transfers during off-peak hours
Compress data before transmission for large file transfers

Application Issues

Application Deployment Failure

Medium Impact

Unable to deploy an application from the marketplace or custom deployment fails.

Possible Causes

Insufficient resources for the application
Dependency conflicts or missing prerequisites
Network issues during package download
Incompatible application versions
Incorrect configuration parameters

Resolution Steps

Review deployment logs
Check the deployment logs for specific error messages.
Verify resource requirements
Ensure your instance meets the minimum requirements for the application.
Check for dependency conflicts
If deploying multiple applications, ensure they don't have conflicting dependencies.
Try manual installation
If marketplace deployment fails, try installing the application manually to identify specific issues.
Check application compatibility
Verify the application is compatible with your instance's operating system and environment.

Common Deployment Errors

ERROR_INSUFFICIENT_MEMORY

The instance does not have enough memory to run the application.

Solution: Resize your instance to a larger memory configuration or optimize the application's memory usage.

ERROR_DEPENDENCY_CONFLICT

The application has dependency conflicts with existing software.

Solution: Use container isolation or virtual environments to separate applications with conflicting dependencies.

ERROR_NETWORK_TIMEOUT

Network timeout occurred during package download.

Solution: Check network connectivity, try again later, or use a mirror repository if available.

Application Crashes or Exits Unexpectedly

High Impact

Application starts but crashes or exits unexpectedly during operation.

Possible Causes

Application bugs or code errors
Resource exhaustion (memory leaks, CPU spikes)
Missing or corrupted configuration files
Unexpected input or data corruption
External service dependencies unavailable
System signals terminating the process

Resolution Steps

Check application logs

Review application logs for error messages or exceptions.

# Common log locations
# Web server logs
sudo tail -f /var/log/nginx/error.log

# Application-specific logs
sudo tail -f /var/log/your-application/error.log

# System journal for service errors
sudo journalctl -u your-application-service -f

Monitor resource usage
Check if the application is hitting resource limits before crashing.

Implement process monitoring

Use a process manager to automatically restart crashed applications.

# For Node.js applications
npm install -g pm2
pm2 start app.js --name "myapp" --watch

# For Python applications
pip install supervisor
supervisord -c /etc/supervisor/supervisord.conf

Check external dependencies
Verify all external services the application depends on are available and responding.
Update the application
Check for application updates that might address known stability issues.

Advanced Debugging Techniques

Core dump analysis

Enable and analyze core dumps to identify crash causes.

# Enable core dumps
ulimit -c unlimited
echo '/tmp/core.%e.%p' | sudo tee /proc/sys/kernel/core_pattern

# Analyze core dump (example for C/C++ applications)
gdb /path/to/application /tmp/core.applicationname.1234

Application profiling
Use profiling tools to identify performance bottlenecks or memory leaks.

Process tracing

Trace system calls and signals to understand application behavior.

# Trace system calls
strace -f -p [PROCESS_ID]

# Trace only specific calls
strace -e open,read,write -p [PROCESS_ID]

Billing & Account Issues

Unexpected or High Charges

Medium Impact

Your bill is higher than expected or contains charges you don't recognize.

Possible Causes

Instances left running when no longer needed
Resource-intensive applications consuming more than expected
Unoptimized storage usage or unnecessary snapshots
High network data transfer costs
Additional services or add-ons enabled

Resolution Steps

Review billing details
Examine your billing statement for a breakdown of charges by resource type and instance.
Check active resources
Inventory all running instances, stored snapshots, and persistent volumes.
Analyze usage patterns
Review resource utilization graphs to identify unexpected spikes or continuous high usage.
Implement cost controls
Set up budget alerts and usage notifications to prevent future surprises.
Optimize resource usage
Rightsize instances, delete unnecessary snapshots, and optimize storage.

Cost Optimization Tips

Shut down development instances when not in use
Use resource scheduling to automatically start/stop instances based on schedules
Implement snapshot lifecycle policies to automatically delete old snapshots
Monitor network transfer costs and optimize data flow
Right-size your instances based on actual resource utilization
Use Benson's cost optimization recommendations for instance-specific advice

Benson Recommends

"I've analyzed your usage patterns and noticed that you could save approximately 35% by using a scheduled shutdown policy for your development instances during non-working hours (nights and weekends)."

"Consider using resource tagging to track costs by project or department, making it easier to identify where optimization opportunities exist."

Troubleshooting v1.0.0

Compute Instance Issues

Instance Fails to Start

Possible Causes

Resolution Steps

Diagnostic Information

When to Contact Support

Instance Suddenly Stopped or Restarted

Possible Causes

Resolution Steps

Diagnostic Information

Preventative Measures

Instance Running Slowly

Possible Causes

Resolution Steps

Performance Analysis Tools

Networking Issues

Cannot Connect to Instance

Possible Causes

Resolution Steps

Diagnostic Information

Connection Troubleshooting Flow

Slow Network Performance

Possible Causes

Resolution Steps

Network Performance Benchmarking

Network Optimization Tips

Application Issues

Application Deployment Failure

Possible Causes

Resolution Steps

Common Deployment Errors

Application Crashes or Exits Unexpectedly

Possible Causes

Resolution Steps

Advanced Debugging Techniques

Billing & Account Issues

Unexpected or High Charges

Possible Causes

Resolution Steps

Cost Optimization Tips