SlideShare a Scribd company logo
1 of 21
Download to read offline
Troubleshooting CloudStack
Rajesh Battala, Likitha Shetty & Sailaja Mada
Wednesday, December 18, 2013
Agenda
 ACS developer
–
–
–
–

ACS Error codes
Debugging tips in ACS development
SSVM troubleshooting
ACS ports

 ACS Cloud Admin
–
–
–
–
–
–

Install, Configuration & Deployment
Log analysis
Important Global Config Parameters
Best Practices
Cloud Database
Reusing Hypervisors

 References

 Q&A
Troubleshooting CloudStack
ACS Developer

Troubleshooting CloudStack
ACS error codes
-

Client error codes
public static final int MALFORMED_PARAMETER_ERROR = 430;
public static final int PARAM_ERROR = 431;
public static final int UNSUPPORTED_ACTION_ERROR = 432;
public static final int PAGE_LIMIT_EXCEED = 433;

-

Server error codes
public static final int INTERNAL_ERROR = 530;
public static final int ACCOUNT_ERROR = 531;
public static final int ACCOUNT_RESOURCE_LIMIT_ERROR= 532;
public static final int INSUFFICIENT_CAPACITY_ERROR = 533;
public static final int RESOURCE_UNAVAILABLE_ERROR = 534;
public static final int RESOURCE_ALLOCATION_ERROR = 534;
public static final int RESOURCE_IN_USE_ERROR = 536;
public static final int NETWORK_RULE_CONFLICT_ERROR = 537

Insert Presentation Title Here
Debugging tips in CS development
- Generally use eclipse to attach debugger to the management server
- SystemVM agents
- kill the running process
- add -Xdebug Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=878
7 to /usr/local/cloud/systemvm/_run.sh
- open port 8787
- start the java process - ./run.sh
- Usage
- To check if events are being logged in check usage_events in
cloud DB
- To start usage server in dev setup
mvn -pl usage -Drun -Dpid=$$
Insert Presentation Title Here
SSVM troubleshooting
-

Login
-

-

-

ssh -i /root/.ssh/id_rsa.cloud -p 3922 root@ip where ip is link
local on XenServer and private ip in case of VMware
Script to check the health of SSVM
- /usr/local/cloud/systemvm/ssvm-check.sh
Check if port 8250 is open
In global configuration value of ‘host’ is right set to the management
server ip
Check agent status – service cloud status
Logs can be found at
- /var/log/cloud/cloud.log
Template status can be found in template_store_ref DB table

Insert Presentation Title Here
And a couple more …
-

DB Encryption

To decrypt the database secret key use the following
java -classpath /usr/share/java/cloud-jasypt-1.8.jar
org.jasypt.intf.cli.JasyptPBEStringDecryptionCLI decrypt.sh
input=<encryptedValue> password=<secretKey> verbose=false
(where secretKey is the value in /etc/cloudstack/management/key file)

-

GUI timeout
-

-

Default timeout is 15 minutes
To increase the timeout edit
/usr/share/cloud/management/webapps/client/WEB-INF/web.xml to add
<session-config>
<session-timeout>60</session-timeout>
</session-config>
Restart the server

Insert Presentation Title Here
ACS Ports
-

-

-

-

Management Server
- 8080: Primary GUI / Authentication API Port
- 8096: User/Client Management Server (unauthenticated)
- 8787: CloudStack (Tomcat) debug socket
- 9090: Cloudstack Management Cluster Interface
SystemVM Agent
- 3922: SystemVM to Management (secure)
- 8250: SystemVM to Management (unsecure)
MySQL Server
- 3306: MySQL Server
Hypervisor
- 22/443: XenServer
- 22: KVM
- 443: vCenter
7080: AWS API server

Insert Presentation Title Here
ACS Administrator

Troubleshooting CloudStack
ACS Administrator
 Install, Configuration & Deployment
 Log analysis
 Important Global Config Parameters

 Best Practices
 Reuse of Hypervisors
 Cloud database

Troubleshooting CloudStack
Install ,Configuration & Deployment Issues
? Failed to login to ACS Management server



4.2 requires Min 2 GB RAM
Redeploy DB and start cloudstack-setup-management

? Issue with Instances in isolated network
 VLAN Trunking in Switch port configuration

? Failed to deploy instances
 Insufficient resources : Management server log analysis

Troubleshooting CloudStack
Install ,Configuration & Deployment Issues
? Failed to add host
 XCP host – Copy Echo plugin
 Host License
 Compatible host while creating the cluster of hosts

? Host/Storage pool in avoid set
 Reachability issues
 Timeout
 Capacity of the storage pool / Host
 Alert state
? Move XS hosts from Alert state
 Unmanage the cluster with the affected host.
 Clear the host tags of the affected host.
xe host-param-clear param-name=tags uuid=<UUID of affected host>
 Manage the cluster with the affected host.

Troubleshooting CloudStack
Install ,Configuration & Deployment Issues
? Host in Alert State
 Monitor Host Root Disk usage
?





Host/Storage pool in avoid set
Reachability issues
Timeout
Capacity of the storage pool / Host
Alert state

? Move XS hosts from Alert state
 Unmanage the cluster with the affected host.
 Clear the host tags of the affected host.
xe host-param-clear param-name=tags uuid=<UUID of affected host>
 Manage the cluster with the affected host.

Troubleshooting CloudStack
Logs
 Management Server logs
- /var/log/cloudstack/managementserver.log
- /var/log/cloudstack/api.log
 SSVM

- /var/log/cloud/cloud.out

 KVM cloudstak Agent - /var/log/cloudstack/agent/agent.log
 vSphere logs
- /var/log/hostd.log (host log)
- /var/log/vmkernel.log (kernel log)
- /var/log/vpxa.log (agent log)
 Xenserver logs
- /var/log/Smlog
-/var/log/xensource.log
 /etc/cloudstack/management/log4j-cloud.xml - Set the priority to TRACE
Levels - FATAL, ERROR, WARNING, INFO, DEBUG, TRACE

Troubleshooting CloudStack
Global Config Parameters
expunge.delay

Determines how long (in seconds) to wait before actually
expunging destroyed vm. The default value = the default value of
expunge.interval

60

expunge.workers

The interval (in seconds) to wait before running the expunge
thread.
Number of workers performing expunge

network.gc.interval

Seconds to wait before checking for networks to shutdown

600

network.gc.wait

Time (in seconds) to wait before shutting down a network that's
not in used

600

pool.storage.allocated.capacity.disablethreshold

Percentage (as a value between 0 and 1) of allocated storage
utilization above which allocators will disable using the pool for
low allocated storage available.

secstorage.allowed.internal.sites

Comma separated list of cidrs internal to the datacenter that can
host template download servers, please note 0.0.0.0 is not a valid
site

wait

Time in seconds to wait for control commands to return

vmware.vcenter.session.timeout
integration.api.port

VMware client timeout in seconds
Defaul API port
The interval (in seconds) to wait before running the storage
cleanup thread.

expunge.interval

storage.cleanup.interval

Troubleshooting CloudStack

60
1

1

1800
12000
8096
86400
Best Practises


Switch port configurations ( VLANs must be trunked).



Restrict the IP addresses which can access storage to avoid data loss .



Monitor host disk space .



All hosts must be 64-bit and must support HVM (Intel-VT or AMD-V enabled). All Hosts within a
Cluster must be homogeneous.



The volumes used for Primary and Secondary storage should be accessible from Management
Server and the hypervisors. These volumes should allow root users to read/write data. These
volumes must be for the exclusive use of CloudStack and should not contain any data



With Advanced Networking, separate subnets must be used for private and public networks



The Management Servers communicate with the XenServers on ports 22 (ssh) and 80 (HTTP).



The Management Servers communicate with VMware vCenter servers on port 443 (HTTPs).



The Management Servers communicate with the KVM servers on port 22 (ssh).

Troubleshooting CloudStack
Reusing Hypervisors

•
•
•
•
•
•
•

xe vm-uninstall --multiple –force
Unmount Storage
xe vif-unplug uuid=<uuid>
xe vif-destroy uuid=<uuid>
xe network-destroy uuid=<cloud link Local uuid>
sh /opt/xensource/bin/cloud-clean-vlan.sh
Disable cloud tags created on host


•
•
•
•

Xenserver

Vmware

Delete all instances
Delete Templates
Unmount Datastores
Remove all cloud networks

Troubleshooting CloudStack
Cloud Database














op_dc_vnet_alloc
op_dc_ip_address_alloc
user_ip_address
image_store
vm_template
Template_store_ref
volume
storage_pool
host
vm_instance
nics
network_offering
physical_network_traffic_types

Troubleshooting CloudStack
Troubleshooting CloudStack
References
o

https://cwiki.apache.org/confluence/display/CLOUDSTACK/SSVM%2C+templates%2C+Secondary+storage+t
roubleshooting

o

https://cwiki.apache.org/confluence/display/CLOUDSTACK/Ports+used+by+CloudStack

o

http://dlafferty.blogspot.in/2013/08/using-cloudstacks-log-files-xenserver.html

Troubleshooting CloudStack
Get Involved
Web: http://cloudstack.apache.org/
Mailing Lists: cloudstack.apache.org/mailing-lists.html

IRC: irc.freenode.net: 6667 #cloudstack
Twitter: @cloudstack

LinkedIn: www.linkedin.com/groups/CloudStack-Users-Group-3144859
If it didn’t happen on the mailing list, it didn’t happen.

Troubleshooting CloudStack

More Related Content

What's hot

Backroll: Production Grade KVM Backup Solution Integrated in CloudStack
Backroll: Production Grade KVM Backup Solution Integrated in CloudStackBackroll: Production Grade KVM Backup Solution Integrated in CloudStack
Backroll: Production Grade KVM Backup Solution Integrated in CloudStack
ShapeBlue
 
Hyper-converged infrastructure
Hyper-converged infrastructureHyper-converged infrastructure
Hyper-converged infrastructure
Igor Malts
 
Kubernetes and container security
Kubernetes and container securityKubernetes and container security
Kubernetes and container security
Volodymyr Shynkar
 

What's hot (20)

Backroll: Production Grade KVM Backup Solution Integrated in CloudStack
Backroll: Production Grade KVM Backup Solution Integrated in CloudStackBackroll: Production Grade KVM Backup Solution Integrated in CloudStack
Backroll: Production Grade KVM Backup Solution Integrated in CloudStack
 
Ceph issue 해결 사례
Ceph issue 해결 사례Ceph issue 해결 사례
Ceph issue 해결 사례
 
KFServing - Serverless Model Inferencing
KFServing - Serverless Model InferencingKFServing - Serverless Model Inferencing
KFServing - Serverless Model Inferencing
 
Ceph scale testing with 10 Billion Objects
Ceph scale testing with 10 Billion ObjectsCeph scale testing with 10 Billion Objects
Ceph scale testing with 10 Billion Objects
 
Performance Wins with BPF: Getting Started
Performance Wins with BPF: Getting StartedPerformance Wins with BPF: Getting Started
Performance Wins with BPF: Getting Started
 
Hyper-converged infrastructure
Hyper-converged infrastructureHyper-converged infrastructure
Hyper-converged infrastructure
 
Monitoring in CloudStack
Monitoring in CloudStackMonitoring in CloudStack
Monitoring in CloudStack
 
Ceph Performance and Sizing Guide
Ceph Performance and Sizing GuideCeph Performance and Sizing Guide
Ceph Performance and Sizing Guide
 
Unrevealed Story Behind Viettel Network Cloud Hotpot | Đặng Văn Đại, Hà Mạnh ...
Unrevealed Story Behind Viettel Network Cloud Hotpot | Đặng Văn Đại, Hà Mạnh ...Unrevealed Story Behind Viettel Network Cloud Hotpot | Đặng Văn Đại, Hà Mạnh ...
Unrevealed Story Behind Viettel Network Cloud Hotpot | Đặng Văn Đại, Hà Mạnh ...
 
Ceph Introduction 2017
Ceph Introduction 2017  Ceph Introduction 2017
Ceph Introduction 2017
 
2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard
 
IBM Spectrum Scale Networking Flow
IBM Spectrum Scale Networking FlowIBM Spectrum Scale Networking Flow
IBM Spectrum Scale Networking Flow
 
Kubernetes and container security
Kubernetes and container securityKubernetes and container security
Kubernetes and container security
 
Monitoring Kubernetes with Prometheus
Monitoring Kubernetes with PrometheusMonitoring Kubernetes with Prometheus
Monitoring Kubernetes with Prometheus
 
Kubernetes CI/CD with Helm
Kubernetes CI/CD with HelmKubernetes CI/CD with Helm
Kubernetes CI/CD with Helm
 
Snowflake on AWSのターゲットエンドポイントとしての利用
Snowflake on AWSのターゲットエンドポイントとしての利用Snowflake on AWSのターゲットエンドポイントとしての利用
Snowflake on AWSのターゲットエンドポイントとしての利用
 
[오픈소스컨설팅] Open Stack Ceph, Neutron, HA, Multi-Region
[오픈소스컨설팅] Open Stack Ceph, Neutron, HA, Multi-Region[오픈소스컨설팅] Open Stack Ceph, Neutron, HA, Multi-Region
[오픈소스컨설팅] Open Stack Ceph, Neutron, HA, Multi-Region
 
IntelON 2021 Processor Benchmarking
IntelON 2021 Processor BenchmarkingIntelON 2021 Processor Benchmarking
IntelON 2021 Processor Benchmarking
 
Apache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraApache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native Era
 
[오픈소스컨설팅] 쿠버네티스와 쿠버네티스 on 오픈스택 비교 및 구축 방법
[오픈소스컨설팅] 쿠버네티스와 쿠버네티스 on 오픈스택 비교  및 구축 방법[오픈소스컨설팅] 쿠버네티스와 쿠버네티스 on 오픈스택 비교  및 구축 방법
[오픈소스컨설팅] 쿠버네티스와 쿠버네티스 on 오픈스택 비교 및 구축 방법
 

Similar to Troubleshooting Apache Cloudstack

Trouble shooting apachecloudstack
Trouble shooting apachecloudstackTrouble shooting apachecloudstack
Trouble shooting apachecloudstack
Sailaja Sunil
 
SharePoint 2010 Virtualization - SharePoint Saturday L.A.
SharePoint 2010 Virtualization - SharePoint Saturday L.A.SharePoint 2010 Virtualization - SharePoint Saturday L.A.
SharePoint 2010 Virtualization - SharePoint Saturday L.A.
Michael Noel
 
SharePoint 2010 Virtualization - Norway SharePoint User Group
SharePoint 2010 Virtualization - Norway SharePoint User GroupSharePoint 2010 Virtualization - Norway SharePoint User Group
SharePoint 2010 Virtualization - Norway SharePoint User Group
Michael Noel
 
SharePoint 2010 Virtualization - SharePoint Saturday East Bay 2010
SharePoint 2010 Virtualization - SharePoint Saturday East Bay 2010SharePoint 2010 Virtualization - SharePoint Saturday East Bay 2010
SharePoint 2010 Virtualization - SharePoint Saturday East Bay 2010
Michael Noel
 
SharePoint 2010 Virtualisation - SharePoint Saturday UK
SharePoint 2010 Virtualisation - SharePoint Saturday UKSharePoint 2010 Virtualisation - SharePoint Saturday UK
SharePoint 2010 Virtualisation - SharePoint Saturday UK
Michael Noel
 

Similar to Troubleshooting Apache Cloudstack (20)

Trouble shooting apachecloudstack
Trouble shooting apachecloudstackTrouble shooting apachecloudstack
Trouble shooting apachecloudstack
 
Replacing Squid with ATS
Replacing Squid with ATSReplacing Squid with ATS
Replacing Squid with ATS
 
ReplacingSquidWithATS
ReplacingSquidWithATSReplacingSquidWithATS
ReplacingSquidWithATS
 
How to configure esx to pass an audit
How to configure esx to pass an auditHow to configure esx to pass an audit
How to configure esx to pass an audit
 
Monkey man
Monkey manMonkey man
Monkey man
 
SharePoint 2010 Virtualization - SharePoint Saturday L.A.
SharePoint 2010 Virtualization - SharePoint Saturday L.A.SharePoint 2010 Virtualization - SharePoint Saturday L.A.
SharePoint 2010 Virtualization - SharePoint Saturday L.A.
 
SharePoint 2010 Virtualization - Norway SharePoint User Group
SharePoint 2010 Virtualization - Norway SharePoint User GroupSharePoint 2010 Virtualization - Norway SharePoint User Group
SharePoint 2010 Virtualization - Norway SharePoint User Group
 
SharePoint 2010 Virtualization - SharePoint Saturday East Bay 2010
SharePoint 2010 Virtualization - SharePoint Saturday East Bay 2010SharePoint 2010 Virtualization - SharePoint Saturday East Bay 2010
SharePoint 2010 Virtualization - SharePoint Saturday East Bay 2010
 
GlassFish v2 Clustering
GlassFish v2 ClusteringGlassFish v2 Clustering
GlassFish v2 Clustering
 
Pixels_Camp
Pixels_CampPixels_Camp
Pixels_Camp
 
VMworld 2013: vSphere Data Protection (VDP) Technical Deep Dive and Troublesh...
VMworld 2013: vSphere Data Protection (VDP) Technical Deep Dive and Troublesh...VMworld 2013: vSphere Data Protection (VDP) Technical Deep Dive and Troublesh...
VMworld 2013: vSphere Data Protection (VDP) Technical Deep Dive and Troublesh...
 
Php version 7
Php version 7Php version 7
Php version 7
 
Ganglia Monitoring Tool
Ganglia Monitoring ToolGanglia Monitoring Tool
Ganglia Monitoring Tool
 
BeeGFS Training.pdf
BeeGFS Training.pdfBeeGFS Training.pdf
BeeGFS Training.pdf
 
Proxy
ProxyProxy
Proxy
 
SharePoint 2010 Virtualisation - SharePoint Saturday UK
SharePoint 2010 Virtualisation - SharePoint Saturday UKSharePoint 2010 Virtualisation - SharePoint Saturday UK
SharePoint 2010 Virtualisation - SharePoint Saturday UK
 
Building virtualised CloudStack test environments
Building virtualised CloudStack test environmentsBuilding virtualised CloudStack test environments
Building virtualised CloudStack test environments
 
Memory Management
Memory ManagementMemory Management
Memory Management
 
Caching and tuning fun for high scalability
Caching and tuning fun for high scalabilityCaching and tuning fun for high scalability
Caching and tuning fun for high scalability
 
Taming the Cloud Database with Apache jclouds, ApacheCon Europe 2014
Taming the Cloud Database with Apache jclouds, ApacheCon Europe 2014Taming the Cloud Database with Apache jclouds, ApacheCon Europe 2014
Taming the Cloud Database with Apache jclouds, ApacheCon Europe 2014
 

More from Radhika Puthiyetath

Automation Using Marvin Framework by Sowmya Krishnan
Automation Using Marvin Framework by Sowmya KrishnanAutomation Using Marvin Framework by Sowmya Krishnan
Automation Using Marvin Framework by Sowmya Krishnan
Radhika Puthiyetath
 
Open Writing ! - Collaborative Authoring on Apache’s First Open-Source Cloud ...
Open Writing ! -	Collaborative Authoring on Apache’s First Open-Source Cloud ...Open Writing ! -	Collaborative Authoring on Apache’s First Open-Source Cloud ...
Open Writing ! - Collaborative Authoring on Apache’s First Open-Source Cloud ...
Radhika Puthiyetath
 

More from Radhika Puthiyetath (12)

The Apache Way (And How Not to Break Builds!)
The Apache Way (And How Not to Break Builds!)The Apache Way (And How Not to Break Builds!)
The Apache Way (And How Not to Break Builds!)
 
IISc Project Presentation
IISc Project PresentationIISc Project Presentation
IISc Project Presentation
 
Corporate Websites Improvement Areas
Corporate Websites Improvement AreasCorporate Websites Improvement Areas
Corporate Websites Improvement Areas
 
Technical Publication Process
Technical Publication ProcessTechnical Publication Process
Technical Publication Process
 
Clarity in Documentation
Clarity in DocumentationClarity in Documentation
Clarity in Documentation
 
Doc publishing -LeanSixSigma Project
Doc publishing -LeanSixSigma ProjectDoc publishing -LeanSixSigma Project
Doc publishing -LeanSixSigma Project
 
On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...
On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...
On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...
 
Automation Using Marvin Framework by Sowmya Krishnan
Automation Using Marvin Framework by Sowmya KrishnanAutomation Using Marvin Framework by Sowmya Krishnan
Automation Using Marvin Framework by Sowmya Krishnan
 
Nexenta Powered by Apache CloudStack from Iliyas Shirol
Nexenta Powered by Apache CloudStack from Iliyas ShirolNexenta Powered by Apache CloudStack from Iliyas Shirol
Nexenta Powered by Apache CloudStack from Iliyas Shirol
 
Cloud stack for_beginners
Cloud stack for_beginnersCloud stack for_beginners
Cloud stack for_beginners
 
Automating Content Translation Workflow with Transifex
Automating Content Translation Workflow with TransifexAutomating Content Translation Workflow with Transifex
Automating Content Translation Workflow with Transifex
 
Open Writing ! - Collaborative Authoring on Apache’s First Open-Source Cloud ...
Open Writing ! -	Collaborative Authoring on Apache’s First Open-Source Cloud ...Open Writing ! -	Collaborative Authoring on Apache’s First Open-Source Cloud ...
Open Writing ! - Collaborative Authoring on Apache’s First Open-Source Cloud ...
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 

Troubleshooting Apache Cloudstack

  • 1. Troubleshooting CloudStack Rajesh Battala, Likitha Shetty & Sailaja Mada Wednesday, December 18, 2013
  • 2. Agenda  ACS developer – – – – ACS Error codes Debugging tips in ACS development SSVM troubleshooting ACS ports  ACS Cloud Admin – – – – – – Install, Configuration & Deployment Log analysis Important Global Config Parameters Best Practices Cloud Database Reusing Hypervisors  References  Q&A Troubleshooting CloudStack
  • 4. ACS error codes - Client error codes public static final int MALFORMED_PARAMETER_ERROR = 430; public static final int PARAM_ERROR = 431; public static final int UNSUPPORTED_ACTION_ERROR = 432; public static final int PAGE_LIMIT_EXCEED = 433; - Server error codes public static final int INTERNAL_ERROR = 530; public static final int ACCOUNT_ERROR = 531; public static final int ACCOUNT_RESOURCE_LIMIT_ERROR= 532; public static final int INSUFFICIENT_CAPACITY_ERROR = 533; public static final int RESOURCE_UNAVAILABLE_ERROR = 534; public static final int RESOURCE_ALLOCATION_ERROR = 534; public static final int RESOURCE_IN_USE_ERROR = 536; public static final int NETWORK_RULE_CONFLICT_ERROR = 537 Insert Presentation Title Here
  • 5. Debugging tips in CS development - Generally use eclipse to attach debugger to the management server - SystemVM agents - kill the running process - add -Xdebug Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=878 7 to /usr/local/cloud/systemvm/_run.sh - open port 8787 - start the java process - ./run.sh - Usage - To check if events are being logged in check usage_events in cloud DB - To start usage server in dev setup mvn -pl usage -Drun -Dpid=$$ Insert Presentation Title Here
  • 6. SSVM troubleshooting - Login - - - ssh -i /root/.ssh/id_rsa.cloud -p 3922 root@ip where ip is link local on XenServer and private ip in case of VMware Script to check the health of SSVM - /usr/local/cloud/systemvm/ssvm-check.sh Check if port 8250 is open In global configuration value of ‘host’ is right set to the management server ip Check agent status – service cloud status Logs can be found at - /var/log/cloud/cloud.log Template status can be found in template_store_ref DB table Insert Presentation Title Here
  • 7. And a couple more … - DB Encryption To decrypt the database secret key use the following java -classpath /usr/share/java/cloud-jasypt-1.8.jar org.jasypt.intf.cli.JasyptPBEStringDecryptionCLI decrypt.sh input=<encryptedValue> password=<secretKey> verbose=false (where secretKey is the value in /etc/cloudstack/management/key file) - GUI timeout - - Default timeout is 15 minutes To increase the timeout edit /usr/share/cloud/management/webapps/client/WEB-INF/web.xml to add <session-config> <session-timeout>60</session-timeout> </session-config> Restart the server Insert Presentation Title Here
  • 8. ACS Ports - - - - Management Server - 8080: Primary GUI / Authentication API Port - 8096: User/Client Management Server (unauthenticated) - 8787: CloudStack (Tomcat) debug socket - 9090: Cloudstack Management Cluster Interface SystemVM Agent - 3922: SystemVM to Management (secure) - 8250: SystemVM to Management (unsecure) MySQL Server - 3306: MySQL Server Hypervisor - 22/443: XenServer - 22: KVM - 443: vCenter 7080: AWS API server Insert Presentation Title Here
  • 10. ACS Administrator  Install, Configuration & Deployment  Log analysis  Important Global Config Parameters  Best Practices  Reuse of Hypervisors  Cloud database Troubleshooting CloudStack
  • 11. Install ,Configuration & Deployment Issues ? Failed to login to ACS Management server   4.2 requires Min 2 GB RAM Redeploy DB and start cloudstack-setup-management ? Issue with Instances in isolated network  VLAN Trunking in Switch port configuration ? Failed to deploy instances  Insufficient resources : Management server log analysis Troubleshooting CloudStack
  • 12. Install ,Configuration & Deployment Issues ? Failed to add host  XCP host – Copy Echo plugin  Host License  Compatible host while creating the cluster of hosts ? Host/Storage pool in avoid set  Reachability issues  Timeout  Capacity of the storage pool / Host  Alert state ? Move XS hosts from Alert state  Unmanage the cluster with the affected host.  Clear the host tags of the affected host. xe host-param-clear param-name=tags uuid=<UUID of affected host>  Manage the cluster with the affected host. Troubleshooting CloudStack
  • 13. Install ,Configuration & Deployment Issues ? Host in Alert State  Monitor Host Root Disk usage ?     Host/Storage pool in avoid set Reachability issues Timeout Capacity of the storage pool / Host Alert state ? Move XS hosts from Alert state  Unmanage the cluster with the affected host.  Clear the host tags of the affected host. xe host-param-clear param-name=tags uuid=<UUID of affected host>  Manage the cluster with the affected host. Troubleshooting CloudStack
  • 14. Logs  Management Server logs - /var/log/cloudstack/managementserver.log - /var/log/cloudstack/api.log  SSVM - /var/log/cloud/cloud.out  KVM cloudstak Agent - /var/log/cloudstack/agent/agent.log  vSphere logs - /var/log/hostd.log (host log) - /var/log/vmkernel.log (kernel log) - /var/log/vpxa.log (agent log)  Xenserver logs - /var/log/Smlog -/var/log/xensource.log  /etc/cloudstack/management/log4j-cloud.xml - Set the priority to TRACE Levels - FATAL, ERROR, WARNING, INFO, DEBUG, TRACE Troubleshooting CloudStack
  • 15. Global Config Parameters expunge.delay Determines how long (in seconds) to wait before actually expunging destroyed vm. The default value = the default value of expunge.interval 60 expunge.workers The interval (in seconds) to wait before running the expunge thread. Number of workers performing expunge network.gc.interval Seconds to wait before checking for networks to shutdown 600 network.gc.wait Time (in seconds) to wait before shutting down a network that's not in used 600 pool.storage.allocated.capacity.disablethreshold Percentage (as a value between 0 and 1) of allocated storage utilization above which allocators will disable using the pool for low allocated storage available. secstorage.allowed.internal.sites Comma separated list of cidrs internal to the datacenter that can host template download servers, please note 0.0.0.0 is not a valid site wait Time in seconds to wait for control commands to return vmware.vcenter.session.timeout integration.api.port VMware client timeout in seconds Defaul API port The interval (in seconds) to wait before running the storage cleanup thread. expunge.interval storage.cleanup.interval Troubleshooting CloudStack 60 1 1 1800 12000 8096 86400
  • 16. Best Practises  Switch port configurations ( VLANs must be trunked).  Restrict the IP addresses which can access storage to avoid data loss .  Monitor host disk space .  All hosts must be 64-bit and must support HVM (Intel-VT or AMD-V enabled). All Hosts within a Cluster must be homogeneous.  The volumes used for Primary and Secondary storage should be accessible from Management Server and the hypervisors. These volumes should allow root users to read/write data. These volumes must be for the exclusive use of CloudStack and should not contain any data  With Advanced Networking, separate subnets must be used for private and public networks  The Management Servers communicate with the XenServers on ports 22 (ssh) and 80 (HTTP).  The Management Servers communicate with VMware vCenter servers on port 443 (HTTPs).  The Management Servers communicate with the KVM servers on port 22 (ssh). Troubleshooting CloudStack
  • 17. Reusing Hypervisors  • • • • • • • xe vm-uninstall --multiple –force Unmount Storage xe vif-unplug uuid=<uuid> xe vif-destroy uuid=<uuid> xe network-destroy uuid=<cloud link Local uuid> sh /opt/xensource/bin/cloud-clean-vlan.sh Disable cloud tags created on host  • • • • Xenserver Vmware Delete all instances Delete Templates Unmount Datastores Remove all cloud networks Troubleshooting CloudStack
  • 21. Get Involved Web: http://cloudstack.apache.org/ Mailing Lists: cloudstack.apache.org/mailing-lists.html IRC: irc.freenode.net: 6667 #cloudstack Twitter: @cloudstack LinkedIn: www.linkedin.com/groups/CloudStack-Users-Group-3144859 If it didn’t happen on the mailing list, it didn’t happen. Troubleshooting CloudStack