Preview only show first 10 pages with watermark. For full document please download

Cloudportal Services Manager Operations Guide V2.0

Rating
Date

November 2018
Size

2.2MB
Views

1,243
Categories

Computers & electronics Software

Transcript

CloudPortal Services Manager Operations Guide v2.0 Contents Purpose ..................................................................................................................................... 3 Architecture................................................................................................................................ 3 Management Web Site ........................................................................................................... 4 SQL Databases ...................................................................................................................... 4 Directory Web Service ............................................................................................................ 5 Message Queue Services....................................................................................................... 6 Provisioning Engine ................................................................................................................ 6 Reporting Data Warehouse .................................................................................................... 7 Service Specific Web Services ............................................................................................... 7 Deployment Scenarios ............................................................................................................... 8 Simple Scenarios ................................................................................................................... 8 Complex Scenarios ...............................................................................................................10 Monitoring & Maintenance ........................................................................................................13 Scheduled tasks ....................................................................................................................13 Database Backups ................................................................................................................13 Recommendations ...........................................................................................................14 Database Index maintenance ................................................................................................14 Database Trimming of rapidly growing tables ........................................................................15 Provisioning Request Logs .............................................................................................15 Data Warehouse data.......................................................................................................18 Data Transfer tasks ...............................................................................................................19 Provisioning Logs (Object Status)..........................................................................................21 Troubleshooting ........................................................................................................................25 Determining where the problem might be ..............................................................................25 Launching the Logon Screen ..........................................................................................25 Logging on to the Portal ..................................................................................................26 Submitting a Provisioning Request ................................................................................27 Object Status remains in a Provisioning State (Orange)...............................................29 Provisioning Errors .........................................................................................................30 IIS Configuration....................................................................................................................34 IIS Bindings ......................................................................................................................34 Application Pool setup ....................................................................................................35 ASP.Net and .Net Framework ..........................................................................................36 Web Page File permissions .............................................................................................36 Other Common Errors .....................................................................................................37 Database Access ..................................................................................................................39 Encryption .............................................................................................................................42 SQLProperties ..................................................................................................................42 AES 256bit encryption .....................................................................................................44 Recorded Errors ....................................................................................................................49 Provisioning Request Logs ....................................................................................................51 Web Services ........................................................................................................................54 Common Errors................................................................................................................57 General Web Service Troubleshooting...........................................................................57 Troubleshooting a Working Web Service ...............................................................................59 Web Service Credentials .................................................................................................59 Manual Execution of Web Service Method .....................................................................60 Tracing issues with SQL Profiler ............................................................................................68 Failure to load the customers screen .............................................................................76 2 Purpose This document is intended to be used by Citrix Service Providers as a guide to allow them to best configure their CloudPortal Services Manager environment, and to assist in troubleshooting issues which might arise during normal operation of the platform. Throughout this document the use of the acronym CPSM will be used in place of CloudPortal Services Manager Architecture The diagram below shows the basic architecture of a single location CPSM environment Figure 1: CPSM Component Architecture The grey section shows the primary components for any installation of CPSM and consists of the following:  The Management Web Site.  SQL Databases.  Directory Web Service.  Microsoft Message Queue services.  A Provisioning Engine.  A Reporting Data Warehouse (optional). 3 The yellow section reflects the Services Environment. Services are optional add-ons which may, or may not have any additional Web Service which the Provisioning Engine and Management Web Site will talk to for querying service specific information or performing configuration changes. Management Web Site The Management Web Site is the user interface which simply allows for the Configuration, Management, and Reporting on CPSM usage. Figure 2: Simplistic view of the role the Management Web Site plays Essentially the Web Site is used to Configure the System. Once configured, it can then be managed by either the Service Provider themselves, or by their end customers directly. And lastly it is used for viewing usage reports With any installation of CPSM, there will only ever be one Management Web Site, which will always reside in the Primary location. SQL Databases There are 2 primary databased used within CPSM used to store configuration information, provisioning rules, and logging of diagnostic information. With any installation of CPSM, these databases will only ever reside in the Primary location, and must be on the same server and within the same instance of SQL Server. The database server is normally aliased as CortexSQL 4 These databases are: Database Name Purpose OLM OLMReports  Configuration Storage for: o Locations o Customer settings o Services o Users o Customer Service relationships o User Service relationships o Notification emails o Reporting  Object State  Provisioning Rule Configuration  Logging and diagnostic information o Provisioning status logging o Audit logs  Configuration Settings  Logging and diagnostic information o User Interface error logging Table 1: CPSM Databases Directory Web Service The Directory Web Service is unique to all of the other Web Services used within CPSM, as it is the only one which is a core component, and the only one which is required in each Active Directory Location. The Directory Web Service is used for Active Directory related configuration or retrieval of environment information such as obtaining a list of the servers which belong to the location which the Directory Web Service resides, or authenticating a user against Active Directory 5 Message Queue Services Microsoft Message Queueing is a mechanism used to queue requests for the Provisioning Engine to act upon. This allows for a dis-joined behavior and means that the Web Site does not need to wait for provisioning actions to have to wait for the completion of provisioning requests in order to become responsive again. An administrator can simply request a provisioning action and then continue on to other administrative tasks immediately. The Provisioning Engine will process the provisioning request at its leisure and write back a response to the database once the provisioning is complete. CPSM uses a number of different Message Queues for different purposes. Each queue is explained in more detail in the following table CPSM uses the following Message Queues: Message Queue Purpose CortexBulkRequest Processes requests which occur in bulk so that normal UI processing requests are not impacted. For example: Bulk reprovisioning, Password expiry email generation, or User Import requests CortexRequest This is the standard UI processing queue. For example: Customer, Service, User, AND User Service provisioning CortexResponse This is an optional queue responsible for processing Provisioning response logging. Most likely configured for Remote Locations where direct SQL access might not be available. When this queue is not configured, response logging is done directly to SQL from the provisioning engine CortexUsageData Much like the CortexBulkRequest queue, this queue is responsible for Bulk usage data collection tasks. For example: Exchange mailbox usage statistics Table 2: Message Queues Provisioning Engine The provisioning engine is a Microsoft.Net windows service which watches for task requests in the form of new messages arriving in the Message Queues. Each provisioning request is processed by a set of provisioning rules that determine which provisioning actions need to be executed to fulfill the request. A Provisioning Engine is required in each Active Directory Location which is to be managed by CPSM. 6 Reporting Data Warehouse The Reporting Data Warehouse is an optional component where usage information is gathered on a daily basis so that reports can be generated for various interested parties. Reporting within CPSM consists of Microsoft Reporting Services reports which utilizes the data stored in the Data Warehouse Service Specific Web Services Service specific web services are used as a mechanism to communicate with a service environment where there is no other API or SDK available to communicate with, or where the API or SDK does not expose enough functionality needed for CPSM to offer provisioning of this service. This allows service specific code to be isolated to the Web Service itself, and eliminates the need to duplicate similar functionality throughout areas of CPSM which need to interact with the specific service environment. 7 Deployment Scenarios This section will cover some of the more common deployment scenarios for CloudPortal Services Manager which can be implemented without the need for any customization or the involvement of Citrix Consulting Services Simple Scenarios The following scenarios represent the more common implementations of CloudPortal Services Manager. These scenarios really form the foundation for any of the more complex configurations which we will look at within the “Complex Scenarios” section Single Shared Location This first scenario represents a typical deployment and consists of a single “Shared” location Figure 3: Deployment - Single Shared Location All machine infrastructure resides in the same Active Directory domain as the users to which the platform manages All customers share the same domain and are segregated from each other to ensure that no customer knows anything about any other customer on the platform User ID’s within the environment are suffixed with a unique code relating to the customer to which the user belongs, this is to ensure that there are no user name conflicts which might give away the fact that other customers are using the same platform 8 This scenario is simple to deploy and to manage. There is no complex network design. Typically, all machines are within the same subnet, so there are no firewall or routing considerations to deal with Corporate (or Private) Location Similar to the Single “Shared” Location, and a scenario which is becoming more and more popular, is the use of CloudPortal Services Manager as a management Platform for a nonservice provider, or Corporate environment. This scenario is possibly implemented by a larger company, to assist in the management and deployment of their internal users and services Since the only users being added to this environment are from the company hosting the platform, the environment is configured so that the User ID’s do not have the unique suffix added like in a shared environment A setup like this in a corporate environment can greatly reduce the reliance on technical personnel to deploy and configure user services. This task can be simply delegated to the helpdesk or even departmental managers, enabling new users and services to be added within minutes. Multiple Locations CloudPortal Services Manager can be configured to manage multiple Active Directories (or Locations) Figure 4: Deployment - Multiple Locations The first location configured is always marked as the “Primary” location which is the only location where the Management Web Site, SQL Server, and Encryption Service reside 9 One or more additional locations can be added at any time to allow the ability to manage users and services from another Active Directory. The Grey section in this diagram represents one or more additional locations. This is normally done to support different customer’s needs. For example, if one of your customers is not comfortable with sharing the domain and services with other customers, you might add an additional location purely for this customer. In this case the location will be configured as “Private”, or non-shared. Additional locations can be configured for either multi-tenancy, or be dedicated to a single customer. Complex Scenarios In actual fact, all scenarios are based on the foundation of the scenarios mentioned earlier, and it does not matter if you have one location, or multiple, the following features can be applied to any location, or in some cases to one or more customers individually. Use of ADSync ADSync, or Active Directory Sync is a service which is applied to one or more customers, and allows the ability for the customer to manage their own users in their own Active Directory, but still take advantage of being able to use services which are hosted by the Service Provider. Figure 5: Deployment - ADSync The Users from the customers Active Directory (including their Passwords) are synchronized into CloudPortal Services Manager in the Service Providers environment 10 In order for Passwords to be captured and Synchronized successfully, the ADSync Agent should be installed on ALL Active Directory Servers in the Customers domain The ADSync agents communicate only with CPSM’s web based API which means that there is no need for any domain trusts or any other connectivity apart from standard Http or Https traffic. And if the customer does not allow their Active Directory servers to access the internet directly, then the ADSync agents can also be configured to use a Proxy. Use of Remote Domains The use of Remote Domains, in some ways is similar to ADSync, in that a customer wishes to manage their user identities from within their own domain, but unlike ADSync, the user is not synchronized to the CPSM domain. Instead a disabled user is created in the CPSM domain, but is linked to the account in the Customers domain. Figure 6: Deployment - Remote Domain This scenario is only supported for use with the Exchange and Lync or Skype for Business services. Remote Domains can be configured on a Customer by Customer basis, so you can have a mixture of standard customers whose users are created normally and can consume all hosted services, and customers who use the Remote Domain setup. 11 High Availability To achieve High Availability, any of the IIS components (the Management Web Site, the Directory Web Service, and the service specific Web Services) should all be load balanced with a Netscaler or similar load balancer. The Microsoft Message Queue and Provisioning Server should be clustered for failover And the SQL Databases should be configured for AlwaysOn availability The Encryption Web Service does not need to be load balanced as this web service is only used when performing upgrades or for the installation of new service components. And the Microsoft Reporting services reports use the OLMReporting Database which will also be configured within an AlwaysOn availability group For more information on the specifics of setting up CPSM for High Availability, refer to the “Deploying CloudPortal Services Manager 11.x for High Availability and Disaster Recovery” document at the following link: CloudPortal Services Manager 11.x High Availability and Disaster Recovery 12 Monitoring & Maintenance Proper maintenance of CloudPortal Services Manager is essential to ensuring that the platform operates efficiently for both yourselves as a Service Provider, and for your customer’s experience. This section will look at the different areas of CPSM which need to be monitored regularly, as well as some tasks for you to complete to help achieve a smooth running environment Scheduled tasks CPSM requires behind the scenes automation tasks to run periodically to help with keeping various processes up to date. This might include, but is not limited to:  Ensuring User Password properties, such as Account Locked or Account Expired to be synchronized with user objects within the CPSM Databases for the Password Expiry Email notification process  Collecting Service specific usage information for reporting purposes  Workflow cleanup tasks  Sending the Password Expiry emails These tasks typically reside on the Provisioning Server, but in the case of Usage Data Collection tasks, these may reside on servers specific to the data being collected. These tasks should also reside in a folder specific to CPSM as in the following image: Figure 7: Scheduled Tasks folder structure It is essential that these tasks run successfully, so should be checked on a daily basis. Any failed tasks will need to be repaired as soon as possible. Database Backups The CPSM Databases are at the core of the CPSM Control Panel. Losing one or more of these database is likely to render the entire platform unusable, so taking regular backups is essential to being able to recover quickly in the case of any corruption or complete failure. The CPSM Databases have all been created with the “Simple” Recovery Model, which means that transaction logs are not maintained. No transaction logs mean that data can only be recovered back to the last FULL backup. For most installations, this is fine, as the amount of 13 changes which occur daily is relatively small, and the time it will take to perform a full backup is sufficient for this to occur on a daily basis. But for other, larger installations of CPSM, it might be beneficial to alter the Recovery Model to “FULL”, which will enable the use of incremental backups. But bear in mind, that this will also require a lot more active disk space to be available for storing transaction data between backups. Recommendations Recovery Model Minimum Backup Strategy FULL INCREMENTAL Simple Daily - Full Weekly Daily Table 3: Database Backup Strategy Restoring a full backup is a fairly straight forward task, but restoring transaction logs from an incremental backup is a little more complex. Make sure, as a part of your backup strategy, that you also perform test restores periodically to ensure that you are familiar with the process when the need arises. Database Index maintenance Database maintenance is something which is quite often overlooked, but is essential to a database running at optimal performance. Over time, modifications to data will cause both physical and logical fragmentation. This fragmentation is likely to cause an increase in disk IO activity on the SQL Server due to additional reads needed to retrieve data than what would occur if data was stored closer together in a contiguous format. The CPSM databases have been created and initially configured with the assumption that maintenance is not being performed, and unfortunately this is likely to cause higher than desired fragmentation. In addition to reorganizing or rebuilding indexes, it is recommended that you set the “Auto Shrink” Database option to False. Shrinking can be used to reduce the size of a data or log file, but it is a very intrusive, resource-heavy process that causes massive amounts of logical scan fragmentation in data files and leads to poor performance. 14 Unless you are familiar with SQL Server Integration Services (SSIS), creating a maintenance plan may be a difficult task. Fortunately, SQL Server comes with built in features for being able to create a maintenance plan which will be suitable for rebuilding indexes at frequent intervals. Maintenance Plans can be found within SQL Server Management Studio under Management as in the below image. Expand the “Management” node. Right click Maintenance Plans and choose “Maintenance Plan Wizard” Figure 8: SSMS – Maintenance Plans Follow the wizard to create a plan covering all CPSM databases (OLM, OLMReports, and OLMReporting) which at least rebuilds indexes (other tasks are optional) on a weekly basis. The wizard allows you to receive an emailed report of the job upon completion. This might be desirable to avoid having to log onto the SQL Server to check the status of the maintenance task. Database Trimming of rapidly growing tables Some tables within the CPSM databases grow quite rapidly and can cause space and performance issues on your SQL Server(s). The 2 areas of special interest are:  Provisioning Request Logs  Data Warehouse data Provisioning Request Logs A high use CPSM environment is likely to result in many provisioning requests being generated. Each provisioning request is logged, and additionally, almost every provisioning action is also logged, which, for a single provisioning request, may result in anywhere between 20 and possibly 200 logged entries 15 Provisioning Logs are useful in identifying the cause of provisioning related errors and will help in leading you to an ideal resolution. They are also useful in identifying when, and by whom a certain action was performed. Maybe a user was deleted, and you need to find out who deleted this user. The Provisioning Logs can help with identifying the responsible user. So you are likely going to want to keep a month or more of provisioning logs available, but much beyond this, is just wasted space. And in fact, an issue which is likely to arise out of not maintaining these logs, is that simply navigating to the Provisioning Requests page within CPSM may result in an error due to a timeout retrieving the data. To configure a task to purge old Provisioning Request Logs The OLM database contains a Stored Procedure for deleting Request Log entries. This Stored Procedure is called sp_RequestDelete and accepts the following 2 parameters:  RequestID  BeforeDate At first glance, it would look like the best parameter to specify would be the “BeforeDate”, but unfortunately the largest table (RequestLog) is not optimized around dates, so the procedure would perform pretty badly. A better option is to use the RequestID parameter. Here is a query which can be used to call the sp_RequestDelete stored procedure repeatedly for every RequestID prior to x months: Declare Declare Declare Declare @RequestID int @CreateDate datetime @CurrentDate char(11) @Months int Set @Months = 6 --******Change this value****** Set @CurrentDate = '1 jan 1900' Select top 1 @RequestID = RequestID, @CreateDate = Created from Requests Where Created < DateAdd(month, -@months, getdate()) order by RequestID While isnull(@RequestID, 0) > 0 Begin if @CurrentDate <> convert(char(11), @CreateDate, 113) Begin Select @CurrentDate = convert(char(11), @CreateDate, 113) Print 'Deleting Requests from ' + @CurrentDate End exec sp_RequestDelete @RequestID=@RequestID 16 Set @RequestID = 0 Select top 1 @RequestID = RequestID, @CreateDate = Created from Requests Where Created < DateAdd(month, -@months, getdate()) order by RequestID End This query should be scheduled to run on a daily basis by using SQL Servers equivalent of the Task Scheduler - The SQL Server Agent The following download link contains 2 scripts: 1. The above script for either executing manually, or creating your own Scheduled Task. (RequestLogs Trim Script.sql) 2. A script for automatically creating a Scheduled Task configured for running the above script with a default setting of 6 months of Request Log data to keep. (Create RequestLogs Trim Task.sql) https://citrix.sharefile.com/d/s09d4fc6044a4612a If you choose to have the Scheduled Task be created automatically for you, then execute the “Create RequestLogs Trim Task.sql” script within SQL Management Studio Once the task has been created, you will want to customize it to ensure it suits your specific needs. Find the new task under SQL Server Agent \ Jobs within SQL Management Studio Figure 9: SQL Agent - Trim Request Logs task Right click the task and select “Properties” 17 Recommended changes  Edit the Job Step to change the number of months of Request Log data to keep  Edit the schedule to have the trim job run at a more appropriate time (by default, it is configured to run at 12:30am on a daily basis)  Add a notification to send an email stating the jobs execution outcome Data Warehouse data The Data Warehouse stores reporting data for almost every object within CPSM. For example, here are some of the metrics which might be collected on a daily basis:  # of users per customer  # of services / instances per customer  User Service Relationships  The type of access a user has for each service provisioned (User Plan)  Mailbox usage  Customers Public Folder usage, i.e. number of items in the public folder or the total size consumed by the public folder  Applications assigned to each user for the Hosted Apps and Desktops Service  Date and time each of these services were first provisioned  Number of users within a customer being assigned to each service The metrics mentioned above are all Customer related. CPSM also totals up each of these metrics (and more) for each reseller. Obviously the storage needed to collect this data will differ between environments, but imagine if this data for a single day consumed as little as 10Mb (some environments are in excess of 100Mb). Over a 1-year period, over 3.5Gb would be needed, and at 100Mb per day, over 35Gb in a single year. As you can see, this data growth can be quite rapid, so it is important to keep this in check by ensuring older data is removed from the database. Typically, customers are only interested in reporting on the past couple of months of usage, but may also occasionally need to go back a little longer. It’s unlikely that anyone needs to go back more than 6-12 months. Thankfully, the Data Warehouse has been created to automatically trim (or remove) data in order to keep the most recent x months of data. All you have to do is configure it with a value suitable for your environment. 18 To configure the Data Warehouse Purge value  On the Server where the Data Warehouse Web Service is installed (Note: this might not be the SQL Server where the OLMReporting DB is installed), navigate to the c:\Program Files (x86)\Citrix\Cortex\Data Warehouse\Data Warehouse Service\config folder  Open the config.xml file  Find the following xml (it will be near the top of the config file): Purge Aged Data 0 End Destination 36  Alter the AgeInMonths value  Save the file The next time the Data Transfer tasks run, which includes aggregating and storing the Data Warehouse reporting data, the older data prior to the number of months specified in this config file will be deleted. Note: This deletion task has been optimized to delete data in batches as to minimize the impact on transaction logging. Data Transfer tasks For reporting to show accurate data, it is essential that in addition to the Scheduled Tasks which collect service specific data, that the Data Transfer task also runs successfully. The Data Warehouse configuration as well as the Data Transfer historical status can be found under the Reports \ Configuration \ Data Warehouse menu item Figure 10: Reports Configuration 19 On this page you will find a section listing the last 3 Data Transfer attempts including the execution time and status of the transfer. Figure 11: Data Transfer history Ideally, the most recent entry will be green, indicating a successful transfer. In that case, even if the older entries are red, then there is nothing that you need to do. Odds are, that the next execution will also be a success. If the most recent entry is red, then you can click on the start or end times as these are links which will take you to the Data Transfer log for that specific transfer. Figure 12: Data Transfer Error Log The Data Transfer configuration allows for the Data Transfer Log to be sent to an email recipient upon either success or failure. Configure this by editing the config.xml file on the Data Warehouse Server, and by default, located in the “C:\Program Files (x86)\Citrix\Cortex\Data Warehouse\Data Warehouse Service\config” folder The section is likely to be at the very end of the file and looks like the following piece of xml: 20 {from email address} {to email address} Data Transfer - Success {smtp server name or ip address} 25 <h3>Data Transfer - Success</h3> {from email address} {to email address} Data Transfer - Error {smtp server name or ip address} 25 <h3>Data Transfer - Error</h3> Edit the values appropriately for your environment and save the file. Due to the complex nature of the Data Warehouse and Data Transfer configuration, errors related to the transfer of data are often difficult to resolve. You may be able to work out the cause and resolve it yourself, although it is likely that you will need to log a support call for a Citrix Support Engineer to assist in the diagnosis and resolution of these errors. The important thing is that you are on top of these errors as soon as they occur to ensure that your reporting is as accurate as possible. Provisioning Logs (Object Status) When working with any software, the last thing you want to see is errors. The same goes when working with CPSM. But this comes with a little more of a challenge, as not all provisioning actions will be ones which you (as the Service Provider) initiated, since customers can provision their own users and user services etc. But that does not mean that you are completely blind. This is where the Provisioning Requests screen can come in handy. The Provisioning Requests screen not only shows the most recent provisioning requests, but also allows you to list objects which match a given state, which can be more useful than scrolling through screen after screen of each customer’s user base to find any provisioning failures. And in fact Provisioning Failures are not the only issues you will be interested in to maintain a clean environment. 21 By default, the Provisioning Request screen displays “your” latest Provisioning Requests, regardless of the outcome of the request. The Request Filter pane on the left hand side allows you to filter these provisioning requests to show a combination of either Request Status, or Object Status, with the default being “All Statuses”. Figure 13: Provisioning Requests - Request Filter Since you will be looking for Objects which are in a status other than “Provisioned”, the Request Status is irrelevant. Alter the filter to show “All Requests”, and the “Object Status” to show one of the following statuses:  Failed  Pending Changes  Requested  In Progress Then click “Search” The results will show a combination of successful, in-progress, and failed requests. This might seem confusing. If you selected to show “Failed” objects, then why would you be seeing successful requests? Remember, this page actually shows “Provisioning Requests”, but allows you to filter it by Object Status, so it shows all provisioning requests for any object which, let’s say, Failed. It will be the latest request though, which will be the one which failed. You can use this to now look through the requests in an attempt to determine what caused the failure, then fix the issue manually. Odds are, an issue which affected one object, will have also affected many others. So after you have resolved a few different types of issues, you can then look at re-provisioning objects in bulk. 22 The “Bulk System Provisioning” screen, which can be found under the menu Configuration \ Provisioning & Debug Tools is specifically suited to re-provisioning objects in a failed state and looks like this: Figure 14: Bulk System Provisioning Unfortunately, it is not as simple as just asking it to re-provision ALL failed objects, nor will it detect objects in any other state, like objects stuck in a Provisioning state (orange) or Pending Changes (blue) etc. Objects in these other provisioning states need to be processed one by one manually. To Provision all failed objects, simply select the object type (or entity), select “Provisioning Failed” and click Provision You may find that some objects still fail to provision successfully. Simply return to the Provisioning Requests screen to determine the cause, and after resolve more issues, you can repeat the process. Another source for detecting errored objects is the Customers screen. This will show you a warning icon beside either the User count or the Service count for the customer Figure 15: Customers screen - warning icons 23 Clicking on the warning icon will display a list of the objects in error And then clicking on a listed object will take you to the actual configuration page for that object allowing you to see more details about the error or to simply re-try the provisioning action. Objects in any of the other states (Provisioning, Provisioning Requested, or Pending Changes) are a little more difficult to locate. Apart from simply navigating through every customer, user and user service screen, which is a pretty cumbersome task, the only real way to do it is to use Figure 16: Warning icon - Errored objects the Provisioning Requests screen as described above. There is not likely to be may objects in any of these states so the listings should be simple enough to manage. The hardest part is likely to be in identifying the object itself. The Provisioning Request label does show some detail as to the object in question, but is often not specific enough. For example, in the image above (not the Provisioning Requests page, but similar in nature to the description of a provisioning item), the item “Aaron Lister – My Exchange” only tells you the username and the service, but what customer does this user belong to? A couple of ways to work out what the exact object is are: 1. Hover over the “Request Type” field for the provisioning entry. This will show you who submitted the provisioning request and is likely to be an administrator belonging to the same customer. 2. Review the Request Logs themselves. These logs will have a log entry which identifies the object more clearly, like the distinguishedName of the object, or maybe the customer code. Perform this type of clean up periodically 24 Troubleshooting Troubleshooting issues which occur within the CPSM environment can be a difficult task. This section will look at how to troubleshoot some specific components of the platform, and in order to do so, a more in-depth understanding of parts of the architecture will be included. It will also look at some of the more common issues, and provide direction as to where to look for possible causes. Determining where the problem might be In order to narrow down your troubleshooting efforts, you first need to determine where the problem might be, and to do so effectively, an expansion on CPSM components and how they interact with each other is needed. The following sections dig a little deeper into some common scenarios where you might encounter a problem. It will then explain in more detail the process involved in order to achieve the desired result. From this understanding you will see with a bit more clarity the areas that you will need to focus on in order to diagnose the problem. Launching the Logon Screen There is a lot going on just to load the logon page to the Management Web Site, so let’s look at this a little closer. Figure 17: Loading the Logon Screen What happens when you browse to the URL of the Portal? The Web Site is a Microsoft.Net Web Site, so we have IIS, Bindings, ASP.Net, and the .Net Framework involved. 25 The Web Site content is dynamically loaded from the database. This allows for customer specific customization of the content and branding for each page as well as the language translation which CPSM offers. So we have Database Access The database connection strings are stored in the database and are retrieved the first time the Web Site is accessed. Once retrieved, these connection strings are cached as Application Settings within IIS and are re-used for each subsequent call to the Web Site. No Web Service interaction occurs when simply loading this page What to check for: The type of error you receive will often help with identifying at least where the issue is. But generally, you should check the following areas:  IIS Configuration  Database Connectivity Logging on to the Portal Figure 18: Logging onto the Portal Logging on to the portal involves a little more than just authentication. CPSM needs to query the database to retrieve the following information:  The location of the Directory Web Service which is needed to perform the authentication request against Active Directory  The customer the user belongs to  Branding assigned to the customer  The Language the user has selected as their default language  The Roles assigned to the user and customer This demonstrates that there is again Database access involved, but given that the logon page was displayed, you can safely assume that database access will not be an issue with logging 26 on. All of this information should be able to be retrieved successfully. The only thing which might cause any of these items to fail to return successfully, is if the Database Server or Databases have become unavailable all of a sudden, or if the data within the database is corrupt. The last scenario is pretty unlikely though The Web Site will also make a call to the Directory Web Service to make sure the user has entered the correct credentials. There is no need to access the registry for the database access this time around, as the connection strings for the database access have already been retrieved and are available as Application Settings within IIS What to check for: Even though Database connectivity was working correctly when the logon page was loaded, there is still a slight chance that connectivity has been lost between loading the logon page and actually making the request to logon, so Database access should still be investigated. So the following things should be looked at:  Database Connectivity  Ensure that the Directory Web Service is functioning correctly Submitting a Provisioning Request When a user clicks “Provision” or “Deprovision” anywhere within the Management Portal, there is actually quite a lot going on before the request can be sent to the queue for processing. Figure 19: Submitting a Provisioning Request 27 All user input needs to be validated. This also includes values which might not be initially obvious, as there might be values which are not visible on the screen at the time. For example, if a user service provisioning action was requested, the user plan will be selected on the user interface, but the properties associated with the user plan are hidden within the plan configuration. Additionally, the service settings are also hidden, but since all of these values need to be gathered together to build the provisioning request, there still needs to be some validation involved. Validation does not necessarily check that the value is correct, it might only look to see if a value is configured. For example, a Mailstore database has been selected for the User Plan when provisioning Exchange to a user After validation, the configuration changes are saved to the database. For example, which User Plan was selected, or for User Provisioning, all of the user’s properties are saved. The Request is built, but is not sent yet Next, it is the Web Site’s role to execute the PreRequest rules. These rules help in gathering additional information which is needed to be added to the provisioning request to ensure the provisioning engine has all properties it needs to successfully complete the environmental changes correctly. The execution of the PreRequest rules is likely to make additional changes to the Request built earlier. And finally, the request can be sent to the MSMQ for processing Although not mentioned in the diagram above, there is a possibility that some of the properties being retrieved to build the request could be encrypted, therefore encryption needs to be included as an area of interest. What to check for: Given that there is a lot more going on when submitting a request for provisioning, there is also more areas which could cause issues. Troubleshooting begins to get a little more complex, and may now involve more skillsets to diagnose and resolve. Tip: All steps mentioned above need to be completed BEFORE the current iFrame is closed. So if an error occurred before the iFrame closed, then the request will not have been sent to the provisioning queues. The following things should be looked at:  Make sure all required values have been configured to satisfy the provisioning request  Database Connectivity 28  For errors related to a failure to save the configuration values to the database: o Database Connectivity o And to see what SQL activity is going on, look at Tracing Issues with SQL Profiler later in this document Even though the MSMQ is involved, this is a disjoined process, so an attempt made to send a message to the MSMQ will not result in an error. It’s more likely that the message just does not arrive in the queue as intended. In this case, Journaling should be turned on, and the message queue permissions should be looked at Object Status remains in a Provisioning State (Orange) There is a possibility that when you attempt to provision (or Deprovision) an object, the Object Status icon get stuck on Orange. There are a few reasons why this might occur, but let’s first have a look to see what happens AFTER the Management Web Site successfully sends a Provisioning Request. Once the iFrame on the page where you clicked either “provision” or “deprovision” closes, we know that all required changes have been written to the database, and the Provisioning Request has been successfully sent. So what happens now?  The Object Status icon is set to Yellow (provisioning requested)  Under some circumstances the Web Site may set the Object Status icon to Orange (inprogress) prematurely  The Provisioning Request arrives in the MSMQ  The Provisioning Engine picks up the request and begins processing it  The Provisioning Engine sets the Object Status to Orange (in-progress)  Required actions are performed  On error, the Provisioning Engine sets the Object Status to Red (failed)  On success, the Provisioning Engine sets the Object Status to Green (succeeded) What to check for: There are pretty much only 4 things which would leave an object in an Orange state:  Failure for the Request to arrive in the MSMQ  If the Provisioning Engine could not load the rules, therefore would not know what actions to take 29  The Provisioning Engine is not running  Failure to update the Object Status to either Red or Green The following things should be looked at:  Check the Provisioning Request Logs. This will tell you if the request was processed by the Provisioning Engine  Check the outbound queue on the Web Server to see if there is an active connection to the Provisioning Server, and that there is nothing in this queue  Check that the Provisioning Request (message) had arrived in the appropriate Message Queue (most likely, the CortexRequest queue). If Messages are queueing up, then check to see if the Cortex Queue Monitor Service is running. If you see nothing in the queue, then check the Journal. You might need to turn on Journaling and resubmit the request in order to capture the Provisioning Request.  If Requests seem to be just dropping through the queue but are not being processed (they turn up in the Journal), then it might be that there was a problem loading the rules initially when the Provisioning Engine was started. This could simply be caused by the database not being available at the time the service started. Under this circumstance, you should restart the Citrix Queue Monitor Service  If the Provisioning Logs show that the Provisioning Request was completed (either in error or successfully) then perform a SQL Profiler trace to see what is stopping the Object from being updated successfully Provisioning Errors Errors occurring during a provisioning request will be the most common form of errors you will see with CPSM. Since provisioning is typically related to the configuration of the physical environment, the cause of the error can be related to almost anything. But before we look at actually diagnosing a provisioning request, let’s first look at what happens once a Provisioning Request reaches the MSMQ Message Queues. 30 Provisioning Process Request Message Properties Provisioning Rules Evaluate Rule Execute Action Next Rule Log Response Update Environment Figure 20: Provisioning Process The Citrix Queue Monitor Service picks up the request for processing (Green). At this point the PreRequest rules have already been processed. The Provisioning Engine is only responsible for processing the “Request” rules and the “PostRequest” rules. The Provisioning Engine begins processing the “Request” rules using the properties available within the MSMQ message to help determine which actions to execute. Each rule is evaluated (Red) to determine if the rule is to be processed or not If the rule condition evaluates to “True”, or if there is no condition on the rule, then the associated Action is processed (Grey). The Action can be anything from making a change to the physical environment (Yellow) by calling Web Service methods, or communicating directly with Active Directory, or simply making a change to a property within the Request Message. If the rule is configured to log a response, then the outcome of the rule is logged to the database (Blue). Logging a response is done is one of 2 ways. Either by: 1. Sending a “Response” message to the CortexResponse Queue, whereby the Provisioning Engine will process the response separately, or 2. By communicating directly with SQL to log the response. In the Primary location, a direct connection to SQL is always available, so responses are normally written directly to the database, but in a Remote location, SQL access might not be possible. In this situation, the location can be configured to log provisioning responses via http to the CortexResponse queue in the primary location. Logged responses can be viewed in the Provisioning Request Logs, and always show the following information: 1. The timestamp of when the rule completed 2. The Rule Description and associated parameters 31 3. Any error information in the case of a failure Figure 21: Sample Provisioning Request Log Entries And lastly, once all Rules have been processed, the Provisioning Request Status is updated. What to check for in the case of a Provisioning Error: The first place to look should be to try the provisioning request again. Occasionally the error is related to a timing issue, which may just work the second time around. Although if this becomes a pattern, then additional troubleshooting should be performed to determine why this is a regular point of failure, and an attempt to resolve it should be made. The next place to look should always be the Provisioning Request Logs If nothing has been logged, then chances are, that no rules were actually processed. In this case, it is possible that when the Citrix Queue Monitor Service was first started, that it was unable to load the provisioning rules. You should then restart the Citrix Queue Monitor Service and resubmit the request. If an error is received, then there should be tell-tale signs within the error message as to where to look next: Is there a Web Service connection parameter in the failed rule? If so, then look to troubleshoot the Web Service 32 Is there a Powershell script error mentioned in the error message? Again, troubleshooting the Web Service will help you discover more about what is causing the error. Do the parameter values look correct for the rule being executed? There is always a possibility that the request was built with incorrect values. This is likely to be more difficult to diagnose, but a good place to look is to firstly look at the Provisioning message itself. To do this, turn on journaling within the MSMQ and resubmit the request in order to capture the message. If you find that values are incorrect, then recheck the service configuration within the Management Web Site. Hopefully, if there is no Web Service connection within the rule parameters, that the error message is descriptive enough to point you in the right direction. Occasionally there might not be an error message at all. This is pretty rare, but is likely to point to a rule which has not been configured to log on either success or failure. In this case, you may need to edit the provisioning rules to enable logging for these rules. To narrow down your rules of interest, you should find the last rule which was logged and check each rule following that until you find the next rule which is configured to log on error. It will be one of these rules causing your error. Once logging is enabled, then you will need to resubmit the request and recheck the logs. 33 IIS Configuration There are many areas within the IIS configuration of a Web Site or Web Service which could go wrong. This section will only touch on the more obvious and common areas which you are likely to encounter issues. Where possible, this section will refer to both the Management Web Site and the CPSM Web Services as a Web Site or just simply a site. IIS Bindings IIS must be configured to respond to the web request appropriately. Typically, the CPSM Management Web Site is configured to respond on either port 80 or over SSL on port 443, and CPSM Web Services are configured to respond on port 8095. Check the bindings on the site match what you would expect them to be, and ensure that you are either calling the site according to the host headers configured, or there is a binding which allows for any host header to be specified. Host Headers are not normally used unless the Bindings use Port 80 and there is a conflicting Web Site. This is the reason why CPSM initially configures the CortexWeb host header when CPSM is first installed. Given that the bindings only really determine which ports or host headers IIS should respond to, you will only need to look into these further if there is no response or a response stating that the web site could not be found when attempting to browse to the URL of the site. Test the connectivity by browsing to the web site locally from the server hosting it. Also, for Web Services you should ensure that they are responsive from the Web Server and Provisioning Server. If a Web Service is not responsive, check the Server Connection details within CPSM to ensure the settings match the settings which IIS is configured to respond for. To do this, navigate to Configuration \ System Manager \ Server Connections: 34 Figure 22: Server Connections Then either configure IIS to match the settings within the CPSM Server Connections page, or alter the details with CPSM to match IIS. Make sure any firewall is configured to allow the traffic on the specific port between the appropriate servers Test using Telnet if Telnet is available on the Source Server Application Pool setup The Application Pool configuration differs between sites, so for the most part, responding to the particulars in any error message which your browser displays is the best solution. Some application pools are configured to use impersonation, whereas some will run under the Application Pool context. Essentially all should use impersonation, but unfortunately some Web Services need to run in the context of the Application Pool. Exchange is one Web Service that needs to run under this context. 35 Note: When testing a Web Service where the Application Pool is using impersonation by Browsing to it, the context for which the Web Service is running will be the user account that YOU are logged on as. This will likely differ from the account that would normally be configured to call the Web Service. Most Application Pools should be configured to run as Integrated, but if you find that this setting is causing issues, then by all means attempt to change it to Classic. Some might run .Net Framework 2.0, and some might be 4.0. The majority of CPSM Web Services will run the 4.0 version of the Framework, but a few won’t run properly if configured with 4.0. It might not be apparent when testing, and will only show up by way of one or 2 methods failing to run properly. So this is a difficult one to determine. All application pools should be configured with the “Load User Profile” property set to True. This will eliminate the possibility of the web site’s temporary files being removed by another service. ASP.Net and .Net Framework If you receive a 404.17 error when browsing to the .asmx file, then it is likely that ASP.Net is not installed or enabled for the version of the Framework that the Application Pool is using. Ensure the correct version of the Framework is installed Ensure that ASP.Net is installed and is enabled in IIS as in the following image Figure 23: IIS - APS.Net configuration Web Page File permissions You might receive a 500.19 Internal Server Error (Cannot read configuration file due to insufficient permissions) This may indicate that IIS does not have permissions to read the files for the Web Service. 36 Typically, you should never get these errors if you use the Citrix installers to install the Web Service as the installer should set the correct permissions for the Web Service to work properly. You will only get errors like this if the security has been altered in some way. Another way to resolve errors like this is to re-install the Web Service by using the install media, or check another web site which is working properly and attempt to replicate similar permissions. Although, there are times where all permissions are configured correctly, but IIS continues to give these errors. The only thing to do is to delete the site and reinstall it. Other Common Errors Here are some other common errors that you might receive that will stop you from loading the Web Service .asmx page Service Unavailable (503) Error when Browsing to the .asmx File This is normally caused because the user that you are running the application pool as does not have the Log on as a Batch Job within the security policy of this machine. To resolve this, follow the steps below: 1. Launch the Local Security Policy Console from the Administration Tools menu or run secpol.msc from the command line. 2. Expand Local Policies > User Rights Assignment. 3. Add the appropriate user to the Logon as a Batch Job policy Note: If you cannot add the user because the option is unavailable (grayed out), then you need to add it to the Domain Policy instead. 4. Close the Local Security Policy Console. 5. From Command Prompt, run GPUpdate. 6. Ensure that the application pool is running because it normally stops when this error occurs. 401.1 Unauthorised Often this is related to Kerberos configuration issues. This error is getting less common with later versions of IIS, but is still worth a mention here. 37 Refer to CTX129868 - Troubleshooting Kerberos Error FAQ for how to enable Kerberos logging and identify some common Kerberos issues. If you are not receiving Kerberos errors in the event log, and have turned on Kerberos logging, then the errors might not be related to Kerberos. Check the authentication methods that are available on the web site using the following steps: 1. Launch the command prompt. 2. Navigate to C:\inetpub\Adminscripts. 3. Run Cscript adsutil.vbs Get w3svc//root/NtAuthenticationProviders. You should receive a response saying that "Negotiate,NTLM" are allowed methods on the site. If you do not, then you need to add them by running Cscript adsutil.vbs Set w3svc//root/NtAuthenticationProviders "Negotiate,NTLM" 401.3 Access Denied If you are still getting permissions issues, then turning on file system logging might be a good idea. To do so: 1. Open the Local Security Policy 2. Expand the Audit Policy 3. Select Success and Failure for: 4. Audit Account Logon Events 5. Audit Logon Events 6. Audit Object Access 7. Run GPUpdate from a cmd prompt 8. In Windows Explorer edit the Properties of the effected folder. 9. Select the Security Tab and click Advanced 10. Go to the Auditing Tab and add “Everyone”. Give this selection Full Control Now check the Security Event Log to see what account needs access to which files. 38 Database Access Before any process related to CPSM can access the configuration databases, it first needs to be able to retrieve the database connection strings from a special configuration section of CPSM. But how does the process even access this specific area? There is an encrypted connection string stored in the registry on any Server which needs this access. This would normally only be the Provisioning Server, the Web Server, and the SQL Server. The setting is stored in HKLM\Software\EMS\SQLProperties\Settings or HKLM\Software\Wow6432Node\EMS\SQLProperties\Settings with a property name of ConnectionString See the section on Encryption for a better understanding on how encryption works within CPSM Once this “Master” connection string has been retrieved and has been decrypted, a call is made to the database to retrieve additional connection strings for connecting to the OLM and the OLMReports databases. When the Management Web Site makes this call (which only happens on the initial loading of the site), it will store the connection strings as Application Settings within IIS to remove the need for repeated calls to SQL for this information. Here is some common database related errors you might receive: Failure to find OLMDotNet Connection String: Figure 24: SQLProperties Error - Connot find OLMDotNet Connection String 39 This represents a failure to get the database connection strings when the Management Web Site was first initialized. This particular error refers to not being able to find the connection strings, or more specifically, it cannot find the “ConnectionStrings” property “OLMDotNet”. This is likely to be caused by the value being missing database. These connection strings are stored in the OLMReports database within a table called Settings. How do you resolve this?  Check the existence of the property by running: Select * from olmreports..settings where application = ‘connectionstrings’ and property = ‘OLMDotNet’  Make sure the “Master” connection string is pointing to the OLMReports database by decrypting the connection string from the registry. Refer to Encryption for more information on how to decrypt and encrypt values  Recycle the CortexMgmt Application Pool on the Web Server to force the Web Site to retry its attempt to get these connection strings Another variation of this error message is: Figure 25: SQLProperties Error - Connection String Property not initialized Notice the slight difference with this error as compared to the earlier one. This one states: SQLProperties: Error fetching Application Properties. The ConnectionString property has not been initialized. 40 This is actually a failure to decrypt the “Master” connection string from the registry and should be resolved by the following actions:  Make sure the “Master” connection string can be decrypted and is pointing to the OLMReports database by decrypting the connection string from the registry. Refer to Encryption for more information on how to decrypt and encrypt values  Make sure the Encryption Key exists for LocalMachine within the Registry and that the key value is correct. Refer to Encryption for more information on how to work with the encryption keys.  Recycle the CortexMgmt Application Pool on the Web Server to force the Web Site to retry its attempt to get these connection strings 41 Encryption Encryption is used throughout CPSM to ensure that sensitive data cannot be compromised in the event that the CPSM databases are accessed by unauthorized parties. Prior to V11.5 CPSM used a component called SQLProperties which used a custom Blowfish implementation to encrypt and decrypt data. This encryption did not support multi-byte character sets, so was replaced in v11.5 with an AES 256 bit encryption which does handle these character sets. This allows for CPSM to support a wider range of languages, like Japanese and Simplified Chinese for example. CPSM encrypts data in the following areas of CPSM:  The “Master” database connection string stored in the registry  The credentials used for Server Connections  Configuration Settings  Throughout the service property hierarchy  Security Questions and Answers There may be times where you need to decrypt or re-encrypt a value, or to simply test that the encryption is working as intended. The following topics will discuss each of these encryption methods in a little bit more depth and will provide details on how they can be used to decrypt or encrypt values SQLProperties SQLProperties is a 32bit COM component which will reside on the Web Server and the Provisioning Server A value which is encrypted using SQLProperties will look something like this: 6A4042304506B8E93C2D34530EA71E77992380FD0E88F4E6189E861987A8739FCB21C82D24BC4 7906EE026C99E69808AA451FD31D1A83BA7A0F3E2C5C229F9DF2D19500BDE913AD4DC754F7670 875A2C Notice that the value is all in upper case with only alphanumeric characters. To decrypt and encrypt values encrypted value you can use the following 2 methods. This should be executed on either the Provisioning Server or the Web Server: VBScript To encrypt a value, create a file with a .vbs extension and include the following contents 42 Dim s Dim encryptedval Set s = CreateObject("SQLProperties.Session") encryptedval = s.Encrypt ("ValueToEncrypt") wscript.echo encryptedval Execute the script with the 32bit version of cScript.exe from within a cmd prompt like so: C:\windows\syswow64\cScript.exe encryptionexample.vbs The result will be something like this: AC7D0FE0F41E8182FEEF8D21DCD8E8D0CDE8EC5B1E6FC260FB47478EE00F75AB Then to decrypt a value, create a file with a .vbs extension and include the following contents Dim s Dim decryptedval Set s = CreateObject("SQLProperties.Session") decryptedval = s.decrypt ("AC7D0FE0F41E8182FEEF8D21DCD8E8D0CDE8EC5B1E6FC260FB47478EE00F75AB") wscript.echo decryptedval Execute the script with the 32bit version of cScript.exe from within a cmd prompt like so: C:\windows\syswow64\cScript.exe decryptionexample.vbs The result will be something like this: ValueToEncrypt Powershell To encrypt a value, open a 32bit Powershell window, and type the following: $s = New-Object -ComObject "SQLProperties.session" $s.Encrypt([ref]"ValueToEncrypt") The result should be something like this: AC7D0FE0F41E8182FEEF8D21DCD8E8D0CDE8EC5B1E6FC260FB47478EE00F75AB To decrypt a value, open a 32bit Powershell window, and type the following: 43 $s = New-Object -ComObject "SQLProperties.session" $s.Decrypt([ref]"AC7D0FE0F41E8182FEEF8D21DCD8E8D0CDE8EC5B1E6FC260FB47478EE00F 75AB") The result should be something like this: ValueToEncrypt AES 256bit encryption The is the newer encryption used within CPSM and is a little more difficult to work with than SQLProperties, but as mentioned earlier, it supports a wider range of characters than what SQLProperties supported. Figure 26: Retrieval of Encryption Keys A value which is encrypted using AES 256bit encryption will look something like this: 9HeWZecRtS8XxuLPnVaFqg==;cq1Xic1c8JD9na9lbk7+b8O4uH8cz4zmH1ktK52tNTFAUmR4if64 RiJFzGIFNJivy6qjVFwHRnWULY6kDOi4pPItS1Kmr/2yr5S61dnzO5o4idFK60aZhU7/oGAzV9AIi zWKGifaVOP2NsIMdTevWvhix1z+FIdZKsqOKwHyOHnF9JuvkkaUo8qcE4dSombID4X4IZK5CjYRUq Ar0mMQeH5lq8w0vfirEQ90EkTcsiWxzTLFqRCodyg2Z2G1tdLCi3S3bQLzpJFaoLOkbWheKMQnL5s LPrx0iIQWpPSrLWw2lNmF2kmY+7gWuQCZwLRnf1UfnPH2sEtjJkAE41RTIMcQQLPuQmyNjbzPr7c/ pK3W/JCST1Sn5mu16z4vm4zNHcvDyebA9tU+jSA0P3hykA== Notice that the value, unlike the Blowfish implementation, is in mixed case and includes more than just alphanumeric characters. 44 In order for any user to be able to encrypt or decrypt values, the user must have access to the Master key. This Master key is created initially when the Encryption Web Service is installed, which must be the first component installed for any new installation, and any upgrade to V11.5 of CPSM. The Web Service is then used to obtain this Master key and enable the configuration or creation of any user specific versions of this key. This is only done during installation or upgrades of new services. NOTE: The Encryption Web Service does not need to be available for normal operation of CPSM The user related Encryption Keys are stored in the registry in the following location: HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Citrix\Cortex\Keys\ The above diagram shows the user context which is used by each component, therefore the related user encryption key which will be used in each case. Since the Provisioning Manager is launched by the logged on user, there needs to be a key in the registry for each user who wishes to use this component. To provide a new user access to launch the Provisioning Manager, refer to the following article: How to Provide User Access to Launch Provisioning Manager for CPSM 11.5 Use the following information to help with resolving issues related to this encryption Setup Powershell to use the 4.0 version of the .net framework Before working with the encryption libraries, you will need to setup powershell to use the 4.0 version of the .net framework, as by default, only the 2.0 version will be used. Create a powershell.exe.config file in same folder as the powershell.exe (32bit) and add the following content: After creating this file, you will need to restart Powershell for the change to take effect Assembly 45 All encryption management functionality is encapsulated within the following assembly: Citrix.Csm.dll This assembly can be found at C:\Program Files (x86)\Citrix\Cortex\Provisioning Engine\ within the Provisioning Engine, and C:\inetpub\CortexManagement\CortexDotNet\Bin\ within the Management Web Site From a 32Bit Powershell window, you can load the assembly with the following command: [system.reflection.assembly]::LoadFrom("\Citrix.Csm.dll") Classes There are 2 classes of interest: Class Purpose EncryptionKeyProvider Used for key management. i.e. it writes keys to the registry EncryptionProvider Used for encryption / decryption of values Table 4: Encryption Classes of interest Encryption Keys To create an instance of the EncryptionKeyProvider class, you would execute the following: $provider = New-Object citrix.csm.encryption.encryptionkeyprovider Run Get-Member on the $provider object to see what other methods are available You will probably find the ones of interest are:  GenerateKey  GetKey  SetKey The GenerateKey method is called automatically when the Encryption Web Service is first installed, and should never be executed manually, but the GetKey and SetKey methods can be useful. Both the GetKey and SetKey methods work with the Encryption Key stored in the Registry for the executing user. 46 The GetKey method retrieves the encrypted key from the registry and decrypts it. Executing this from the context of all users who have stored Encryption Keys should result in the same decrypted key value. If the key value returned by this method is different for any user, then you are likely to have problems with encrypted data, and you will need to determine which key is the correct one, then execute the SetKey method to repair the incorrect accounts. Executing the GetKey method does not require any parameters and is called like so: [system.reflection.assembly]::LoadFrom("\Citrix.Csm.dll") $provider = New-Object citrix.csm.encryption.encryptionkeyprovider $provider.GetKey() The SetKey method creates or updates the Encryption Keys in the registry for the current user, with the exception of the Management Web Site. The Web Site uses a special key called LocalMachine, and is set a little differently than normal accounts. The SetKey method can be called in 2 ways. The most common method is to include a single parameter with the value of the key you are trying to set for the user as in the following: [system.reflection.assembly]::LoadFrom("\Citrix.Csm.dll") $provider = New-Object citrix.csm.encryption.encryptionkeyprovider $provider.SetKey(“{KeyValue}”) And the less common method is to set the key for the context of the Local Machine as in: [system.reflection.assembly]::LoadFrom("\Citrix.Csm.dll") $provider = New-Object citrix.csm.encryption.encryptionkeyprovider $provider.SetKey(“{KeyValue}”, “LocalMachine”) Encrypting and Decrypting Data To encrypt or decrypt data you will use the encryptionprovider class To create an instance of the EncryptionProvider class, you would execute the following: $provider = New-Object citrix.csm.encryption.encryptionprovider 47 Encrypting a value is as simple as running: [system.reflection.assembly]::LoadFrom("\Citrix.Csm.dll") $provider = New-Object citrix.csm.encryption.encryptionprovider $provider.Encrypt(“{Value To Encrypt}”) And decrypting an encrypted value is as simple as running: [system.reflection.assembly]::LoadFrom("\Citrix.Csm.dll") $provider = New-Object citrix.csm.encryption.encryptionprovider $provider.Decrypt(“{Encrypted Value}”) 48 Recorded Errors Typically errors which occur in the Management Web Site User Interface will display an error similar to the following: Figure 27: User Interface error example 1 or Figure 28: User Interface error example 2 When errors such as these are displayed, the error should also be logged in the database and can be viewed by navigating to the following page within the CPSM Management Web Site: Configuration \ Provisioning & Debug Tools \ Recorded Errors 49 Figure 29: Recorded Errors The Recorded Errors page is ordered from newest message to oldest, and provides paging buttons to help you in navigating through the errors. This will be useful if a customer has reported an error and you need to look back in time to see what occurred at the time of the incident. Hopefully the error details within the Recorded Errors page can help you work out what is causing the error, or can at least point you in the right direction for further troubleshooting. 50 Provisioning Request Logs Each provisioning request is processed by a set of provisioning rules that determine which provisioning actions need to be executed to fulfill the request. As a part of this configuration, each action also states whether or not to log the result of the action on success or failure. These logs can be very useful in narrowing down exactly where the issue might be, or even in the case of a successful execution, they can help with determining areas of poor performance. Provisioning Request Logs can be viewed by navigating to the following page within the CPSM Web Site: Configuration \ Provisioning & Debug Tools \ Provisioning Request Logs Figure 30: Provisioning Requests Use the Request Filter to narrow down your search. This will help in reducing the time to load the page, especially in more active environments. Initially only provisioning requests which you have submitted are displayed, so if the issue you are researching comes from a customer, you will need to select the “All Requests” radio button to be able to see the provisioning requests the customer submitted. 51 Request Filter Criteria The following table displays the options for selecting a Request Filter in a little more detail: Filter Criteria Description Type: The following options are available:  All Types  Object Provision  Object Deprovision  Usage Data Request Requested By: Choose between “My Requests” or “All Requests”. “My Requests” is the default and is probably the most useful when you encounter problems while provisioning objects yourself, but is not so useful when looking to resolve a problem which has been reported by someone else. Bear in mind that selecting “All Requests” will result in a larger result set and can cause timeouts in environments which have higher volumes of provisioning activity. To help with this, it’s a good idea to further narrow your search by using the date range criteria. Request Status The following options are available:  Provisioned  Failed  Pending Changes  Requested  In Progress  Deprovisioned This filter will list all historical provisioning requests which have been recorded. NOTE: This is not the current status of an object. For example: An attempt to provision Bob resulted in an error, but was later corrected with a successful provision. There will be 2 entries in the Provisioning Request Logs for bob. One as a failed request, and the other as successful. So it’s important to bear in mind that this relates to historical results, and might not be reflective of current status. Object Status The Object Status has similar options to the Request Status. 52 Object Status reflects the current status of an object which is likely to match the most recent provisioning request status for the same object. This is useful to find all objects which are currently in a “Failed” state so that actions can be taken to correct them Specify date range A date range helps in narrowing down the search criteria. This will result in quicker response times and less need to page through many pages of requests to find the one you are interested in Table 5: Provisioning Request Filter Options When looking at “All Requests”, you will often want to know who submitted the request. By hovering over the request in the “Request Type” column, you will be presented with a pop up or tool tip similar to the following image: Figure 31: Provisioning Requests - Requestor popup Provisioning Request Details Clicking on the Request Type (“Object Provision”, “Object Deprovision”, etc) will drill down into the Provisioning Request and allow you to see the individual rules which were executed. Figure 32: Provisioning Request Details 53 Web Services Web Services are used quite heavily within CPSM as both a source for information, and as a channel for making changes within the hosted environment. This is done for various reasons, for example to have a single point of call for performing a task from different parts of the system and it eliminates the need for duplicating code, or for being able to perform an action that is local to the service that is being configured. Not all Web Services are created equally, meaning that there are different technologies used in creating Web Services, and as such, the methods used to troubleshoot one Web Service might differ from the methods used for another, but for the most part, the steps outlined here should help with the majority of Web Services used by CPSM How does CPSM communicate with Web Services? The following diagram shows the typical architecture of the CPSM components. The Red highlighted parts show the flow that is related specifically to Web Services. Figure 33: Web Service Interaction Overview As you can see from this diagram, Web Services are called by the Management Web Site and from the Provisioning Engine The Management Web Site communicates with a Web Service to gather information it might need for displaying to the end user, or information that it might need to include in a Provisioning Request. And in a few isolated cases where a queued request is not suitable, the Management Web Site might need to make a call to a Web Service to create an object directly, or to make a configuration change to an object. 54 For example:  When creating a Security or Distribution Group  When configuring security on a Security Group  When configuring a Distribution Group to be mail enabled, or  When creating and setting security on Public Folders The Provisioning Engine communicates with the Web Services to make actual environmental changes. Server Connections The configuration needed for CPSM to communicate with a Web Service is stored in the database and can be managed by navigating to the following page within the Management Web Site: Configuration \ System Manager \ Server Connections Figure 34: Web Service Management Clicking on the Test icon on the right hand side will instruct the Web Site to attempt to connect to the Web Service according to the configuration of the Server Connection. NOTE: The Test method here only tests the connection to the Web Service from the Management Web Site, it does not test connectivity from the Provisioning Server. Traffic light icons will replace the Test icon depending on the result of the test as demonstrated in the above image. 55 If the test fails, then you can hover over the Red icon to get a more detailed error message of the cause of the failure Figure 35: Server Connection Test Failure And an even more in-depth error message should be available within the Recorded Errors page. Click on the name of the Role to expand the iFrame to manage the server connection Figure 36:Web Services - Manage Server Connections The values listed under the Server and Credentials dropdown boxes are configurable on the Server Roles and Credentials pages within the System Manager pages. NOTE: Configuring these are outside the scope of this document at this time. CPSM Web Services are typically configured to answer on port 8095, but this is not guaranteed to be true in all environments. There may have been the need to use a different port due to a possible conflict or business requirements in that environment. 56 Occasionally the URL Base points only to a folder (i.e. there is no page reference). This will be the case when there are multiple base pages. For example, the Windows Web Hosting Service has separate pages for IIS6 and IIS7 management tasks. Common Errors Some common errors you may receive when testing the Web Service via the Test icon are: The remote name could not be resolved: ‘’  Check that the server name can be resolved to an I.P Address, also if there is an alias on the Server settings, then check that the Alias can be resolved also Unable to connect to the remote server The server name can be resolved, but no response is received from the Web Service.  Check that the correct Server is configured for the Web Service and that the path and port is correct.  Check that the port is accessible from the Web Server to the Server running the Web Service  Check that the Bindings in IIS will allow for IIS to accept the request given the configured settings for Server connection. General Web Service Troubleshooting Firstly, you need to ensure that CPSM can call the Web Service, so let’s look at what might be involved in a simple Web Services call. We need to be able to get the credentials and path to the page which supplies the Web Service methods. These things are stored in the Configuration Database, so there is SQL connectivity and data to consider. Then we need to connect to it. Here we have ports and IIS settings, as well as the IIS environment itself including ASP.Net and the .Net Framework. We need to be able to browse to the Web Service entry point from the Web Server and the Provisioning Server? We obviously have the network to consider. 57 Does the Web Service method page load? If not, then there may be issues with the IIS config or web.config file, or maybe even the .Net Framework version. Can we execute a simple method? Most Web Services have a connection method that can be used as a simple test. Can we execute a method that interacts with the Service environment? Here there may be credentials to consider. Does the Web Service call actually do what is intended? So more simply put, we have the following possible areas of failure:  SQL Connectivity  Web Service connection details could be wrong or missing  IIS Bindings on the Web Service Site  Ports between the Management Web Site or the Provisioning Engine to the Web Service  Application Pool setup  ASP.Net  .Net Framework  IIS config  Web.config settings  Web Page permissions  Web Service Credentials (who are we impersonating to interact with the Service environment?)  Powershell Scripts  Service Environment There is a lot going on here, therefore there are a lot of possibly failure points, but if you are faced with an issue, especially one where you cannot even load the Web Service’s entry point, then hopefully you can use the above diagrams and points to determine where in the process the issue is. Additionally, you should perform the following tests:  Browse to the Management Web Site. Any page will do, as mentioned earlier, the Web Site loads page content from the Database. If the Web Site itself is still working, then Database Access is not likely to be a cause, otherwise refer to Database Access to further troubleshoot why. 58  Check that the Web Site is able to access the Web Service by running the “Test Connection” method as demonstrated within the Server Connections section earlier. o  If this fails, then:  Check the credentials specified on the Server Connections Screen  Check the Server Alias configured on the Servers Configuration screens. NOTE: The alias is used in preference to the server name, and is often a way to bypass DNS to access a server’s resources  browse to the Web Services entry point (.asmx or .svc page) locally from the server hosting the Web Service. This should hopefully show you a more descriptive error as to why it’s failing. Browse to the Web Services entry point from the server where the request is failing. o If this fails, then browse to the Web Services entry point (.asmx or .svc page) locally from the server hosting the Web Service. This should hopefully show you a more descriptive error as to why it’s failing.  If the Web Service fails to load locally, then the error displayed should lead you to the cause. The IIS Configuration section might help in the resolution of this.  If the Web Service loads successfully, then the next section “Troubleshooting a Working Web Service” may be helpful Troubleshooting a Working Web Service The rest of this section will deal with troubleshooting a Web Service which appears to work but may not function as expected. I.e. the Test Connection method works successfully, and the .asmx page loads, but executing one or more methods either fails or returns unexpected data. Web Service Credentials A Web Service will execute under the context of the Application Pool unless impersonation is enabled on the Web Site, then it will run under the context of the calling user account. Most of the CPSM Web Services will be configured to use impersonation. A Web Service may appear to work fine, but if the account that is being used to run the Web Service does not have the appropriate right for the task being performed you may receive unexpected results. Errors like "User cannot be found" even though you can see in Active Directory that the user does exist may indicate that the account just does not have the rights to see the user. So for modifying Active Directory the Web Service Account must have Domain Admin rights. 59 For modifying the Exchange Environment, the user must have at least the Organizational Management rights, etc. A good way to test this is to run the Web Service Method Manually (from the context of your logged on account, which might be a Domain Admin). If you get different results, then it may point to different rights between your account and the Web Service account. Manual Execution of Web Service Method Manual Execution of a Web Service can sometimes show a more detailed error and a better understanding of what is actually going on. To run a method manually, the hardest part is determining the actual method to run and the parameters to supply. You can always make them up, but it's probably best to try to supply the same ones that caused the error. The issue might actually be found in the parameters themselves. If the error occurred when the Provisioning Engine was running one of its rules, you will find the error message by looking at the Provisioning Request Logs Here is an example of a Provisioning Request error which relates to the execution of a Web Services method. Figure 37: Web Service Action Failed Error Message Notice that one of the parameters refers to a Web Service Connection. 60 The details of the Web Service Connection are:  Server Name = CPSMEX2013  Port = 8095  The Executing User = r2-ftl.local\csm_exchange_svc  The URL Base is /ExchangeWS/HostedExchange.asmx. This does not tell you the Web Services method which is called, but the rule name is normally closely related to the method name. It also provides you with the list of parameters, but you need to understand that these parameters are for the Provisioning Rule Action, and not the actual Web Service Method, so the parameters might not be the same given that there is a whole lot of code that is executed before the Web Service is even called, but the odds are, they will be close. Browse to the Web Service at the URL provided. i.e. http://CPSMEX2013:8095/ExchangeWS/HostedExchange.asmx and look for a method which resembles the name of the failing rule Here is a sample of some of the Methods for the Exchange Web Service for CPSM v11.5. Figure 38: Sample Web Service Methods for the Exchange Web Service The method that we want here, which relates to the error above might be LeastUsedMailDatabase. 61 Click the Method Name. This will bring you to a Method Test page which in this case looks like this: Figure 39: Web Service Test Method page Some Web Service Methods may accept complex parameters such as an array, a hashtable, or an object. Unfortunately, these methods cannot be executed through this Test interface, so are a little harder to troubleshoot. However, there are free tools available on the internet which will allow you to execute Web Service methods which contain these complex parameter types, such as WCFStorm, or SoapUI. Fill in the parameters and click Invoke. This will execute the method, and if it fails, it may give you a more detailed error than which you received earlier. 62 Powershell Scripts With more and more Services these days adopting the use of Powershell, a lot of the CPSM Web Services also use Powershell to interact with the Service Environment. Each Web Service (if it uses Powershell) will have a similar layout. Figure 40: Sample Web Service Filesystem Layout The Scripts folder contains any Powershell scripts that are used within the Web Service. The Custom folder is where you can place any scripts which you have customized to behave in a manner which is more suitable to your environment. The Web Service will first look into the Custom folder before it looks into the Scripts folder for the script it needs to run. It then uses the first one it finds. So if any modifications are needed, a similarly named script (with modifications) can be placed into the Custom folder and this will take precedence over the original script. Note: Scripts in the Scripts folder should NEVER be modified. Manual execution of the Powershell Script Running a script manually will allow you to determine if:  The parameters are wrong  There is an environmental issue  The script itself needs a modification in some way  There is a preceding issue that was not actually logged Sometimes an error message will contain a snippet of the Powershell script that caused the error. It can be a good idea to copy this script and execute it within the Service Shell environment. Following is an example of a Web Service error which also shows the Powershell script which failed during execution. Failed to Create mailbox ('CN=Mailbox Database 0920661070,CN=Databases, CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Administrative Groups ,CN=R2-FTL,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=r2-ftl,DC= 63 local','74017137b4764fd797ee4d1fc83a3199','False',' alister_Atc','alister_atc','2010','EMS.Cortex.WebSe rviceConnection to r2-ftl.local\csm_exchange_svc @ http://CPSMEX2013:8095/E xchangeWS/HostedExchange.asmx','DC2.r2-ftl.local','0',& #39;','60','3','','[email protected]') Error: Server was unable to process request. ---> Failed to create a mailbox fo r 'alister_Atc' in store 'CN=Mailbox Database 0920661070,CN=Dat abases,CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Administrative Groups,CN=R2-FTL,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=r2ftl,DC=local' ---> Unable to run the "$Identity = 'CN=alist er_Atc,OU=Aarons test customer(Atc),OU=Customers,DC=r2-ftl,DC=local'; $Alias = 'alister_atc'; $Database = 'CN=Mailbox Database 09206 61070,CN=Databases,CN= Stack: Server was unable to process request. ---> Failed to create a mailbox fo r 'alister_Atc' in store 'CN=Mailbox Database 0920661070,CN=Dat abases,CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Administrative Groups,CN=R2-FTL,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=r2ftl,DC=local' ---> Unable to run the "$Identity = 'CN=alist er_Atc,OU=Aarons test customer(Atc),OU=Customers,DC=r2-ftl,DC=local'; $Alias = 'alister_atc'; $Database = 'CN=Mailbox Database 09206 61070,CN=Databases,CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Ad ministrative Groups,CN=R2-FTL,CN=Microsoft Exchange,CN=Services,CN=Configur ation,DC=r2-ftl,DC=local'; $isRecoverMailboxEnabled = $False; $Mailst oreLocations = '74017137b4764fd797ee4d1fc83a3199'.Trim().ToLower(). Split(","); $LegacyDN = ''.Trim(); $PrimarySmtpAddress = '[email protected]'; $User = Get-User -Identity $Identity -Domain Controller 'DC2.r2-ftl.local'; if ($User.RecipientType.value__ -eq 1) { $err=@(); $mailbox = $null; if ($isRecoverMailboxEnabled){ if ($LegacyDN -ne ''){ $mailbox = Get-MailboxStatistics -Identity $ LegacyDN -ErrorAction silentlycontinue | Sort LastLogonTime -Descending; } if (!$mailbox) { $dbs = Get-MailboxDatabase | where { [System.Array ]::IndexOf($MailstoreLocations, $_.GUID.ToString().Replace("-",&q uot;").ToLower()) -ne -1 }; foreach ($mdb in $dbs) { Clean-Mai lboxDatabase $mdb.DistinguishedName -ErrorAction silentlycontinue; if ($LegacyDN -ne ''){ $mailbox = Get-MailboxStatistics -Serv er $mdb.Server.Name | where { $_.LegacyDN -eq $LegacyDN } | Sort LastLogonT ime -Descending; } else { $mailbox = Get-MailboxStatistics -Server $mdb.Server.Name | where { $_.LegacyDN -like ('*cn=' + $User.Ident ity.Rdn.EscapedName) -and $_.ServerName -eq $mdb.Server.Name -and $_.Databa seName -eq $mdb.Name -and $_.ItemCount -gt 0 } | Sort LastLogonTime -Descen ding; } if ($mailbox){ break; } } } if ($mailbox){ if ($ mailbox -is [Array]) { $mailbox = $mailbox[0]; } $Database = $mailbox.Da tabase.DistinguishedName; Connect-Mailbox -Identity $mailbox.Identity.Ma ilboxGUID -Database $Database -User $Identity -Alias $Alias -confirm:$False ; }; }; if (!$mailbox -or $err.count -ne 0){ Enable-Mailbox -Identit y $Identity -Alias $Alias -Database $Database -PrimarySmtpAddress $PrimaryS mtpAddress -DomainController 'DC2.r2-ftl.local'; }; }; " P owerShell script. ---> The Mailbox database "Mailbox Database 09206 61070" isn't on a server running Exchange 2013 or a later version. 64 If you want to create or enable a Exchange 2010 mailbox, or want to enable the archive or remote archive for a Exchange 2010 Sp1 mailbox, please exec ute the cmdlet from a Exchange 2010 server. ---> The Mailbox database &q uot;Mailbox Database 0920661070" isn't on a server running Exchang e 2013 or a later version. If you want to create or enable a Exchange 2010 mailbox, or want to enable the archive or remote archive for a Exchange 201 0 Sp1 mailbox, please execute the cmdlet from a Exchange 2010 server. a t System.Web.Services.Protocols.SoapHttpClientProtocol.ReadResponse(SoapCli entMessage message, WebResponse response, Stream responseStream, Boolean as yncCall) at System.Web.Services.Protocols.SoapHttpClientProtocol.Invoke (String methodName, Object[] parameters) at EMS.Cortex.Service.HE.HEWeb Service.HostedExchange.CreateMailbox(String sAmAccountName, String MailboxA lias, String DestinationMDBUrl, String MailStoreLocations, Boolean MoveExis tingMailBox, String DCServer, Int32 BadItemLimit, Boolean isRecoverMailboxE nabled, String LegacyDN, String PrimarySmtpAddress) at EMS.Cortex.Provi sioning.Actions.Exchange.CreateMailBox.Do(Hashtable Properties) at Citr ix.Csm.Sdk.V1.Provisioning.ActionBase.Citrix.Csm.Sdk.V1.Provisioning.IProvi sioningAction.Do(Hashtable properties) at EMS.Cortex.Provisioning.Actio ns.Standard.External.SdkActionAdapter.OnDo(Hashtable Properties) at EMS .Cortex.Provisioning.Actions.Base.ActionBase.Do(Hashtable Properties) It is important to note that the Error section is truncated after so many characters, so the script may not be complete. But the script is then repeated in the Stack section of the error and will always be complete. The script is all of the text between the following phrases: ---> Unable to run the {Script} Powershell script ---> Minus the double quotes. Copy the script and execute it piece by piece (in this instance) in the Exchange Shell environment. The aim here is to identify the exact part of the script which is failing, then work backwards from there to resolve the issue You are likely to get the same error if there is actually something wrong with the parameters passed into the script, but occasionally it might actually work. But now that you have the script with all of the parameters included, you can make modifications until such time it works, and then possibly make a modification to the script in the Custom folder of the Web Service itself. Tracing a Web Service request Lastly, when all else has failed to uncover the cause of the issue, or you cannot find what script is being run, then you can actually Trace the Web Service requests. This will sometimes show you a little more information. 65 To setup tracing of a Web Service you need to: 1. Edit the web.config file that is in the Root folder of the Web Service. 2. Modify the following line: so the enabled parameter is "true" and the requestLimit is a value that will capture enough requests for you to see the request you need. For example, 40. 3. Save the web.config file. Now you can browse to the trace.axd page. This page is located at the Root of the Web Service Application in IIS. In the example we have been using for the Exchange Web Service, this is located at: http://CPSMEX2013:8095/ExchangeWS/trace.axd Now perform the action that failed again through the Control Panel and then refresh the trace page. You should see something like this. Figure 41: Web Service Trace The entry you are interested in is likely to be the one with a Status Code of 500. 66 Click the "View Details" link for that entry. Figure 42: Web Service Trace - Request Details This trace shows the Powershell script that is being executed which in this example is "GetMailboxDatabase.ps1". You can now perform further tests using this script as a base, and know which script to modify (in the Custom folder) in order to (hopefully) resolve the issue. Some traces might show that a single Web Service Method executes multiple scripts, or may include further trace details that can tell you what is going on within the Method call. Finally Not all Web Service failures are able to be resolved with these steps, and may need the Citrix Development team to make further code changes. If, with these steps you are still unable to resolve the issue, then a Helpdesk Ticket should be raised requesting further assistance. 67 Tracing issues with SQL Profiler There are often times when troubleshooting an issue with CPSM there is no error message, or the error message is not descriptive enough to lead you to the actual issue. It could also be that you need to find out what is happening internally to determine why or how CPSM behaves a particular way. This section will show how you can use the SQL Profiler to trace what is happening and attempt to use that trace to lead you to the actual cause of a problem. It will not show you how to actually fix the problem and it is by no means intended to document how to use the SQL Profiler tool in any depth. It will just show a way in which it is typically used in diagnosing issues with CPSM. Capturing an initial Trace Launch the SQL Profiler on the SQL Server that hosts the CPSM configuration database (OLM). Connect to the OLM database and switch to the Event Selection Tab. Figure 43: SQL Profiler Trace - Initial Event Selection Clear the "Audit Login", "Audit Logout", and "ExistingConnection" events. These events just cloud your results. Select the "Show all Columns" checkbox, then de-select and re-select the remaining three Events to ensure that newly added columns are included in the result set. 68 This is normally a good starting point as a base trace. It should be able to provide you with the actual Statement (query or stored procedure) that is causing the error. Figure 44: SQL Profiler Trace - Basic Config To ensure that unwanted data is not captured, initially this should be filtered to just the OLM database. So select "Column Filters", and add a filter for the "DatabaseName" column as in the following image. Figure 45: SQL Profiler Trace - Edit Filter 69 Perform a quick test to see how much data is returned when you start the trace. If you find that you are being flooded with records, then further filter the results by excluding entries that you see in the result set. Ensure your environment is ready to perform the action which generated the error before starting the trace to ensure that you get as small a result set as possible. Start the Trace, and then perform he failing action. As soon as the error is displayed, stop the trace. Reading SQL Profiler Trace Output Now you need to find the records within the Trace output that will begin to help in pinpointing the problem. If the Database was active with other tasks at the time of the trace, the result pane may be swamped with records not related to your specific request. In this case it may be like finding a needle in a haystack. The method you use to home in on the troublesome statement, really depends on the problem you are trying to trace. Some examples are:  Use the "Find" feature to search for a specific value (if you know what to look for, but odds are you won't). o A SQL Timeout may be associated with an aborted process; in this case an error will be generated of type 2 (Abort). You can search for a value of 2 in the Error column.  Scroll through the result set manually and hope something jumps out. This might sound silly, but you'll probably find that this is the method you will use most often.  Further filter your trace by specifying only long running statements. (in the case of a timeout or poor performing queries)  Export your trace results to a table in your SQL database and manually construct queries to eliminate the records you do not need. 70 Here is a sample trace output: Figure 46: SQL Profiler Trace - Trace output There are a number of columns of initial interest:  EventClass  TextData  CPU  Reads  Writes  Duration  SPID  Error  ObjectName The TextData column will show you the actual statement that was executed which caused the issue. When selecting a row from the output pane, the TextData contents is duplicated in the lower pane to make it easier to select the statement and copy it to a Query Analyzer window for further analysis. 71 Tracing a Statement in the SQL Server Management Studio Running the statement within the SQL Management Studio allows you to isolate your testing to just this statement so you no longer need to wade through hundreds or thousands of lines of output to find the interesting entries. But it does have its drawbacks, and may not be suitable in all situations. For example, a stored procedure which adds a value to a table may fail the second time around due to the table not allowing duplicates, or a query which deletes a value from a table might finish immediately the second time it is executed as the entry no longer exists. With these types of scenarios, you will need to resort to adding additional trace events and capturing another Profiler Trace. If the statement is able to be executed manually, then copy the statement from the Profiler window and paste it into a new Query window within the SQL Management Studio. If you are interested in analyzing the performance of the statement, then ensure that "Include Actual Execution Plan" Toolbar icon has been selected on the toolbar. This will show a graphical representation of how the statements were executed and can be very helpful in narrowing down the slower parts of a query or stored procedure. Execute the statement. Hopefully it will not take too long to finish, but if it looks like it may take longer than you are prepared to wait, then you can stop the execution of the statement and still determine which individual statement seemed to be the one which was hanging as the “Execution Plan” tab is added to the Results pane as soon as the first query is completed. It then adds each additional query as it completes. Therefore, you should be able to use the existing results to determine which queries have been executed, and which have not. So, once the statement has completed (or if you were forced to stop the execution), switch to the “Execution Plan” tab and then analyze the results to further determine where the issue lies. The worst performing queries would normally have a high percentage Query Cost (relative to the Batch) in relation to the other queries. So look for high percentage entries as a starting point. 72 Figure 47: Execution Plan - Long performing statement You can see from this that the statement that is taking up 43% of the batch is a Delete statement: DELETE FROM O WITH (ROWLOCK) FROM dbo.Objects O INNER JOIN Deleted D ON O.ObjectID = D.ObjectID. This is probably where you should focus your efforts. The “Execution Plan” within the SQL Management Studio combines the results of a statement with any statements arising from triggers associated with the object(s) being updated which makes the Query Plan difficult to near impossible to dissect. In this case it may be better to expand your Profiler trace to include additional events and then run another Profiler trace which is filtered on the SPID of the window you are executing the statement from. Expand Your Profiler Trace to Show Statement Details Expanding your Profiler trace to show additional trace events like StatementStarted and StatementEnded events will show each statement within a stored procedure, function or trigger when it is executed. And additionally adding the “Showplan XML Statistics Profile” event will show how the SQL optimizer has structured the statement for execution. These events will help in narrowing down your troubleshooting efforts to the exact statement which is at fault. Add the following Trace Events to your current Trace:  SP:StmtCompleted  SP:StmtStarted 73  SQL:StmtCompleted  SQL:StmtStarted If you are looking for Performance issues, then you'll also want to add "Showplan XML Statistics Profile" so you can see the query plan that the SQL Optimizer decided to use in executing the statement An example is shown in the image below. Figure 48: SQL Profiler – Expanded Profiler trace to show additional events and filtered to specific Query window Then start the trace and perform the troublesome action again Wait for the error or symptom to show itself and stop the trace. This time the Trace output will include each statement issued as in the following image: 74 Figure 49: SQL Profiler - Additional Trace Events If you are looking for the XML Query Plans associated with a particular statement, then the rows you are interested in are likely to be the rows directly after the “StmtStarting” events. The above image highlights a Select statement, and the row directly after this is the “Showplan XML Statistics Profile” event associated with the Select query. Columns of interest at this stage, apart from the columns already mentioned are likely to be: ObjectName The name of the Stored Procedure, Function, or Trigger which contains the SQL Statement shown in the TextData column. NestLevel The level of nesting to which the statement is associated. For example, if the initial statement is an Update statement which the underlying table has associated triggers. The Statement shown may be a statement from a trigger, in such case the NestLevel would be one higher than the originating Update Statement. Similarly, if functions are included as a joined object to a query, the statements associated with the function will have a higher NestLevel value. SPID or TransactionID Useful in identifying if 2 statements were performed within the same transaction or not. Different SPID values will be associated with separate transactions. This can point to issues where table locking is causing timeouts 75 Further information SQL Server Profiler Analyzing a Query Displaying Graphical Execution Plans Sample Tracing Exercise Now, let’s investigate an issue which has occurred within CPSM in the past to see all of this in action. The following scenario will be used: Failure to load the customers screen When navigating to the Customers screen you receive the following error: Failed to load the customers And when you expand the error message you receive the following detailed error: Violation of UNIQUE KEY constraint 'UQ__#Service__591E08AB32579BA2'. Cannot insert duplicate key in object 'dbo.#ServiceTemp'. The duplicate key value is (5, ). The statement has been terminated. at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction) at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction) at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose) at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady) at System.Data.SqlClient.SqlDataReader.TrySetMetaData(_SqlMetaDataSetmetaData, Boolean moreInfo) at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObjectstateObj, Boolean& dataReady) at System.Data.SqlClient.SqlDataReader.TryConsumeMetaData() at System.Data.SqlClient.SqlDataReader.get_MetaData() at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString) at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean async, Int32 timeout, Task& task, Boolean asyncWrite, SqlDataReader ds) at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method, TaskCompletionSource`1 completion, Int32 timeout, Task& task, Boolean asyncWrite) at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method) at 76 System.Data.SqlClient.SqlCommand.ExecuteReader(CommandBehavior behavior, String method) at System.Data.SqlClient.SqlCommand.ExecuteDbDataReader(CommandBehavior behavior) at System.Data.Common.DbCommand.System.Data.IDbCommand.ExecuteReader(CommandBehavior behavior) at System.Data.Common.DbDataAdapter.FillInternal(DataSet dataset, DataTable[] datatables, Int32 startRecord, Int32 maxRecords, String srcTable, IDbCommand command, CommandBehavior behavior) at System.Data.Common.DbDataAdapter.Fill(DataTable[] dataTables, Int32 startRecord, Int32 maxRecords, IDbCommand command, CommandBehavior behavior) at System.Data.Common.DbDataAdapter.Fill(DataTable dataTable) at EMS.Cortex.DBConnection.Execute(MarshalByValueComponent& Result, Int32 Timeout, String CommandName, IEnumerable`1 CommandParameters) at EMS.Cortex.DBConnection.ExecuteDataTable(String CommandName, IEnumerable`1 CommandParameters) at EMS.Cortex.CustomerSearch.Searcher.QuickSearch(Hashtable Criteria, Boolean Disabled, Boolean ReadUncommited) at EMS.Cortex.Web.Customers.BindData(Boolean isNewRequired) at EMS.Cortex.Web.Customers.CustomerSearch(Int32 pageIndex, Boolean DoDataBinding, BooleanisNewRequired) The error points to a violation of a unique key while attempting to insert data into a temp table (dbo.#ServiceTemp), but what is this table, and what’s causing it? We know that the duplicate key value is 5, but again, what does this refer to? We can use SQL Profiler to capture the SQL statements which the CPSM Web Site is executing to see if we can discover the actual SQL statements that are being issued. 1. Setup the Profiler for an initial trace as described above. 2. Logon to the Management Web Site 3. Start the Profiler Trace 4. Click the Customers menu item 5. When the error is displayed, stop the Profiler trace. This particular trace should show the following statements being executed as: exec sp_ServicesGetALL @ServiceType=N'Top',@IncludeParent=NULL,@Custom=NULL,@Enabled=1,@Mandatory=NULL,@Loca tionID=0,@InclServiceAdmin=NULL,@ServiceName=NULL exec sp_DisplayPropertiesGet @DisplayPropertyTypeID=2,@CustomerID=NULL,@isVisible=NULL,@isSearch=1,@isADAttributes Only=0,@DisplayPropertyID=NULL,@Name=N'' exec sp_LocationsGet @LocationID=0,@Name=N'',@isHMC=NULL exec sp_CustomerServicesFullList @CustomerID=1,@ServiceID=NULL,@Service=N'Reseller',@CustomerOnly=NULL,@IncludeProxy=0 exec sp_CustomerSearch @ParentID=0,@Domain=N'',@ContactName=N'',@PageIndex=1,@Location=N'',@Service=N'0',@Us erCount=NULL,@LocationID=0,@Label=N'',@Name=N'',@UserCountLessThan=NULL,@RecordsPerPa ge=20,@ContactEmail=N'',@Disabled=1 77 You will now want to run each statement within the SQL Management Studio to see which one is failing. In fact, with some issues you might not see a failure, but the results of the statements may still help in directing you to the cause of the error. Although with this case, when executing the last statement (exec sp_CustomerSearch) you are likely to receive the following error: Msg 2627, Level 14, State 1, Procedure sp_CustomerSearch, Line 583 Violation of UNIQUE KEY constraint 'UQ__#Service__591E08ABC4BEB194'. Cannot insert duplicate key in object 'dbo.#ServiceTemp'. The duplicate key value is (5, ). The statement has been terminated. We now know that the error is occurring within the sp_CustomerSearch stored procedure at line 583. The relevant line within the stored procedure looks like this: insert INTO #ServiceTemp exec sp_ExecuteSQL @Query, @Params, @LocationID=@LocationID, @Name=@Name, @Label=@Label, @Location=@Location, @ContactEmail=@ContactEmail, @ContactName=@ContactName, @UserCount=@UserCount, @UserCountLessThan=@UserCountLessThan, @CustomerID=@CustomerID But the @Query parameter, which is the query passed into the sp_ExecuteSQL statement is dynamically created. It’s difficult to reverse engineer this statement. An easier way to find out what the actual statement is, is to create a duplicate of this stored procedure, then edit the duplicated stored procedure to add the following 2 lines just before this statement is executed: Print @Query exec sp_ExecuteSQL @Query, @Params, @LocationID=@LocationID, @Name=@Name, @Label=@Label, @Location=@Location, @ContactEmail=@ContactEmail, @ContactName=@ContactName, @UserCount=@UserCount, @UserCountLessThan=@UserCountLessThan, @CustomerID=@CustomerID The first statement just prints the @Query parameter to the Messages window, and the second statement, executes the query to show the resulting data set which would be inserted into the Temp table (without actually doing the insert). Recompile the stored procedure and re-execute the earlier statement. 78 The query displayed looks like this: SELECT C.*, BN.BrandLabel, ISNULL(vwU.UserCount, 0) AS UserCount, SL.RootServiceID column2, ISNULL(cd.CustomerDomain,'* domain not set *') As PrimaryDomain, ISNULL(cs.ServiceError,0) AS ServiceError, ISNULL(cs.UserError,0) AS UserError, ISNULL(cs.StatusID,6) AS StatusID, ISNULL(cs.StatusDescription,'Not provisioned') AS StatusDescription, ISNULL(cs.StatusColour,'grey') AS StatusColour, ISNULL(sc.ServiceCount,0) AS ServiceCount FROM Customers C (NOLOCK) /*CustLocationsJoin*/ /*PropertyFilter*/ INNER JOIN BrandNames BN (NOLOCK) ON C.BrandName = BN.BrandName LEFT JOIN CustomerDomain CD (NOLOCK) ON CD.CustomerID = C.CustomerID AND CD.IsPrimary = 1 LEFT JOIN #CustomerStatus CS ON CS.CustomerID = C.CustomerID LEFT JOIN #ServiceCount SC ON C.CustomerID = SC.CustomerID LEFT JOIN #UserCount vwU ON vwU.CustomerID = C.CustomerID LEFT JOIN #ServiceLookup SL ON SL.CustomerID = C.CustomerID WHERE C.DateDeleted IS NULL /*DisabledFilter*/ /*CustNameFilter*/ /*CustLabelFilter*/ /*CustLocationFilter*/ /*CustContactEmailFilter*/ /*CustContactNameFilter*/ /*UserCountFilter*/ /*UserCountLessThanFilter*/ /*CustIDFilter*/ ORDER BY C.[Name] Unfortunately, due to the query containing temp tables, it is still not possible to execute this manually to find out where our duplicate is. You can either manually create the contents of the temp tables from earlier statements within the stored procedure before attempting to run this, or you can use the output generated from the second change we made to see if something pops out. Remember, the duplicate key value was 5. We may be able to use the table definition of the #ServiceTemp table and the Query output to discover this duplicate entry. 79 Here is the structure of the #ServiceTemp Table from the stored procedure: Create Table #ServiceTemp ( CustomerID int NOT NULL, [Name] nvarchar(128) NOT NULL, Label nvarchar(128) NOT NULL, Location nvarchar(1024) NULL, ContactName nvarchar(128) NULL, ContactEMail nvarchar(256) NULL, [Description] nvarchar(1024) NULL, Created datetime NOT NULL , PhoneNumber nvarchar(50) NULL, FaxNumber nvarchar(50) NULL, MinPWLength int NOT NULL , BannerPWDays int NOT NULL , ModifiedBy int NOT NULL, DateDisabled datetime NULL, DateDeleted datetime NULL, BrandName nvarchar(50) NOT NULL , OU_Name nvarchar(128) NULL, ObjectID int NOT NULL, BillingID nvarchar(50) NULL, ImpersonatingUserID int NOT NULL , BrandType nvarchar(50) NOT NULL , ParentID int NULL, BrandLabel nVarchar(100), UserCount int not null, column2 int null, PrimaryDomain nvarchar(200), ServiceError int not null, UserError int not null, StatusID int not null, StatusDescription nVarchar(120) not null, StatusColour nVarchar(60) not null, ServiceCount int not null, unique (customerid, column2) ) Now we know that the unique constraint on the #ServiceTemp table is based on the CustomerID and Column2 columns, we should be able to use the query results to find our duplicate without having to reconstruct the tables. 80 Here is the output from the execution of the query (some columns have been omitted so that the results can be displayed successfully): CustomerID 3 12 4 Name ADS AOT AR 5 5 2 at2 at2 Atc 6 ats 8 1 Cli CSP 10 7 13 11 FSAS gt pfmig TC2 9 Vol Label ADSync AppTeam Aarons Reseller ajl test 2 ajl test 2 Aarons test customer aaron test sub ClientOne Service Provider FSAS gt pfmig Test Customer 2013 VolksWagon ContactName Administrator Admin aaron ObjectID 516656 517214 516726 column2 NULL NULL NULL PrimaryDomain adsynct.local aoteam.com ar.loc ajl ajl Aaron Lister 516762 516762 516579 NULL NULL NULL at2.loc lol.abc atc.local ajl 516787 NULL ats.loc CustAdmin Admin 516970 516272 NULL NULL cli.com.au csp.local Ash fwee pfmig Admin 517056 516919 517301 517073 NULL NULL NULL NULL fsas.in abc.com at2.loc tc2013.local Polo 517029 NULL vw.com From these results you can see that for the CustomerID, there are 2 entries with a value of 5, and Column2 for both of these columns is NULL. The issue is that the Customer with CustomerID=5 seems to have 2 primary domains, but this is not allowed. To resolve this issue, you need to determine which domain is supposed to be the primary one, and which one to remove the primary flag from. Then execute the following query on the CustomerEmail table to see the email domains configured for the customer with a customerid value of 5: select * from customerdomain where customerid = 5 The results are shown here: CustDom ainID 5 Custom erID 5 CustomerD omain at2.loc IsOw ner 1 IsPri mary 1 ExchangeAutho ritative NULL Modifi edBy 1 Impersonati ngUserID 0 IsDNS Zone 0 12 5 lol.abc 1 1 NULL 1 0 1 14 5 lannister .kings 1 0 NULL 1 0 1 Obje ctID 5167 63 5172 06 5172 10 Both at2.loc and lol.abc have the isPrimary flag set to 1. Let’s keep the at2.loc domain as the primary domain, and set the other one with the isPrimary flag back to 0 with the following statement: 81 Update CustomerDomain set isPrimary = 0 where CustDomainID = 12 Now you should be able to reload the Customers screen within the Management Web Site and the page should load successfully. Obviously the issues you encounter will probably be different than this one, but hopefully you can use this as a guide to performing similar tasks to diagnose and resolve the issues you encounter. If you feel that there are any changes which need to be made to any SQL object, or if you’ve gone as far as you are comfortable going and you have not been able to resolve the issue, then a Helpdesk call should be logged including your troubleshooting efforts so that a support engineer can take this further. 82

Cloudportal Services Manager Operations Guide V2.0

Rating

Date

Size

Views

Categories

Share

Transcript